The DSPT is an online self-assessment tool that allows organisations to measure their performance against the National Data Guardian’s 10 data security standards and must be reported yearly if interacting with patient data. The DTAC is an assessment framework for care commissioners and providers to use when assuring digital health technology (DHT) product. It is… Continue reading NCSC Cyber Assessment Framewok, NHS Data Security and Protection Toolkit & NHS Digital Technology Assessment Criteria
Category: Big Data
International AI Bodies and their Powers
Every major economy now has an AI safety body. The international network that came out of the Seoul Summit in 2024 has grown to include the UK, the US, the EU, France, Germany, Japan, South Korea, Canada, Singapore, India and Australia. On paper it looks like coordinated global governance. In practice, almost none of these… Continue reading International AI Bodies and their Powers
Understanding AI Classifiers, Terminologies, terminology Engines
These three concepts get conflated constantly in healthcare informatics conversations. All deal with clinical codes and they all do fundamentally different jobs. Understanding what each does is critical if you're building anything that touches coded clinical data. Category What it does Major examples Healthcare context Terminology Clinical code systems Defines the codes, concepts, descriptions and… Continue reading Understanding AI Classifiers, Terminologies, terminology Engines
Using CatBoost.ai in Healthcare
Because SNOMED and ICD Codes must be treated as categories for gradient boosting The 8 main blood types (A+, A-, B- rare, B+, O+, O- universal, AB+, AB-) are categories. If I used label encoding then each category becomes integer (e.g A+ =0, B-=1, AB+=2). This is compact but introduces a false ordering and the… Continue reading Using CatBoost.ai in Healthcare
MLOps Toolsets for Different ML Types
I've been playing with various ML training tools and different monitoring and operations tools. I've been unsure if it's one size fits all (e.g. langsmith or mlflow) or whether certain tools are more proportionate for the need. I haven't gone into licencing costs but for each ML Type, I have put together a list of… Continue reading MLOps Toolsets for Different ML Types
MLOps for Scikit-learn
Setting up MLOps for repeatable pipelines when using scikit-learn Not every AI problem requires a large language model. In many enterprise environments, the most valuable systems are still well engineered, explainable, repeatable and operationally governed. This is where classical machine learning pipelines still provide benefit and Scikit-learn remains one of the strongest foundations for these… Continue reading MLOps for Scikit-learn
Ducks on Icebergs
Federating Data Between Snowflake and Databricks with DuckDB and Apache Iceberg If you're running both Snowflake and Databricks — and most enterprises I work with are — you've probably hit the federation problem. Data lives in both platforms, analysts need to query across them, and the obvious solutions (ETL everything into one place, or pay… Continue reading Ducks on Icebergs
EU Sovereign Cloud List
The rule of law is a fundamental principle from the Mesopotanian Code Ur-Nammu, through Magna Carta to International Criminal Court's decisiion to ditch Microsoft Office for European open source alternatives. Data sovereignty requires certainty that services will never be terminated or at the mastery of a governmental body. For this reason I find it useful… Continue reading EU Sovereign Cloud List
Explaining Medallion Data Architectures in Healthcare
Faster Insight, Better Reuse, and Scalable Data Foundations Healthcare organisations face growing demand for better use of data: improving operational performance, supporting population health management, enabling AI, and accelerating research. Yet many still rely on fragmented pipelines, duplicated transformations, and slow bespoke data requests. At the same time, the economics of technology have changed. Modern… Continue reading Explaining Medallion Data Architectures in Healthcare








