ai – API Crazy

Using CatBoost.ai in Healthcare

May 18, 2026May 18, 2026 mustnotgrumbleLeave a comment

Because SNOMED and ICD Codes must be treated as categories for gradient boosting The 8 main blood types (A+, A-, B- rare, B+, O+, O- universal, AB+, AB-) are categories. If I used label encoding then each category becomes integer (e.g A+ =0, B-=1, AB+=2). This is compact but introduces a false ordering and the… Continue reading Using CatBoost.ai in Healthcare

MLOps for Scikit-learn

May 17, 2026May 14, 2026 mustnotgrumbleLeave a comment

Setting up MLOps for repeatable pipelines when using scikit-learn Not every AI problem requires a large language model. In many enterprise environments, the most valuable systems are still well engineered, explainable, repeatable and operationally governed. This is where classical machine learning pipelines still provide benefit and Scikit-learn remains one of the strongest foundations for these… Continue reading MLOps for Scikit-learn

Ducks on Icebergs

Featured mustnotgrumbleLeave a comment

Federating Data Between Snowflake and Databricks with DuckDB and Apache Iceberg If you're running both Snowflake and Databricks — and most enterprises I work with are — you've probably hit the federation problem. Data lives in both platforms, analysts need to query across them, and the obvious solutions (ETL everything into one place, or pay… Continue reading Ducks on Icebergs

Explaining Medallion Data Architectures in Healthcare

Featured mustnotgrumbleLeave a comment

Faster Insight, Better Reuse, and Scalable Data Foundations Healthcare organisations face growing demand for better use of data: improving operational performance, supporting population health management, enabling AI, and accelerating research. Yet many still rely on fragmented pipelines, duplicated transformations, and slow bespoke data requests. At the same time, the economics of technology have changed. Modern… Continue reading Explaining Medallion Data Architectures in Healthcare