I’ve been playing with various ML training tools and different monitoring and operations tools. I’ve been unsure if it’s one size fits all (e.g. langsmith or mlflow) or whether certain tools are more proportionate for the need. I haven’t gone into licencing costs but for each ML Type, I have put together a list of what I think are appropriate tools and a sub article (under ML Type) showing how I have implemented each.
ML Types and MLOps Tools| ML type | Typical use cases | Training / experiment | Serving / deploy | Monitoring / ops |
|---|---|---|---|---|
|
Classical Linear / logistic regression |
Forecasting, pricing, churn prediction, A/B test analysis | scikit-learn Statsmodels MLflow W&B | Flask FastAPI BentoML SageMaker Vertex AI | Evidently AI Prometheus Grafana WhyLabs |
|
Classical SVMs / k-NN / naive Bayes |
Text classification, anomaly detection, recommendation | scikit-learn MLflow DVC Optuna | ONNX Runtime Seldon Core KServe | Evidently AI Arize Prometheus |
|
Ensemble Random forest / gradient boosting |
Tabular data, A&E acuity predictions, hospital utilisation, fraud detection | XGBoost LightGBM CatBoost MLflow Optuna | Triton SageMaker Databricks BentoML | Evidently AI NannyML Fiddler Grafana |
|
Deep learning CNNs (computer vision) |
Image classification, object detection, medical imaging, video | PyTorch TensorFlow Keras W&B ClearML | TorchServe Triton TFServing SageMaker Vertex AI | Arize WhyLabs Seldon Alibi Prometheus |
|
Deep learning RNNs / LSTMs |
Time series forecasting, speech recognition, sequence modeling | PyTorch TensorFlow MLflow W&B | TorchServe Triton AWS Lambda KServe | Arize Evidently AI Grafana |
|
Deep learning GANs / VAEs |
Image generation, data augmentation, style transfer, drug discovery | PyTorch TensorFlow W&B Neptune | Triton Replicate Modal | W&B Prometheus |
|
Transformer Transformers (NLP) |
Sentiment analysis, NER, translation, summarization | Hugging Face PyTorch DeepSpeed W&B | HF Inference vLLM Triton SageMaker | LangSmith Arize Phoenix WhyLabs |
|
Diffusion Diffusion models |
Text-to-image, inpainting, video generation, 3D | Diffusers PyTorch W&B Comet | Replicate Modal RunPod Triton | W&B Prometheus |
|
Reinforcement Reinforcement learning |
Robotics, game AI, rec systems, autonomous vehicles | Ray / RLlib Stable Baselines3 W&B MLflow | Ray Serve SageMaker RL | Ray Dashboard Prometheus |
|
Foundation LLMs (GPT, Claude, Llama) |
Chat, code gen, agents, RAG, reasoning, content creation | PyTorch DeepSpeed Megatron FSDP Axolotl W&B | vLLM TGI Ollama Triton Anyscale Together AI | LangSmith Arize Phoenix Braintrust Humanloop PromptLayer |