Chapter 21 — MLOps

MLOps: Deploy, Monitor, Retrain

A model in a notebook delivers zero value. This chapter covers the lifecycle most courses skip: packaging, deployment, monitoring, drift detection, and retraining.

MLOps is what turns a one-off analysis into a reliable product. The work doesn't end at model.fit() — it ends when the model keeps delivering value in production and you'd notice the day it stops.
21.1 The ML lifecycle
21.2 Experiment tracking
Log every run — params, metrics, data version, code commit — so results are reproducible and comparable. Tools: MLflow, Weights & Biases, DVC (for data/versioning).
python
import mlflow

with mlflow.start_run():
    mlflow.log_params({'n_estimators': 400, 'lr': 0.05})
    model.fit(X_train, y_train)
    auc = roc_auc_score(y_test, model.predict_proba(X_test)[:, 1])
    mlflow.log_metric('auc', auc)
    mlflow.sklearn.log_model(model, 'model')
21.3 Deployment patterns
PatternUse whenAvoid when
Batch scoringPredictions needed daily/hourly (churn lists)Instant response required
Real-time APILow-latency per-request (fraud at checkout)Huge volumes scored offline
Edge / embeddedOffline devices, privacy, low latencyModel too large for device
StreamingContinuous event dataSimple periodic batches suffice
python
# Minimal real-time model API with FastAPI
from fastapi import FastAPI
import joblib, pandas as pd

app = FastAPI()
model = joblib.load('model.joblib')

@app.post('/predict')
def predict(features: dict):
    X = pd.DataFrame([features])
    proba = model.predict_proba(X)[0, 1]
    return {'churn_probability': float(proba)}
21.4 Monitoring & drift detection
What to watchMeaningAction
Data driftInput feature distribution shifts vs trainingAlert; investigate; consider retraining
Concept driftRelationship between X and y changesRetrain on recent data
Prediction driftOutput distribution shiftsCheck upstream data pipeline
Performance decayLive metric drops once labels arriveRetrain / rollback
OperationalLatency, error rate, throughputScale infra; fix serving bugs
Detect drift with population stability index (PSI), KS-test, or tools like Evidently / NannyML. Always log live predictions so you can measure performance once true labels arrive.
21.5 Retraining strategy
decision tree
When to retrain?
│
├── Scheduled ─ stable domain ──────► Weekly / monthly cadence
├── Triggered ─ drift / decay alert ─► Retrain on recent window
└── Continuous ─ fast-moving data ──► Online / streaming updates

Professional recommendation

Start simple: batch scoring + scheduled retraining + a drift dashboard. Add real-time serving and automated triggers only when the business case demands it. Always keep the previous model version so you can roll back instantly.

21.6 Common mistakes
Common mistakes to avoid
Quick cheatsheet
mlflow.log_metric() -> Track experiment results
joblib.dump(model) -> Persist the trained model
FastAPI / @app.post -> Serve real-time predictions
PSI / KS-test -> Detect data drift
model registry / version -> Enable rollback