| Model | RMSE (cases) | MAE (cases) | R² | MAPE (%) | |
|---|---|---|---|---|---|
| 0 | Random Forest | 131.43 | 72.10 | 0.8563 | 41.48 |
| 1 | XGBoost | 129.07 | 69.50 | 0.8614 | 44.64 |
| 2 | MLP | 128.07 | 71.45 | 0.8635 | 43.50 |
| 3 | Weighted Ensemble | 126.20 | 68.07 | 0.8675 | 41.01 |
Replication — Eval
Goal
Show the test-set performance of the four trained models on the held-out final 12 months of district-month data. Numbers below are loaded from the metrics CSV produced by evaluator.py — they are the same numbers reported in the Results chapter.
Performance table
Feature–target correlations
| Feature | Pearson_r_with_log_cases | |
|---|---|---|
| 0 | cases_lag1 | 0.731 |
| 1 | temp_mean_lag1 | 0.599 |
| 2 | temp_roll3 | 0.559 |
| 3 | precip_lag1 | 0.313 |
| 4 | monsoon | 0.245 |
| 5 | precip_roll3 | 0.208 |
| 6 | humidity_lag1 | 0.180 |
| 7 | flood_lag1 | 0.122 |
| 8 | month_sin | -0.133 |
| 9 | month_cos | -0.226 |
Interpretation
- The Weighted Ensemble is the headline winner — R² = 0.8675, RMSE = 126.20, MAE = 68.07, MAPE = 41.01 %. It edges every individual model on every metric, validating the use of structurally distinct learners whose errors are partially uncorrelated.
- All four architectures converge to within 0.012 R² of each other (Random Forest 0.8563, XGBoost 0.8614, MLP 0.8635, Ensemble 0.8675). This is not noise: it indicates the predictive ceiling is set by the climate × autoregressive signal in the data, not by the choice of model family.
- XGBoost is the operational pick when only a single model can be deployed — lowest MAE among the individual models (69.50), interpretable gain-based feature importance, graceful handling of missing values, and reproducible under a fixed random seed.
- Random Forest has the lowest MAPE among individuals (41.5 %), meaning marginally better proportional accuracy on low-incidence districts. It remains the conservative baseline.
- MLP edges out the tree-based individual models on overall R² (0.8635) but at the cost of higher absolute-error variance and stricter input-scaling requirements.
Where the trained pickles live
The pipeline persists trained models to code/output/models/ next to the source code (see Train). The file layout is:
code/output/models/
├── rf_model.pkl
├── xgb_model.pkl
├── mlp_model.pkl (or lstm_model.h5 if TensorFlow is installed)
├── scaler_X.pkl
└── scaler_y.pkl
The directory is gitignored by default — model artifacts are regenerable from source and shouldn’t be committed.
To regenerate from scratch, see Train.
Tip
The figures supporting this evaluation — model comparison, correlation heatmap, ecological-zone distribution, national trend — are on the Figures page.