End-to-end mini-applications built on the same techniques the courses teach — risk scoring, quant trading, knowledge graphs, macro pipelines. Each case ships with the dataset, the model, and a one-page write-up of the reasoning behind it.
Patient-level risk scoring on tabular EHR data. Gradient-boosted ensemble with calibration, SHAP attributions, and a fairness audit across protected groups.
Stack · pandas · lightgbm · scikit-learn · SHAP
Classic cross-sectional momentum portfolio — formation / holding period sweep, transaction-cost-aware backtest, Newey-West t-stats, and a sub-period regime breakdown.
Stack · pandas · numpy · statsmodels
Pair selection by Engle-Granger cointegration, half-life-aware OU-process entry / exit thresholds, walk-forward live simulation with slippage and stop-loss controls.
Stack · numpy · statsmodels · backtester
Schema → entity-resolution → graph store → retrieval-augmented generation over an organisation's internal docs. Includes a Neo4j + pgvector reference architecture and an evaluation harness.
Stack · NetworkX · rdflib · Neo4j · pgvector
Live macro indicators (BIS, IMF, FRED) pulled via SDMX into a dashboard with consistent vintages, revision-aware time series, and nowcasting on key headline series.
Stack · pandasdmx · duckdb · plotly
End-to-end social-media monitoring: ingest → claim extraction → stance classification → community detection on the share graph, with a small operator UI for triage.
Stack · spaCy · transformers · NetworkX
Short-horizon nodal-price spike forecaster. Combines weather, unit-commitment fundamentals, and a gradient-boosted residual to beat naive seasonal benchmarks on out-of-sample tests.
Stack · pandas · lightgbm · joblib
Live calculator for two-sample tests on proportions and means. Handles unequal allocation, sequential / fixed-horizon designs, and plots power curves you can paste straight into a planning doc.
Stack · scipy.stats · numpy · plotly