Skip to the content.

Week 1 Day 2 — Baseline A: single-drug MLP

Summary

Baseline A trained successfully and passes the pre-registered quality gate (per-patient Spearman ≥ 0.40) with a large margin. This is the strong baseline the combo predictor (Week 3) will have to beat.

Architecture

Multi-task MLP, ~50K parameters:

Patient features (80) → [Patient MLP 128 → 64] → patient_emb
                                                    │
Drug ID (0..164)      → [Drug Embedding 64]  → drug_emb
                                                    │
                              concat (128)          │
                                   │                │
                        [Response Head 128→64→1]    │
                                   │                │
                              predicted AUC ◄───────┘

5-fold CV by patient (no leakage)

Fold best epoch MAE per-patient ρ per-drug median ρ wall time
1 23 34.15 0.698 0.278 137 s
2 29 35.75 0.709 0.226 149 s
3 20 36.39 0.681 0.286 110 s
4 29 35.69 0.700 0.369 138 s
5 28 33.21 0.730 0.354 133 s
mean 26 35.04 0.704 0.303 134 s
std 4 1.27 0.018 0.061  

Pre-registered gates

Gate Threshold Actual Verdict
Per-patient Spearman ≥ 0.40 0.704 ✅ PASS (+76%)
MAE (AUC 0-300 scale) < 50 34.6
Std across folds < 0.05 0.018 ✅ extremely stable

Per-drug predictability (most-predictable drugs)

Drugs with per-drug Spearman > 0.40:

Drug n_patients Spearman Biology
Venetoclax 382 0.580 BCL2i, AML SOC anchor
Sunitinib 513 0.537 Multi-kinase inhibitor
Sorafenib 518 0.520 FLT3i / RAF
Cabozantinib 469 0.512 VEGFR / FLT3
KW-2449 468 0.506 FLT3 / Aurora B
Tivozanib 467 0.500 VEGFR
Dasatinib 518 0.500 BCR-ABL / SRC
Dovitinib 474 0.494 RTK
Foretinib 472 0.487 MET / VEGFR
Selumetinib 475 0.482 MEK

Biological pattern: the most-predictable drugs are kinase inhibitors (FLT3, multi-RTK) and BCL2i — exactly the drugs whose response is tightly coupled to the FLT3/NPM1/BCL2-status features we encoded. This is a clean sanity check that the model is learning real biology, not memorizing noise.

Per-patient predictability distribution

Outputs

runs/baseline_single_drug_mlp/
├── final_model.pt                                  # checkpoint + scaler + drug vocab
├── cv_metrics.json                                 # fold-level + overall
├── cv_held_out_predictions.csv                     # held-out pred + true, all folds
├── per_patient_spearman.csv                        # per-patient ρ
├── per_drug_spearman.csv                           # per-drug ρ
└── predictions_all_patients_all_drugs.csv          # 613 × 165, used in Week 4

Implications for Week 4

The head-to-head comparison becomes:

for each patient p in held-out:
    best_single(p) = min over drugs of predictions_all_patients_all_drugs[p, :]
    best_combo(p)  = min over legal pairs of combo_predictor(p, d1, d2)
    Δ(p) = best_single(p) - best_combo(p)

A baseline with ρ=0.70 means best_single is a genuinely strong opponent — winning against it requires the combo predictor to carry information beyond single-drug matching, not just learn the same patterns. This strengthens the scientific value of whatever the head-to-head shows.

Next (Day 3) — DrugComb AML subset

User started the download. Once summary_v_1_5.csv is in place:

python -m combo_val.data.drugcomb_etl \
  --input data/raw/drugcomb/summary_v_1_5.csv \
  --out data/canonical/drugcomb_aml_pairs.csv

I’ll write the ETL to filter to AML cell lines and map drug names to the BeatAML drug vocabulary (needed so DrugComb-trained residual can apply to BeatAML patients’ drug space).