S6E5 — Kaggle Playground

0.947

Public LB (ROC-AUC)

Best submission so far

0.950

5-Fold CV (OOF)

LGB + XGB ensemble

~800

Feature Columns

After feature engineering v1

200

Optuna Trials

TPE sampler per model

Model	3-Fold CV	5-Fold OOF	Hardware
LightGBM (Optuna)	0.950	0.950	GPU (CUDA)
XGBoost (Optuna)	0.950	0.949	GPU (CUDA)
CatBoost (Optuna)	0.947	—	GPU
Ensemble (avg)	0.951	0.950	—

Feature Engineering v1

Target encoding on categoricals (Race, Driver, Team), lag features on LapNumber, rolling statistics, interaction terms. ~800 features with StratifiedKFold validation.

Model Baseline

LGB + XGB + CatBoost with default params on 5-fold CV. Best single model: XGB 0.950 CV. Ensemble 0.951 CV. Public LB 0.94735.

Optuna Hyperparameter Tuning

200-trial TPE search per model. Parameter space: n_estimators 200-1200, lr 0.005-0.15, max_depth 3-12, subsample/colsample 0.5-1.0. Dropped CatBoost (underperforming). Currently running.

Next Steps

Blend with public high-score submissions, pseudo-labeling, seed averaging, final 5-fold OOF ensemble with BayesianRidge stacking.

Race Lap Performance

📊 Key Metrics

🤖 Model Performance

🔬 Approach

Feature Engineering v1

Model Baseline

Optuna Hyperparameter Tuning

Next Steps

📤 Recent Submissions