Overfitting Risk Score
Detecting predictive models too good to be true.
Overview
This pillar quantifies the risk that a predictive model is 'overfit', meaning it has memorized past data noise instead of learning the true underlying signal. An overfit model looks great on historical data but fails on future events, making it a hidden trap for traders.
What It Does
It analyzes a model's performance on data it was trained on versus new, unseen data. The pillar also assesses model complexity relative to the amount of data available. A large performance drop or excessive complexity triggers a high-risk score, warning that the model's predictive power is likely an illusion.
Why It Matters
This provides a crucial reality check, protecting you from models with impressive but fragile backtests. It helps distinguish robust, reliable signals from brittle ones that will collapse when faced with new market conditions, saving you from costly errors.
How It Works
The pillar first ingests a model's performance metrics from its training and validation datasets. It then calculates the percentage drop in accuracy or error rate between the two. This is combined with a complexity score, derived from the ratio of model parameters to data points, to generate a final 0-100 risk score.
Methodology
The core calculation is the Out-of-Sample Performance Degradation (OOSPD), calculated as: (In-Sample_Metric - Out-of-Sample_Metric) / In-Sample_Metric. This is weighted by a Complexity Penalty, often derived from the Akaike Information Criterion (AIC) or a simple parameter-to-data-point ratio. Analysis typically relies on K-fold cross-validation results to ensure robustness.
Edge & Advantage
It provides a 'second-layer' analysis, evaluating the quality of the prediction source itself, an edge few traders consider. This allows you to fade popular models that are secretly fragile.
Key Indicators
-
Out-of-Sample Performance Drop
highThe percentage decrease in a model's accuracy when tested on new data compared to its training data.
-
Parameter-to-Data Ratio
mediumCompares the number of adjustable parameters in a model to the number of data points used for training.
-
Cross-Validation Variance
mediumMeasures how much a model's performance changes across different subsets of the data, with high variance indicating instability.
Data Sources
-
Model Backtest Results
Provides the historical performance data (in-sample and out-of-sample) of the predictive model being analyzed.
-
Model Specification
Details about the model's architecture, including the number of parameters and features used.
Example Questions This Pillar Answers
- → Will this AI-driven stock prediction model maintain its stated accuracy next quarter?
- → Is the high success rate of this sports betting algorithm likely to continue through the playoffs?
- → What is the risk that a given crypto price model is overfit to the last bull run's data?
Tags
Use Overfitting Risk Score on a real market
Run this analytical framework on any Polymarket or Kalshi event contract.
Try PillarLab