📊 Backtesting & Validation Framework
Rigorous statistical validation ensuring strategy robustness before live deployment
Walk-Forward Analysis
Out-of-sample validation methodology
Every strategy undergoes walk-forward analysis with a 12-month in-sample optimization window and 3-month out-of-sample testing window. The window advances monthly, producing a continuous series of out-of-sample results that simulate real-time performance.
This methodology prevents overfitting by ensuring that every performance metric is computed on data the model has never seen during parameter optimization. Only strategies that demonstrate consistent out-of-sample performance are promoted to paper trading.
Monte Carlo Simulation
10,000 permutation stress testing
Trade sequences are randomly permuted 10,000 times to generate a distribution of possible equity curves. This reveals the range of outcomes attributable to luck vs. skill and provides confidence intervals for key metrics.
95th Percentile Max DD
Worst drawdown in 95% of simulations
Ruin Probability
Percentage of paths hitting 50% drawdown
Profit Factor Range
5th–95th percentile of gross profit/loss
Recovery Time
Median time to recover from max drawdown
Realistic Cost Model
Friction-adjusted performance
0.02% per side for liquid ETFs, 0.05% for individual equities
$0.005/share (Alpaca), 0.1% taker fee (crypto)
Half bid-ask spread applied to each trade at time of signal
Square-root model for position sizes > 1% of ADV
Short positions include borrow rate (Fed Funds + 0.5%)
Regime-Conditional Analysis
Performance by market state
All backtests are segmented by HMM regime state (Bull, Transition, Bear). This reveals which strategies perform in which environments and validates the regime-adaptive allocation logic. A strategy is only approved if it demonstrates positive expectancy in its target regime(s) and does not hemorrhage capital in adverse regimes.
Statistical Significance
Hypothesis testing framework
t-test on Returns
p < 0.05
Mean return statistically different from zero
Bootstrap Sharpe
95% CI > 0.5
Confidence interval of Sharpe ratio excludes low values
Deflated Sharpe Ratio
DSR > 0.95
Accounts for multiple testing, data snooping, and non-normality
Minimum Track Record
200+ trades
Sufficient sample size for statistical reliability