Rosa Del Mar

Daily Brief

Issue 58 2026-02-27

Migration Playbook: Phased Automation, Forward Testing, And Reconciliation Loop

Issue 58 Edition 2026-02-27 8 min read
General
Sources: 1 • Confidence: Medium • Updated: 2026-03-02 13:05

Key takeaways

  • Mabe asserts that validation work effectively begins after the backtest because live trading at tiny size reveals important simulation-to-reality differences.
  • Mabe trades a gapping-stock breakout strategy that enters on a breakout from a narrowing post-open range, places a stop on the opposite side of that tightening range, and holds for the day.
  • For short-selling strategies, Mabe prefers a 'pristine' backtest that includes commissions but does not explicitly model slippage or locate costs, treating those as post-backtest degradations.
  • In fully automated trading, the emotionally difficult discretionary decisions shift from intraday execution to decisions about how to respond to drawdowns.
  • Mabe claims, based on his backtests, that taking partial profits and moving stops to breakeven materially reduces performance versus holding full size to the strategy's natural exit.

Sections

Migration Playbook: Phased Automation, Forward Testing, And Reconciliation Loop

  • Mabe asserts that validation work effectively begins after the backtest because live trading at tiny size reveals important simulation-to-reality differences.
  • Mabe increased automation in stages: automating sizing, then exit orders, then computer-generated entry orders with manual transmit.
  • Mabe reports that live automated results were close to the backtest but not identical because backtests assume fills that are not always achievable in real markets.
  • Mabe built a reconciliation loop that logs trades (including slippage) to an online journal and generates daily reports of missed backtest trades to diagnose live-vs-backtest differences.
  • Mabe advocates going live quickly at small size to observe where issues arise rather than endlessly tweaking a backtest to perfection beforehand.
  • Mabe defines forward testing as taking live trades with very small size while systematically reconciling divergences from the backtest and their causes.

System Definition: Rule Formalization, Sizing Via Fixed Dollar Risk, And Replicability Limits From Filters

  • Mabe trades a gapping-stock breakout strategy that enters on a breakout from a narrowing post-open range, places a stop on the opposite side of that tightening range, and holds for the day.
  • In Mabe's framework, a tightening range reduces stop distance and permits larger share size for the same fixed dollar risk.
  • Before making his first day trade, Mabe required evaluation using expectancy and R-multiples.
  • Mabe sizes positions by risking a fixed dollar amount per trade and computing share size from the setup-defined stop distance, increasing the fixed risk amount gradually as confidence grew.
  • Mabe claims traders who believe they trade the same system can have very different results because their discretionary trade-skipping filters differ.
  • Mabe states that converting a discretionary approach into a backtest often starts by stripping discretion and encoding the underlying rules as a purely systematic strategy.

Backtesting Methodology: Robustness Over High-Fidelity Execution Simulation

  • For short-selling strategies, Mabe prefers a 'pristine' backtest that includes commissions but does not explicitly model slippage or locate costs, treating those as post-backtest degradations.
  • Mabe claims very tight stops can create overly optimistic backtests due to bar-resolution and entry-bar assumptions about whether a stop could be hit immediately after entry.
  • To reduce curve fitting when tweaking systems, Mabe requires each added rule to be coherent and supported by a backtest with a large number of trades.
  • Mabe asserts that attempting to model slippage and real-world execution perfectly inside a backtest is generally futile because it cannot be captured exactly.
  • Mabe proposes a backtest sanity check: the core strategy should still work without stops or targets before adding them.
  • Mabe claims tick-by-tick backtesting can address some precision issues but is costly and resource-intensive, requiring a cost-benefit tradeoff.

Automation Relocates Discretion To Governance And Drawdown Response

  • In fully automated trading, the emotionally difficult discretionary decisions shift from intraday execution to decisions about how to respond to drawdowns.
  • Predefining drawdown thresholds and actions in advance is presented as a way to reduce emotional decision-making during drawdowns.
  • Mabe claims scaling becomes harder as trade size increases because larger size changes psychology and can reintroduce errors or different emotional mistakes even in automated systems.
  • Mabe claims the only two ways to build confidence in a trading system are long-term repetition of live trading and backtesting, and that backtesting is a shortcut to confidence needed to scale size.

Trade Management Claims That Contradict Common Heuristics

  • Mabe claims, based on his backtests, that taking partial profits and moving stops to breakeven materially reduces performance versus holding full size to the strategy's natural exit.
  • Mabe claims backtesting can debunk widely repeated trading advice that contains significant misinformation.
  • Mabe claims stops generally worsen strategy performance in backtests, though he considers stops necessary for practical risk control in live trading.
  • Mabe proposes a backtest sanity check: the core strategy should still work without stops or targets before adding them.

Unknowns

  • What specific instruments/universes, liquidity constraints, and time periods do the strategy and backtests cover?
  • What were the exact automation changes and metrics used to attribute improvements (error rate, latency, slippage, fill rate), beyond qualitative description?
  • What drawdown thresholds and intervention actions are used in practice, and how often are they triggered?
  • How are trade-skipping filters defined, and which of them can be reliably codified without creating new overfitting risks?
  • What is the quantitative magnitude of backtest biases from tight stops under different bar-resolution and fill assumptions for this strategy class?

Investor overlay

Read-throughs

  • Systematic trading performance may depend more on post backtest validation and live reconciliation than on adding high fidelity execution assumptions in backtests.
  • Automation shifts risk from discretionary execution errors toward governance errors, especially drawdown response rules and scaling decisions.
  • Common intraday trade management heuristics such as partial profits and breakeven stops may reduce edge for some breakout day trade strategies, implying management rules should be treated as testable hypotheses.

What would confirm

  • Forward test results at small size converge toward backtest behavior after reconciliation of missed trades and rule implementation gaps, with documented reductions in unexplained variance.
  • Pre committed drawdown thresholds and intervention actions are defined, tracked, and shown to reduce human inconsistency without materially degrading strategy expectancy.
  • Ablation tests show that partial profits and moving stops to breakeven reduce returns versus holding to the strategy exit, and the result persists across many trades and reasonable assumptions.

What would kill

  • Live trading at small size diverges persistently from backtests even after reconciliation, suggesting unmodeled execution effects or ambiguous rules dominate outcomes.
  • Strategy performance is highly sensitive to discretionary filters that cannot be reliably codified, preventing reproducible automation.
  • Tight stop and bar resolution assumptions materially change results, and robustness tests show edge disappears under plausible fill and slippage conditions.

Sources