Pennant results matrix — column glossary¶

Definitive definition of every column in results_matrix.csv (23 columns). Each entry gives type, plain-English meaning, units / format, when the cell is empty vs zero, and caveats worth knowing before reading or filling the column. Where a column references another artifact (registry, report), the entry points to it.

This glossary is the source of truth for column semantics. If the glossary and a row disagree, the glossary wins and the row is buggy — fix the row.

1. `run_id`¶

Type: string
Format: PEN-TEST-NNN or PEN-TEST-NNN<a-z> for sub-rows
Definition: Unique identifier for one test row. Sequential across all tests, reserved before the test runs. A single conceptual test that compares multiple cohorts or variants gets one base PEN-TEST-NNN plus suffix letters per sub-row (PEN-TEST-004a, 004b, … 004e). The suffix is lowercase.
Empty vs zero: never empty. Required field.
Caveats: Never reused. Skipping an ID is fine; reusing one breaks the audit trail. The matching test directory at tests/<date>_<run_id>/ may collapse multiple sub-rows into one directory (PEN-TEST-004a–e share tests/2026-05-11_PEN-TEST-004/); the suffix is for the matrix, not the filesystem.

2. `run_date`¶

Type: string
Format: ISO date YYYY-MM-DD
Definition: Date the test produced its final headline result. For multi-day tests, the completion date. For sub-rows produced by the same execution, all share the same date.
Empty vs zero: never empty. Required field.
Caveats: Not the date the row was added to the matrix. If a result is back-filled later, the original run date is used.

3. `purpose`¶

Type: string
Format: short phrase, sentence-case, no trailing period
Definition: Plain-English answer to "what were we trying to learn?" One row of this column should be intelligible without reading the report.
Empty vs zero: never empty. Required field.
Caveats: Keep terse; full context lives in the report. Avoid result claims here ("V2 is best") — those go in notes or key_metric once measured.

4. `period`¶

Type: string
Format: YYYY-YYYY (e.g. 2007-2026)
Definition: Calendar window the test covers. For detection scans, the date range of events emitted; for backtests, the date range of equity-curve simulation.
Empty vs zero: never empty.
Caveats: Year-resolution only. The actual scan/backtest date range may be a partial year on either end (2026 data through May 8 only). Report contains the precise range.

5. `detection_id`¶

Type: string
Format: PEN-DET-<label> (lowercase short label, no embedded date)
Definition: ID of the detection-parameter set that produced the event cohort consumed by this row. References an entry in strategies/Pennant/registry.md Detection- variants table. The corresponding cohort parquets live at cohorts/DET-<UPPER>-<scan-date>/.
Empty vs zero: never empty for any test (everything consumes some detection variant; even baseline counts).
Caveats: Detection IDs are parameter-set IDs, not cohort IDs. If the same parameter set is re-scanned (different scan date), the detection ID stays the same; the cohort directory gets a new scan date in its directory name. The registry shows the latest cohort per detection ID.

6. `strategy_id`¶

Type: string
Format: PEN-<asset>-<NNN> (e.g. PEN-STOCK-001, PEN-OPT-001)
Definition: ID of the trading strategy variant simulated. Each has a locked spec at strategies/<id>.md registered in strategies/Pennant/registry.md.
Empty vs zero: empty for detection-only tests and population analyses (no strategy simulated). Required for backtests.
Caveats: A change to mechanics — different sizing, different exit thresholds, options overlay — requires a new strategy ID, not a parameter on the existing one. This is the lock-once-write rule.

7. `precursor`¶

Type: string
Format: one of none, rule1, rule2, rule1_or_rule2
Definition: Precursor filter applied at entry / event time. Rule 1 (Momentum) and Rule 2 (Breakout) are the surviving 5-y / 10-y / 20-y precursor profiles from the original Phase 1 pennant findings; documented in build_v1/reports/findings_report.md and the trading action plan. none means the unconditional population. rule1_or_rule2 means at least one of the two rules fires.
Empty vs zero: never empty; none is the explicit value.
Caveats: None of the Pennant-era tests (PEN-TEST-001..005) have used precursor filtering yet — every row is none. Future tests that apply Rule 1/2 will populate this column.

8. `regime_filter`¶

Type: string
Format: one of none, spy200+vix35, vix_vvix
Definition: Market-regime gate applied at entry time.
none — entries unconditional on market state.
spy200+vix35 — skip entries on days where SPY < SPY-200-SMA and VIX > 35 (the Phase 7 "circuit breaker" gate; 240 such days in the 2007–2026 calendar).
vix_vvix — placeholder for the VIX × VVIX joint gate explored in the VolGap call-only family; not yet used in any Pennant-line test.
Empty vs zero: never empty; none is the explicit value.
Caveats: Detection-only and population-analysis tests are none because the regime gate is a strategy concept, not a detector one. Backtests in PEN-TEST-004 use spy200+vix35.

9. `trades`¶

Type: int
Units: count (no thousands separators in CSV)
Definition: Polysemous — meaning depends on test type:
Backtest rows: number of trades taken (= entries that passed cash + regime gates). Not the number of detected events; not the number of cohort rows.
Detection-only rows: number of pennants detected. Equal to the row count of the cohort's events.parquet.
Population analyses: number of patterns analyzed (= cohort rows with usable forward outcomes). For PEN-TEST-005, this is 15,528 — 6 fewer than the 15,534 detected events, because 6 had no forward data.
Empty vs zero: never empty for retroactive rows; future rows could be empty if the test is a pure documentation exercise.
Caveats: Do not compare across test types without understanding the units. A backtest's 4,533 trades against a detection's 5,155 events is the cohort minus skips (no-cash, regime-gated), not a quality difference.

10. `win_rate`¶

Type: float
Units: percent (decimal value, e.g. 43.7 for 43.7 %)
Definition: % of completed trades that closed profitable (P&L > 0). Computed per the canonical backtest harness; the exact accounting (after-friction vs gross, partial fills as separate trades or aggregated) is whatever the report defines.
Empty vs zero: empty for non-backtests (no trades to win or lose). Zero would mean the strategy ran 1+ trades and every one lost — a measured outcome, distinct from "doesn't apply".
Caveats: Breakout / continuation strategies typically have win rates in the 40–55 % range — this is structural, not a defect. The PEN-STOCK-001 scaled-exit takes a half-exit at +15 % and trails the runner, so the per-trade outcome is asymmetric. Win-rate alone is not a quality metric for this class of strategy; pair it with profit_factor or sharpe.

11. `cagr_pct`¶

Type: float
Units: percent (annualized)
Definition: Compound annual growth rate of equity over the period. Computed as (final_equity / starting_capital) ^ (1 / years) − 1.
Empty vs zero: empty for non-backtests. Zero means a measured CAGR of 0 % (strategy ran but ended at starting capital).
Caveats: Sensitive to choice of starting capital and to whether dividends / cash drift are included. The Pennant-line backtests use $10K starting capital and ignore SPY drift on idle cash. Compare CAGRs only across backtests with matching capital + cash conventions.

12. `total_return_pct`¶

Type: float
Units: percent (cumulative, not annualized)
Definition: (final_equity − starting_capital) / starting_capital × 100. The headline "what did $10K become" number.
Empty vs zero: empty for non-backtests. Zero means the strategy returned exactly the starting capital.
Caveats: Like cagr_pct, depends on starting capital and cash drift. Useful side-by-side with CAGR for sanity check: (1 + total/100) ^ (1/years) should equal 1 + cagr/100.

13. `final_equity`¶

Type: float
Units: dollars (no $ sign, no comma, raw number)
Definition: Equity at the end of the period. The dollar amount $10K turned into.
Empty vs zero: empty for non-backtests. Zero would mean a wipeout (PEN-STOCK-001 can't wipe out — the worst case is a drift down with no leverage).
Caveats: All Pennant-line backtests start at $10,000. Future variants might start elsewhere; if so, document in notes.

14. `max_dd_pct`¶

Type: float (negative)
Units: percent
Definition: Largest peak-to-trough drawdown in equity over the period, expressed as a negative percent of the high-water mark at the peak.
Empty vs zero: empty for non-backtests. Zero would mean the equity curve was monotonically non-decreasing — possible but unlikely over a 19-year window.
Caveats: Drawdown is path-dependent. Two strategies with identical CAGR can have very different max-DD. Pair with CAGR to compute the MAR ratio (cagr_pct / abs(max_dd_pct)) — values above 0.5 are good, above 1.0 are exceptional, below 0.2 mean the equity curve has poor sequence properties.

15. `sharpe`¶

Type: float
Units: dimensionless ratio (annualized)
Definition: Annualized Sharpe ratio of daily equity returns. Risk-free rate assumed zero (the convention in the Pennant backtest harness — both the strategy and the benchmark would see the same risk-free, so it cancels in apples-to-apples comparisons).
Empty vs zero: empty for non-backtests. Zero would mean measured mean return = 0 with positive variance.
Caveats: Sharpe penalizes upside variance equally with downside variance — a strategy with frequent +30 % winners scores worse on Sharpe than a strategy with steady +5 % winners. For breakout strategies prefer the Sortino ratio (when reported in key_metric) or the MAR ratio.

16. `profit_factor`¶

Type: float
Units: dimensionless ratio
Definition: Sum of profits divided by absolute sum of losses across all closed trades. Values > 1.0 mean profitable; intuition: a PF of 1.5 means $1.50 won per $1.00 lost.
Empty vs zero: empty for non-backtests. Zero would mean no winning trades — implausible but theoretically possible.
Caveats: Not reported by the PEN-TEST-004 harness — that test's headline summary tracks Sharpe and max-DD instead. Left empty in all current rows; future backtests should populate.

17. `mean_mfe_pct`¶

Type: float
Units: percent
Definition: Mean of the forward 30-trading-day Maximum Favorable Excursion across the cohort. MFE = highest forward close vs anchor close, in percent. Available for detection- only and population tests; not meaningful for backtests (the strategy may exit before MFE is reached).
Empty vs zero: empty for backtest rows. Zero would mean no event in the cohort ever reached a positive forward close — not observed.
Caveats: Heavily skewed by penny-stock pops in the right tail (max observed: +1,795 %). Use the median (in the report) for a more robust central value; the mean reflects the tail.

18. `mean_mae_pct`¶

Type: float (negative)
Units: percent
Definition: Mean of the forward 30-trading-day Maximum Adverse Excursion. MAE = lowest forward close vs anchor close, in percent (so usually negative).
Empty vs zero: empty for backtest rows. Zero would mean no event ever closed below anchor in the next 30 days — not observed.
Caveats: Like mean_mfe_pct, has a left tail (min observed: -98.6 %). The median is more robust. Note that an event with positive MAE means the stock never closed below anchor — it happens (~25 % of huge-winner cluster patterns).

19. `hit_rate_15pct_mfe`¶

Type: float
Units: percent
Definition: % of events in the cohort whose MFE reached ≥ +15 % at any point in the forward 30-trading-day window. The "did the move happen at all?" metric, independent of whether it stuck.
Empty vs zero: empty for backtest rows. Zero would mean not a single event in the cohort hit +15 %.
Caveats: The 15 % threshold matches the PEN-STOCK-001 Leg 1 exit. Other reports may quote +5 %, +10 %, +20 %, etc.; only the 15 % column lives in the matrix. The companion give-back metric (median 68 % — see PEN-TEST-005's key_metric) tells you that hit-rate ≠ retained gain.

20. `key_metric`¶

Type: string
Format: semicolon-separated name: value pairs, free-form per test
Definition: Flexible field for whatever's most distinctive about the test that the standard columns don't capture. Examples:
Population analyses: huge_winner_share_cluster2: 2.8%; give_back_median: 68%
Detection variants: per_pattern_expectancy: +4.9% vs baseline
Future Rule-1/Rule-2 tests: rule1_population_size: 4521; rule1_lift_vs_baseline: +X pp
Empty vs zero: empty when the standard columns capture everything important.
Caveats: Not machine-parsed. If a value here becomes important enough to compare across tests, it earns a dedicated column.

21. `report_link`¶

Type: string
Format: relative path from Pennant/ root
Definition: Path to the canonical detailed report markdown. Example: reports/pennant_strategy_backtest_2026-05-11.md.
Empty vs zero: never empty for completed tests.
Caveats: Always relative to Pennant/, never absolute, never with a leading slash. The same report can be referenced by multiple rows (PEN-TEST-004a–e all link to reports/pennant_strategy_backtest_2026-05-11.md).

22. `charts`¶

Type: string
Format: semicolon-separated relative paths from Pennant/ root (no spaces around the ;)
Definition: Charts associated with this row. In the rendered HTML / markdown publishing layer these become numbered hyperlinks ([1], [2], …). One row can carry many charts (PEN-TEST-005 has 8). All paths must point to files that exist under charts/.
Empty vs zero: empty if the test produced no chart (Phase 11a/a-2/a-3 detection-only A/B reports are pure text). Never the literal string none.
Caveats: Validator does not currently check that the files exist on disk. If you mistype a path, the reference is silently broken. Use the same path the report uses (relative to Pennant/).

23. `notes`¶

Type: string
Format: free-form, may contain commas (CSV-quoted on write)
Definition: Anything else worth recording that doesn't fit the structured columns. Constraints / caveats specific to this row; reasons a number looks weird; references to related tests; failure-mode commentary.
Empty vs zero: empty when nothing else needs saying.
Caveats: Not machine-parsed. Keep prose terse; longer context belongs in the report. Avoid duplicating information already in key_metric or purpose.

Conventions across all columns¶

Empty cells indicate "doesn't apply" or "not measured", not "missing data" and not "zero". The publishing layer renders empty cells as blank, not as n/a.
All percentages are decimal values, not fractions (43.7 means 43.7 %, not 4,370 %).
All dollar values are raw numbers (40398, not $40,398 or 40,398.00).
All paths are relative to Pennant/, no leading slash, forward slashes only.
All IDs are case-sensitive: PEN-TEST-001a ≠ pen-test-001a.

Updating the schema¶

If a new column is needed: edit SCHEMA in infra/update_matrix.py, migrate existing rows by reading the CSV into Python, adding the new key with appropriate defaults, writing back; then add a glossary entry here. The _validate() function in update_matrix.py rejects rows with extra or missing keys, so the CSV and schema must stay in lockstep.

Pennant results matrix — column glossary¶

1. run_id¶

2. run_date¶

3. purpose¶

4. period¶

5. detection_id¶

6. strategy_id¶

7. precursor¶

8. regime_filter¶

9. trades¶

10. win_rate¶

11. cagr_pct¶

12. total_return_pct¶

13. final_equity¶

14. max_dd_pct¶

15. sharpe¶

16. profit_factor¶

17. mean_mfe_pct¶

18. mean_mae_pct¶

19. hit_rate_15pct_mfe¶

20. key_metric¶

21. report_link¶

22. charts¶

23. notes¶