Portofolio_maximizer

Name: Portofolio_maximizer
Rating: 3.5 (1 reviews)
Author: mrbestnaija

By mrbestnaija ⭐ 3 stars 👁 91 views ▲ 0 votes

ML for Quantitative trading

GitHub

Install

pip install -r

Configuration Example

# config/forecasting_config.yml (lines 98-115)
regime_candidate_weights:
  CRISIS:
    - {sarimax: 0.23, samossa: 0.72, mssa_rl: 0.05}
  MODERATE_MIXED:
    - {sarimax: 0.05, samossa: 0.73, mssa_rl: 0.22}
  MODERATE_TRENDING:
    - {sarimax: 0.05, samossa: 0.90, mssa_rl: 0.05}

README

# Portfolio Maximizer – Autonomous Profit Engine

[![Python 3.10-3.12](https://img.shields.io/badge/python-3.10--3.12-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Phase 7.9 In Progress](https://img.shields.io/badge/Phase%207.9-In%20Progress-blue.svg)](Documentation/EXIT_ELIGIBILITY_AND_PROOF_MODE.md)
[![Tests: 731](https://img.shields.io/badge/tests-731%20(718%20passing)-success.svg)](tests/)
[![Documentation](https://img.shields.io/badge/docs-comprehensive-informational.svg)](Documentation/)
[![Research Ready](https://img.shields.io/badge/research-reproducible-purple.svg)](#-research--reproducibility)

> End-to-end quantitative automation that ingests data, forecasts regimes, routes signals, and executes trades hands-free with profit as the north star.

**Version**: 4.2
**Status**: Phase 7.9 In Progress - Cross-session persistence, proof-mode validation, UTC normalization
**Last Updated**: 2026-02-09

---

## 🎯 Overview

Portfolio Maximizer is a self-directed trading stack that marries institutional-grade ETL with autonomous execution. It continuously extracts, validates, preprocesses, forecasts, and trades financial time series so profit-focused decisions are generated without human babysitting.

### Current Phase & Scope (Jan 2026)

**Phase 7.8 Complete** - All-Regime Weight Optimization:

- **3/6 regimes optimized** with SAMOSSA-dominant weights:
  - **CRISIS**: 60.69% RMSE improvement (17.15 → 6.74), 72% SAMOSSA
  - **MODERATE_MIXED**: 6.30% improvement (17.63 → 16.52), 73% SAMOSSA
  - **MODERATE_TRENDING**: 65.07% improvement (20.86 → 7.29), 90% SAMOSSA
- **Key Finding**: SAMOSSA dominates ALL regimes (72-90%), contradicting initial GARCH hypothesis
- **Method**: Rolling cross-validation with scipy.optimize.minimize (3+ years of AAPL data)
- **Validation**: 2/20 holdout audits complete

**Phase 7.9 In Progress** - Holdout Audit Accumulation:

- Current: 2/20 audits complete
- Target: 20 audits for production deployment decision
- 3 regimes not optimized (insufficient samples): HIGH_VOL_TRENDING, MODERATE_RANGEBOUND, LIQUID_RANGEBOUND

**System Architecture**:
- Regime-aware ensemble routing with adaptive model selection
- 4 forecasting models: SARIMAX, GARCH, SAMOSSA, MSSA-RL
- Quantile-based confidence calibration (Phase 7.4)
- Rolling cross-validation optimization framework
- Comprehensive logging with phase-organized structure

### Key Features

- **🚀 Intelligent Caching**: 20x speedup with cache-first strategy (24h validity)
- **📊 Advanced Analysis**: MIT-standard time series analysis (ADF, ACF/PACF, stationarity)
- **📈 Publication-Quality Visualizations**: 8 professional plots with 150 DPI quality
- **🔄 Robust ETL Pipeline**: 4-stage pipeline with comprehensive validation
- **✅ Comprehensive Testing**: 141+ tests with high coverage across ETL, LLM, and integration modules
- **⚡ High Performance**: Vectorized operations, Parquet format (10x faster than CSV)
- **🧠 Modular Orchestration**: Dataclass-driven pipeline runner coordinating CV splits, neural/TS stages, and ticker discovery with auditable logging
- **🔐 Resilient Data Access**: Hardened Yahoo Finance extraction with pooling to reduce transient failures
- **🤖 Autonomous Profit Engine**: `scripts/run_auto_trader.py` keeps the signal router + trading engine firing so positions are sized and executed automatically

---

### Latest Enhancements (Jan 2026)

**Phase 7.8 Achievements**:

- All-regime weight optimization (3/6 regimes) with ~60-65% RMSE improvement for CRISIS/MODERATE_TRENDING and +6.30% for MODERATE_MIXED
- SAMOSSA dominance finding: 72-90% across ALL optimized regimes
- CRISIS regime optimization contradicts initial GARCH hypothesis
- Updated configuration files with data-driven weights
- Comprehensive documentation: [PHASE_7.8_RESULTS.md](Documentation/PHASE_7.8_RESULTS.md)

**Phase 7.7 Achievements**:

- Per-regime weight optimization framework established
- Organized log directory structure with phase-specific subdirectories
- Automated log organization script ([bash/organize_logs.sh](bash/organize_logs.sh))

**Infrastructure Improvements**:
- ENSEMBLE DB migration: CHECK constraint updated, busy_timeout for write resilience
- Enhanced confidence scoring with model key canonicalization
- SQLite read-only connections with immutable URI mode (WSL/DrvFS robustness)
- Position-based forecast alignment fallback for calendar vs business day handling
- Regime detection feature flag with instant enable/disable capability

## Academic Rigor & Reproducibility (MIT-style)

- **Traceable artifacts**: Log config + commit hashes alongside experiment IDs; keep hashes for data snapshots and generated plots (`logs/artifacts_manifest.jsonl` when present).
- **Deterministic runs**: Set and record seeds (`PYTHONHASHSEED`, RNG, hyper-opt samplers, RL) for every reported experiment; prefer config overrides over ad hoc flags.
- **Executable evidence**: Each figure/table used for publication should have a runnable script/notebook (target: `reproducibility/` folder) that regenerates it from logged artifacts.
- **Transparency**: Document MTM assumptions, cost models, and cron wiring in experiment notes; link back to `Documentation/RESEARCH_PROGRESS_AND_PUBLICATION_PLAN.md` for the publication plan and replication checklist.
- **Archiving plan**: Package replication bundles (configs, logs, plots, minimal sample data) for Zenodo/Dataverse deposit before submitting any paper/thesis.

---

## 📋 Table of Contents

- [Architecture](#-architecture)
- [Installation](#-installation)
- [Quick Start](#-quick-start)
- [Phase 7.8 Results](#-phase-78-results-all-regime-optimization)
- [Phase 7.9 Status](#-phase-79-cross-session-persistence--proof-mode)
- [Usage](#-usage)
- [Project Structure](#-project-structure)
- [Performance](#-performance)
- [Testing](#-testing)
- [Documentation](#-documentation)
- [Research & Reproducibility](#-research--reproducibility)
- [Contributing](#-contributing)
- [License](#-license)

---

## 🎖️ Phase 7.8 Results: All-Regime Optimization

### Key Results

**3/6 Regimes Optimized** with SAMOSSA-dominant weights:

| Regime | Samples | Folds | RMSE Before | RMSE After | Improvement | Optimal Weights |
|--------|---------|-------|-------------|------------|-------------|-----------------|
| **CRISIS** | 25 | 5 | 17.15 | 6.74 | **+60.69%** | 72% SAMOSSA, 23% SARIMAX, 5% MSSA-RL |
| **MODERATE_MIXED** | 20 | 4 | 17.63 | 16.52 | +6.30% | 73% SAMOSSA, 22% MSSA-RL, 5% SARIMAX |
| **MODERATE_TRENDING** | 50 | 10 | 20.86 | 7.29 | **+65.07%** | 90% SAMOSSA, 5% SARIMAX, 5% MSSA-RL |

### Major Finding: SAMOSSA Dominance

**SAMOSSA dominates ALL optimized regimes (72-90%)**, contradicting initial hypothesis that GARCH would be optimal for CRISIS regime.

- Pattern recognition outperforms volatility modeling across all market conditions
- CRISIS regime: SAMOSSA (72%) + SARIMAX (23%) provides best defensive configuration
- MODERATE_TRENDING: Confirms Phase 7.7 results with 2x sample size validation

### Configuration Updates

```yaml
# config/forecasting_config.yml (lines 98-115)
regime_candidate_weights:
  CRISIS:
    - {sarimax: 0.23, samossa: 0.72, mssa_rl: 0.05}
  MODERATE_MIXED:
    - {sarimax: 0.05, samossa: 0.73, mssa_rl: 0.22}
  MODERATE_TRENDING:
    - {sarimax: 0.05, samossa: 0.90, mssa_rl: 0.05}
```

### Regimes Not Optimized (Insufficient Samples)

| Regime | Reason | Recommendation |
|--------|--------|----------------|
| **HIGH_VOL_TRENDING** | Rare in AAPL 2024-2026 data | Test with NVDA (higher volatility) |
| **MODERATE_RANGEBOUND** | Rare in trending market | Use default weights |
| **LIQUID_RANGEBOUND** | Very rare (stable markets) | Use default weights |

**Full Results**: [Documentation/PHASE_7.8_RESULTS.md](Documentation/PHASE_7.8_RESULTS.md)

---

## 🚀 Phase 7.9: Cross-Session Persistence & Proof Mode

### Objective

Establish reliable round-trip trade execution with cross-session position persistence, enabling profitability validation and holdout audit accumulation.

### Current Status

- **Closed trades**: 30 validated (proof-mode TIME_EXIT)
- **Holdout audits**: 9/20 (forecast audit gate active at 25% max violation rate)
- **UTC normalization**: Complete across execution and persistence layers
- **Frequency compatibility**: Deprecated pandas aliases (`'H'` -> `'h'`) resolved

### Key Components

- **Cross-session persistence**: `portfolio_state` + `portfolio_cash_state` tables via `--resume`
- **Proof mode** (`--proof-mode`): Tight max_holding (5d/6h), ATR stops/targets, flatten-before-reverse
- **Audit sprint**: `bash/run_20_audit_sprint.sh` with gate enforcement (forecast, quant health, dashboard)
- **UTC timestamps**: `etl/timestamp_utils.py` (`ensure_utc()`, `utc_now()`, `ensure_utc_index()`)

### Validation Commands

```bash
# Run proof-mode audit sprint
PROOF_MODE=1 RISK_MODE=research_production bash bash/run_20_audit_sprint.sh

# Check closed trades
python -c "
import sqlite3
conn = sqlite3.connect('data/portfolio_maximizer.db')
closed = conn.execute('SELECT COUNT(*) FROM trade_executions WHERE realized_pnl IS NOT NULL').fetchone()[0]
print(f'Closed trades with realized PnL: {closed}')
conn.close()
"
```

### Success Criteria

- [x] Cross-session position persistence working
- [x] Proof mode creates guaranteed round trips
- [x] UTC-aware timestamps across all layers
- [ ] 20/20 holdout audits accumulated
- [ ] Forecast audit gate violation rate < 25%

### Phase 7.10: Production Deployment (Future)

Prerequisites:

- 20/20 audits passed
- All 3 optimized regimes show consistent improvement
- Overall RMSE regression confirmed <25%

---

## 🏗️ Architecture

### System Architecture (7 Layers)

```
┌─────────────────────────────────────────────────────────┐
│              Portfolio Maximizer                          │
│              Production-Ready System                     │
└───────────────────────────

... (truncated)

tools