This document covers Meta Models, an advanced meta-learning system for dynamic strategy selection and classification in the Systematica framework.
Meta Models use machine learning classifiers to determine which trading strategy to apply at any given time based on market state representations and historical strategy performance.
Meta models implement machine learning approaches for dynamic strategy selection, using features from multiple base models to predict optimal strategies.
Overview
The Meta Model system implements a hierarchical classification approach where multiple trading strategies compete for selection based on their historical performance.
The system consists of state representation engines, reward calculation modules, and classification models that together determine optimal strategy allocation.
Meta models operate through a multi-stage process:
- Feature Collection: Executes multiple base models and calculates rolling metrics
- State Representation: Generates input features from market data
- Target Generation: Creates categorical labels from best-performing strategies
- Classification: Trains ML model to predict optimal strategy
- Signal Generation: Converts predictions to trading signals
Key configuration options:
use_neptune: Load models from Neptune experiments vs. local config
sklearn.Pipeline: Custom ML pipeline components
window_in_days: Rolling metric calculation window
training_window/testing_window: Cross-validation parameters
Architecture
System Overview
| Layer | Purpose | Key Responsibilities |
| API Layer | Public Interface | Expose meta-model functionality to users |
| Core Layer | Business Logic | Orchestrate classification pipeline and registries |
| Processing Layer | Data Transformation | Handle feature scaling and model training |
| Config Layer | Configuration Management | Store and retrieve model/state configurations |
| Signal Layer | Strategy Generation | Generate trading signals and strategy mappings |
API & Core Components
| Component | Type | Function | Input | Output |
MetaModel | API Class | Public interface for meta-model operations | User requests | Processed results |
BaseMetaModel | Core Engine | Central orchestrator managing all subsystems | Configuration data | Coordinated pipeline execution |
State | State Representation | Generate market state features | Market data | Feature vectors |
Key Integration Points
| Integration | Components | Data Exchange | Frequency |
| Model Training | BaseMetaModel.from_model_config() | Feature matrices, labels | Per training cycle |
| Experiment Tracking | BaseMetaModel.from_neptune_config() | Metrics, parameters | Per experiment |
| State Extraction | State.from_config() | Configuration objects | Per prediction |
| Signal Generation | get_meta_model_signals() | Strategy identifiers | Per market update |
Config & State Management
| COnfig | Data Type | Purpose | Schema |
state_registry | StateRegistry | Market state configurations | state_id |
model_registry | FeatureConfig | Feature engineering configurations | feature_id |
neptune_registry | NeptuneSelector | Experiment tracking selectors | experiment_id |
Signal Generation
| Component | Type | Function | Trigger | Output |
get_meta_model_signals() | Signal Generator | Generate trading/strategy signals | Market conditions | Signal list |
model_mapping | Strategy Mapper | Map strategies to appropriate analyzers | Strategy type | Analyzer assignment |
Foundation
The BaseMetaModel class provides the foundation for meta-learning functionality, handling feature processing, category management, and classifier training.
| Component | Purpose | Key Methods |
| Data Management | Handles time series data and symbol mapping | data, s1, s2 |
| Feature Processing | Manages rolling metrics and state features | features, get_inputs() |
| Category Mapping | Maps model performance to categorical labels | categories, label_mapping |
| Classification | Runs cross-validation and prediction | run_clf() |
Steps
| Step | Component | Function | Input | Output |
| 1 | from_model_config() or from_neptune() | Model Initialization | Config/Neptune config | Model instance: Load pre-trained model or configuration |
| 2 | features | Feature Extraction | Raw data | Rolling time-series metrics DataFrame |
| 3 | categories | Model Selection | Rolling metrics | Best performing model per period |
| 4a | get_target() | Target Generation | Categories | Numeric category codes: Convert categorical labels to numeric format |
| 4b | get_inputs() | Input Preparation | Categories | State representation: Extract feature vectors for model training |
| 5 | run_clf() | Model Training | Targets + Inputs | Trained classifier |
Model Creation and Data Sources
Meta Models can be created from two primary data sources: model registries for local configurations or Neptune experiments for cloud-based model tracking.
- From Model Registry: The
from_model_config() method creates meta models using local feature configurations stored in ModelRegistry.
- From Neptune Experiments: The
from_neptune() method creates meta models using cloud-based feature configurations stored in NeptuneAIRegistry.
Classification and Cross-Validation
The classification engine uses time series cross-validation to train models that predict which trading strategy should be active based on current market conditions.
The system provides comprehensive model evaluation tools including accuracy metrics, classification reports, and confusion matrices.
| Evaluation Method | Purpose | Output |
get_accuracy_score() | Overall classification accuracy | Float |
get_report() | Detailed precision/recall metrics | DataFrame |
get_confusion_matrix() | Prediction vs actual matrix | DataFrame |
State Representation
State representation converts market data into feature vectors suitable for machine learning classification. The State class processes multiple data sources to create comprehensive market state descriptions.
Signal Generation and Trading
Meta Models generate trading signals by converting classifier predictions into actionable buy/sell decisions for the optimal strategy at each time period.
Signal Generation Pipeline
| Stage | Component | Input | Process | Output |
| 1 | model_output | Raw data | Classifier predictions | Prediction probabilities/classes |
| 2 | model_mapping | Category definitions | Category to runner mapping | Strategy-runner associations: Map predictions to execution strategies |
| 3 | get_meta_model_signals() | Predictions + Mappings | Signal conversion | Trading signals |
| 4 | Signals | Trading signals | Signal formatting | Entry/exit points |
Portfolio Execution Pipeline
| Stage | Component | Input | Process | Output | Business Impact |
| 5 | select_symbols | Entry/exit points | Symbol filtering | Filtered symbol list | Focus on tradeable assets |
| 6 | run_from_signals() | Signals + symbols | Portfolio simulation | Simulated trades | Execute strategy simulation |
| 7 | PortfolioAnalyzer | Simulation results | Performance analysis | Analysis report | Measure strategy performance |
MetaModel(
preprocessor: sklearn.base.BaseEstimator = None,
estimator: sklearn.base.BaseEstimator = None,
n_steps: int = 1,
training_window: int = 365,
testing_window: int = 60,
splitter: str = 'from_custom_rolling',
custom_splitter: str | None = None,
custom_splitter_kwargs: Dict[str, Any] = None,
use_neptune: bool = False,
metrics: str = 'sharpe_ratio',
model_registry: List[systematica.registries.base.Register] = None,
neptune_registry: List[systematica.registries.base.Register] = None,
window_in_days: int = 365,
minp: int = None,
downsample: str = '1d',
state_registry: systematica.registries.base.Register = None,
)
MetaModel is a class for cross-validation of meta models.
It allows for the evaluation of different model configurations and
the generation of trading signals based on the results. The class is designed
to work with various classifiers and preprocessors, and it supports custom
data splitting strategies for training and testing.
Method generated by attrs for class MetaModel.
Ancestors
systematica.models.base.BaseStatArb
abc.ABC
Instance variables
-
custom_splitter: str | None: Custom splitter to use for data partitioning. Defaults to None. If set, it should be a string that matches a custom splitter function.
-
custom_splitter_kwargs: Dict[str, Any]: Additional keyword arguments for the custom splitter. Defaults to None. If custom_splitter is set, this should contain any necessary parameters for the custom splitter function.
-
downsample: str: Resample data before state representation computation use to speed up the process. Upsampling and ffill is performed straight after. If None, no resampling is performed. Defaults to 1d (daily).
-
estimator: sklearn.base.BaseEstimator: Classifier model. If None, defaults to LogisticRegression (aka logit, MaxEnt) classifier from sklearn. Defaults to None.
-
metrics: str: Metric(s) to calculate the reward. Defaults to sharpe_ratio.
-
minp: int: Minimum number of observations required. Defaults to None.
-
model_registry: List[systematica.registries.base.Register]: Config of models to use for metric calculation. Defaults to model_registry.
-
n_steps: int: Number of periods to shift backward by n positions. This operation intentionally looks ahead to train the model! Must be positive. Default to 1.
-
neptune_registry: List[systematica.registries.base.Register]: Config of models to use for metric calculation when use_neptune is set to True. Defaults to neptune_registry.
-
preprocessor: sklearn.base.BaseEstimator: Standardize features. If None, defaults to StandardScaler. Defaults to None.
-
splitter: str: Default splitter to be used if custom_splitter is not passed. Choices are from_rolling, from_custom_rolling, from_expanding, from_custom_expanding. Defaults to “from_custom_rolling”.
-
state_registry: systematica.registries.base.Register: State representation config. if None, uses StateRegistry. Defaults to None.
-
testing_window: int: The size of the testing window. Defaults to 60.
-
training_window: int: The size of the training window. Defaults to 365.
-
use_neptune: bool: Use neptune if True, config otherwise. Defaults to False.
-
window_in_days: int: The size of the rolling window in days. Defaults to 365. This is the window size used for calculating both rolling metrics and state representations.
Methods
get_signals
get_signals(
self,
model_output: numpy.ndarray,
select_symbols: str | int | list[str | int] = None,
**kwargs,
) ‑> systematica.signals.base.Signals
Generate trading signals based on scores. See BaseStatArb.get_signals
Parameters:
| Name | Type | Default | Description |
model_output | tp.Array | -- | Output from the model, typically scores for each symbol. |
select_symbols | str | None | Symbols to select for generating signals. If None, all symbols are used. |
Returns:
| Type | Description |
Signals | Signals object containing buy and sell signals for the selected symbols. |