Skip to main content
This document covers Meta Models, an advanced meta-learning system for dynamic strategy selection and classification in the Systematica framework. Meta Models use machine learning classifiers to determine which trading strategy to apply at any given time based on market state representations and historical strategy performance.
Meta models implement machine learning approaches for dynamic strategy selection, using features from multiple base models to predict optimal strategies.

Overview

The Meta Model system implements a hierarchical classification approach where multiple trading strategies compete for selection based on their historical performance. The system consists of state representation engines, reward calculation modules, and classification models that together determine optimal strategy allocation. Meta models operate through a multi-stage process:
  • Feature Collection: Executes multiple base models and calculates rolling metrics
  • State Representation: Generates input features from market data
  • Target Generation: Creates categorical labels from best-performing strategies
  • Classification: Trains ML model to predict optimal strategy
  • Signal Generation: Converts predictions to trading signals
Key configuration options:
  • use_neptune: Load models from Neptune experiments vs. local config
  • sklearn.Pipeline: Custom ML pipeline components
  • window_in_days: Rolling metric calculation window
  • training_window/testing_window: Cross-validation parameters

Architecture

System Overview

LayerPurposeKey Responsibilities
API LayerPublic InterfaceExpose meta-model functionality to users
Core LayerBusiness LogicOrchestrate classification pipeline and registries
Processing LayerData TransformationHandle feature scaling and model training
Config LayerConfiguration ManagementStore and retrieve model/state configurations
Signal LayerStrategy GenerationGenerate trading signals and strategy mappings

API & Core Components

ComponentTypeFunctionInputOutput
MetaModelAPI ClassPublic interface for meta-model operationsUser requestsProcessed results
BaseMetaModelCore EngineCentral orchestrator managing all subsystemsConfiguration dataCoordinated pipeline execution
StateState RepresentationGenerate market state featuresMarket dataFeature vectors

Key Integration Points

IntegrationComponentsData ExchangeFrequency
Model TrainingBaseMetaModel.from_model_config()Feature matrices, labelsPer training cycle
Experiment TrackingBaseMetaModel.from_neptune_config()Metrics, parametersPer experiment
State ExtractionState.from_config()Configuration objectsPer prediction
Signal Generationget_meta_model_signals()Strategy identifiersPer market update

Config & State Management

COnfigData TypePurposeSchema
state_registryStateRegistryMarket state configurationsstate_id
model_registryFeatureConfigFeature engineering configurationsfeature_id
neptune_registryNeptuneSelectorExperiment tracking selectorsexperiment_id

Signal Generation

ComponentTypeFunctionTriggerOutput
get_meta_model_signals()Signal GeneratorGenerate trading/strategy signalsMarket conditionsSignal list
model_mappingStrategy MapperMap strategies to appropriate analyzersStrategy typeAnalyzer assignment

Meta-Learning Functionality

Foundation

The BaseMetaModel class provides the foundation for meta-learning functionality, handling feature processing, category management, and classifier training.
ComponentPurposeKey Methods
Data ManagementHandles time series data and symbol mappingdata, s1, s2
Feature ProcessingManages rolling metrics and state featuresfeatures, get_inputs()
Category MappingMaps model performance to categorical labelscategories, label_mapping
ClassificationRuns cross-validation and predictionrun_clf()

Steps

StepComponentFunctionInputOutput
1from_model_config() or from_neptune()Model InitializationConfig/Neptune configModel instance: Load pre-trained model or configuration
2featuresFeature ExtractionRaw dataRolling time-series metrics DataFrame
3categoriesModel SelectionRolling metricsBest performing model per period
4aget_target()Target GenerationCategoriesNumeric category codes: Convert categorical labels to numeric format
4bget_inputs()Input PreparationCategoriesState representation: Extract feature vectors for model training
5run_clf()Model TrainingTargets + InputsTrained classifier

Model Creation and Data Sources

Meta Models can be created from two primary data sources: model registries for local configurations or Neptune experiments for cloud-based model tracking.
  • From Model Registry: The from_model_config() method creates meta models using local feature configurations stored in ModelRegistry.
  • From Neptune Experiments: The from_neptune() method creates meta models using cloud-based feature configurations stored in NeptuneAIRegistry.

Classification and Cross-Validation

The classification engine uses time series cross-validation to train models that predict which trading strategy should be active based on current market conditions.

Performance Evaluation

The system provides comprehensive model evaluation tools including accuracy metrics, classification reports, and confusion matrices.
Evaluation MethodPurposeOutput
get_accuracy_score()Overall classification accuracyFloat
get_report()Detailed precision/recall metricsDataFrame
get_confusion_matrix()Prediction vs actual matrixDataFrame

State Representation

State representation converts market data into feature vectors suitable for machine learning classification. The State class processes multiple data sources to create comprehensive market state descriptions.

Signal Generation and Trading

Meta Models generate trading signals by converting classifier predictions into actionable buy/sell decisions for the optimal strategy at each time period.

Signal Generation Pipeline

StageComponentInputProcessOutput
1model_outputRaw dataClassifier predictionsPrediction probabilities/classes
2model_mappingCategory definitionsCategory to runner mappingStrategy-runner associations: Map predictions to execution strategies
3get_meta_model_signals()Predictions + MappingsSignal conversionTrading signals
4SignalsTrading signalsSignal formattingEntry/exit points

Portfolio Execution Pipeline

StageComponentInputProcessOutputBusiness Impact
5select_symbolsEntry/exit pointsSymbol filteringFiltered symbol listFocus on tradeable assets
6run_from_signals()Signals + symbolsPortfolio simulationSimulated tradesExecute strategy simulation
7PortfolioAnalyzerSimulation resultsPerformance analysisAnalysis reportMeasure strategy performance

MetaModel

MetaModel(
    preprocessor: sklearn.base.BaseEstimator = None,
    estimator: sklearn.base.BaseEstimator = None,
    n_steps: int = 1,
    training_window: int = 365,
    testing_window: int = 60,
    splitter: str = 'from_custom_rolling',
    custom_splitter: str | None = None,
    custom_splitter_kwargs: Dict[str, Any] = None,
    use_neptune: bool = False,
    metrics: str = 'sharpe_ratio',
    model_registry: List[systematica.registries.base.Register] = None,
    neptune_registry: List[systematica.registries.base.Register] = None,
    window_in_days: int = 365,
    minp: int = None,
    downsample: str = '1d',
    state_registry: systematica.registries.base.Register = None,
)
MetaModel is a class for cross-validation of meta models. It allows for the evaluation of different model configurations and the generation of trading signals based on the results. The class is designed to work with various classifiers and preprocessors, and it supports custom data splitting strategies for training and testing. Method generated by attrs for class MetaModel.

Ancestors

  • systematica.models.base.BaseStatArb
  • abc.ABC

Instance variables

  • custom_splitter: str | None: Custom splitter to use for data partitioning. Defaults to None. If set, it should be a string that matches a custom splitter function.
  • custom_splitter_kwargs: Dict[str, Any]: Additional keyword arguments for the custom splitter. Defaults to None. If custom_splitter is set, this should contain any necessary parameters for the custom splitter function.
  • downsample: str: Resample data before state representation computation use to speed up the process. Upsampling and ffill is performed straight after. If None, no resampling is performed. Defaults to 1d (daily).
  • estimator: sklearn.base.BaseEstimator: Classifier model. If None, defaults to LogisticRegression (aka logit, MaxEnt) classifier from sklearn. Defaults to None.
  • metrics: str: Metric(s) to calculate the reward. Defaults to sharpe_ratio.
  • minp: int: Minimum number of observations required. Defaults to None.
  • model_registry: List[systematica.registries.base.Register]: Config of models to use for metric calculation. Defaults to model_registry.
  • n_steps: int: Number of periods to shift backward by n positions. This operation intentionally looks ahead to train the model! Must be positive. Default to 1.
  • neptune_registry: List[systematica.registries.base.Register]: Config of models to use for metric calculation when use_neptune is set to True. Defaults to neptune_registry.
  • preprocessor: sklearn.base.BaseEstimator: Standardize features. If None, defaults to StandardScaler. Defaults to None.
  • splitter: str: Default splitter to be used if custom_splitter is not passed. Choices are from_rolling, from_custom_rolling, from_expanding, from_custom_expanding. Defaults to “from_custom_rolling”.
  • state_registry: systematica.registries.base.Register: State representation config. if None, uses StateRegistry. Defaults to None.
  • testing_window: int: The size of the testing window. Defaults to 60.
  • training_window: int: The size of the training window. Defaults to 365.
  • use_neptune: bool: Use neptune if True, config otherwise. Defaults to False.
  • window_in_days: int: The size of the rolling window in days. Defaults to 365. This is the window size used for calculating both rolling metrics and state representations.

Methods

get_signals

get_signals(
    self,
    model_output: numpy.ndarray,
    select_symbols: str | int | list[str | int= None,
    **kwargs,
) ‑> systematica.signals.base.Signals
Generate trading signals based on scores. See BaseStatArb.get_signals Parameters:
NameTypeDefaultDescription
model_outputtp.Array--Output from the model, typically scores for each symbol.
select_symbolsstrNoneSymbols to select for generating signals. If None, all symbols are used.
Returns:
TypeDescription
SignalsSignals object containing buy and sell signals for the selected symbols.