Clf Model - Systematica - Blockforce Capital

`get_meta_model_signals`

get_meta_model_signals(
    model_output: numpy.ndarray,
    index: numpy.ndarray | pandas.core.indexes.base.Index,
    model_mapping: Dict[str, systematica.portfolio.analyzer.PortfolioAnalyzer],
) ‑> systematica.signals.base.Signals

Map model_output labels to signals in all_signals, creating a new Signals namedtuple. Parameters:

Name	Type	Default	Description
`model_output`	`tp.Array1d`	`--`	Meta model output.
`index`	`tp.Arrays1d`	`--`	\| pd.Index Datetime index.
`model_mapping`	`tp.Dict[str, PortfolioAnalyzer]`	`--`	Dictionary mapping category names to their respective runner objects.

Returns:

Type	Description
`Signals`	Signals namedtuple with combined clean signals aligned to the provided index.

`meta_model_cv`

meta_model_cv(
    X: numpy.ndarray,
    y: numpy.ndarray,
    index: numpy.ndarray | pandas.core.indexes.base.Index,
    columns: Sequence[Hashable] = None,
    splitter: str = 'from_custom_rolling',
    custom_splitter: str = None,
    custom_splitter_kwargs: Dict[str, Any] = None,
    preprocessor: sklearn.base.BaseEstimator = None,
    estimator: sklearn.base.BaseEstimator = None,
    training_window: Union(annotations=(<class 'vectorbtpro.utils.params.Param'>, <class 'int'>), resolved=True) = 365,
    testing_window: Union(annotations=(<class 'vectorbtpro.utils.params.Param'>, <class 'int'>), resolved=True) = 60,
    n_steps: int = 1,
    to_numpy: bool = False,
    raw_output: bool = False,
    **split_kwargs,
) ‑> pandas.core.frame.DataFrame | numpy.ndarray

Meta classifier CV.

Following the above exceptions, the signal generation process might leave running position for an unintended period of time.

Parameters:

Name	Type	Default	Description
`X`	`tp.Array2d`	`--`	State representation: Input (X).
`y`	`tp.Array1d`	`--`
`Reward`	`Target`	`--`	(y).
`index`	`tp.Array1d`	`--`	The datetime index for the returns data.
`columns`	`tp.Labels`	`None`	The column labels for the output DataFrame. Default is `None`.
`splitter`	`str`	`from_custom_rolling`	The method for splitting the data into training and testing sets. Default is `from_custom_rolling`.
`custom_splitter`	`str`	`None`	Custom splitter function to use. Default is `None`.
`custom_splitter_kwargs`	`tp.Kwargs`	`None`	Custom arguments for the splitter function. Default is `None`.
`preprocessor`	`BaseEstimator`	`None`	Standardize features. If None, defaults to StandardScaler. Defaults to None.
`estimator`	`BaseEstimator`	`None`	Classifier model. If None, defaults to Logistic Regression (aka logit, MaxEnt) classifier. Defaults to None.
`training_window`	`int`	`365`	The size of the training window for cross-validation. Default is 365.
`testing_window`	`int`	`60`	The size of the testing window for cross-validation. Default is 60.
`n_steps`	`int`	`1`	Number of periods to shift backward by `n` positions. This operation intentionally looks ahead to train the model! Must be positive. Default to 1.
`to_numpy`	`bool`	`False`	Whether to return the result as a NumPy array. Default is `False`.
`raw_output`	`bool`	`False`	Whether to return the raw output without any alignment. Default is `False`.
`split_kwargs`	`tp.Kwargs`	`--`	Additional key word arguments for vectorBT PRO splitter.

Raises:

Type	Description
`Exception`	The model does not accept missing values encoded as NaN natively.
`Exception`	Input y contains NaN.
`Exception`	This solver needs samples of at least 2 classes in the data, but the data contains only one class.

Returns:

Type	Description
`pd.DataFrame \| tp.Array2d:`	Trained Model output.

Examples: Import dependencies:

>>> from sklearn.ensemble import RandomForestClassifier 
>>> from sklearn.pipeline import make_pipeline
>>> from sklearn.preprocessing import StandardScaler
>>> import pandas as pd
>>> import vectorbtpro as vbt
>>> import systematica as sma

Initialize BaseMetaModel, as follow:

>>> model = sma.BaseMetaModel.from_model_config(
...     window_in_days=365,
...     metrics='sharpe_ratio',
...     with_id=False,
...     model_reward_config=None
... )

Get features with shifted y by n_steps=1:

>>> X = model.get_inputs(downsample='1d', to_numpy=False, state_config=None) 
>>> y = model.get_target(to_numpy=False).shift(-1)
>>> index = model.data.index

Shift backward by n positions is equivalent to pd.Series(arr).shift(-n). This operation intentionally looks ahead to train the model!

Get custom splitter object:

>>> splitter = sma.CustomSplitter.from_custom_rolling(
...     model.data.index, 
...     **sma.default_splitter_kwargs("from_custom_rolling", training_window=365, testing_window=60)
... )

Get Scikit-Learn Pipeline object:

>>> pipeline = make_pipeline(
...     StandardScaler(), 
...     RandomForestClassifier()
... )

The implementation is equivalent to:

>>> X_slices, y_slices = splitter.take(X), splitter.take(y)
>>> y_tests = []
>>> y_preds = []
>>> for split in X_slices.index.unique(level="split"):  
...     X_train_slice = X_slices[(split, "train")]  
...     y_train_slice = y_slices[(split, "train")]
...     X_test_slice = X_slices[(split, "test")]
...     y_test_slice = y_slices[(split, "test")]
...     try:
...         fitted_pipe = pipeline.fit(X_train_slice, y_train_slice) 
...         test_pred = fitted_pipe.predict(X_test_slice)  
...         test_pred_ser = pd.Series(test_pred, index=y_test_slice.index)
...         y_tests.append(y_test_slice)
...         y_preds.append(test_pred_ser)
...     except:
...         continue
>>> y_test = pd.concat(y_tests).rename("labels")  
>>> y_pred = pd.concat(y_preds).rename("preds")

​get_meta_model_signals

​meta_model_cv

`get_meta_model_signals`

`meta_model_cv`