Skip to main content

BaseMetaModel

BaseMetaModel(
    data: vectorbtpro.data.base.Data,
    s1: str,
    s2: str,
    window_in_days: int,
    minp: int,
    runners: Dict[str, vectorbtpro.portfolio.base.Portfolio],
    features: pandas.core.frame.DataFrame | numpy.ndarray,
    add_features: Dict[str, Callable],
)
A model class for feature processing and category management. This class provides utilities to work with feature data, extract categories, and generate visualizations for model analysis. Method generated by attrs for class BaseMetaModel.

Static methods

set_keys

set_keys(
    with_id: bool,
    reward_registry: list,
) ‑> List[str]
Set keys. Parameters:
NameTypeDefaultDescription
with_idbool, optional--Output id columns if True. Otherwise, model name columns. Setting with_id to True is necessary later in the process to retrieve model parameters. Defaults to True.
reward_registrylist, optional--Config of models to use for metric calculation. If None, uses config.
Returns:
TypeDescription
tp.List[str]Keys.

from_model_config

from_model_config(
    window_in_days: int,
    minp: int = None,
    metrics: str | List[str= 'sharpe_ratio',
    with_id: bool = True,
    reward_registry: List[vectorbtpro.utils.config.FrozenConfig] = None,
    add_features: Dict[str, Callable] = None,
) ‑> systematica.models.meta_model.base.BaseMetaModel
Create a Model instance from rolling metrics. Parameters:
NameTypeDefaultDescription
window_in_daysint--Rolling window size in days.
minpintNoneMinimum number of observations required.
metricsstrsharpe_ratioMetric(s) to calculate from the data.
with_idboolTrueOutput id columns if True. Otherwise, model name columns. Setting with_id to True is necessary later in the process to retrieve model parameters. Defaults to True.
reward_registryList[vbt.FrozenConfig]NoneConfig of models to use for metric calculation. If None, uses ModelRewardRegistry.
add_featurestp.Dict[str, tp.Callable], default None“--Add features to the data object. If None, add returns rets by default (see State.from_config). Defaults to None.
Returns:
TypeDescription
BaseMetaModelA new Model instance with features populated from calculated rolling metrics.

from_neptune_config

from_neptune_config(
    window_in_days: int,
    minp: int = None,
    metrics: str | List[str= 'sharpe_ratio',
    with_id: bool = True,
    reward_registry: List[vectorbtpro.utils.config.FrozenConfig] = None,
) ‑> systematica.models.meta_model.base.BaseMetaModel
Create a Model instance from rolling metrics. Parameters:
NameTypeDefaultDescription
selectortp.Dict[str, str | int | BaseTrialSelector]--The ID(s) of the Neptune run to fetch mapped with trial selector. Trial selector is a custom parameter selection. If None, retrieve ‘best/params’ from neptune. if int, retrieve trial number. If BaseTrialSelector, retrieve params based on algorithm.
window_in_daysint--Rolling window size in days.
minpintNoneMinimum number of observations required.
metricsstrsharpe_ratioMetric(s) to calculate from the data.
with_idboolTrueOutput id columns if True. Otherwise, model name columns. Setting with_id to True is necessary later in the process to retrieve model parameters. Defaults to True.
reward_registrytp.List[vbt.FrozenConfig]NoneConfig of models to use for metric calculation. If None, uses NEPTUNE_reward_registry.
Returns:
TypeDescription
BaseMetaModelA new Model instance with features populated from calculated rolling metrics.

Instance variables

  • add_features: Dict[str, Callable]:
  • categories: pandas.core.series.Series:
  • category_mapping: Dict[str, int]:
  • data: vectorbtpro.data.base.Data:
  • features: pandas.core.frame.DataFrame | numpy.ndarray:
  • label_mapping: Dict[str, int]:
  • minp: int:
  • model_mapping: Dict[str, systematica.portfolio.analyzer.PortfolioAnalyzer]:
  • runners: Dict[str, vectorbtpro.portfolio.base.Portfolio]:
  • s1: str:
  • s2: str:
  • window_in_days: int:

Methods

run_clf

run_clf(
    self,
    splitter: str = 'from_custom_rolling',
    custom_splitter: str = None,
    custom_splitter_kwargs: Dict[str, Any] = None,
    preprocessor: sklearn.base.BaseEstimator = None,
    estimator: sklearn.base.BaseEstimator = None,
    training_window: Union(annotations=(<class 'vectorbtpro.utils.params.Param'><class 'int'>), resolved=True= 365,
    testing_window: Union(annotations=(<class 'vectorbtpro.utils.params.Param'><class 'int'>), resolved=True= 60,
    n_steps: int = 1,
    downsample: str = '1d',
    state_registry: List[vectorbtpro.utils.config.FrozenConfig] = None,
    to_numpy: bool = False,
    raw_output: bool = False,
    **split_kwargs,
) ‑> pandas.core.frame.DataFrame | numpy.ndarray
Meta classifier CV. Parameters:
NameTypeDefaultDescription
splitterstrfrom_custom_rollingThe method for splitting the data into training and testing sets. Default is “from_custom_rolling”.
custom_splitterstrNoneCustom splitter function to use. Default is None.
custom_splitter_kwargstp.KwargsNoneCustom arguments for the splitter function. Default is None.
preprocessorBaseEstimatorNoneStandardize features. If None, defaults to StandardScaler. Defaults to None.
estimatorBaseEstimatorNoneClassifier model. If None, defaults to Logistic Regression (aka logit, MaxEnt) classifier. Defaults to None.
training_windowint365The size of the training window for cross-validation. Default is 365.
testing_windowint60The size of the testing window for cross-validation. Default is 60.
n_stepsint1Number of periods to shift backward by n positions. This operation intentionally looks ahead to train the model! Must be positive. Default to 1.
downsamplestr1dResample data before state computation to speed up the process. If None, no resampling is performed. Defaults to 1d (daily).
state_registrytp.List[vbt.FrozenConfig]NoneState config to use. If None, defaults to STATE_CONFIG. The default is None.
to_numpyboolFalseWhether to return the result as a NumPy array. Default is False.
raw_outputboolFalseWhether to return the raw output without any alignment. Default is False.
split_kwargstp.Kwargs--Additional key word arguments for vectorBT PRO splitter.
Returns:
TypeDescription
pd.DataFrame | tp.Array2d:Trained Model output.

get_target

get_target(
    self,
    to_numpy: bool = False,
) ‑> numpy.ndarray | pandas.core.series.Series
Convert categorical labels to numeric codes. Parameters:
NameTypeDefaultDescription
to_numpyboolFalseOutput numpy array if True. Pandas DataFrame otherwise. Defaut to False.
Returns:
TypeDescription
tp.Array1d | pd.SeriesArray or Series containing the numeric encoding of categories.

get_inputs

get_inputs(
    self,
    downsample: str = '1d',
    to_numpy: bool = False,
    state_registry: List[vectorbtpro.utils.config.FrozenConfig] = None,
) ‑> pandas.core.frame.DataFrame | numpy.ndarray
Get state representation: Input (X).
To include data in vbt.Data object, use data.add_feature method as follow:
data = sma.load_clean_data("1d")
rets = sma.get_returns(data.close)
data = data.add_feature("rets", rets, missing_index="drop")
Parameters:
NameTypeDefaultDescription
downsamplestr1dResample data before state computation to speed up the process. If None, no resampling is performed. Defaults to 1d (daily).
to_numpyboolFalseOutput Numpy array if True. Pandas DataFrame object otherwise. The default is False.
state_registrytp.List[StateConfig]NoneState config to use. If None, defaults to StateRegistry. The default is None.
Returns:
TypeDescription
pd.DataFrame | tp.Array2dInput (exogenous) variables.

get_accuracy_score

get_accuracy_score(
    self,
    y_pred: pandas.core.series.Series,
    normalize: bool = True,
) ‑> float
Get Accuracy classification score. In multilabel classification, this function computes subset accuracy: the set of labels predicted for a sample must exactly match the corresponding set of labels in y_true. See Also:
  • balanced_accuracy_score: Compute the balanced accuracy to deal with imbalanced datasets.
  • jaccard_score: Compute the Jaccard similarity coefficient score.
  • hamming_loss: Compute the average Hamming loss or Hamming distance between two sets of samples.
  • zero_one_loss : Compute the Zero-one classification loss. By default, the function will return the percentage of imperfectly predicted subsets.
Parameters:
NameTypeDefaultDescription
y_predpd.Series--Predicted labels, as returned by a classifier.
normalizeboolTrueIf False, return the number of correctly classified samples. Otherwise, return the fraction of correctly classified samples.
Returns:
TypeDescription
float or intIf normalize=True, return the fraction of correctly classified samples (float), else returns the number of correctly classified samples (int). The best performance is 1 with normalize=True and the number of samples with normalize=False.

get_report

get_report(
    self,
    y_pred: pandas.core.series.Series,
) ‑> pandas.core.frame.DataFrame
Build a report showing the main classification metrics. See Also:
  • precision_recall_fscore_support: Compute precision, recall, F-measure and support for each class.
  • confusion_matrix: Compute confusion matrix to evaluate the accuracy of a classification.
  • multilabel_confusion_matrix: Compute a confusion matrix for each class or sample.
Parameters:
NameTypeDefaultDescription
y_predpd.Series--Estimated targets as returned by a classifier.
Returns:
TypeDescription
report : pd.DataFrameDataFrame summary of the precision, recall, F1 score for each class.

get_confusion_matrix

get_confusion_matrix(
    self,
    y_pred: pandas.core.series.Series,
    normalize: str = None,
) ‑> pandas.core.frame.DataFrame
Compute confusion matrix to evaluate the accuracy of a classification. By definition a confusion matrix CC is such that Ci,jC_{i, j} is equal to the number of observations known to be in group :math: ii and predicted to be in group jj. Thus in binary classification, the count of true negatives is C0,0C_{0,0}, false negatives is C1,0C_{1,0}, true positives is C1,1C_{1,1} and false positives is C0,1C_{0,1}. See Also:
  • ConfusionMatrixDisplay.from_estimator: Plot the confusion matrix given an estimator, the data, and the label.
  • ConfusionMatrixDisplay.from_predictions: Plot the confusion matrix given the true and predicted labels.
  • ConfusionMatrixDisplay : Confusion Matrix visualization.
Parameters:
NameTypeDefaultDescription
y_predpd.Series--Estimated targets as returned by a classifier.
normalizestrNoneNormalizes confusion matrix over the true (rows), predicted (columns) conditions or all the population. If None, confusion matrix will not be normalized.
Returns:
TypeDescription
pd.DataFrameConfusion matrix whose i-th row and j-th column entry indicates the number of samples with true label being i-th class and predicted label being j-th class.

get_feature_importance

get_feature_importance(
    self,
    X: pandas.core.frame.DataFrame,
    tree_based_estimator: sklearn.ensemble._forest.RandomForestClassifier,
) ‑> pandas.core.frame.DataFrame
The impurity-based feature importances. The higher, the more important the feature. The importance of a feature is computed as the (normalized) total reduction of the criterion brought by that feature. It is also known as the Gini importance.
Impurity-based feature importances can be misleading for high cardinality features (many unique values). See sklearn.inspection.permutation_importance as an alternative.
Parameters:
NameTypeDefaultDescription
Xpd.DataFrame--State representation: Input (X).
tree_based_estimatorRandomForestClassifier--Tree based estimator supporting feature_importances_ method.
Returns:
TypeDescription
pd.DataFrameThe values of this array sum to 1, unless all trees are single node trees consisting of only the root node, in which case it will be an array of zeros.

plot_rolling_metrics

plot_rolling_metrics(
    self,
) ‑> vectorbtpro.utils.figure.FigureWidget
Visualize rolling metrics from features. Returns:
TypeDescription
vbt.FigureWidgetA visualization of the features using vectorbtpro plotting.

plot_target

plot_target(
    self,
) ‑> vectorbtpro.utils.figure.FigureWidget
Visualize encoded labels. Returns:
TypeDescription
vbt.FigureWidgetA visualization of the encoded labels using vectorbtpro plotting.

plot_heatmap_overlay

plot_heatmap_overlay(
    self,
    y_test_or_pred: pandas.core.series.Series,
    **layout_kwargs,
) ‑> vectorbtpro.utils.figure.FigureWidget
Plot a Series as a line and overlay it with a heatmap. Parameters:
NameTypeDefaultDescription
y_test_or_predpd.Series--Labels or estimated targets as returned by a classifier.
layout_kwargstp.Kwargs--Additional Plotly key-word arguments
Returns:
TypeDescription
vbt.FigureWidgetPlot a Series as a line and overlay it with a heatmap.