> ## Documentation Index
> Fetch the complete documentation index at: https://systematica.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Base

> systematica.models.meta_model.base

## `BaseMetaModel`

```python theme={null}
BaseMetaModel(
    data: vectorbtpro.data.base.Data,
    s1: str,
    s2: str,
    window_in_days: int,
    minp: int,
    runners: Dict[str, vectorbtpro.portfolio.base.Portfolio],
    features: pandas.core.frame.DataFrame | numpy.ndarray,
    add_features: Dict[str, Callable],
)
```

A model class for feature processing and category management.

This class provides utilities to work with feature data, extract categories,
and generate visualizations for model analysis.

Method generated by attrs for class BaseMetaModel.

### Static methods

#### `set_keys`

```python theme={null}
set_keys(
    with_id: bool,
    reward_registry: list,
) ‑> List[str]
```

Set keys.

**Parameters**:

| Name              | Type             | Default | Description                                                                                                                                                           |
| ----------------- | ---------------- | ------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `with_id`         | `bool, optional` | `--`    | Output id columns if True. Otherwise, model name columns. Setting `with_id` to True is necessary later in the process to retrieve model parameters. Defaults to True. |
| `reward_registry` | `list, optional` | `--`    | Config of models to use for metric calculation. If None, uses config.                                                                                                 |

**Returns**:

| Type           | Description |
| -------------- | ----------- |
| `tp.List[str]` | Keys.       |

#### `from_model_config`

```python theme={null}
from_model_config(
    window_in_days: int,
    minp: int = None,
    metrics: str | List[str] = 'sharpe_ratio',
    with_id: bool = True,
    reward_registry: List[vectorbtpro.utils.config.FrozenConfig] = None,
    add_features: Dict[str, Callable] = None,
) ‑> systematica.models.meta_model.base.BaseMetaModel
```

Create a Model instance from rolling metrics.

<Notes>
  Equivalent to:

  ```python theme={null}
  for config in model_reward_registry:
      model = getattr(models, config.model)
      pf = model.run_pipeline(...)
      rolling_metric = default_metric_model.run_rolling_metric(...)
  ```
</Notes>

**Parameters**:

| Name              | Type                                          | Default        | Description                                                                                                                                                               |
| ----------------- | --------------------------------------------- | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `window_in_days`  | `int`                                         | `--`           | Rolling window size in days.                                                                                                                                              |
| `minp`            | `int`                                         | `None`         | Minimum number of observations required.                                                                                                                                  |
| `metrics`         | `str`                                         | `sharpe_ratio` | Metric(s) to calculate from the data.                                                                                                                                     |
| `with_id`         | `bool`                                        | `True`         | Output id columns if `True`. Otherwise, model name columns. Setting `with_id` to `True` is necessary later in the process to retrieve model parameters. Defaults to True. |
| `reward_registry` | `List[vbt.FrozenConfig]`                      | `None`         | Config of models to use for metric calculation. If `None`, uses `ModelRewardRegistry`.                                                                                    |
| `add_features`    | `tp.Dict[str, tp.Callable], default `None\`\` | `--`           | Add features to the data object. If `None`, add returns `rets` by default (see `State.from_config`). Defaults to `None`.                                                  |

**Returns**:

| Type            | Description                                                                   |
| --------------- | ----------------------------------------------------------------------------- |
| `BaseMetaModel` | A new Model instance with features populated from calculated rolling metrics. |

#### `from_neptune_config`

```python theme={null}
from_neptune_config(
    window_in_days: int,
    minp: int = None,
    metrics: str | List[str] = 'sharpe_ratio',
    with_id: bool = True,
    reward_registry: List[vectorbtpro.utils.config.FrozenConfig] = None,
) ‑> systematica.models.meta_model.base.BaseMetaModel
```

Create a Model instance from rolling metrics.

**Parameters**:

| Name              | Type                                            | Default        | Description                                                                                                                                                                                                                                                |
| ----------------- | ----------------------------------------------- | -------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `selector`        | `tp.Dict[str, str \| int \| BaseTrialSelector]` | `--`           | The ID(s) of the Neptune run to fetch mapped with trial selector. Trial selector is a custom parameter selection. If None, retrieve 'best/params' from neptune. if int, retrieve trial number. If `BaseTrialSelector`, retrieve params based on algorithm. |
| `window_in_days`  | `int`                                           | `--`           | Rolling window size in days.                                                                                                                                                                                                                               |
| `minp`            | `int`                                           | `None`         | Minimum number of observations required.                                                                                                                                                                                                                   |
| `metrics`         | `str`                                           | `sharpe_ratio` | Metric(s) to calculate from the data.                                                                                                                                                                                                                      |
| `with_id`         | `bool`                                          | `True`         | Output id columns if `True`. Otherwise, model name columns. Setting `with_id` to `True` is necessary later in the process to retrieve model parameters. Defaults to `True`.                                                                                |
| `reward_registry` | `tp.List[vbt.FrozenConfig]`                     | `None`         | Config of models to use for metric calculation. If `None`, uses `NEPTUNE_reward_registry`.                                                                                                                                                                 |

**Returns**:

| Type            | Description                                                                   |
| --------------- | ----------------------------------------------------------------------------- |
| `BaseMetaModel` | A new Model instance with features populated from calculated rolling metrics. |

### Instance variables

* `add_features: Dict[str, Callable]`:

* `categories: pandas.core.series.Series`:

* `category_mapping: Dict[str, int]`:

* `data: vectorbtpro.data.base.Data`:

* `features: pandas.core.frame.DataFrame | numpy.ndarray`:

* `label_mapping: Dict[str, int]`:

* `minp: int`:

* `model_mapping: Dict[str, systematica.portfolio.analyzer.PortfolioAnalyzer]`:

* `runners: Dict[str, vectorbtpro.portfolio.base.Portfolio]`:

* `s1: str`:

* `s2: str`:

* `window_in_days: int`:

### Methods

#### `run_clf`

```python theme={null}
run_clf(
    self,
    splitter: str = 'from_custom_rolling',
    custom_splitter: str = None,
    custom_splitter_kwargs: Dict[str, Any] = None,
    preprocessor: sklearn.base.BaseEstimator = None,
    estimator: sklearn.base.BaseEstimator = None,
    training_window: Union(annotations=(<class 'vectorbtpro.utils.params.Param'>, <class 'int'>), resolved=True) = 365,
    testing_window: Union(annotations=(<class 'vectorbtpro.utils.params.Param'>, <class 'int'>), resolved=True) = 60,
    n_steps: int = 1,
    downsample: str = '1d',
    state_registry: List[vectorbtpro.utils.config.FrozenConfig] = None,
    to_numpy: bool = False,
    raw_output: bool = False,
    **split_kwargs,
) ‑> pandas.core.frame.DataFrame | numpy.ndarray
```

Meta classifier CV.

**Parameters**:

| Name                     | Type                        | Default               | Description                                                                                                                                          |
| ------------------------ | --------------------------- | --------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
| `splitter`               | `str`                       | `from_custom_rolling` | The method for splitting the data into training and testing sets. Default is "from\_custom\_rolling".                                                |
| `custom_splitter`        | `str`                       | `None`                | Custom splitter function to use. Default is `None`.                                                                                                  |
| `custom_splitter_kwargs` | `tp.Kwargs`                 | `None`                | Custom arguments for the splitter function. Default is `None`.                                                                                       |
| `preprocessor`           | `BaseEstimator`             | `None`                | Standardize features. If `None`, defaults to `StandardScaler`. Defaults to `None`.                                                                   |
| `estimator`              | `BaseEstimator`             | `None`                | Classifier model. If `None`, defaults to Logistic Regression (aka `logit`, `MaxEnt`) classifier. Defaults to `None`.                                 |
| `training_window`        | `int`                       | `365`                 | The size of the training window for cross-validation. Default is `365`.                                                                              |
| `testing_window`         | `int`                       | `60`                  | The size of the testing window for cross-validation. Default is `60`.                                                                                |
| `n_steps`                | `int`                       | `1`                   | Number of periods to shift backward by `n` positions. This operation intentionally looks ahead to train the model! Must be positive. Default to `1`. |
| `downsample`             | `str`                       | `1d`                  | Resample data before state computation to speed up the process. If None, no resampling is performed. Defaults to `1d` (daily).                       |
| `state_registry`         | `tp.List[vbt.FrozenConfig]` | `None`                | State config to use. If `None`, defaults to `STATE_CONFIG`. The default is `None`.                                                                   |
| `to_numpy`               | `bool`                      | `False`               | Whether to return the result as a NumPy array. Default is `False`.                                                                                   |
| `raw_output`             | `bool`                      | `False`               | Whether to return the raw output without any alignment. Default is `False`.                                                                          |
| `split_kwargs`           | `tp.Kwargs`                 | `--`                  | Additional key word arguments for vectorBT PRO splitter.                                                                                             |

**Returns**:

| Type                          | Description           |
| ----------------------------- | --------------------- |
| `pd.DataFrame \| tp.Array2d:` | Trained Model output. |

#### `get_target`

```python theme={null}
get_target(
    self,
    to_numpy: bool = False,
) ‑> numpy.ndarray | pandas.core.series.Series
```

Convert categorical labels to numeric codes.

**Parameters**:

| Name       | Type   | Default | Description                                                              |
| ---------- | ------ | ------- | ------------------------------------------------------------------------ |
| `to_numpy` | `bool` | `False` | Output numpy array if True. Pandas DataFrame otherwise. Defaut to False. |

**Returns**:

| Type                      | Description                                                    |
| ------------------------- | -------------------------------------------------------------- |
| `tp.Array1d \| pd.Series` | Array or Series containing the numeric encoding of categories. |

#### `get_inputs`

```python theme={null}
get_inputs(
    self,
    downsample: str = '1d',
    to_numpy: bool = False,
    state_registry: List[vectorbtpro.utils.config.FrozenConfig] = None,
) ‑> pandas.core.frame.DataFrame | numpy.ndarray
```

Get state representation: Input (X).

<Note>
  To include data in vbt.Data object, use `data.add_feature` method as follow:

  ```python theme={null}
  data = sma.load_clean_data("1d")
  rets = sma.get_returns(data.close)
  data = data.add_feature("rets", rets, missing_index="drop")
  ```
</Note>

**Parameters**:

| Name             | Type                   | Default | Description                                                                                                                      |
| ---------------- | ---------------------- | ------- | -------------------------------------------------------------------------------------------------------------------------------- |
| `downsample`     | `str`                  | `1d`    | Resample data before state computation to speed up the process. If `None`, no resampling is performed. Defaults to `1d` (daily). |
| `to_numpy`       | `bool`                 | `False` | Output Numpy array if `True`. Pandas DataFrame object otherwise. The default is `False`.                                         |
| `state_registry` | `tp.List[StateConfig]` | `None`  | State config to use. If `None`, defaults to `StateRegistry`. The default is `None`.                                              |

**Returns**:

| Type                         | Description                  |
| ---------------------------- | ---------------------------- |
| `pd.DataFrame \| tp.Array2d` | Input (exogenous) variables. |

#### `get_accuracy_score`

```python theme={null}
get_accuracy_score(
    self,
    y_pred: pandas.core.series.Series,
    normalize: bool = True,
) ‑> float
```

Get Accuracy classification score.

In multilabel classification, this function computes subset accuracy:
the set of labels predicted for a sample must *exactly* match the
corresponding set of labels in `y_true`.

**See Also**:

* `balanced_accuracy_score`: Compute the balanced accuracy to deal with imbalanced datasets.
* `jaccard_score`: Compute the Jaccard similarity coefficient score.
* `hamming_loss`: Compute the average Hamming loss or Hamming distance between two sets of samples.
* `zero_one_loss` : Compute the Zero-one classification loss. By default, the function will return the percentage of imperfectly predicted subsets.

**Parameters**:

| Name        | Type        | Default | Description                                                                                                                    |
| ----------- | ----------- | ------- | ------------------------------------------------------------------------------------------------------------------------------ |
| `y_pred`    | `pd.Series` | `--`    | Predicted labels, as returned by a classifier.                                                                                 |
| `normalize` | `bool`      | `True`  | If `False`, return the number of correctly classified samples. Otherwise, return the fraction of correctly classified samples. |

**Returns**:

| Type           | Description                                                                                                                                                                                                                                              |
| -------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `float or int` | If `normalize=True`, return the fraction of correctly classified samples (`float`), else returns the number of correctly classified samples (`int`). The best performance is `1` with `normalize=True` and the number of samples with `normalize=False`. |

#### `get_report`

```python theme={null}
get_report(
    self,
    y_pred: pandas.core.series.Series,
) ‑> pandas.core.frame.DataFrame
```

Build a report showing the main classification metrics.

<Notes>
  The reported averages include macro average (averaging the unweighted
  mean per label), weighted average (averaging the support-weighted mean
  per label), and sample average (only for multilabel classification).
  Micro average (averaging the total true positives, false negatives and
  false positives) is only shown for multi-label or multi-class
  with a subset of classes, because it corresponds to accuracy
  otherwise and would be the same for all metrics.
  See also :func:`precision_recall_fscore_support` for more details
  on averages.

  Note that in binary classification, recall of the positive class
  is also known as "sensitivity"; recall of the negative class is
  "specificity".
</Notes>

**See Also**:

* `precision_recall_fscore_support`: Compute precision, recall, F-measure and support for each class.
* `confusion_matrix`: Compute confusion matrix to evaluate the accuracy of a classification.
* `multilabel_confusion_matrix`: Compute a confusion matrix for each class or sample.

**Parameters**:

| Name     | Type        | Default | Description                                    |
| -------- | ----------- | ------- | ---------------------------------------------- |
| `y_pred` | `pd.Series` | `--`    | Estimated targets as returned by a classifier. |

**Returns**:

| Type                    | Description                                                          |
| ----------------------- | -------------------------------------------------------------------- |
| `report : pd.DataFrame` | DataFrame summary of the precision, recall, F1 score for each class. |

#### `get_confusion_matrix`

```python theme={null}
get_confusion_matrix(
    self,
    y_pred: pandas.core.series.Series,
    normalize: str = None,
) ‑> pandas.core.frame.DataFrame
```

Compute confusion matrix to evaluate the accuracy of a classification.

By definition a confusion matrix $C$ is such that $C_{i, j}$
is equal to the number of observations known to be in group :math: $i$ and
predicted to be in group $j$.

Thus in binary classification, the count of true negatives is
$C_{0,0}$, false negatives is $C_{1,0}$, true positives is
$C_{1,1}$ and false positives is $C_{0,1}$.

**See Also**:

* `ConfusionMatrixDisplay.from_estimator`: Plot the confusion matrix
  given an estimator, the data, and the label.
* `ConfusionMatrixDisplay.from_predictions`: Plot the confusion matrix
  given the true and predicted labels.
* `ConfusionMatrixDisplay` : Confusion Matrix visualization.

**Parameters**:

| Name        | Type        | Default | Description                                                                                                                                                 |
| ----------- | ----------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `y_pred`    | `pd.Series` | `--`    | Estimated targets as returned by a classifier.                                                                                                              |
| `normalize` | `str`       | `None`  | Normalizes confusion matrix over the true (rows), predicted (columns) conditions or all the population. If `None`, confusion matrix will not be normalized. |

**Returns**:

| Type           | Description                                                                                                                                                  |
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `pd.DataFrame` | Confusion matrix whose i-th row and j-th column entry indicates the number of samples with true label being i-th class and predicted label being j-th class. |

#### `get_feature_importance`

```python theme={null}
get_feature_importance(
    self,
    X: pandas.core.frame.DataFrame,
    tree_based_estimator: sklearn.ensemble._forest.RandomForestClassifier,
) ‑> pandas.core.frame.DataFrame
```

The impurity-based feature importances.

The higher, the more important the feature.
The importance of a feature is computed as the (normalized)
total reduction of the criterion brought by that feature.  It is also
known as the Gini importance.

<Warning>
  Impurity-based feature importances can be misleading for
  high cardinality features (many unique values). See `sklearn.inspection.permutation_importance` as an alternative.
</Warning>

**Parameters**:

| Name                   | Type                     | Default | Description                                                    |
| ---------------------- | ------------------------ | ------- | -------------------------------------------------------------- |
| `X`                    | `pd.DataFrame`           | `--`    | State representation: Input (X).                               |
| `tree_based_estimator` | `RandomForestClassifier` | `--`    | Tree based estimator supporting `feature_importances_` method. |

**Returns**:

| Type           | Description                                                                                                                                               |
| -------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `pd.DataFrame` | The values of this array sum to `1`, unless all trees are single node trees consisting of only the root node, in which case it will be an array of zeros. |

#### `plot_rolling_metrics`

```python theme={null}
plot_rolling_metrics(
    self,
) ‑> vectorbtpro.utils.figure.FigureWidget
```

Visualize rolling metrics from features.

**Returns**:

| Type               | Description                                                 |
| ------------------ | ----------------------------------------------------------- |
| `vbt.FigureWidget` | A visualization of the features using vectorbtpro plotting. |

#### `plot_target`

```python theme={null}
plot_target(
    self,
) ‑> vectorbtpro.utils.figure.FigureWidget
```

Visualize encoded labels.

**Returns**:

| Type               | Description                                                       |
| ------------------ | ----------------------------------------------------------------- |
| `vbt.FigureWidget` | A visualization of the encoded labels using vectorbtpro plotting. |

#### `plot_heatmap_overlay`

```python theme={null}
plot_heatmap_overlay(
    self,
    y_test_or_pred: pandas.core.series.Series,
    **layout_kwargs,
) ‑> vectorbtpro.utils.figure.FigureWidget
```

Plot a Series as a line and overlay it with a heatmap.

**Parameters**:

| Name             | Type        | Default | Description                                              |
| ---------------- | ----------- | ------- | -------------------------------------------------------- |
| `y_test_or_pred` | `pd.Series` | `--`    | Labels or estimated targets as returned by a classifier. |
| `layout_kwargs`  | `tp.Kwargs` | `--`    | Additional Plotly key-word arguments                     |

**Returns**:

| Type               | Description                                            |
| ------------------ | ------------------------------------------------------ |
| `vbt.FigureWidget` | Plot a Series as a line and overlay it with a heatmap. |
