Overview
Statistical arbitrage models in Systematica are designed to identify and exploit temporary price divergences between related financial instruments. The framework provides several model categories, each implementing different statistical approaches to identify trading opportunities.Ornstein-Uhlenbeck Process Models
The Ornstein-Uhlenbeck (OU) process models implement mean-reverting statistical arbitrage based on the Ornstein-Uhlenbeck stochastic process.- Mean Reversion Speed: The autocorrelation relates to . A smaller (closer to 0) implies faster reversion (larger ).
- Trading Signals: Deviations from beyond suggest trading opportunities, as the series is expected to revert.
- Parameter Estimation: Calculates long-term mean and equilibrium standard deviation
- S-Score Calculation: Generates standardized reversion signals
- Residual Integration: Uses PCA-decomposed residuals as input data
reversion: Speed of mean reversion threshold (auto-calculated ifNone)ann_factor: Annualization factor (inferred from data frequency)training_window/testing_window: Cross-validation windowsn_components: PCA variance explained (default0.5)
- : Price or spread at time
- : Speed of mean reversion
- : Long-term mean (equilibrium level)
- : Volatility of the process
- : Wiener process (random noise)
Key Calculations
The OU parameters are estimated for a time series as follows:- Autocorrelation Coefficient ():
- Computed as the correlation between consecutive values:
- Measures how strongly predicts . A value close to 1 indicates slow mean reversion.
- Long-Term Mean ():
- Derived from the regression , where: and
- Represents the equilibrium level the series reverts to.
- Equilibrium Standard Deviation ():
- Calculated from residuals :
- Measures volatility around the mean, adjusted for autocorrelation.
RollingOUProcess vs OUProcessCV
Two implementations are available: rolling window and cross-validation approaches.
- RollingOUProcess: Uses a sliding window for real-time parameter estimation, ideal for streaming data.
- OUProcessCV: Employs cross-validation splits for robust backtesting, optimizing parameters over training/testing periods.
| Feature | RollingOUProcess | OUProcessCV |
|---|---|---|
| Window Type | Rolling/sliding window | Cross-validation splits |
| Parameters | window, minp | training_window, testing_window |
| Use Case | Real-time/streaming | Backtesting/optimization |
| Function | rolling_ou_process_nb | ou_process_cv |
RollingOUProcess
- Dimensionality Reduction: Principal Component Analysis is used to model returns of assets by decomposing them into systematic (market) components and idiosyncratic (residual) components. Assets are modeled using a multi-factor approach, with the residual returns assumed to be the source of alpha.
- Market-Neutral Portfolio: Ensuring that the portfolio’s factor exposures sum to zero.
- Eigenportfolio Construction: Assets are allocated to eigenportfolios based on the eigenvectors of the correlation matrix, with investments scaled by the volatility of each asset.
- Extracting the Residuals: Residuals — the deviations from a statistical relationship — are derived from models that predict or explain the relationship between asset prices.
- Mean-Reverting?: Residual-based strategies typically assume that residuals will revert to their mean over time, creating opportunities for profit.
- Calculate Returns: Use closing prices to calculate returns.
- Standardize Data: PCA seeks to maximize the variance of each component. Standardization will scale variance to a common measure.
- Reduce Dimensionality: Decompose asset returns using PCA - Use to explain ~% of the variance.
- Eigenportfolio Returns: Estimate the market-neutral eigenportfolio returns for each asset in the universe.
- Systematic and Residual Components: Estimate the residual components for each asset.
Ancestors
systematica.models.base.BaseStatArbabc.ABC
Descendants
systematica.api.models.ou_process.OUProcessCV
Instance variables
-
ann_factor: str | int: Annualization coefficient. If ‘auto’, infer frequency and ann_factor factor automatically. Defaults to ‘auto’. -
autotune: bool: Tune estimator hyperparameter automatically. Defaults toFalse, which means no autotuning is performed. -
custom_splitter: str | None: Custom splitter to use for data partitioning. Defaults toNone, which means no custom splitter is used. If set, it should be a string that matches a custom splitter function. -
custom_splitter_kwargs: dict | None: Additional keyword arguments for the custom splitter. Defaults toNone, which means no additional arguments are passed. Ifcustom_splitteris set, this should contain any necessary parameters for the custom splitter function. -
estimator: sklearn.base.BaseEstimator: scikit-learn estimator to use. IfNone, usesLinearRegressionfromsklearn. Defaults toNone, which means no estimator is used. -
minp: int: Mininim period. Defaults toNone, which means no minimum period is applied. -
n_components: str | int | float: Number of components to keep. Ifn_components == 'mle', Minka’s MLE is used to guess the dimension. If0 < n_components < 1, select the number of components such that the amount of variance that needs to be explained is greater than the percentage specified byn_components. Defaults to0.5, which mean half of the variance is captured. -
reversion: float: Speed of reversion. IfNone, the model auto evaluate the reversion speed coefficient.reversionshould be less than half the size of the window used for residual estimation. For example, with a 60-day window, half is 30 days. Therefore: Assuming 252 trading days in a year. Defaults toNone. -
splitter: str: Default splitter to be used ifcustom_splitteris not passed. Choices arefrom_rolling,from_expanding,from_custom_rolling,from_custom_expanding. Defaults tofrom_custom_rolling. -
testing_window: int: The size of the testing window. Defaults to60. This is the window size used for testing the model. -
training_window: int: The size of the training window. Defaults to365. This is the window size used for training the model. -
window: int: The size of the rolling window. Defaults to365.
Methods
get_residuals
| Name | Type | Default | Description |
|---|---|---|---|
rets | tp.Array2d | -- | A NumPy array of returns. |
index | tp.Array1d | -- | A NumPy array of datetime indices. |
columns | tp.Array1d | -- | A NumPy array of column indices. |
| Type | Description |
|---|---|
tp.Array2d | A NumPy array of residuals. |
OUProcessCV
RollingOUProcess.
Method generated by attrs for class OUProcessCV.
Ancestors
systematica.api.models.ou_process.RollingOUProcesssystematica.models.base.BaseStatArbabc.ABC

