Auto

class Auto(target_metric: etna.metrics.base.Metric, horizon: int, metric_aggregation: Literal['median', 'mean', 'std', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95'] = 'mean', backtest_params: Optional[dict] = None, experiment_folder: Optional[str] = None, pool: Union[etna.auto.pool.generator.Pool, List[etna.pipeline.base.BasePipeline]] = Pool.default, runner: Optional[etna.auto.runner.base.AbstractRunner] = None, storage: Optional[optuna.storages._base.BaseStorage] = None, metrics: Optional[List[etna.metrics.base.Metric]] = None)[source]

Bases: etna.auto.auto.AutoBase

Automatic pipeline selection via defined or custom pipeline pool.

Initialize Auto class.

Parameters
  • target_metric (etna.metrics.base.Metric) – Metric to optimize.

  • horizon (int) – Horizon to forecast for.

  • metric_aggregation (Literal['median', 'mean', 'std', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95']) – Aggregation method for per-segment metrics. By default, mean aggregation is used.

  • backtest_params (Optional[dict]) – Custom parameters for backtest instead of default backtest parameters.

  • experiment_folder (Optional[str]) – Name for saving experiment results, it determines the name for optuna study. By default, isn’t set.

  • pool (Union[etna.auto.pool.generator.Pool, List[etna.pipeline.base.BasePipeline]]) – Pool of pipelines to choose from. By default, default pool from Pool is used.

  • runner (Optional[etna.auto.runner.base.AbstractRunner]) – Runner to use for distributed training. By default, LocalRunner is used.

  • storage (Optional[optuna.storages._base.BaseStorage]) – Optuna storage to use. By default, sqlite storage is used.

  • metrics (Optional[List[etna.metrics.base.Metric]]) – List of metrics to compute. By default, Sign, SMAPE, MAE, MSE, MedAE metrics are used.

Inherited-members

Methods

fit(ts[, timeout, n_trials, initializer, ...])

Start automatic pipeline selection.

objective(ts, target_metric, ...[, ...])

Optuna objective wrapper for the pool stage.

summary()

Get Auto trials summary.

top_k([k])

Get top k pipelines with the best metric value.

fit(ts: etna.datasets.tsdataset.TSDataset, timeout: Optional[int] = None, n_trials: Optional[int] = None, initializer: Optional[etna.auto.auto._Initializer] = None, callback: Optional[etna.auto.auto._Callback] = None, **kwargs) etna.pipeline.base.BasePipeline[source]

Start automatic pipeline selection.

There are two stages:

  • Pool stage: trying every pipeline in a pool

  • Tuning stage: tuning tune_size best pipelines from a previous stage by using :py:class`~etna.auto.auto.Tune`.

Tuning stage starts only if limits on n_trials and timeout aren’t exceeded. Tuning goes from the best pipeline to the worst, and trial limits (n_trials, timeout) are divided evenly between each pipeline. If there are no limits on number of trials only the first pipeline will be tuned until user stops the process.

Parameters
  • ts (etna.datasets.tsdataset.TSDataset) – TSDataset to fit on.

  • timeout (Optional[int]) – Timeout for optuna. N.B. this is timeout for each worker. By default, isn’t set.

  • n_trials (Optional[int]) – Number of trials for optuna. N.B. this is number of trials for each worker. By default, isn’t set.

  • initializer (Optional[etna.auto.auto._Initializer]) – Object that is called before each pipeline backtest, can be used to initialize loggers.

  • callback (Optional[etna.auto.auto._Callback]) – Object that is called after each pipeline backtest, can be used to log extra metrics.

  • **kwargs – Parameter tune_size (default: 0) determines how many pipelines to fit during tuning stage. Other parameters are passed into optuna optuna.study.Study.optimize().

Return type

etna.pipeline.base.BasePipeline

static objective(ts: etna.datasets.tsdataset.TSDataset, target_metric: etna.metrics.base.Metric, metric_aggregation: Literal['median', 'mean', 'std', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95'], metrics: List[etna.metrics.base.Metric], backtest_params: dict, initializer: Optional[etna.auto.auto._Initializer] = None, callback: Optional[etna.auto.auto._Callback] = None) Callable[[optuna.trial._trial.Trial], float][source]

Optuna objective wrapper for the pool stage.

Parameters
  • ts (etna.datasets.tsdataset.TSDataset) – TSDataset to fit on.

  • target_metric (etna.metrics.base.Metric) – Metric to optimize.

  • metric_aggregation (Literal['median', 'mean', 'std', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95']) – Aggregation method for per-segment metrics.

  • metrics (List[etna.metrics.base.Metric]) – List of metrics to compute.

  • backtest_params (dict) – Custom parameters for backtest instead of default backtest parameters.

  • initializer (Optional[etna.auto.auto._Initializer]) – Object that is called before each pipeline backtest, can be used to initialize loggers.

  • callback (Optional[etna.auto.auto._Callback]) – Object that is called after each pipeline backtest, can be used to log extra metrics.

Returns

function that runs specified trial and returns its evaluated score

Return type

objective

summary() pandas.core.frame.DataFrame[source]

Get Auto trials summary.

There are columns:

  • hash: hash of the pipeline;

  • pipeline: pipeline object;

  • metrics: columns with metrics’ values;

  • state: state of the trial;

  • study: name of the study in which trial was made.

Returns

dataframe with detailed info on each performed trial

Return type

study_dataframe

top_k(k: int = 5) List[etna.pipeline.base.BasePipeline]

Get top k pipelines with the best metric value.

Only complete and non-duplicate studies are taken into account.

Parameters

k (int) – Number of pipelines to return.

Returns

List of top k pipelines.

Return type

List[etna.pipeline.base.BasePipeline]