auto

Classes

Auto(target_metric, horizon[, ...])

Automatic pipeline selection via defined or custom pipeline pool.

AutoAbstract()

Interface for Auto object.

AutoBase(target_metric, horizon[, ...])

Base Class for Auto and Tune, implementing core logic behind these classes.

Tune(pipeline, target_metric, horizon[, ...])

Automatic tuning of custom pipeline.

_Callback(*args, **kwargs)

_Initializer(*args, **kwargs)

class Auto(target_metric: etna.metrics.base.Metric, horizon: int, metric_aggregation: Literal['median', 'mean', 'std', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95'] = 'mean', backtest_params: Optional[dict] = None, experiment_folder: Optional[str] = None, pool: Union[etna.auto.pool.generator.Pool, List[etna.pipeline.base.BasePipeline]] = Pool.default, runner: Optional[etna.auto.runner.base.AbstractRunner] = None, storage: Optional[optuna.storages._base.BaseStorage] = None, metrics: Optional[List[etna.metrics.base.Metric]] = None)[source]

Automatic pipeline selection via defined or custom pipeline pool.

Initialize Auto class.

Parameters
  • target_metric (etna.metrics.base.Metric) – Metric to optimize.

  • horizon (int) – Horizon to forecast for.

  • metric_aggregation (Literal['median', 'mean', 'std', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95']) – Aggregation method for per-segment metrics. By default, mean aggregation is used.

  • backtest_params (Optional[dict]) – Custom parameters for backtest instead of default backtest parameters.

  • experiment_folder (Optional[str]) – Name for saving experiment results, it determines the name for optuna study. By default, isn’t set.

  • pool (Union[etna.auto.pool.generator.Pool, List[etna.pipeline.base.BasePipeline]]) – Pool of pipelines to choose from. By default, default pool from Pool is used.

  • runner (Optional[etna.auto.runner.base.AbstractRunner]) – Runner to use for distributed training. By default, LocalRunner is used.

  • storage (Optional[optuna.storages._base.BaseStorage]) – Optuna storage to use. By default, sqlite storage is used.

  • metrics (Optional[List[etna.metrics.base.Metric]]) – List of metrics to compute. By default, Sign, SMAPE, MAE, MSE, MedAE metrics are used.

fit(ts: etna.datasets.tsdataset.TSDataset, timeout: Optional[int] = None, n_trials: Optional[int] = None, initializer: Optional[etna.auto.auto._Initializer] = None, callback: Optional[etna.auto.auto._Callback] = None, **kwargs) etna.pipeline.base.BasePipeline[source]

Start automatic pipeline selection.

There are two stages:

  • Pool stage: trying every pipeline in a pool

  • Tuning stage: tuning tune_size best pipelines from a previous stage by using :py:class`~etna.auto.auto.Tune`.

Tuning stage starts only if limits on n_trials and timeout aren’t exceeded. Tuning goes from the best pipeline to the worst, and trial limits (n_trials, timeout) are divided evenly between each pipeline. If there are no limits on number of trials only the first pipeline will be tuned until user stops the process.

Parameters
  • ts (etna.datasets.tsdataset.TSDataset) – TSDataset to fit on.

  • timeout (Optional[int]) – Timeout for optuna. N.B. this is timeout for each worker. By default, isn’t set.

  • n_trials (Optional[int]) – Number of trials for optuna. N.B. this is number of trials for each worker. By default, isn’t set.

  • initializer (Optional[etna.auto.auto._Initializer]) – Object that is called before each pipeline backtest, can be used to initialize loggers.

  • callback (Optional[etna.auto.auto._Callback]) – Object that is called after each pipeline backtest, can be used to log extra metrics.

  • **kwargs – Parameter tune_size (default: 0) determines how many pipelines to fit during tuning stage. Other parameters are passed into optuna optuna.study.Study.optimize().

Return type

etna.pipeline.base.BasePipeline

static objective(ts: etna.datasets.tsdataset.TSDataset, target_metric: etna.metrics.base.Metric, metric_aggregation: Literal['median', 'mean', 'std', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95'], metrics: List[etna.metrics.base.Metric], backtest_params: dict, initializer: Optional[etna.auto.auto._Initializer] = None, callback: Optional[etna.auto.auto._Callback] = None) Callable[[optuna.trial._trial.Trial], float][source]

Optuna objective wrapper for the pool stage.

Parameters
  • ts (etna.datasets.tsdataset.TSDataset) – TSDataset to fit on.

  • target_metric (etna.metrics.base.Metric) – Metric to optimize.

  • metric_aggregation (Literal['median', 'mean', 'std', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95']) – Aggregation method for per-segment metrics.

  • metrics (List[etna.metrics.base.Metric]) – List of metrics to compute.

  • backtest_params (dict) – Custom parameters for backtest instead of default backtest parameters.

  • initializer (Optional[etna.auto.auto._Initializer]) – Object that is called before each pipeline backtest, can be used to initialize loggers.

  • callback (Optional[etna.auto.auto._Callback]) – Object that is called after each pipeline backtest, can be used to log extra metrics.

Returns

function that runs specified trial and returns its evaluated score

Return type

objective

summary() pandas.core.frame.DataFrame[source]

Get Auto trials summary.

There are columns:

  • hash: hash of the pipeline;

  • pipeline: pipeline object;

  • metrics: columns with metrics’ values;

  • state: state of the trial;

  • study: name of the study in which trial was made.

Returns

dataframe with detailed info on each performed trial

Return type

study_dataframe

top_k(k: int = 5) List[etna.pipeline.base.BasePipeline]

Get top k pipelines with the best metric value.

Only complete and non-duplicate studies are taken into account.

Parameters

k (int) – Number of pipelines to return.

Returns

List of top k pipelines.

Return type

List[etna.pipeline.base.BasePipeline]

class AutoAbstract[source]

Interface for Auto object.

abstract fit(ts: etna.datasets.tsdataset.TSDataset, timeout: Optional[int] = None, n_trials: Optional[int] = None, initializer: Optional[etna.auto.auto._Initializer] = None, callback: Optional[etna.auto.auto._Callback] = None, **kwargs) etna.pipeline.base.BasePipeline[source]

Start automatic pipeline selection.

Parameters
  • ts (etna.datasets.tsdataset.TSDataset) – TSDataset to fit on.

  • timeout (Optional[int]) – Timeout for optuna. N.B. this is timeout for each worker. By default, isn’t set.

  • n_trials (Optional[int]) – Number of trials for optuna. N.B. this is number of trials for each worker. By default, isn’t set.

  • initializer (Optional[etna.auto.auto._Initializer]) – Object that is called before each pipeline backtest, can be used to initialize loggers.

  • callback (Optional[etna.auto.auto._Callback]) – Object that is called after each pipeline backtest, can be used to log extra metrics.

  • **kwargs – Additional parameters for the method.

Return type

etna.pipeline.base.BasePipeline

abstract summary() pandas.core.frame.DataFrame[source]

Get trials summary.

Returns

dataframe with detailed info on each performed trial

Return type

study_dataframe

abstract top_k(k: int = 5) List[etna.pipeline.base.BasePipeline][source]

Get top k pipelines with the best metric value.

Only complete and non-duplicate studies are taken into account.

Parameters

k (int) – Number of pipelines to return.

Returns

List of top k pipelines.

Return type

List[etna.pipeline.base.BasePipeline]

class AutoBase(target_metric: etna.metrics.base.Metric, horizon: int, metric_aggregation: Literal['median', 'mean', 'std', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95'] = 'mean', backtest_params: Optional[dict] = None, experiment_folder: Optional[str] = None, runner: Optional[etna.auto.runner.base.AbstractRunner] = None, storage: Optional[optuna.storages._base.BaseStorage] = None, metrics: Optional[List[etna.metrics.base.Metric]] = None)[source]

Base Class for Auto and Tune, implementing core logic behind these classes.

Initialize AutoBase class.

Parameters
  • target_metric (etna.metrics.base.Metric) – Metric to optimize.

  • horizon (int) – Horizon to forecast for.

  • metric_aggregation (Literal['median', 'mean', 'std', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95']) – Aggregation method for per-segment metrics. By default, mean aggregation is used.

  • backtest_params (Optional[dict]) – Custom parameters for backtest instead of default backtest parameters.

  • experiment_folder (Optional[str]) – Name for saving experiment results, it determines the name for optuna study. By default, isn’t set.

  • runner (Optional[etna.auto.runner.base.AbstractRunner]) – Runner to use for distributed training. By default, LocalRunner is used.

  • storage (Optional[optuna.storages._base.BaseStorage]) – Optuna storage to use. By default, sqlite storage is used with name “etna-auto.db”.

  • metrics (Optional[List[etna.metrics.base.Metric]]) – List of metrics to compute. By default, Sign, SMAPE, MAE, MSE, MedAE metrics are used.

abstract fit(ts: etna.datasets.tsdataset.TSDataset, timeout: Optional[int] = None, n_trials: Optional[int] = None, initializer: Optional[etna.auto.auto._Initializer] = None, callback: Optional[etna.auto.auto._Callback] = None, **kwargs) etna.pipeline.base.BasePipeline

Start automatic pipeline selection.

Parameters
  • ts (etna.datasets.tsdataset.TSDataset) – TSDataset to fit on.

  • timeout (Optional[int]) – Timeout for optuna. N.B. this is timeout for each worker. By default, isn’t set.

  • n_trials (Optional[int]) – Number of trials for optuna. N.B. this is number of trials for each worker. By default, isn’t set.

  • initializer (Optional[etna.auto.auto._Initializer]) – Object that is called before each pipeline backtest, can be used to initialize loggers.

  • callback (Optional[etna.auto.auto._Callback]) – Object that is called after each pipeline backtest, can be used to log extra metrics.

  • **kwargs – Additional parameters for the method.

Return type

etna.pipeline.base.BasePipeline

abstract summary() pandas.core.frame.DataFrame

Get trials summary.

Returns

dataframe with detailed info on each performed trial

Return type

study_dataframe

top_k(k: int = 5) List[etna.pipeline.base.BasePipeline][source]

Get top k pipelines with the best metric value.

Only complete and non-duplicate studies are taken into account.

Parameters

k (int) – Number of pipelines to return.

Returns

List of top k pipelines.

Return type

List[etna.pipeline.base.BasePipeline]

class Tune(pipeline: etna.pipeline.base.BasePipeline, target_metric: etna.metrics.base.Metric, horizon: int, metric_aggregation: Literal['median', 'mean', 'std', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95'] = 'mean', backtest_params: Optional[dict] = None, experiment_folder: Optional[str] = None, runner: Optional[etna.auto.runner.base.AbstractRunner] = None, storage: Optional[optuna.storages._base.BaseStorage] = None, metrics: Optional[List[etna.metrics.base.Metric]] = None, sampler: Optional[optuna.samplers._base.BaseSampler] = None, params_to_tune: Optional[Dict[str, etna.distributions.distributions.BaseDistribution]] = None)[source]

Automatic tuning of custom pipeline.

This class takes given pipelines and tries to optimize its hyperparameters by using params_to_tune.

Trials with duplicate parameters are skipped and previously computed results are returned.

Initialize Tune class.

Parameters
  • pipeline (etna.pipeline.base.BasePipeline) – Pipeline to optimize.

  • target_metric (etna.metrics.base.Metric) – Metric to optimize.

  • horizon (int) – Horizon to forecast for.

  • metric_aggregation (Literal['median', 'mean', 'std', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95']) – Aggregation method for per-segment metrics. By default, mean aggregation is used.

  • backtest_params (Optional[dict]) – Custom parameters for backtest instead of default backtest parameters.

  • experiment_folder (Optional[str]) – Name for saving experiment results, it determines the name for optuna study. By default, isn’t set.

  • runner (Optional[etna.auto.runner.base.AbstractRunner]) – Runner to use for distributed training. By default, LocalRunner is used.

  • storage (Optional[optuna.storages._base.BaseStorage]) – Optuna storage to use. By default, sqlite storage is used with name “etna-auto.db”.

  • metrics (Optional[List[etna.metrics.base.Metric]]) – List of metrics to compute. By default, Sign, SMAPE, MAE, MSE, MedAE metrics are used.

  • sampler (Optional[optuna.samplers._base.BaseSampler]) – Optuna sampler to use. By default, TPE sampler is used.

  • params_to_tune (Optional[Dict[str, etna.distributions.distributions.BaseDistribution]]) – Parameters of pipeline that should be tuned with corresponding tuning distributions. By default, pipeline.params_to_tune() is used.

fit(ts: etna.datasets.tsdataset.TSDataset, timeout: Optional[int] = None, n_trials: Optional[int] = None, initializer: Optional[etna.auto.auto._Initializer] = None, callback: Optional[etna.auto.auto._Callback] = None, **kwargs) etna.pipeline.base.BasePipeline[source]

Start automatic pipeline tuning.

Parameters
  • ts (etna.datasets.tsdataset.TSDataset) – TSDataset to fit on.

  • timeout (Optional[int]) – Timeout for optuna. N.B. this is timeout for each worker. By default, isn’t set.

  • n_trials (Optional[int]) – Number of trials for optuna. N.B. this is number of trials for each worker. By default, isn’t set.

  • initializer (Optional[etna.auto.auto._Initializer]) – Object that is called before each pipeline backtest, can be used to initialize loggers.

  • callback (Optional[etna.auto.auto._Callback]) – Object that is called after each pipeline backtest, can be used to log extra metrics.

  • **kwargs – Additional parameters for optuna optuna.study.Study.optimize().

Return type

etna.pipeline.base.BasePipeline

static objective(ts: etna.datasets.tsdataset.TSDataset, pipeline: etna.pipeline.base.BasePipeline, params_to_tune: Dict[str, etna.distributions.distributions.BaseDistribution], target_metric: etna.metrics.base.Metric, metric_aggregation: Literal['median', 'mean', 'std', 'percentile_5', 'percentile_25', 'percentile_75', 'percentile_95'], metrics: List[etna.metrics.base.Metric], backtest_params: dict, initializer: Optional[etna.auto.auto._Initializer] = None, callback: Optional[etna.auto.auto._Callback] = None) Callable[[optuna.trial._trial.Trial], float][source]

Optuna objective wrapper.

Parameters
Returns

function that runs specified trial and returns its evaluated score

Return type

objective

summary() pandas.core.frame.DataFrame[source]

Get trials summary.

There are columns:

  • hash: hash of the pipeline;

  • pipeline: pipeline object;

  • metrics: columns with metrics’ values;

  • state: state of the trial.

Returns

dataframe with detailed info on each performed trial

Return type

study_dataframe

top_k(k: int = 5) List[etna.pipeline.base.BasePipeline]

Get top k pipelines with the best metric value.

Only complete and non-duplicate studies are taken into account.

Parameters

k (int) – Number of pipelines to return.

Returns

List of top k pipelines.

Return type

List[etna.pipeline.base.BasePipeline]