density_outliers

Functions

absolute_difference_distance(x, y)

Calculate distance for get_anomalies_density() function by taking absolute value of difference.

get_anomalies_density(ts[, in_column, ...])

Compute outliers according to density rule.

get_segment_density_outliers_indices(series)

Get indices of outliers for one series.

absolute_difference_distance(x: float, y: float) float[source]

Calculate distance for get_anomalies_density() function by taking absolute value of difference.

Parameters
  • x (float) – first value

  • y (float) – second value

Returns

result – absolute difference between values

Return type

float

get_anomalies_density(ts: TSDataset, in_column: str = 'target', window_size: int = 15, distance_coef: float = 3, n_neighbors: int = 3, distance_func: typing.Callable[[float, float], float] = <function absolute_difference_distance>) Dict[str, List[pandas._libs.tslibs.timestamps.Timestamp]][source]

Compute outliers according to density rule.

For each element in the series build all the windows of size window_size containing this point. If any of the windows contains at least n_neighbors that are closer than distance_coef * std(series) to target point according to distance_func target point is not an outlier.

Parameters
  • ts (TSDataset) – TSDataset with timeseries data

  • in_column (str) – name of the column in which the anomaly is searching

  • window_size (int) – size of windows to build

  • distance_coef (float) – factor for standard deviation that forms distance threshold to determine points are close to each other

  • n_neighbors (int) – min number of close neighbors of point not to be outlier

  • distance_func (Callable[[float, float], float]) – distance function

Returns

dict of outliers in format {segment: [outliers_timestamps]}

Return type

Dict[str, List[pandas._libs.tslibs.timestamps.Timestamp]]

Notes

It is a variation of distance-based (index) outlier detection method adopted for timeseries.