streamsight.evaluators.EvaluatorBase

class streamsight.evaluators.EvaluatorBase(metric_entries: List[MetricEntry], setting: Setting, metric_k: int, ignore_unknown_user: bool = True, ignore_unknown_item: bool = True, seed: int | None = None)

Bases: object

Base class for evaluator.

Provides the common methods and attributes for the evaluator classes. Should there be a need to create a new evaluator, it should inherit from this class.

Parameters:
  • metric_entries (List[MetricEntry]) – List of metric entries to compute

  • setting (Setting) – Setting object

  • ignore_unknown_user (bool, optional) – Ignore unknown users, defaults to True

  • ignore_unknown_item (bool, optional) – Ignore unknown items, defaults to True

__init__(metric_entries: List[MetricEntry], setting: Setting, metric_k: int, ignore_unknown_user: bool = True, ignore_unknown_item: bool = True, seed: int | None = None)

Methods

__init__(metric_entries, setting, metric_k)

metric_results([level, ...])

Results of the metrics computed.

prepare_dump()

Prepare evaluator for pickling.

restore()

Restore the generators before pickling.

Attributes

setting

Setting to evaluate the algorithms on.

metric_k

Value of K for the metrics.

ignore_unknown_user

To ignore unknown users during evaluation.

ignore_unknown_item

To ignore unknown items during evaluation.

_get_evaluation_data() Tuple[InteractionMatrix, InteractionMatrix, int]

Get the evaluation data for the current step.

Internal method to get the evaluation data for the current step. The evaluation data consists of the unlabeled data, ground truth data, and the current timestamp which will be returned as a tuple. The shapes are masked based through user_item_base. The unknown users in the ground truth data are also updated in user_item_base.

Note

_current_timestamp is updated with the current timestamp.

Returns:

Tuple of unlabeled data, ground truth data, and current timestamp

Return type:

Tuple[csr_matrix, csr_matrix, int]

Raises:

EOWSetting – If there is no more data to be processed

_prediction_shape_handler(X_true_shape: Tuple[int, int], X_pred: csr_matrix) csr_matrix

Handle shape difference of the prediction matrix.

If there is a difference in the shape of the prediction matrix and the ground truth matrix, this function will handle the difference based on ignore_unknown_user and ignore_unknown_item.

Parameters:
  • X_true_shape (Tuple[int,int]) – Shape of the ground truth matrix

  • X_pred (csr_matrix) – Prediction matrix

Raises:

ValueError – If the user dimension of the prediction matrix is less than the ground truth matrix

Returns:

Prediction matrix with the same shape as the ground truth matrix

Return type:

csr_matrix

ignore_unknown_item

To ignore unknown items during evaluation.

ignore_unknown_user

To ignore unknown users during evaluation.

metric_k

Value of K for the metrics.

metric_results(level: MetricLevelEnum | Literal['macro', 'micro', 'window', 'user'] = MetricLevelEnum.MACRO, only_current_timestamp: bool | None = False, filter_timestamp: int | None = None, filter_algo: str | None = None) DataFrame

Results of the metrics computed.

Computes the metrics of all algorithms based on the level specified and return the results in a pandas DataFrame. The results can be filtered based on the algorithm name and the current timestamp.

Specifics

  • User level: User level metrics computed across all timestamps.

  • Window level: Window level metrics computed across all timestamps. This can be viewed as a macro level metric in the context of a single window, where the scores of each user is averaged within the window.

  • Macro level: Macro level metrics computed for entire timeline. This score is computed by averaging the scores of all windows, treating each window equally.

  • Micro level: Micro level metrics computed for entire timeline. This score is computed by averaging the scores of all users, treating each user and the timestamp the user is in as unique contribution to the overall score.

param level:

Level of the metric to compute, defaults to “macro”

type level:

Union[MetricLevelEnum, Literal[“macro”, “micro”, “window”, “user”]]

param only_current_timestamp:

Filter only the current timestamp, defaults to False

type only_current_timestamp:

bool, optional

param filter_timestamp:

Timestamp value to filter on, defaults to None. If both only_current_timestamp and filter_timestamp are provided, filter_timestamp will be used.

type filter_timestamp:

Optional[int], optional

param filter_algo:

Algorithm name to filter on, defaults to None

type filter_algo:

Optional[str], optional

return:

Dataframe representation of the metric

rtype:

pd.DataFrame

prepare_dump() None

Prepare evaluator for pickling.

This method is used to prepare the evaluator for pickling. The method will destruct the generators to avoid pickling issues.

restore() None

Restore the generators before pickling.

This method is used to restore the generators after loading the object from a pickle file.

setting

Setting to evaluate the algorithms on.