streamsight.metrics.Metric

class streamsight.metrics.Metric(timestamp_limit: int | None = None, cache: bool = False)

Bases: object

Base class for all metrics.

A Metric object is stateful, i.e. after calculate the results can be retrieved in one of two ways:

  • Detailed results are stored in results,

  • Aggregated result value can be retrieved using value

__init__(timestamp_limit: int | None = None, cache: bool = False)

Methods

__init__([timestamp_limit, cache])

cache_values(y_true, y_pred)

Cache the values of y_true and y_pred for later use.

calculate(y_true, y_pred)

Calculates this metric for all nonzero users in y_true, given true labels and predicted scores.

calculate_cached()

Calculate the metric using the cached values of y_true and y_pred.

get_params()

Get the parameters of the metric.

Attributes

identifier

Name of the metric.

macro_result

The global metric value.

micro_result

Micro results for the metric.

name

Name of the metric.

num_items

Dimension of the item-space in both y_true and y_pred

num_users

Dimension of the user-space in both y_true and y_pred after elimination of users without interactions in y_true.

params

Parameters of the metric.

timestamp_limit

The timestamp limit for the metric.

_calculate(y_true: csr_matrix, y_pred: csr_matrix) None
_eliminate_empty_users(y_true: csr_matrix, y_pred: csr_matrix) Tuple[csr_matrix, csr_matrix]

Eliminate users that have no interactions in y_true.

Users with no interactions in y_true are eliminated from the prediction matrix y_pred. This is done to avoid division by zero and to also reduce the computational overhead.

Parameters:
  • y_true (csr_matrix) – True user-item interactions.

  • y_pred (csr_matrix) – Predicted affinity of users for items.

Returns:

(y_true, y_pred), with zero users eliminated.

Return type:

Tuple[csr_matrix, csr_matrix]

_false_negative: int

Number of false negatives computed. Used for caching to obtain macro results.

_false_positive: int

Number of false positives computed. Used for caching to obtain macro results.

property _indices: Tuple[ndarray, ndarray]

Indices in the prediction matrix for which scores were computed.

_map_users(users)

Map internal identifiers of users to actual user identifiers.

_set_shape(y_true: csr_matrix) None

Set the number of users and items in the metric.

The values of self._num_users and self._num_items are set to the number of users and items in y_true. This allows for the computation of the metric to be done in the correct shape.

Parameters:

y_true (csr_matrix) – Binary representation of user-item interactions.

_true_positive: int

Number of true positives computed. Used for caching to obtain macro results.

_verify_shape(y_true: csr_matrix, y_pred: csr_matrix) bool

Make sure the dimensions of y_true and y_pred match.

Parameters:
  • y_true (csr_matrix) – True user-item interactions.

  • y_pred (csr_matrix) – Predicted affinity of users for items.

Raises:

AssertionError – Shape mismatch between y_true and y_pred.

Returns:

True if dimensions match.

Return type:

bool

cache_values(y_true: csr_matrix, y_pred: csr_matrix) None

Cache the values of y_true and y_pred for later use.

Basic method to cache the values of y_true and y_pred for later use. This is useful when the metric can be calculated with the cumulative values of y_true and y_pred.

Note

This method should be over written in the child class if the metric cannot be calculated with the cumulative values of y_true and y_pred. For example, in the case of Precision@K, the metric default behavior is to obtain the top-K ranks of y_pred and and y_true, this will cause cumulative values to be possibly dropped.

Parameters:
  • y_true (csr_matrix) – True user-item interactions.

  • y_pred (csr_matrix) – Predicted affinity of users for items.

Raises:

ValueError – If caching is disabled for the metric.

Deprecated since version Caching: values for metric is no longer needed for core functionalities due to change in compute method.

calculate(y_true: csr_matrix, y_pred: csr_matrix) None

Calculates this metric for all nonzero users in y_true, given true labels and predicted scores.

Parameters:
  • y_true (csr_matrix) – True user-item interactions.

  • y_pred (csr_matrix) – Predicted affinity of users for items.

calculate_cached()

Calculate the metric using the cached values of y_true and y_pred.

This method calculates the metric using the cached values of y_true and y_pred. calculate() will be called on the cached values.

Note

This method should be overwritten in the child class if the metric cannot be calculated with the cumulative values of y_true and y_pred.

Raises:

ValueError – If caching is disabled for the metric.

Deprecated since version Caching: values for metric is no longer needed for core functionalities due to change in compute method.

get_params()

Get the parameters of the metric.

property identifier

Name of the metric.

property macro_result: float

The global metric value.

property micro_result: Dict[str, ndarray]

Micro results for the metric.

Returns:

Detailed results for the metric.

Return type:

Dict[str, np.ndarray]

property name

Name of the metric.

property num_items: int

Dimension of the item-space in both y_true and y_pred

property num_users: int

Dimension of the user-space in both y_true and y_pred after elimination of users without interactions in y_true.

property params

Parameters of the metric.

property timestamp_limit

The timestamp limit for the metric.