streamsight.metrics.Metric

class streamsight.metrics.Metric(timestamp_limit: int | None = None, cache: bool = False)

Bases: object

Base class for all metrics.

A Metric object is stateful, i.e. after calculate the results can be retrieved in one of two ways:

Detailed results are stored in results,

Aggregated result value can be retrieved using value

__init__(timestamp_limit: int | None = None, cache: bool = False)

Methods

`__init__`([timestamp_limit, cache])
`cache_values`(y_true, y_pred)	Cache the values of y_true and y_pred for later use.
`calculate`(y_true, y_pred)	Calculates this metric for all nonzero users in `y_true`, given true labels and predicted scores.
`calculate_cached`()	Calculate the metric using the cached values of y_true and y_pred.
`get_params`()	Get the parameters of the metric.

Attributes

`identifier`	Name of the metric.
`macro_result`	The global metric value.
`micro_result`	Micro results for the metric.
`name`	Name of the metric.
`num_items`	Dimension of the item-space in both `y_true` and `y_pred`
`num_users`	Dimension of the user-space in both `y_true` and `y_pred` after elimination of users without interactions in `y_true`.
`params`	Parameters of the metric.
`timestamp_limit`	The timestamp limit for the metric.

_calculate(y_true: csr_matrix, y_pred: csr_matrix) → None

_eliminate_empty_users(y_true: csr_matrix, y_pred: csr_matrix) → Tuple[csr_matrix, csr_matrix]

Eliminate users that have no interactions in y_true.

Users with no interactions in y_true are eliminated from the prediction matrix y_pred. This is done to avoid division by zero and to also reduce the computational overhead.

Parameters:

y_true (csr_matrix) – True user-item interactions.
y_pred (csr_matrix) – Predicted affinity of users for items.

Returns:

(y_true, y_pred), with zero users eliminated.

Return type:

Tuple[csr_matrix, csr_matrix]

_false_negative: int: Number of false negatives computed. Used for caching to obtain macro results.

_false_positive: int: Number of false positives computed. Used for caching to obtain macro results.

property _indices: Tuple[ndarray, ndarray]: Indices in the prediction matrix for which scores were computed.

_map_users(users): Map internal identifiers of users to actual user identifiers.

_set_shape(y_true: csr_matrix) → None

Set the number of users and items in the metric.

The values of self._num_users and self._num_items are set to the number of users and items in y_true. This allows for the computation of the metric to be done in the correct shape.

Parameters:: y_true (csr_matrix) – Binary representation of user-item interactions.

_true_positive: int: Number of true positives computed. Used for caching to obtain macro results.

_verify_shape(y_true: csr_matrix, y_pred: csr_matrix) → bool

Make sure the dimensions of y_true and y_pred match.

Parameters:

y_true (csr_matrix) – True user-item interactions.
y_pred (csr_matrix) – Predicted affinity of users for items.

Raises:

AssertionError – Shape mismatch between y_true and y_pred.

Returns:

True if dimensions match.

Return type:

bool

cache_values(y_true: csr_matrix, y_pred: csr_matrix) → None

Cache the values of y_true and y_pred for later use.

Basic method to cache the values of y_true and y_pred for later use. This is useful when the metric can be calculated with the cumulative values of y_true and y_pred.

Note

This method should be over written in the child class if the metric cannot be calculated with the cumulative values of y_true and y_pred. For example, in the case of Precision@K, the metric default behavior is to obtain the top-K ranks of y_pred and and y_true, this will cause cumulative values to be possibly dropped.

Parameters:

y_true (csr_matrix) – True user-item interactions.
y_pred (csr_matrix) – Predicted affinity of users for items.

Raises:

ValueError – If caching is disabled for the metric.

Deprecated since version Caching: values for metric is no longer needed for core functionalities due to change in compute method.

calculate(y_true: csr_matrix, y_pred: csr_matrix) → None

Calculates this metric for all nonzero users in y_true, given true labels and predicted scores.

Parameters:

y_true (csr_matrix) – True user-item interactions.
y_pred (csr_matrix) – Predicted affinity of users for items.

calculate_cached()

Calculate the metric using the cached values of y_true and y_pred.

This method calculates the metric using the cached values of y_true and y_pred. calculate() will be called on the cached values.

Note

This method should be overwritten in the child class if the metric cannot be calculated with the cumulative values of y_true and y_pred.

Raises:: ValueError – If caching is disabled for the metric.

Deprecated since version Caching: values for metric is no longer needed for core functionalities due to change in compute method.

get_params(): Get the parameters of the metric.

property identifier: Name of the metric.

property macro_result: float: The global metric value.

property micro_result: Dict[str, ndarray]

Micro results for the metric.

Returns:: Detailed results for the metric.
Return type:: Dict[str, np.ndarray]

property name: Name of the metric.

property num_items: int: Dimension of the item-space in both y_true and y_pred

property num_users: int: Dimension of the user-space in both y_true and y_pred after elimination of users without interactions in y_true.

property params: Parameters of the metric.

property timestamp_limit: The timestamp limit for the metric.