streamsight.metrics.RecallK
- class streamsight.metrics.RecallK(K: int | None = 10, timestamp_limit: int | None = None, cache: bool = False)
Bases:
ListwiseMetricK
Computes the fraction of true interactions that made it into the Top-K recommendations.
Recall per user is computed as:
\[\text{Recall}(u) = \frac{\sum\limits_{i \in \text{Top-K}(u)} y^{true}_{u,i} }{\sum\limits_{j \in I} y^{true}_{u,j}}\]ref: RecPack
- Parameters:
K (int) – Size of the recommendation list consisting of the Top-K item predictions.
- __init__(K: int | None = 10, timestamp_limit: int | None = None, cache: bool = False)
Methods
__init__
([K, timestamp_limit, cache])cache_values
(y_true, y_pred)Cache the values of y_true and y_pred for later use.
calculate
(y_true, y_pred)Computes metric given true labels
y_true
and predicted scoresy_pred
.Calculate the metric using the cached values of y_true and y_pred.
Get the parameters of the metric.
prepare_matrix
(y_true, y_pred)Prepare the matrices for the metric calculation.
Attributes
The names of the columns in the results DataFrame.
Name of the metric.
Global metric value obtained by taking the average over all users.
User level results for the metric.
Name of the metric.
Dimension of the item-space in both
y_true
andy_pred
Dimension of the user-space in both
y_true
andy_pred
after elimination of users without interactions iny_true
.Parameters of the metric.
The timestamp limit for the metric.
- DEFAULT_K = 10
- _calculate(y_true: csr_matrix, y_pred_top_K: csr_matrix) None
Computes metric given true labels
y_true
and predicted scoresy_pred
. Only Top-K recommendations are considered.To be implemented in the child class.
- Parameters:
y_true (csr_matrix) – Expected interactions per user.
y_pred_top_K (csr_matrix) – Ranks for topK recommendations per user
- _eliminate_empty_users(y_true: csr_matrix, y_pred: csr_matrix) Tuple[csr_matrix, csr_matrix]
Eliminate users that have no interactions in
y_true
.Users with no interactions in
y_true
are eliminated from the prediction matrixy_pred
. This is done to avoid division by zero and to also reduce the computational overhead.- Parameters:
y_true (csr_matrix) – True user-item interactions.
y_pred (csr_matrix) – Predicted affinity of users for items.
- Returns:
(y_true, y_pred), with zero users eliminated.
- Return type:
Tuple[csr_matrix, csr_matrix]
- _false_negative: int
Number of false negatives computed. Used for caching to obtain macro results.
- _false_positive: int
Number of false positives computed. Used for caching to obtain macro results.
- property _indices
Indices in the prediction matrix for which scores were computed.
- _map_users(users)
Map internal identifiers of users to actual user identifiers.
- _scores: csr_matrix | None
- _set_shape(y_true: csr_matrix) None
Set the number of users and items in the metric.
The values of
self._num_users
andself._num_items
are set to the number of users and items iny_true
. This allows for the computation of the metric to be done in the correct shape.- Parameters:
y_true (csr_matrix) – Binary representation of user-item interactions.
- _true_positive: int
Number of true positives computed. Used for caching to obtain macro results.
- _value: float
- _verify_shape(y_true: csr_matrix, y_pred: csr_matrix) bool
Make sure the dimensions of y_true and y_pred match.
- Parameters:
y_true (csr_matrix) – True user-item interactions.
y_pred (csr_matrix) – Predicted affinity of users for items.
- Raises:
AssertionError – Shape mismatch between y_true and y_pred.
- Returns:
True if dimensions match.
- Return type:
bool
- _y_pred: csr_matrix
- _y_true: csr_matrix
- cache_values(y_true: csr_matrix, y_pred: csr_matrix) None
Cache the values of y_true and y_pred for later use.
Basic method to cache the values of y_true and y_pred for later use. This is useful when the metric can be calculated with the cumulative values of y_true and y_pred.
Note
This method should be over written in the child class if the metric cannot be calculated with the cumulative values of y_true and y_pred. For example, in the case of Precision@K, the metric default behavior is to obtain the top-K ranks of y_pred and and y_true, this will cause cumulative values to be possibly dropped.
- Parameters:
y_true (csr_matrix) – True user-item interactions.
y_pred (csr_matrix) – Predicted affinity of users for items.
- Raises:
ValueError – If caching is disabled for the metric.
Deprecated since version Caching: values for metric is no longer needed for core functionalities due to change in compute method.
- calculate(y_true: csr_matrix, y_pred: csr_matrix) None
Computes metric given true labels
y_true
and predicted scoresy_pred
. Only Top-K recommendations are considered.Detailed metric results can be retrieved with
results
. Global aggregate metric value is retrieved asvalue
.- Parameters:
y_true (csr_matrix) – True user-item interactions.
y_pred (csr_matrix) – Predicted affinity of users for items.
- calculate_cached()
Calculate the metric using the cached values of y_true and y_pred.
This method calculates the metric using the cached values of y_true and y_pred.
calculate()
will be called on the cached values.Note
This method should be overwritten in the child class if the metric cannot be calculated with the cumulative values of y_true and y_pred.
- Raises:
ValueError – If caching is disabled for the metric.
Deprecated since version Caching: values for metric is no longer needed for core functionalities due to change in compute method.
- property col_names
The names of the columns in the results DataFrame.
- get_params()
Get the parameters of the metric.
- property identifier
Name of the metric.
- property macro_result: float | None
Global metric value obtained by taking the average over all users.
- Raises:
ValueError – If the metric has not been calculated yet.
- Returns:
The global metric value.
- Return type:
float, optional
- property micro_result: dict[str, ndarray]
User level results for the metric.
Contains an entry for every user.
- Returns:
The results DataFrame with columns: user_id, score
- Return type:
pd.DataFrame
- property name
Name of the metric.
- property num_items: int
Dimension of the item-space in both
y_true
andy_pred
- property num_users: int
Dimension of the user-space in both
y_true
andy_pred
after elimination of users without interactions iny_true
.
- property params
Parameters of the metric.
- prepare_matrix(y_true: csr_matrix, y_pred: csr_matrix) Tuple[csr_matrix, csr_matrix]
Prepare the matrices for the metric calculation.
This method is used to prepare the matrices for the metric calculation. It is used to eliminate empty users and to set the shape of the matrices.
- Parameters:
y_true (csr_matrix) – True user-item interactions.
y_pred (csr_matrix) – Predicted affinity of users for items.
- Returns:
Tuple of the prepared matrices.
- Return type:
Tuple[csr_matrix, csr_matrix]
- property timestamp_limit
The timestamp limit for the metric.