streamsight.preprocessing.MinUsersPerItem

class streamsight.preprocessing.MinUsersPerItem(min_users_per_item: int, item_ix: str, user_ix: str, count_duplicates: bool = True)

Bases: Filter

Require that a minimum number of users has interacted with an item.

This code is adapted from RecPack [MVG22]

Example

Original interactions
1 - a
1 - b
1 - c
2 - a
2 - b
2 - d
3 - a
3 - b
3 - d

After MinUsersPerItem(3)
1 - a
1 - b
1 - c
2 - a
2 - b
2 - d
3 - a
3 - b
3 - d
param min_users_per_item:

Minimum number of users required.

type min_users_per_item:

int

param item_ix:

Name of the column in which item identifiers are listed.

type item_ix:

str

param user_ix:

Name of the column in which user identifiers are listed.

type user_ix:

str

param count_duplicates:

Count multiple interactions with the same user, defaults to True

type count_duplicates:

bool

__init__(min_users_per_item: int, item_ix: str, user_ix: str, count_duplicates: bool = True)

Methods

__init__(min_users_per_item, item_ix, user_ix)

apply(df)

Apply Filter to the DataFrame passed.

_abc_impl = <_abc._abc_data object>
apply(df) DataFrame

Apply Filter to the DataFrame passed.

Parameters:

df (pd.DataFrame) – DataFrame to filter