streamsight.preprocessing.MinItemsPerUser

class streamsight.preprocessing.MinItemsPerUser(min_items_per_user: int, item_ix: str, user_ix: str, count_duplicates: bool = True)

Bases: Filter

Require that a user has interacted with a minimum number of items.

This code is adapted from RecPack [MVG22]

Example

Original interactions
1 - a
1 - b
1 - c
2 - a
2 - b
2 - d
3 - a
3 - b
3 - d

After MinItemsPerUser(3)
1 - a
1 - b
2 - a
2 - b
3 - a
3 - b
param min_items_per_user:

Minimum number of items required.

type min_items_per_user:

int

param item_ix:

Name of the column in which item identifiers are listed.

type item_ix:

str

param user_ix:

Name of the column in which user identifiers are listed.

type user_ix:

str

param count_duplicates:

Count multiple interactions with the same item, defaults to True

type count_duplicates:

bool

__init__(min_items_per_user: int, item_ix: str, user_ix: str, count_duplicates: bool = True)

Methods

__init__(min_items_per_user, item_ix, user_ix)

apply(df)

Apply Filter to the DataFrame passed.

_abc_impl = <_abc._abc_data object>
apply(df) DataFrame

Apply Filter to the DataFrame passed.

Parameters:

df (pd.DataFrame) – DataFrame to filter