Skip to content

timestamp

logger = logging.getLogger(__name__) module-attribute

TimestampSplitter

Bases: Splitter

Split an interaction dataset by timestamp.

The splitter divides the data into two parts:

  1. Interactions with timestamps in the interval [t - t_lower, t), representing past interactions.
  2. Interactions with timestamps in the interval [t, t + t_upper], representing future interactions.

If t_lower or t_upper are not provided, they default to infinity, meaning the corresponding interval is unbounded on that side.

Note that a user can appear in both the past and future interaction sets.

Attributes:

Name Type Description
past_interaction InteractionMatrix

Interactions in the interval [0, t), representing unlabeled data for prediction.

future_interaction InteractionMatrix

Interactions in the interval [t, t + t_upper) or [t, inf), used for training the model.

Parameters:

Name Type Description Default
t int

Timestamp to split on, in seconds since the Unix epoch.

required
t_lower None | int

Seconds before t to include in the past interactions. If None, the interval is unbounded. Defaults to None.

None
t_upper None | int

Seconds after t to include in the future interactions. If None, the interval is unbounded. Defaults to None.

None
Source code in src/recnexteval/settings/splitters/timestamp.py
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
class TimestampSplitter(Splitter):
    """Split an interaction dataset by timestamp.

    The splitter divides the data into two parts:

    1. Interactions with timestamps in the interval `[t - t_lower, t)`,
       representing past interactions.
    2. Interactions with timestamps in the interval `[t, t + t_upper]`,
       representing future interactions.

    If `t_lower` or `t_upper` are not provided, they default to infinity,
    meaning the corresponding interval is unbounded on that side.

    Note that a user can appear in both the past and future interaction sets.

    Attributes:
        past_interaction (InteractionMatrix): Interactions in the interval
            `[0, t)`, representing unlabeled data for prediction.
        future_interaction (InteractionMatrix): Interactions in the interval
            `[t, t + t_upper)` or `[t, inf)`, used for training the model.

    Args:
        t: Timestamp to split on, in seconds since the Unix epoch.
        t_lower: Seconds before `t` to include in
            the past interactions. If None, the interval is unbounded.
            Defaults to None.
        t_upper: Seconds after `t` to include in
            the future interactions. If None, the interval is unbounded.
            Defaults to None.
    """

    def __init__(
        self,
        t: int,
        t_lower: None | int = None,
        t_upper: None | int = None,
    ) -> None:
        super().__init__()
        self.t = t
        self.t_lower = t_lower
        self.t_upper = t_upper

    def split(self, data: InteractionMatrix) -> tuple[InteractionMatrix, InteractionMatrix]:
        """Split the interaction data by timestamp.

        The method populates the `past_interaction` and `future_interaction`
        attributes with the corresponding subsets of the input data.

        Args:
            data: The interaction dataset to split.
                Must include timestamp information.

        Returns:
            A pair containing the past interactions and future interactions.
        """

        if self.t_lower is None:
            # timestamp < t
            past_interaction = data.timestamps_lt(self.t)
        else:
            # t-t_lower =< timestamp < t
            past_interaction = data.timestamps_lt(self.t).timestamps_gte(self.t - self.t_lower)

        if self.t_upper is None:
            # timestamp >= t
            future_interaction = data.timestamps_gte(self.t)
        else:
            # t =< timestamp < t + t_upper
            future_interaction = data.timestamps_gte(self.t).timestamps_lt(self.t + self.t_upper)

        logger.debug(f"{self.identifier} has complete split")

        return past_interaction, future_interaction

t = t instance-attribute

t_lower = t_lower instance-attribute

t_upper = t_upper instance-attribute

name property

Return the class name of the splitter.

Returns:

Type Description
str

The splitter class name.

identifier property

Return a string identifier including the splitter's parameters.

The identifier includes the class name and a comma-separated list of attribute name/value pairs from self.__dict__.

Returns:

Type Description
str

Identifier string like Name(k1=v1,k2=v2).

split(data)

Split the interaction data by timestamp.

The method populates the past_interaction and future_interaction attributes with the corresponding subsets of the input data.

Parameters:

Name Type Description Default
data InteractionMatrix

The interaction dataset to split. Must include timestamp information.

required

Returns:

Type Description
tuple[InteractionMatrix, InteractionMatrix]

A pair containing the past interactions and future interactions.

Source code in src/recnexteval/settings/splitters/timestamp.py
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
def split(self, data: InteractionMatrix) -> tuple[InteractionMatrix, InteractionMatrix]:
    """Split the interaction data by timestamp.

    The method populates the `past_interaction` and `future_interaction`
    attributes with the corresponding subsets of the input data.

    Args:
        data: The interaction dataset to split.
            Must include timestamp information.

    Returns:
        A pair containing the past interactions and future interactions.
    """

    if self.t_lower is None:
        # timestamp < t
        past_interaction = data.timestamps_lt(self.t)
    else:
        # t-t_lower =< timestamp < t
        past_interaction = data.timestamps_lt(self.t).timestamps_gte(self.t - self.t_lower)

    if self.t_upper is None:
        # timestamp >= t
        future_interaction = data.timestamps_gte(self.t)
    else:
        # t =< timestamp < t + t_upper
        future_interaction = data.timestamps_gte(self.t).timestamps_lt(self.t + self.t_upper)

    logger.debug(f"{self.identifier} has complete split")

    return past_interaction, future_interaction