Preventing Conflicting Gradients in Neural Marked Temporal Point Processes

Tanguy Bosser, Souhaib Ben Taieb
(2024)

Sequences of labeled events occurring at irregular intervals naturally arise in a wide array of application domains. Neural Temporal Point Processes (TPP) are flexible models that capture complex inter-dependencies between events. Learning such models implicitly involves learning a distribution over event times and a distribution over event types (also called marks) conditional on time. However, most neural MTPP models implicitly enforce parameter sharing between time and mark predictive distributions through joint training of a common representation of past event occurrences. While parameter sharing can be beneficial in certain applications, it may also trigger an undesirable phenomenon known as negative transfer, where the shared parameters benefit one task while hurting performance in the other. We herein propose a framework that unifies state-of-the-art parametrizations of neural TPP models, enabling independent modeling and training of both the time and mark predictive distributions. In this context, we also propose a simple, yet efficient, parametrization of the mark distribution that relaxes the classical conditional independence assumption. Through experiments on multiple real-world event sequence datasets, we demonstrate the benefits of our independent learning framework compared to the original model parametrizations.