Article summary:

1. VideoIC: A large-scale video interactive comments dataset called VideoIC is introduced, consisting of 4951 videos spanning 557 hours and 5 million comments. This dataset contains richer and denser comments information compared to existing danmaku datasets, making it suitable for research on automatic video comments generation.

2. Multimodal Multitask Learning: The article proposes a novel model called MML-CG for comment generation in the context of video interactive commenting. This model integrates multiple modalities to effectively generate comments and predict temporal relations. A multitask loss function is designed to train both tasks jointly in an end-to-end manner.

3. Experimental Results: Extensive experiments are conducted on both the VideoIC and Livebot datasets to evaluate the effectiveness of the proposed model. The results demonstrate the efficacy of the MML-CG model and provide insights into the features of danmaku (live video interactive commenting).

