NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization

Abstract

Narrative summarization aims to produce a distilled version of a narrative to describe its most salient events and characters. Writing a summary for a narrative is challenging as it requires an understanding of event causality and character behaviors. To encourage research in this direction, we propose NarraSum, a large-scale narrative summarization dataset. It contains 122K narratives, which are collected from the synopses of movies and TV episodes with diverse genres, and their corresponding abstractive summaries. Experiments show that there is a large performance gap between humans and the state-of-the-art summarization models on NarraSum. We hope that this dataset will promote future research in summarization, as well as broader studies of natural language understanding and generation. The dataset is available at https://github.com/zhaochaocs/narrasum.

Publication
In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Kaiqiang Song
Kaiqiang Song
Senior Research Scientist

Kaiqiang Song (宋凯强) is a Senior Research Scientist at Tencent AI Lab, Seattle, specializing in Natural Language Processing. His research focuses on advancing artificial intelligence through machine learning, NLP, and large language models. He is dedicated to optimizing AI model architectures for practical applications like text summarization and text generation, bridging the gap between foundational AI research and real-world impact.