Adapting the Neural Encoder-Decoder Framework from Single to Multi-Document Summarization

Abstract

Generating a text abstract from a set of documents remains a challenging task. The neural encoder-decoder framework has recently been exploited to summarize single documents, but its success can in part be attributed to the availability of large parallel data automatically acquired from the Web. In contrast, parallel data for multi-document summarization are scarce and costly to obtain. There is a pressing need to adapt an encoder-decoder model trained on single-document summarization data to work with multiple-document input. In this paper, we present an initial investigation into a novel adaptation method. It exploits the maximal marginal relevance method to select representative sentences from multi-document input, and leverages an abstractive encoder-decoder model to fuse disparate sentences to an abstractive summary. The adaptation method is robust and itself requires no training data. Our system compares favorably to state-of-the-art extractive and abstractive approaches judged by automatic metrics and human assessors.

Publication
In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Kaiqiang Song
Kaiqiang Song
Senior Research Scientist

Kaiqiang Song (宋凯强) is a Senior Research Scientist at Tencent AI Lab, Seattle, specializing in Natural Language Processing. His research focuses on advancing artificial intelligence through machine learning, NLP, and large language models. He is dedicated to optimizing AI model architectures for practical applications like text summarization and text generation, bridging the gap between foundational AI research and real-world impact.