SurveySum: A Dataset for Summarizing Multiple Scientific Articles into a Survey Section
Leandro CarĂsio Fernandes, Gustavo Bartz Guedes, Thiago Soares Laitz, Thales Sales Almeida, Rodrigo Nogueira, Roberto Lotufo, Jayr Pereira
2024-09-02

Summary
This paper talks about SurveySum, a new dataset created to help summarize multiple scientific articles into a single section of a survey.
What's the problem?
Summarizing scientific articles can be difficult, especially when trying to combine information from several papers into one concise and informative summary. There hasn't been a specific dataset designed for this purpose, making it hard for researchers to develop effective summarization tools.
What's the solution?
The authors introduce SurveySum, which includes a collection of data specifically for summarizing scientific articles. They also provide two methods (or pipelines) for summarizing these articles into survey sections, and they evaluate how well these methods work using various metrics to measure their performance.
Why it matters?
This research is important because it fills a gap in the tools available for summarizing scientific literature. By providing a dedicated dataset and effective summarization methods, it can help researchers quickly understand multiple studies, facilitating better communication and collaboration in the scientific community.
Abstract
Document summarization is a task to shorten texts into concise and informative summaries. This paper introduces a novel dataset designed for summarizing multiple scientific articles into a section of a survey. Our contributions are: (1) SurveySum, a new dataset addressing the gap in domain-specific summarization tools; (2) two specific pipelines to summarize scientific articles into a section of a survey; and (3) the evaluation of these pipelines using multiple metrics to compare their performance. Our results highlight the importance of high-quality retrieval stages and the impact of different configurations on the quality of generated summaries.