Paper, Code and dataset, Tags: #nlp, #text-summarization
TLDR generation involves high source compression and expert background knowledge and understanding of complex domain-specific language. To facilitate study on this task, we introduce SciTLDR, a new multi-target dataset of 5.4k TLDRs over 3.2k papers.
We propose CATTS, a simple but effective learning strategy for generating TLDRs that exploits titles as an auxiliary training signal.
As an alternative to abstracts, TLDRs focus on the key aspects of the paper, such as its main contributions, eschewing nonessential background or methodological details. TLDRs can enable readers to quickly discern a paper's key points and decide whether it's worth reading.