Interpretable Representation Learning and Evaluation for Abstractive Summarization

Marfurt, Andreas Thomas

doi:10.5075/epfl-thesis-10045

Marfurt, Andreas Thomas

2023

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Abstractive summarization has seen big improvements in recent years, mostly due to advances in neural language modeling, language model pretraining, and scaling models and datasets. While large language models generate summaries that are fluent, coherent, and integrate the salient information from the source document well, there are still a few challenges. Most importantly, information that is either not supported by the source document (hallucinations) or factually inaccurate finds its way into the machine-written summaries. Moreover, and connected to this first point, knowledge retrieval and summary generation happen implicitly, which leads to a lack of interpretability and controllability of the models. In this thesis, we contribute to solving these problems by working on making the summarization process more interpretable, faithful, and controllable. The thesis consists of two parts. In Part I, we learn interpretable representations that help with summary structure, faithfulness, and document understanding. First, we plan summary content at the sentence level, building a next sentence representation from the summary generated so far. Second, we integrate an entailment interpretation into standard text-encoding neural network architectures. In the last chapter of the first part, we use multiple object discovery methods from computer vision to identify semantic text units that should facilitate the extraction of salient information from source documents. In Part II, we turn to the evaluation of summarization models, and also contribute annotated resources for our tasks. We start by using the attentions and probability estimates during summary generation to identify hallucinations. We then apply summarization models in a novel semi-structured setting, where the model is asked to generate an interpretation from a long source document. For this novel task, we develop an evaluation technique that allows efficient contrastive evaluation of generative models with respect to user-specified distinctions.

Details

Title Interpretable Representation Learning and Evaluation for Abstractive Summarization

Author(s) Marfurt, Andreas Thomas

Advisor(s)

Gatica-Perez, Daniel
Henderson, James

Pagination 214

Date 2023

Publisher Lausanne, EPFL

Keywords

abstractive summarization; text summarization; representation learning; interpretability; hallucination detection; text generation; evaluation; datasets; natural language understanding; natural language processing

Language English

DOI https://doi.org/10.5075/epfl-thesis-10045

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Scientific production and competences > EPFL Theses
Work produced at EPFL
Published
Theses

Record creation date 2023-06-12

Files

Abstract

Details

PDF