Files

Abstract

The ability to reason, plan and solve highly abstract problems is a hallmark of human intelligence. Recent advancements in artificial intelligence, propelled by deep neural networks, have revolutionized disciplines like computer vision and natural language processing. However, in spite of the amazing progress that we are witnessing, the challenge of creating models that can acquire human-level reasoning abilities sample efficiently persists. To make a step forward, it is crucial to acknowledge that all models inherently carry inductive biases and that human-level intelligence cannot be general and requires the incorporation of appropriate knowledge priors. Following this chain of thought, this study aims to scrutinize and enhance the reasoning abilities of neural networks by incorporating proper knowledge priors and biasing learning through structured representations. Due to the complexity of the problem at hand, we aim to investigate it through multiple lenses. The thesis unfolds into three main parts, each focusing on distinct tasks and perspectives. In the first part of the thesis, our research revolves around reasoning and planning in interactive textual environments. We introduce novel environments for evaluating commonsense reasoning skills and decision-making abilities of neural agents. Then, we investigate whether graph-structured representations can serve as appropriate inductive biases for knowledge representation and reasoning with neural agents. We propose agents that use graphs both as a source of prior knowledge and as a model of the state of the world, showing that they act more sample efficiently. Further, we introduce a general algorithm inspired by case-based reasoning to train on-policy agents, improving their planning and out-of-distribution generalization abilities. In the second part, we isolate core factual reasoning challenges and investigate how language models can reason and benefit from prior knowledge. We delve into language-understanding tasks and introduce an efficient method to navigate large-scale knowledge graphs and answer natural language questions requiring complex logical reasoning and robustness to distributional shifts. Then, we introduce a method to enhance language models with prior knowledge in entity-linking tasks, showing improvements by infusing appropriate structure in the latent space. Finally, driving inspiration from developmental science, we focus on the core knowledge priors of human intelligence, concentrating our efforts on geometry and topology priors. We introduce a variant of the transformer model that incorporates lattice symmetry priors, showing that it is 2 orders of magnitude more sample efficient than standard transformers on fundamental geometric reasoning. The contributions of this thesis span several fronts. We achieve state-of-the-art results on several benchmarks, including popular textual environments, standard question answering and entity linking datasets, as well as geometric reasoning tasks. Our text-based neural agents are more sample efficient and resilient to distributional shifts than the baselines. The proposed question answering model is orders of magnitude more scalable than competitive approaches and achieves compositional generalization out of the training distribution. Our entity linking method achieves results comparable to large generative models with 18 times more parameters.

Details

PDF