A Unified Framework For Automated, Accurate and Flexible Knowledge Graphs from Free Text
Research in Machine Learning AI has made great advances but recently, there has been much discussion centering around the issues with state-of-the-art models: lack of interpretability and transparency, inability to generalize and reason in unseen situations, and, frequently, the need to train these models on large labeled datasets that are often difficult and expensive to generate at the scale required. There is increasing interest in integrating symbolic knowledge representation and reasoning methods into Machine Learning solutions in order to tackle these issues and create adaptable systems that can be applied to a variety of domains and settings. Knowledge Graphs have been achieving increasing visibility in the research community as a form of structured representation of information. Integration of Knowledge Graphs into downstream tasks has already shown great potential in use-cases such as Question Answering and recommendation. A major bottleneck, however, is the still-unsolved challenge of automated creation and curation of Knowledge Graphs that are accurate, can be maintained with minimal manual intervention, and balance the tradeoff between adherence to design requirements and the flexibility necessary for integration of new knowledge and generalization. Our work focuses on addressing the Knowledge Graph creation and curation bottleneck, with specific focus on extracting knowledge from free text. We intend to tackle this by considering all aspects of end-to-end Knowledge Graph construction and application to downstream tasks: domain discovery, schema design, information extraction for KG population, Knowledge Graph completion, evaluation, as well as maintenance and downstream tasks. These aspects are usually considered in isolation in research work, whereas we propose to approach the bottleneck problem via a more unified framework that pushes the boundaries of existing methods individually and establishes end-to-end systems that benefit from mutual interaction and feedback. The first contribution addresses the bottleneck of Knowledge Graph construction and curation with particular attention to knowledge extraction and Knowledge Graph population from free text. Our approach consists of incorporating existing linguistic and domain-specific knowledge bases for downstream linguistic tasks, as well as enriching distributed semantic representations with syntactic information through recursive structures and neuro-symbolic reasoning. The goal of our efforts in this part of the project is to establish a two-way relationship between NLP methods and symbolic knowledge representation and reasoning.