Abstract
In recent days, we are encircled by a massive amount of data in full-text documents on various platforms across web. Most of the time, we are more interested in getting the solution to our query instead of looking at the document in web results. A Question Answering (QA) framework helps retrieve valuable data from the web and provides bits of knowledge. The QA process can be divided into two parts, namely (1) Information Retrieval (IR) which finds the document that contains the answer to the question, and (2) Reading Comprehension (RC) which finds the answer to the question. Most state-of-art works in reading comprehension are focused on answering the questions based on individual documents or single paragraphs. In these paragraphs of RC, the response to a query is consistently part of the context. While in this project, we integrate reasons depending on the information spread across documents by posing it as an inference problem on the graph. In our approach, we consider two types of relations between the sentences, match based relations, and sentence based relations. The graph constructed as an input to the model has nodes, which are mentions of entities containing a question and other relevant sentences from a Knowledge Base. At the same time, edges encode the relations between the different mentions. We are using a unigram-based method to extract appropriate sentences by assigning a negative log to the normalized unigram probabilities to calculate the weight of each unigram. Once we have extracted weights, we find the relevant sentences from the corpus by using Weighted Containment Similarity Measurement (WCSM). WCSM measures similarity between finite sample sets, which are question(s) and sentences from the corpus. We generate a graph for each question by generating a dependency parsing of each sentence in the set of supporting sentences. Entity - Graph Convolutional Networks (GCNs) are applied to these graphs, and all these questions require multiple supporting facts to answer. It is typically hard to locate all the supporting facts necessary with only the question. So, we train our model to perform multi-step reasoning. Entity-GCN is compact and scalable because every node can perform differential message passing in a parallel fashion. Here, we apply a graph attention network based model on ARC (the AI2 Reasoning Challenge) dataset. The model is evaluated using the open-sourced ARC dataset, which contains easy and challenging questions, with each question having four choices. The employed dataset also includes a supplementary knowledge base, which is used to answer the question.