Abstract
The web connected world we are living in, imparts a gamut of data that can be utilized for various purpose given the advent of web 2.0. Such data that is evolving, can be represented in a data structure like a ‘Graph’. Of the major categories of NoSQL databases, Graph database has been successful in handling the complexity of a highly inter-connected data. Neo4j, as the most widely used graph database, is a JVM based open source graph database that allows to model, query and manipulate the data that is stored in the form of a graph. It is ideal for monitoring densely connected semi structured data, especially when it involves numerous joins. The domain of my project to use Neo4j is ‘supply chain management’. The biggest challenge of supply chain management is its increasingly inter-connected data. With globalization at peak, more complexity is added to the supply networks. Since supply chain network involves a lot of relationships, relational databases require more table joins thereby increasing complexity and execution time and decreasing performance. The main objective of this project is to solve one complex and commonly faced supply chain management issue, i.e. product recall. A few reasons that lead to product recall could be because the product was infected, defective, substandard or malfunctioning. This leads us to backtrack the supply chain and identify the root cause while recalling orders. Often this is achieved by a complicated query involving multi-table joins. And when dealing with large datasets, it’s better to find an efficient solution to solve such problems. This project includes two different types of implementation that involves creating & manipulating the graph data. The first implementation is a standalone application via Neo4j server. The server can be accessed through a web browser and provides a platform to run and visualize ad-hoc queries. The second implementation is a java based web application that includes an interactive user interface and processes data upon button click. In this case, I have the database embedded within my project and use Neo4j’s java API’s to manipulate the database and provide output. The web service is hosted on my local Tomcat server. Recalls in legacy systems typically involve looking up data from various source tables and joining them to form a solution. Using graph database provides an efficient solution to the recall problem because traversals from the source to destination involves finding a path based on relationships between nodes. This eliminates long query execution time because of joining multiple tables and thereby increasing efficiency. This is demonstrated further in this report.