Abstract
Hackers are continuously targeting modern web applications with the help of their intricate techniques. It is very difficult to predict the security events that might be undertaken by attacker. Currently, the research of cyber security community focuses on predicting such malicious attacks before they even occur. There are two prominent solutions that are designed to address this issue. The first solution analyzes file hash, machine identifiers, file name and directory from binary files of host machine. It uses a combination of Random Forest algorithm and Semi-Supervised algorithm based on posterior probabilities. However, it provides a binary outcome and therefore, it can only predict whether an attack event may occur or not. It cannot predict the type of attack event. The second solution analyzes parameters such as timestamp and event description with the help of unidirectional recurrent neural network that is based on LSTM. It considers sequence of log events to predict the next attack event that the hacker can execute. However, its precision fell when certain parameters in attack request were changed. Moreover, it can only process up to hundred words in a single sequence and therefore it cannot process requests which are bigger in length. Analyzing server logs is one of the methods used to predict the possibility of an attack event if it has not yet occurred. Nowadays, attention neural networks have become popular as they are better suited to identify the context changes in the data. Additionally, they can process up to ten thousand words in a single sequence. The approach used in this project utilizes a single parameter called query string from web server logs. It analyzes sequences of such query strings with the help of attention neural network to predict the security events. Therefore, this project uses attention neural networks to analyze web server logs in order to get better precision on predicting attack events.