Abstract
Cyber-attacks have been increasing rapidly in the past few years, causing significant threats to organizations and individuals. Regardless of size and industry, many organizations are facing a wide range of challenges. These challenges could be improved for their operations and the overall performance of organizations. Also, with growing cyber-attacks, many organizations face data breaches, financial losses, and reputational damages. To stop cyber-attacks, we perform penetration testing, which plays a vital role in identifying vulnerabilities and weaknesses within an organization’s network and applications.
The advanced approach to cyber-security assessment and protection is to perform penetration testing using machine learning techniques. Nowadays, machine learning is widely used to perform different tasks. Instead of identifying vulnerabilities manually, ML–driven scanners can automatically identify and prioritize vulnerabilities within an organization’s network and applications. ML algorithms can identify threats, openness, and system configurations that attackers might exploit.
The NVD dataset is used in the project to visualize the vulnerabilities of different networks. NVD is a comprehensive repository of vulnerabilities which various organizations widely used for risk assessment. NVD is managed by the National Institute of Standards and Technology (NIST). The NVD dataset includes a detailed description of each vulnerability, severity range, severity level, and website URL. The dataset has more than 1,00,000 vulnerabilities from the last 15 years. NVD has different vulnerability files for each year, but I have merged all the files from the past 15 years and created a single file to train and test the vulnerabilities using machine learning models.
This project aims to create an ensemble machine-learning model to enhance penetration testing. It seeks to find the accuracy and effectiveness of vulnerabilities by creating a penetration testing framework using machine learning techniques. ML models can detect weaknesses within the organization by analyzing network traffic that manual testing can miss. I have created a web application using react.js to visualize vulnerabilities. This web application displays the severity range, severity level, year, website URL, and description of the specific vulnerability. It also shows overall process of the ML models and data transformation and results. I have combined react.js and the data visualizations library to create interactive data representations that enhance the overall user experience.