Abstract
In today’s modern era, social media platforms perform an important role in providing platforms to people where they can share their opinions, thoughts, points of view via different mediums including Twitter, Facebook, Instagram, and YouTube. On the other side of the coin, Social Media platforms can become stages for cyberbullying and digital harassment. These types of offensive activities cause mental distress and adverse effects on the human mind.
McAfee’s Cyberbullying Report 2022 mentioned that around 28% of children around the globe have faced cyberbullying in which highest rates occurring in US and India. Social media harassment has evolved around hostile and aggressive behavior to damage or disturb someone continuously via different medium over the internet. Among different types of Cyberbullying in particular, we implement various types and target offensive messages on Social Media platforms.
In this project, I have analyzed this type of offensive language data and to understand its behaviors. Our project showcased a sophisticated approach of offensive language detection on the social media platform. With the help of offensive language dataset, we explored various types of offences on the social media platform. Furthermore, we implemented and evaluated recurrent neural network with LSTM, NLP based transformers BERT and HateBERT. I have evaluated all models and analyzed my findings. Our results supported the findings regarding the substantial improvement in recall, precision, F-1 score and presenting the model’s effectiveness in identifying offensive language.
The purpose of our project was to find offensive content efficiently with the help of models that are higher level in clarification and support insights for this complex mechanism of offensive content findings. The outcome of the project helped us understand the different offensive data accurately that are avail on different social media platforms. The core vision of the project is to establish a healthier social media communication platform.