CyberBullying Detection from Social Media Posts

CyberBullying Detection from Social Media Posts

With the widespread usage of social media and their popularity, social networking platforms have given us more valuable opportunities than ever before, and their benefits are undeniable. On the other hand, it also allows anonymous users, strangers, or peers to humiliate, insult, bully, and harass unsuspecting victims.


Cyberbullying is defined as an intentional, aggressive act carried out by an individual or group, using electronic platforms, repeatedly and extensively, against a victim who cannot easily defend herself. Cyberbullying poses a significant threat to the physical and mental health of the victims. The main courses of action to combat cyberbullying are detection of cyberbullying and subsequent preventive measures.

Recent research has shown that most teenagers experience cyberbullying during online gaming or on social networking sites. The National Crime Prevention Council reports that approximately 50% of the youth in America are victims of cyberbullying. The implications of cyberbullying become serious (suicide attempts) when victims fail to cope with the emotional strain from aggressive, abusive, threatening, or humiliating messages. The impact of cyberbullying is exacerbated by the fact that children are reluctant to share their difficulties with adults, driven by the fear of losing their mobile phone and Internet access privileges.

The challenges in fighting cyberbullying include: detecting cyber abuse in real-time, reporting it to law enforcement agencies and Internet service providers, and identifying predators and their victims.

Percentage of US Teens Victim of Cyberbullying


According to the national advocacy group in the US, bullying can take several forms: racism and sexuality are two. Based on a report by Pew Research Centre, two discrete categories of online harassment have been described among internet users:

  1. The first category includes less painful experiences: swearing and humiliation, since those who see or experience these often claim that they ignore them.

  2. Though targeting fewer online users, the second category includes more severe experiences such as sexual harassment, physical threats, trapping, and long-term harassment.


To investigate the rising global occurrence of online abuse, the Guardian commissioned research into the 70 million comments left on its site since 2006 and discovered that eight of the 10 most abused authors were women, and the two men are black. Two of the women and one of the men were gay. And of these women, one was Muslim and one Jewish.

The results of this study paint a pretty accurate picture of today’s cyberspace and its preferred victims. As a digital enterprise, this is the question you must answer: do you shut the comment section down or say “Don’t read the comments”? Yet, comments are valuable to engage readers and understand their opinions and feedback. It turns out, comment moderation might be the answer. Several internet spaces are renowned for sexism, racism, misogyny, and bigotry. Reddit is often regarded as such as a space, so is 4chan. Yet, one of the safest, protected, and supported spaces is within Reddit, called CreepyPMs. The subreddit has more than 30 moderators. For example, there is rarely an awful sexist comment blaming a woman for receiving creepy messages (“well, what were you wearing on your profile page?”, etc.). This is what comment sections can and should be.

It does take massive time and effort to moderate comment sections, though. It takes a huge amount of effort to keep up with comments in real-time, and outlining the boundary between a good post and a bad one is difficult; it saves a lot of frustration and time just dealing with the worst.

This is where machine learning comes in- semi-automated comment moderation, bot detection, and troll detection are some example applications. As natural language models like OpenAI GPT2 come into play, bots will improve rapidly and human comment moderation will reach its limits. Researchers at Indiana University have provided a tool to check Twitter users called botornot. Online trolls are also interesting: Research from Stanford has shown that just 1% of accounts create 74% of conflict. There has also been work into identifying multilingual abusive content, which is helpful if your organization is in a country where most people speak more than one language.

Sentiment analysis can not only support moderation but also help to understand the dynamics of online discussions. A subtask of content moderation is the identification of toxic comments. Various deep learning approaches, datasets, and architectures are tailored to sentiment analysis in online discussions. A solution to make these approaches more intelligible and trustworthy is fine-grained instead of binary comment classification. But more classes require more training data. If your organization has enough data, this solution is well worth considering. And if not, there are ways to augment training data by using transfer learning.

Finally, there are some questions to ponder, on which sufficient data is not available:

  1. In organizations dealing with huge volumes of data like Amazon and Flipkart, what percentages of abusive comments are actually reaching the victim?

  2. How to make ML Algorithms identify the victim in comments where they are not explicitly mentioned ( ex: the orange-faced dude, referring to Donald Trump)

  3. How the content of online abuse changes across age, gender, religion, ethnicity, etc.

CyberBullying in Social Media


The detection methods using Machine Learning can identify cyberbullying terms and classify cyberbullying activities in social networks such as flaming, harassment, racism, and terrorism, using Fuzzy logic, Genetic algorithm, and Natural Language Processing, among other techniques.

They aim to address the following practical challenges:

  1. detection timeliness, which is necessary to support victims as early as possible

  2. scalability to the staggering rates at which content is generated in online social networks.

  3. quantifying/classifying the extent to which the different cyberbullying incidents could impact victims

Researchers have proposed various frameworks to detect cyberbullying behavior and quantify its severity in online social networks. One such work studying cyberbullying severity used 31 real-world transcriptions as source data, obtained from a well-known American organization, Perverted-Justice, which investigates, recognizes, and reports the conduct of adults who solicit online sexual conversations with adults posing as youngsters. Using term frequency, time-series modeling, and Support Vector Machines showed the best results in abuse detection. A numeric class label was assigned to questions asked by predators, containing values from the set {0,200,600,900}. Zero indicated posts with no cyberbullying activity, 200 the questions containing personal information, 600 those posts containing words with sexual meaning, and 900 the posts showing the predator’s attempts to approach the victim physically.

In contrast to this work, a second paper proposes categorizing severity in three levels, for the topics already declared as sensitive and severe: sexuality, racism, physical appearance, intelligence, and politics. The authors hope to research how a machine learning multi-class algorithm for detecting cyberbullying might perform using this method. The researchers allocated the forms mentioned above of cyberbullying into three classes: low, medium, high, and non-cyberbullied tweets. They generated features from Twitter content by leveraging a pointwise mutual information (PMI) technique and developed a supervised machine learning solution for cyberbullying detection and multi-class categorization of its severity. Extracted features were applied with Support Vector Machine, Naïve Bayes, Decision Tree, KNN, and Random Forest algorithms. Results from experiments with this framework in a multi-class setting are encouraging both in a binary environment and with respect to Kappa, classifier accuracy, and f-measure metrics. Finally, they compared the results of proposed and baseline features with other machine learning algorithms. The comparison findings indicated the significance of the proposed features in cyberbullying detection.

There has also been research using Deep Learning models to build detection systems. For example, the SIMAH challenge (SocIaL Media And Harassment) addresses the difficulties of harassment detection on Twitter posts and the identification of a harassment category. Automatically detecting content containing harassment could be the basis for removing it. Thus, it is an essential step to distinguishing different types of harassment and providing the means to control such a mechanism in a fine-grained way. One work classified a set of Twitter posts into non-harassment or harassment tweets, where the latter are classified as indirect harassment, sexual harassment, or physical harassment. It explored how to use self-attention models for harassment classification to combine different baselines’ outputs. The transformer architecture encoded each baseline output exploiting relationships between baselines and posts. Then, the transformer learned how to combine the results of these methods with a BERT representation of the post, reaching a macro-averaged F-score of 0.481 on the SIMAH test set.

CONcISE is a novel work focusing on timely and accurate Cyberbullying detection on Instagram media sessions. It proposes a sequential hypothesis testing formulation that drastically reduces the number of features used in classifying each comment while maintaining high classification accuracy. CONcISE raises an alert only after several detections. Extensive experiments on a real-world Instagram dataset with ~ 4M users and ~ 10M comments demonstrate this approach’s effectiveness, scalability, timeliness, and benefits over existing methods.

Sample Framework for ML Based Abuse Detector


1. Data accessibility

A Machine Learning model requires a vast amount of training data to show accurate results. Thus, creating a large and diverse dataset is crucial irrespective of the application. Therefore, data availability becomes the most important criterion when selecting an Online Social Network(OSN) to study. Two primary features need to be considered: popularity (number of active users) and data accessibility. Accessibility of relevant data, necessary to develop models that characterize cyberbullying, is a significant challenge in cyberbullying research. Currently, Facebook is the largest online social network, with over one billion active users. Although data extracted from Facebook is common in works related to OSN research, the high percentage of restricted content (generally due to users’ privacy settings) strictly limits analysis using Facebook data.

In contrast, Twitter is considered the most studied OSN. Twitter’s well-defined public interface, the simplicity of its protocol, and the public nature of most of its material make it simple to obtain data from the network. Other web services incorporating social networking features are YouTube, Instagram, and Kaggle.

2. Dealing with class imbalance data

Class imbalance occurs when the number of instances from one class is significantly greater than that of another class. Most machine learning algorithms perform optimally when the number of cases of each class is approximately equal. Nevertheless, in many real-life applications and non-synthetic datasets, the data is imbalanced; an important class (the minority class) may have significantly fewer samples than the other class (the majority class). In such cases, standard classifiers are overwhelmed by the large class and ignore the small distributed instances. It usually produces a biased classifier with higher predictive accuracy over majority classes but poorer predictive accuracy over minority classes. A solution is to modify the class distributions in the training data by oversampling the minority class or undersampling the majority class. SMOTE (Synthetic Minority Over-sampling Technique) is specifically designed to learn from imbalanced datasets and is one of the most adopted approaches to dealing with class imbalance due to its simplicity and effectiveness. It is a combination of undersampling and oversampling.

3. Selection of Machine Learning Models

Choosing the best classifier is the most vital phase of the text classification pipeline. We cannot determine the most optimum model for a text classification implementation without a complete conceptual comprehension of each algorithm. To select the best classifier, we need to test several machine learning algorithms like Random Forest, Support Vector Machine, Naïve Bayes, Decision Tree, and K-Nearest Neighbors (KNN). Recently, Deep Learning models like Transformers and BERT have shown promising results and address the challenge of data volumes more effectively.

4. Eliminating Human Bias

Bias can come undetected into algorithms in several ways. AI systems learn decision-making based on training data, including human preferences or reflecting social or historic inequities even if we remove sensitive variables such as sexual orientation, race, and gender. For example, Amazon stopped using a hiring algorithm when it favored applicants based on words like “captured” or “executed”- commonly found on men’s resumes. Another source of bias is flawed data sampling, in which groups are under-or overrepresented in the training data. For instance, researchers at MIT found that facial analysis technologies had higher error rates for minorities and particularly minority women, potentially due to unrepresentative training data.

5. Data volume

Every second, 3.3 million new posts appear on Facebook and almost half a million on Twitter, with approximately 500 million daily tweets. Detection systems have to process this data onslaught, analyze various data, and provide actionable insights in real-time.

6. Tackling the not-so-straightforward online abuse

Hate speech or abusive content comes in various forms, and the challenge is to detect them as abuse when the posted content is not direct. For instance, the offensive text content could be part of an image, in which case we must use Computer Vision models. The text could be in mixed languages- in India, people commonly speak and post in a mixture of Hindi and English, and abuse can get quite creative and, thus, difficult to detect. We might encounter obfuscated text(k1ll, gen0cide). Detecting Deepfake pornography is another difficult challenge.


The internet and social media use have clear advantages, but their frequent use may also have significant adverse consequences. This involves cybercrime, unwanted sexual exposure, and cyberbullying.

Online harassment has become a severe issue that affects people to a large extent. The anti-harassment standards and policies supplied by social platforms and the ability to flag/block/report the bully are practical steps towards a safer online community, but they are not enough. Popular social media platforms such as Facebook, Twitter, Instagram, and others receive an enormous number of flagged content every day. Scrutinizing this massive content and users is very time-consuming and impractical.

Consequently, it is imperative to design data-driven automated methods to detect harmful behaviors in social media. A successful detection would enable early identification of damaging and threatening scenarios and prevent such incidents. Future studies could enhance automated cyberbullying detection by combining textual data with images and video to build a machine learning model to detect online abuse and its severity. This will form the foundation of automated systems to analyze contemporary social online behaviors that negatively affect mental health. Detection algorithms can analyze the bully’s posts and then align them to a preselected severity level, thus giving early awareness about the extent of cyberbullying detection.