XAI for hate speech monitoring on social media
In this age of uninterrupted social media access, the extent of connectivity for an ordinary individual has reached unprecedented levels, allowing the spread of ideas to increase many folds and turning the world into a global village of public views and opinion. Social media’s low-cost and high-speed connectivity has made it a favourable avenue for alternative or alt views that may be underrepresented in mainstream media. Hate speech on social media has been an interesting research topic for many years under the broader umbrella of Internet Sciences. Most researchers define ‘hate speech’ as derogatory remarks towards an individual, race, religion, gender, or sexual orientation. However, what constitutes derogatory and what does not remain a much-debated topic.
Several techniques have been proposed in the artificial intelligence community to identify hate speech and disinformation. Several benchmark datasets contain data gathered from popular social media platforms, i.e., Twitter, Facebook and other platforms such as Gab and Whisper. This data is often biased due to the source of labelling or the models trained on these datasets’ failure to identify multiple hate-speech incidents. Conversely, many excerpts are falsely identified as hate speech as well. The lack of consistency in the definitions, labels and classification of hate speech, along with a lack of explainability behind classification, causes mistrust between social media networks and users.
The explainability of AI models refers to the degree to which an ML model is understandable to its stakeholders. Here, Explainable AI (XAI) aims to improve the user experience by increasing trust in the decision-making capabilities of a system. The introduction of explainability to purpose-built AI and ML solutions has been a sought-after concept for several years now. Around the world, policymakers have pushed for more explainable and transparent solutions to enable effective policy making using AI in critical systems. Similarly, an explainable model for hate speech detection can provide similar insights to policymakers for legislation on hate speech control on social media. Furthermore, explainable solutions also improve the public understanding of these frameworks and algorithms.
There is a need to address the lack of explainability in monitoring hate speech on social media. Establishing a certain level of trust between the system and its stakeholders is a genuine requirement. Similarly, explainability enables policymakers to develop better policies with an increased understanding of the system. The following research objectives define the scope of the project: