Search

Natural Language Processing for Multilingual Hate Speech Detection in Twitter


Description Social media is an integral part of our everyday lives and connects billions of people worldwide. At the same time, uncensored user-generated content creates new problems such as hate speech. Hate speech is any discriminatory communication to a group or individual based on race, color, ethnicity, gender, sexual orientation, nationality, religion. In recent years, methods of deep learning more precisely natural language processing have made numerous breakthroughs in the automatic processing and classification of text.
The aim of this study is to identify hate speech against immigrants and women with deep learning methods by utilising a new multilingual (English, Spanish) Twitter dataset.  
Task In this thesis, the student(s) design and implement a suitable method to automatically analyse user-generated, multilingual text data using NLP and learn language markers that point to hate speech.
Utilises Multilingual Word Embeddings, Attention Neural Networks , Recurrent Neural Networks.
Requirements Preliminary knowledge in Machine Learning and Natural Language Processing, Good programming skills (e.g. Python, C++).
Languages English or German.
Supervisor Lukas Stappen, M. Sc. (lukas.stappen@informatik.uni-augsburg.de)