Unsupervised Topic and Aspect Detection in Spoken Narratives


Extracting the relevant topics and entities of a conversation is an important part of sentiment analysis that wants to categorize the opinions expressed towards a particular topic. Especially when designing data sets with the aim to make sentiment analysis supervised learnable, a prior extraction of the relevant topics is elementary, since only relevant entities and topics that are frequently used are important to annotate. Besides previously known k-mean algorithms on word embeddings, a new form of attention clustering emerged recently ( showed good qualitative results and should be further evaluated on linguistic, spoken narrative data

Task In this work, the student(s) will implement unsupervised topic and aspect detection utilising Attention-Clustering ( and compares the result to previously common k-means on word embeddings on two different sentiment databases (SEWA, EmCaR).
Utilises Keras, Tensorflow/Theano, Attention Neural Networks
Requirements Advanced knowledge in machine learning and natural language processing, good programming skills (e.g. Python, C++)
Languages German or English
Supervisor Lukas Stappen, M. Sc. (