Document Level Emotion Detection from Bangla Text Using Machine Learning Techniques

Abstract

Understanding emotion from documents automatically is an interesting research topic in the machine learning field. Nowadays, many applications like email, blog, etc have the ability to suggest joyful or angry expressions from written documents. In spite of being a popular language, Bangla lacks a rich corpus with annotated emotion labels, so recognizing emotion from documents is still not developed as other languages. In this work, we have proposed a new dataset containing Bangla documents with annotation of three emotions- Happy, Sad and Angry. Two major feature extraction techniques - Bag of Words(BoW) and Word Embedding is used to extract features from the documents. BoW is used by Logistic Regression and Multinomial Naive Bayes classifiers. Word Embedding is used by Artificial Neural Network(ANN) and Convolutional Neural Network(CNN) classifiers. Among all, Multinomial Naive Bayes classifier has given the best performance on the test set and the accuracy is 68.27%. We have made our dataset (https://doi.org/10.6084/m9.figshare.13052789.v1) available for all to be used in further research purposes.

Publication
2021 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD)
Sadia Afrin Purba
Sadia Afrin Purba
Software Engineer I

My research interests include machine learning specially computer vision and NLP.