IIIT-H researchers teach machines to interpret emotions using films
HYDERABAD: Researchers at the International Institute of Information Technology, Hyderabad (IIIT-H) are teaching machines to recognise and interpret human emotions and mental states. This a part of the evolving field of affective computing.
With a new machine learning (ML) model that analyses emotions from complex movie scenes, researchers from the IIIT-H Centre for Visual Information Technology (CVIT) have taken artificial intelligence (AI) closer to the goal.
In a study, ‘How you feelin’? Learning emotions and mental states in movie scenes,’ primary author Dhruv Srivastava along with co-authors Aditya Kumar Singh and Prof. Makarand Tapaswi introduce an machine learning (ML) model that relies on a Transformer-based architecture to understand and label emotions not only for each movie character in the scene but also for the overall scene itself.
With cinema possessing a vast amount of emotional data mirroring the complexities that exist in everyday life, the research group embarked upon movies as their starting point. Unlike static images though, movies are extremely complex for machines to interpret.
To train their model, the team of researchers used an existing dataset of movie clips collected by Prof. Tapaswi in his previous work called MovieGraphs that provides detailed graph-based annotations of social situations depicted in movie scenes.
In effect, EmoTx was trained to accurately label emotions and mental states of characters in each scene through a 3-pronged process – one, by analysing the full video and the actions involved, two, by interpreting individual facial features of various characters and three, by extracting the subtitles that accompanied the dialogues in each scene.
The study has now been accepted for presentation at the 2023 Conference on Computer Vision and Pattern Recognition at Vancouver from June 18-23.