CS Forum: "Movies and meaning: from low-level features to mind reading" Sergio Benini, University of Brescia

Movies and meaning: from low-level features to mind reading


When dealing with movies, closing the tremendous discontinuity between low-level features and the richness of semantics in the viewers’ cognitive processes, requires a variety of approaches and perspectives at the crossroad between video content analysis, neuroscience, and psychology.

When attempting to relate movie content to users’ affective responses, previous work in content analysis suggests that a direct mapping of audio-visual properties into elicited emotions is difficult, due to the high variability of individual reactions. To reduce the gap between the objective level of features and the subjective sphere of emotions, we exploit the intermediate representation of the connotative properties of movies: the set of shooting and editing conventions that help in transmitting meaning to the audience [1] [2].

One of these stylistic feature, the shot scale, i.e. the distance of the camera from the subject, effectively regulates theory of mind, indicating that increasing spatial proximity to the character triggers higher occurrence of mental state references in viewers’ story descriptions [3] [4] [7]. When considered together with shot duration, meant as length of camera takes, shot scale does not appear as random patterns in movies from the same director, thus it may be also employed for automatic attribution of movie authorship [5].

Movies are also becoming an important stimuli employed in neural decoding, an ambitious line of research within contemporary neuroscience aiming at “mind-reading”. We address the challenge of producing generalizable decoding models, which allow the reconstruction of perceived audiovisual features from human magnetic resonance imaging (fMRI) data without prior training of the algorithm on the decoded content [6] [7]. In this field we also aim at combining fMRI data and deep features in a hybrid model able to predict specific video object classes [8].

Sergio Benini received his MSc degree in Electronic Engineering (2000, cum laude) and his PhD in Information Engineering (2006) from the University of Brescia, Italy. Between 2001 and 2003 he was with Siemens Mobile Communications R&D. During his Ph.D. he spent almost one year in British Telecom Research in UK. Since 2005 he is Assistant Professor at the University of Brescia. In 2012 he co-founded Yonder, a spin-off company specialized in NLP, Machine Learning, and Cognitive Computing.