Skip to main content

ORIGINAL RESEARCH article

Front. Psychol.
Sec. Perception Science
Volume 15 - 2024 | doi: 10.3389/fpsyg.2024.1509392
This article is part of the Research Topic Visual Perception and Mental Imagery in Aging, Health and Disease View all 3 articles

Assessment of Human Emotional Reactions to Visual Stimuli "Deep-Dreamed" by Artificial Neural Networks

Provisionally accepted
  • University of Notre Dame, Notre Dame, United States

The final, formatted version of the article will be published soon.

    Introduction: While the fact that visual stimuli synthesized by Artificial Neural Networks (ANN) may evoke emotional reactions is documented, the precise mechanisms that connect the strength and type of such reactions with the ways of how ANNs are used to synthesize visual stimuli are yet to be discovered. Understanding these mechanisms allows for designing methods that synthesize images attenuating or enhancing selected emotional states, which may provide unobtrusive and widely-applicable treatment of mental dysfunctions and disorders.The Convolutional Neural Network (CNN), a type of ANN used in computer vision tasks which models the ways humans solve visual tasks, was applied to synthesize ("dream" or "hallucinate") images with no semantic content to maximize activations of neurons in preciselyselected layers in the CNN. The evoked emotions of 150 human subjects observing these images were self-reported on a two-dimensional scale (arousal and valence) utilizing self-assessment manikin (SAM) figures. Correlations between arousal and valence values and image visual properties (e.g., color, brightness, clutter feature congestion, clutter sub-band entropy) as well as the position of the CNN's layers stimulated to obtain a given image were calculated.Results: Synthesized images that maximized activations of some of the CNN layers led to significantly higher or lower arousal and valence levels compared to average subject's reactions.Multiple linear regression analysis found that a small set of selected image global visual features (hue, feature congestion, and sub-band entropy) are significant predictors of the measured arousal, however no statistically significant dependencies were found between image global visual features and the measured valence. 1 Marczak-Czajka et al.This study demonstrates that the specific method of synthesizing images by maximizing small and precisely-selected parts of the CNN used in this work may lead to synthesis of visual stimuli that enhance or attenuate emotional reactions. This method paves the way for developing tools that stimulate, in a non-invasive way, to support well-being (manage stress, enhance mood) and to assist patients with certain mental conditions by complementing traditional methods of therapeutic interventions.

    Keywords: Emotional reactions, visual stimuli, deep learning, artificial neural networks, Visual Stimuli Synthesis

    Received: 10 Oct 2024; Accepted: 26 Nov 2024.

    Copyright: © 2024 Marczak-Czajka, Redgrave, Mitcheff, Villano and Czajka. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence:
    Agnieszka Marczak-Czajka, University of Notre Dame, Notre Dame, United States
    Adam Czajka, University of Notre Dame, Notre Dame, United States

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.