The brain’s fascinating ability to adapt its internal neural dynamics to the temporal structure of the sensory environment is becoming increasingly clear. It is thought to be metabolically beneficial to align ongoing oscillatory activity to the relevant inputs in a predictable stream, so that they will enter at optimal processing phases of the spontaneously occurring rhythmic excitability fluctuations. However, some contexts have a more predictable temporal structure than others. Here, we tested the hypothesis that the processing of rhythmic sounds is more efficient than the processing of irregularly timed sounds. To do this, we simultaneously measured functional magnetic resonance imaging (fMRI) and electro-encephalograms (EEG) while participants detected oddball target sounds in alternating blocks of rhythmic (e.g., with equal inter-stimulus intervals) or random (e.g., with randomly varied inter-stimulus intervals) tone sequences. Behaviorally, participants detected target sounds faster and more accurately when embedded in rhythmic streams. The fMRI response in the auditory cortex was stronger during random compared to random tone sequence processing. Simultaneously recorded N1 responses showed larger peak amplitudes and longer latencies for tones in the random (vs. the rhythmic) streams. These results reveal complementary evidence for more efficient neural and perceptual processing during temporally predictable sensory contexts.
Correlated sensory inputs coursing along the individual sensory processing hierarchies arrive at multisensory convergence zones in cortex where inputs are processed in an integrative manner. The exact hierarchical level of multisensory convergence zones and the timing of their inputs are still under debate, although increasingly, evidence points to multisensory integration (MSI) at very early sensory processing levels. While MSI is said to be governed by stimulus properties including space, time, and magnitude, violations of these rules have been documented. The objective of the current study was to determine, both psychophysically and electrophysiologically, whether differential visual-somatosensory (VS) integration patterns exist for stimuli presented to the same versus opposite hemifields. Using high-density electrical mapping and complementary psychophysical data, we examined multisensory integrative processing for combinations of visual and somatosensory inputs presented to both left and right spatial locations. We assessed how early during sensory processing VS interactions were seen in the event-related potential and whether spatial alignment of the visual and somatosensory elements resulted in differential integration effects. Reaction times to all VS pairings were significantly faster than those to the unisensory conditions, regardless of spatial alignment, pointing to engagement of integrative multisensory processing in all conditions. In support, electrophysiological results revealed significant differences between multisensory simultaneous VS and summed V + S responses, regardless of the spatial alignment of the constituent inputs. Nonetheless, multisensory effects were earlier in the aligned conditions, and were found to be particularly robust in the case of right-sided inputs (beginning at just 55 ms). In contrast to previous work on audio-visual and audio-somatosensory inputs, the current work suggests a degree of spatial specificity to the earliest detectable multisensory integrative effects in response to VS pairings.
In well-controlled laboratory experiments, researchers have found that humans can perceive delays between auditory and visual signals as short as 20 ms. Conversely, other experiments have shown that humans can tolerate audiovisual asynchrony that exceeds 200 ms. This seeming contradiction in human temporal sensitivity can be attributed to a number of factors such as experimental approaches and precedence of the asynchronous signals, along with the nature, duration, location, complexity and repetitiveness of the audiovisual stimuli, and even individual differences. In order to better understand how temporal integration of audiovisual events occurs in the real world, we need to close the gap between the experimental setting and the complex setting of everyday life. With this work, we aimed to contribute one brick to the bridge that will close this gap. We compared perceived synchrony for long-running and eventful audiovisual sequences to shorter sequences that contain a single audiovisual event, for three types of content: action, music, and speech. The resulting windows of temporal integration showed that participants were better at detecting asynchrony for the longer stimuli, possibly because the long-running sequences contain multiple corresponding events that offer audiovisual timing cues. Moreover, the points of subjective simultaneity differ between content types, suggesting that the nature of a visual scene could influence the temporal perception of events. An expected outcome from this type of experiment was the rich variation among participants' distributions and the derived points of subjective simultaneity. Hence, the designs of similar experiments call for more participants than traditional psychophysical studies. Heeding this caution, we conclude that existing theories on multisensory perception are ready to be tested on more natural and representative stimuli.