The Groove Enhancement Machine (GEM): A Multi-Person Adaptive Metronome to Manipulate Sensorimotor Synchronization and Subjective Enjoyment

Fink, Lauren K.; Alexander, Prescott C.; Janata, Petr

doi:10.3389/fnhum.2022.916551

ORIGINAL RESEARCH article

Front. Hum. Neurosci., 15 June 2022

Sec. Cognitive Neuroscience

Volume 16 - 2022 | https://doi.org/10.3389/fnhum.2022.916551

This article is part of the Research TopicInterpersonal Synchrony and Network Dynamics in Social InteractionView all 16 articles

The Groove Enhancement Machine (GEM): A Multi-Person Adaptive Metronome to Manipulate Sensorimotor Synchronization and Subjective Enjoyment

Lauren K. Fink^1,2,3,4*

Prescott C. Alexander^2,5

Petr Janata^1,6*

¹Center for Mind and Brain, University of California, Davis, Davis, CA, United States
²Neuroscience Graduate Group, University of California, Davis, Davis, CA, United States
³Department of Music, Max Planck Institute for Empirical Aesthetics, Frankfurt am Main, Germany
⁴Max Planck – NYU Center for Language, Music, and Emotion (CLaME), Frankfurt am Main, Germany
⁵Center for Neuroscience, University of California, Davis, Davis, CA, United States
⁶Department of Psychology, University of California, Davis, Davis, CA, United States

Synchronization of movement enhances cooperation and trust between people. However, the degree to which individuals can synchronize with each other depends on their ability to perceive the timing of others’ actions and produce movements accordingly. Here, we introduce an assistive device—a multi-person adaptive metronome—to facilitate synchronization abilities. The adaptive metronome is implemented on Arduino Uno circuit boards, allowing for negligible temporal latency between tapper input and adaptive sonic output. Across five experiments—two single-tapper, and three group (four tapper) experiments, we analyzed the effects of metronome adaptivity (percent correction based on the immediately preceding tap-metronome asynchrony) and auditory feedback on tapping performance and subjective ratings. In all experiments, tapper synchronization with the metronome was significantly enhanced with 25–50% adaptivity, compared to no adaptation. In group experiments with auditory feedback, synchrony remained enhanced even at 70–100% adaptivity; without feedback, synchrony at these high adaptivity levels returned to near baseline. Subjective ratings of being in the groove, in synchrony with the metronome, in synchrony with others, liking the task, and difficulty all reduced to one latent factor, which we termed enjoyment. This same factor structure replicated across all experiments. In predicting enjoyment, we found an interaction between auditory feedback and metronome adaptivity, with increased enjoyment at optimal levels of adaptivity only with auditory feedback and a severe decrease in enjoyment at higher levels of adaptivity, especially without feedback. Exploratory analyses relating person-level variables to tapping performance showed that musical sophistication and trait sadness contributed to the degree to which an individual differed in tapping stability from the group. Nonetheless, individuals and groups benefitted from adaptivity, regardless of their musical sophistication. Further, individuals who tapped less variably than the group (which only occurred ∼25% of the time) were more likely to feel “in the groove.” Overall, this work replicates previous single person adaptive metronome studies and extends them to group contexts, thereby contributing to our understanding of the temporal, auditory, psychological, and personal factors underlying interpersonal synchrony and subjective enjoyment during sensorimotor interaction. Further, it provides an open-source tool for studying such factors in a controlled way.

Highlights

– To aid people in synchronizing with each other, we built an assistive device that adapts in real-time to groups of people tapping together.

– By varying the adaptivity of a metronome, we show that we can enhance group synchrony and subjective feelings of enjoyment.

– Both individuals and groups benefit from an optimally adaptive metronome, regardless of their previous musical experience.

– Auditory feedback about one’s own and others’ taps influences both motor synchrony and subjective experience, and interacts with metronome adaptivity.

– The multi-person adaptive metronome allows to study, in a controlled way, the factors that influence interpersonal synchronization and social bonding.

Introduction

Sensorimotor synchronization (SMS)—the temporal alignment of motor behavior with a rhythmic sensory stimulus—has been observed in a variety of species and sensory modalities (Greenfield, 2005; Ravignani et al., 2014). Among humans, SMS has been shown to enhance prosocial behavior, social bonding, social cognition, perception, and mood—both within groups and toward outsiders [see Mogan et al. (2017) for review]. While sensorimotor synchronization can occur in a variety of contexts, we focus here specifically on auditory-motor coupling, which has been studied extensively in the music cognition literature as a critical mechanism of musical engagement and prosocial behavior.

When humans interact together in motoric synchrony, they are more likely to subsequently exhibit cooperative behavior, successful joint actions, trust of others, and altruism (Wiltermuth and Heath, 2009; Valdesolo et al., 2010). While such benefits occur during pure motor synchrony—for example, walking in step together—using music to organize movement is a powerful temporal cue and culturally relevant activity. The use of music, or even just a metronome, can enhance cooperative behavior—typically measured via economics games, such as the Public Goods Game (Wiltermuth and Heath, 2009; Kniffen et al., 2017) or Prisoner’s Dilemma (Anshel and Kipper, 1988)—and feelings of synchronization or connection with others, typically measured via self-report surveys (Hove and Risen, 2009; Fairhurst et al., 2013, 2014; Zhang et al., 2016; Kirschner and Tomasello, 2010). Interestingly, these prosocial effects are not specific to adults (Kirschner and Tomasello, 2010). Infants as young as 14-months show increased helping behavior toward adults who have bounced together with them synchronously to music (Cirelli et al., 2014a). Such helping behavior even generalizes to positive affiliates of the adults, i.e., adults the infants had seen interact together, though not toward neutral strangers (Cirelli et al., 2014b).

Importantly, however, for music to benefit motoric synchrony and interpersonal coordination, those engaging with it must be able to extract its temporal regularity. Generally, the beat and its (sub)harmonics are the relevant periodicities to which those engaging with music synchronize their movements. Because the perceived beat is typically isochronous (Merker et al., 2009) or quasi-isochronous (Merchant et al., 2015), many studies of sensorimotor synchronization involve asking individual participants to tap to an isochronous auditory pulse or beat (i.e., tap with a metronome), as such a task contains the most basic elements of what occurs during engagement with more complex music. However, the degree to which individuals can perceive and align their action to an auditory pulse varies among individuals and may impede their ability to synchronize well with others.

The most basic version of a sensorimotor synchronization task involves having a single participant synchronize their finger taps with a metronome. These single-person isochronous tapping experiments have revealed much about the motor and cognitive processes underlying sensorimotor synchronization. In terms of motor constraints, the lower inter-onset-interval (IOI) limit for a single finger tapping in 1:1 synchrony with an auditory pulse is around 150–200 ms, though highly trained musicians and/or bimanual tapping may result in slightly lower IOIs of around 100 ms [see Repp (2005) for review]. On the other end of the spectrum, a perceptual constraint prohibits participants from being able to synchronize their taps with IOIs longer than ∼1.8 s. Such durations become too hard to predict accurately; participants’ taps become reactive to the metronome tones, rather than anticipatory (Repp, 2005; c.f., Repp and Doggett, 2007). Though these studies of IOI constraints have typically been conducted with adult participants, a growing number of developmental studies (van Noorden and De Bruyn, 2009; Provasi et al., 2014; Thompson et al., 2015) have found that synchronizing with slower tempi becomes less difficult with age. With these limitations in mind, in the following experiments, we have adults stay within a comfortable range of synchronization tempi, using a starting tempo of 120 beats per minute (an IOI of 500 ms).

Complementing tapping tasks using invariant (strict) metronomes set at different tempi, studies seeking to understand the dynamics underlying more realistic joint musical interactions (Repp and Keller, 2008; Fairhurst et al., 2013) have used metronomes that adapt their timing based on the asynchronies between the metronome’s tone and the participant’s tap. The idea behind an adaptive metronome is that it mimics, in a controlled way, the adaptive behavior another human might adopt during joint tapping. For example, Fairhurst et al. (2013) used a personal computer-based adaptive metronome, implemented in MAX / MSP, that adapted each subsequent metronome tone by some percentage (0, 25, 50, 75, or 100%) of the participant’s tap asynchrony relative to the current tone, while they were in a magnetic resonance imagining (MRI) scanner. They found that, across sets of taps (20 taps/trial), adaptive metronome settings of 25% and 50% brought participants into greater synchrony with the metronome, compared to a non-adapting metronome. The opposite was also true; a metronome adapting by 75% or 100% significantly worsened synchronization performance. Functional MRI analyses revealed distinct brain networks differentially activated when participants are in vs. out of synchrony. Greater synchrony resulted in increased motor and “Default Mode Network” activity, possibly related to the social and effortless aspects associated with being “in the groove (Janata et al., 2012),” whereas poor synchrony resulted in increased activity in cognitive control areas of the brain, likely reflecting the increased effort required to align taps with the metronome. Given this ability to enhance or perturb sensorimotor synchronization and subjective experience using an adaptive metronome—coupled with the known group benefits of synchronous motor action outlined previously—we sought, in the current study, to extend the use of an adaptive metronome to a multi-person context.

To our knowledge, the current study is the first to utilize an adaptive metronome in a group-tapping context; however, recent work has explored synchronization among groups of individuals [see Repp and Su (2013), Part 3, for review]. Such studies range from examinations of synchronization in musical ensembles (Rasch, 1979), or duetting pianists (Goebl and Palmer, 2009; Zamm et al., 2015; Demos et al., 2019), to those of how synchrony affects social affiliation between two tappers (adults: Hove and Risen, 2009; Kokal et al., 2011; children+adult: Rabinowitch and Cross, 2018). Additionally, others have explored important factors, such as auditory and visual feedback, that influence tapping synchrony. With respect to auditory feedback, Konvalinka et al. (2010) showed coupling between dyadic tappers changes as a function of auditory feedback and participants are actually worst at keeping the tempo when they can hear each other. More recent evidence suggests such a result might be mediated by musicianship (Schultz and Palmer, 2019); musicians perform well with self and other feedback but non-musicians are worse when receiving feedback from others’ taps. In a study of triads with one leader, two followers, and varied conditions of cross-participant feedback, Ogata et al. (2019) show the effect of auditory feedback is dependent on the assigned leader-follower roles. Using visual feedback, Tognoli et al. (2007) observed that seeing the other tapper’s finger induced spontaneous synchronization during self-paced finger tapping (without a metronome). Additionally, Timmers et al. (2020) recently showed that visual information reduced synchronization accuracy during dyadic co-performance (one participant live, one participant recorded).

Considering these studies, especially in conjunction with the results of single-person tapping tasks, it is clear that audio and/or visual feedback has an effect on tapping performance. Nonetheless, it remains unclear whether such effects manifest among multiple individuals when the metronome is adaptive. Thus, in the following experiments, we manipulated all versions of auditory feedback (hearing only the metronome, hearing only the metronome and oneself but not others in the group, and hearing the metronome, oneself, and everyone else in the group). So as to avoid visual cues influencing synchronization, we asked participants to look only at their own tapping finger.

The main questions posed in our experiments were: (1) can group synchrony (both objective and subjective) be enhanced using an adaptive metronome? And (2) does auditory feedback influence group synchrony and subjective experience? To address these questions, we introduce a novel hardware/software system for perturbing a metronome and collecting multi-person tapping data in a highly customizable, temporally precise way. This research thus extends the previous adaptive metronome paradigm of Repp and Keller (2008) and Fairhurst et al. (2013) to group contexts.

The Current Study

We created an adaptive device to assist individuals or groups tasked with synchronizing to a pulse. While a variety of hardware and software has been developed for tapping experiments, none fill the exact niche of our device. For example, the E-music Box (Novembre et al., 2015) is a group music-making system giving participants the ability to control the output timing of a musical sequence based on their cyclical rotary control of an electromagnetic music box. Our device, on the other hand, adapts to the measured asynchrony of an individual participant’s tap, or to the mean asynchrony of a group of participants tapping together, relative to a defined metronome period. While other systems have been developed to collect and analyze tapping data, such as FTAP (Finney, 2001), MatTAP (Elliott et al., 2009), Tap-It (Kim et al., 2012), Tap Arduino (Schultz and van Vugt, 2016), and TeensyTap (van Vugt, 2020), none of these are currently capable of adaptive, group-tapping experiments.

In building the multi-person adaptive metronome, we sought to keep latencies in the system to a minimum. We were inspired by Tap Arduino (Schultz and van Vugt, 2016), which uses a force-sensitive resistor (FSR) as a tap pad connected to an Arduino microcontroller and PC to collect tapping data. In the validation of Tap Arduino, Schultz and van Vugt (2016) showed that the average latency of the Arduino-based tap pad is less than 3 ms, significantly lower than latencies produced by a standard percussion pad (∼9 ms), FTAP (∼15 ms), and MAX/MSP (∼16 ms) (Cycling’74, 2014). Hence, we decided to build our device using multiple Arduino Uno microcontrollers and FSRs. Please see additional details below in Apparatus.

In order to better understand the sensorimotor synchronization of groups in the presence of an adaptive metronome, we conducted five tapping experiments. The goal of the first experiment was to replicate Fairhurst et al. (2013). Therefore, the experiment involved single tappers. The metronome played aloud through a speaker and adapted to participant performance using positive phase correction, as in experiment 1 of Repp and Keller (2008) and Fairhurst et al. (2013), with the following equation:

t_{m e t_{n + 1}} = t_{{met}_{n}} + I O I_{m e t} + (α \times a s y n c_{n}) (1)

where the time of the upcoming metronome tone (t_{met n+1}) is equal to the time of the current metronome tone, plus the metronome inter-onset-interval (IOI), plus an adaptation based on the human participant’s asynchrony. The asynchrony (async_n) is defined as the participant’s tap time minus the metronome tone time (t_{tap_n} – t_{met_n}). Alpha (α) represents the adaptivity parameter, and is a fractional multiplier of the asynchrony. Note, if the participant fails to tap, async_n = 0; the next metronome tone will, therefore, occur at the default IOI. Alpha can be set to any number. When set to α = 0, the metronome does not adapt, and is thus considered a control (reference) condition. In Fairhurst et al. (2013), alpha values of 0.25 and 0.5 were found to be optimally adaptive, resulting in a smaller average standard deviation of asynchronies than with a non-adaptive metronome, while values of 0.75 and 1 were overly adaptive, leading to larger SD asynchronies.

Because we were also interested in the effects of auditory feedback on tapping stability, in Experiment 2 we performed an additional version of the single-person experiment outlined above in which participants now heard the sound triggered by their own tap through headphones (the metronome still played aloud through a speaker). We expected the findings from tapping studies with auditory feedback and a standard metronome (e.g., Aschersleben and Prinz, 1995; Aschersleben, 2002) to replicate, such that participants would become more accurate when they received auditory feedback from their taps. We further hypothesized that this greater objective synchrony (tapping accuracy) would correspond to increased ratings of feeling in the groove, liking of the task, etc.

Experiments 3–5 involved groups of four tappers. In these cases, the metronome algorithm was adjusted to adapt based on the average asynchrony of the group (of size I; in our case, 4 tappers):

t_{m e t_{n + 1}} = t_{m e t_{n}} + I O I_{m e t} + (α \times \frac{1}{I} \sum_{i = 1}^{I} a s y n c_{n_i}) (2)

with all terms as previously defined. In Experiment 3, participants heard the sound of the metronome through its speaker but not the sound of their own tap. In Experiment 4, participants heard the metronome through its speaker and only the sound of their own tap through headphones. In Experiment 5, participants heard the metronome through its speaker and the sounds of their own and everyone else’s taps through speakers attached to each individual Arduino.

Across all experiments, we hypothesized that, compared to no adaptivity or highly adaptive conditions, low levels of metronome adaptivity (i.e., 25–50%) would result in more stable tapping performance, as well as a more positively valanced subjective experience. We further hypothesized that participants’ being able to hear their taps more clearly in Experiments 2, 4, and 5 would result in greater temporal accuracy than in Experiments 1 and 3 (in which they could not hear a sound produced by their taps). With regard to Experiment 5, we hypothesized participants would feel more connected to and in synchrony with the group when they could hear each other’s taps. A summary of experimental conditions and hypotheses is provided in Table 1.

TABLE 1

Table 1. Summary of experiments, conditions, and hypotheses.

Apparatus

The hardware and software comprising the multi-person adaptive metronome were custom-built using open-source tools. The hardware consists of five Arduino Uno devices¹, five Adafruit Wave Shields (v1.1²) for playing sound, and four force sensitive resistors (InterlinkElectronics, 2018) for tapping. The Wave Shields are connected to the Arduinos to enable them to produce sound (i.e., the metronome sound and the sound produced by tapping on the tap pads). For each participant, a single force sensitive resistor (FSR) is used as a tap pad and is connected to a single Arduino which registers the taps, communicates with the metronome Arduino, and plays sounds triggered by the participant’s taps. The fifth Arduino is the metronome Arduino. It integrates inputs from the other Arduinos, implements the metronome timing function, generates metronome tones, and transmits event data to a connected computer. All four Arduinos responsible for registering taps are wired into the metronome Arduino (see Figures 1A,B). A speaker is connected to the Wave Shield of each Arduino. The Wave Shields have a headphone jack and volume wheel, allowing the experimenter to choose whether participants should hear sounds produced during the experiment via a speaker, individually through headphones, or not at all.

FIGURE 1

Figure 1. Overview of the GEM system. (A) Wiring diagram illustrating the connections between one tapping device (Tap) and the metronome (Met). Note that we repeat these connections three times (indicated by the dotted black box) to arrive at our 4 Tapper + metronome system. Note also that the connections from all Adafruit Wave Shields (v1.1) depicted here in black carry through to the Arduino Uno boards underneath them to which they are attached, indicated by the dotted gray lines connecting the blue Arduino Uno (bottom left) to the Wave Shield. (B) Schematic of all devices comprising the multi-person adaptive metronome and their means of communication. The metronome Arduino can be in either an “idle” or “run” state. In the idle state, the Experimental Control Computer (ECC; housed in a separate experiment control room, represented by dotted black box) can set experiment parameters on the metronome Arduino, which in turn sets parameters on the tapping Arduinos. Upon receiving the message from the ECC to start a trial, the metronome Arduino is in a constant loop of producing metronome tones according to our adaptive algorithm, registering participant taps, and sending data packets to the ECC. The tapping Arduinos are constantly polling, reading the value of the FSR input pin. When the FSR pin exceeds a specified threshold (a tap occurs), a digital output pin is pulsed on, resulting in the triggering of an interrupt on the metronome Arduino. Participants can hear the sound of their own tap through headphones, speakers, or not at all. The metronome is always heard through a speaker. (C) A photo of the system used in these experiments. (D) The view of our multi-person adaptive metronome, from the participants’ perspective. The speaker in the top center of the box played the tones from the metronome (all Experiments). During experiments in which participants could hear the sound produced by their own tap, they used headphones, which can be seen hanging from the side of the table. During Experiment 5, when participants could hear everyone in the group, we placed an additional speaker to the left of each tap pad.

Software associated with the multi-person adaptive metronome is freely available for download. All software was custom built in C++, Python 2.7, and Julia 0.6.0 (Bezanson et al., 2012). Separate programs were compiled for the metronome and tapping Arduinos, and downloaded to the respective Arduinos. Because the adaptive metronome code executed directly on the metronome Arduino, the adaptivity calculations were efficient, with minimal delay for registering taps and adjusting subsequent metronome tones. The metronome Arduino was connected to an experiment control computer (ECC)–a MacBook Pro (Apple, Inc., Cupertino, CA, United States)–via USB serial port running at 115,200 baud. The PySerial package was used for two-way communication between the metronome Arduino and the ECC. A custom-written Python program running on the ECC handled the randomization of experiment conditions, transmitted alpha values for each trial to the metronome Arduino, received data from the metronome Arduino that was streamed to custom binary files, and displayed relevant information to the experimenter.

We implemented a graphical user interface (GUI) to allow for easily customizable data collection procedures. With a simple Python script, experimenters can set parameters of the metronome (e.g., tempo, adaptivity percentages, number of repetitions for each adaptivity condition, number of tapping Arduinos in the experiment, data output paths, etc.). From the GUI, experimenters can input participant IDs, control the start and stop of runs or practice runs, and monitor the time remaining in the experiment. Please see Figure 1 for an overview of our hardware, software, and experimental set-up. Also note that all code to create the system is publicly available: https://github.com/janatalab/GEM.

For single person experiments (Exps. 1 and 2), participants always tapped on the same individual tap pad (of the four possible tap pads). At their seat, participants also had an iPad running our experiment web interface, Ensemble (Tomic and Janata, 2007), to answer surveys. All experimental instructions, presentation code, data, and analysis code related to the experiments reported in this paper are available in a separate GitHub respository: https://github.com/janatalab/GEM-Experiments-POC.

Adaptivity Calculations

The calculation of tap asynchronies and subsequent metronome adjustments was defined in relation to the metronome period, or inter-onset-interval (IOI). The fundamental temporal unit for registering taps in relation to a metronome tone was defined as +/– half of the metronome IOI. For example, with a metronome IOI of 500 ms, taps registered in the window spanning −250 ms before to +250 ms after the tone, are ascribed to the current tone. When a tap is registered on the tap pads of any of the tapping Arduinos, an interrupt is triggered on the metronome Arduino. With nanosecond precision, interrupts are a precise way to register the timing of an event.

Once the temporal window of registering taps for the current tone has elapsed (e.g., 250 ms after the metronome tone for an IOI of 500 ms), the times of the taps registered for that tone are used to calculate the timing of the next metronome tone. Upon calculating the time of the next metronome tone, and setting a corresponding timer, the data are sent to the ECC with less than 1 ms delay. The data packet sent to the ECC after every metronome event window is 12 bytes: 4 bytes for the time of the metronome tone onset, and 2 bytes for each tapper’s tap time, relative to the metronome (8 bytes total). The ECC streams these data to a custom-formatted binary data file.

Personality and Individual Factors

In each experiment, all participants completed the Goldsmith’s Music Sophistication Index (GOLD-MSI; Müllensiefen et al., 2014), the Internality, Powerful Others, and Chance Scales (IPC scales; Levenson, 1981), the Brief form of the Affective Neuroscience Personality Scales (BANPS; Barrett et al., 2013), and a short form assessing basic demographic information. The GOLD-MSI assesses six subcategories of possible musical expertise: active engagement, perceptual abilities, musical training, singing abilities, and emotions, which, when combined, generate an overall general musical sophistication score. We assumed people with higher musical sophistication would exhibit less variable tapping performance. The BANPS is the brief version of Affective Neuroscience Personality Scales (Davis et al., 2003; Davis and Panksepp, 2011) and measures behavioral traits in relation to six primary hypothesized neural affective systems: play, seek, care, fear, anger, and sadness, each of which has hypothesized neural correlates. These scales were included to explore potential relationships between neuromodulatory systems and interpersonal dynamics. The IPC scales (Levenson, 1981) measure the degree to which participants think that: (1) they have control over their own life (internality), (2) others control events in their life (powerful others), and (3) chance dictates their life (chance). Previously, the internality subscale was associated with the degree to which participants were categorized as leaders or follows in a single-person adaptive metronome task (Fairhurst et al., 2014). In the current experiments, we wondered whether those with higher internality score may be more closely synchronized with the metronome and/or the group. Collectively, the above surveys allow for an exploratory analysis of the degree to which person-level factors influence tapping behavior and subjective experience in group settings.

Experiment 1

Though the multi-person adaptive metronome was built with four tappers in mind, our initial experiments involved only one tapper so we could confirm the system worked as expected in a known scenario. Thus the goal of the first experiment was to replicate the findings of Fairhurst et al. (2013) with our new Arduino-based system.

Methods

Participants

A statistical power analysis was performed for sample size estimation, based on data from Fairhurst et al. (2013) (N = 16), comparing metronome adaptivity = 0 to metronome adaptivity = 0.25. The effect size in this original study was −0.95, considered to be large using Cohen’s (1988) criteria. With an alpha = 0.05 and power = 0.90, the projected sample size needed with this effect size for the current experiment was approximately N = 12, for this simplest comparison between alpha conditions. We thus sought to collect data from at least 20 participants to allow for attrition. Data collection was set to stop when a maximum of 30 participants had completed the task, or the academic term ended, whichever came first.

Twenty-one undergraduate students from the University of California, Davis, participated in exchange for partial course credit. Data from one participant was discarded because of self-report of abnormal hearing. Data from two participants could not be used due to technical issues with our internet connection during data collection, which resulted in loss of survey data. After the additional data cleaning procedures described below, we had a total of 15 participants, aged 21 +/- 2 years; 8 females. For all experiments reported in this paper, participants provided informed consent in accordance with a protocol approved by the Institutional Review Board of the University of California, Davis.

Stimuli

The only sound heard throughout the experiment was the sound of the metronome. This sound was a marimba sample from GarageBand, with pitch A2 and a 400 ms duration, played through a speaker connected to the metronome Arduino at a comfortable listening volume. All stimuli discussed in this paper are available in the GEM-Experiments-POC repository.

Procedure

The participant was seated in a sound-attenuating room and instructed to synchronize their tapping with the metronome, starting with the third tone. They were told to try to maintain the initial established tempo of the metronome from the first two tones and that the metronome would adapt based on their performance. Following the delivery of instructions, we ensured participants understood the task and showed them how to tap on the force-sensitive-resistors with the index finger of their dominant hand. Participants then completed one practice round of tapping (∼13 s); a maximum of three practice trials was possible, if participants struggled to understand the task. Metronome adaptivity was always at 0 during the practice rounds.

Throughout the experiment, participants completed ten rounds of tapping at each of five adaptivity levels (0, 0.25, 0.5, 0.75, 1), for a total of 50 rounds of tapping. Adaptivity level was randomized across rounds of tapping. Each round consisted of 25 metronome tones and lasted approximately 13 s. The initial metronome tone inter-onset-interval for all rounds was set to 500 ms (120 beats per minute). Following each round of tapping, participants answered a short questionnaire which assessed, on a set of 5-point scales, the degree to which they felt: (1) synchronized with the metronome, (2) in the groove, (3) they had influence over the metronome pulse, (4) the task was difficult, and (5) they would have liked to continue with the task. We defined synchronization as the degree to which participants thought their taps were aligned with the tones of the metronome (from an objective perspective) and “being in the groove” as an effortless, pleasurable feeling of oneness with the metronome (a subjective experience; see Janata et al., 2012). This distinction was explained to participants during the instruction period (for full instructions text, please visit our GEM-Experiments-POC repository wiki). In total, all rounds of tapping, including post-tapping surveys, typically lasted 30–40 min. Participants then completed the personality-related and demographic surveys, which typically took 15–20 min. The entire experiment lasted approximately 1 h.

Data Analysis

Binary tapping data files were converted to .csv using a custom file parser in Julia (available in the GEM repository). All data were then preprocessed and concatenated using custom scripts in MATLAB (Mathworks, Natick, MA, United States). Note that while all hardware and software used to build and run the adaptive metronome is open source, MATLAB is not. We used MATLAB purely due to its convenient interface with our web-based data collection tool, Ensemble (Tomic and Janata, 2007). After concatenating all cleaned tapping and survey data into tables in MATLAB, the remaining analyses were conducted in Python. The anonymized data tables for each experiment and the analysis code are available in the GEM-Experiments-POC repository. All statistical analyses can be recreated using the provided data tables and Jupyter notebooks.

Tapping Data

Participants who missed 30% or more of the required number of taps were eliminated from further analysis. Of the data remaining, any rounds missing > 30% of the taps were discarded. If these data cleaning procedures resulted in a participant not having any observations for any of the adaptivity conditions, that participant was removed from further analyses (3 participants). All tapping data were analyzed with respect to metronome adaptivity condition. The dependent measure of interest was the standard deviation of tap asynchronies, which serves as a measure of synchronization stability, as in Repp and Keller (2008) and Fairhurst et al. (2013).

Survey Data

All questionnaires were analyzed via a custom MATLAB implementation of the original instrument authors’ scoring metrics. For the post-tapping survey, data were z-score normalized within each rating scale for each participant. Correlations among surveyed variables were checked and an exploratory factor analysis was conducted.

Statistical Analyses

All statistical analyses were conducted in Python using the pingouin (Vallat, 2018), statmodels (Seabold and Perktold, 2010) and factor-analyzer (Biggs, 2020) packages. All code is available in the associated Jupyter notebooks.

Results

The tapping results for Experiment 1 are plotted in Figure 2, left panel (solid line), as a function of metronome adaptivity condition (α). A repeated measures analysis of variance (ANOVA) revealed a main effect of α condition on tapping SD asynchrony, F(4, 56) = 8.16, p = 0.002, η² = 0.26. Paired t-tests were calculated one-sided, based on a level of alpha = 0.05. The t-tests revealed significant differences in SD asynchrony between the baseline condition (α = 0) and what we might call optimally adaptive conditions: α = 0.25, t(14) = 2.81, p = 0.007, d = 0.48, Δ async = −8.10 ms, and α = 0.5, t(14) = 2.92, p = 0.006, d = 0.39, Δ async = −6.64 ms. There were no significant differences from the baseline condition when α = 0.75 or 1, both t < 1.

FIGURE 2

Figure 2. Tapping performance (standard deviation of tap—metronome asynchrony), left panel, and subjective enjoyment, right panel, averaged across participants, as a function of metronome adaptivity, in the two single tapper experiments. Solid black lines represent experiments in which participants could only hear the sound produced by the metronome; dashed lines, experiments in which participants could hear the metronome and the sound of their own tap. Error bars represent standard error of the mean.

Turning to subjective ratings, we found participants’ perceived synchrony with the metronome was significantly negatively correlated with their tapping SD asynchrony [r_rm(603) = −0.40, 95% CI (−0.47, −0.33), p < 0.001]. Reduced asynchronies (better synchrony with metronome) led to a greater sense of subjective synchrony, suggesting participants have at least a moderately accurate assessment of their own tapping performance. Before analyzing subjective ratings in relation to metronome adaptivity, we checked for correlations among the individual items and found all showed significant correlations with each other (see Supplementary Appendix 1). Thus, we performed an exploratory factor analysis to identify one or more latent variables. Two preliminary tests confirmed the suitability of applying a factor analysis to these data (Bartlett’s test of sphericity = 810.04, p < 0.001; Kaiser-Meyer-Olkin measure of sampling adequacy = 0.738, p < 0.001). Eigenvalues and a scree plot indicated a 1-factor solution. We therefore ran the factor analysis again, specifying a one factor solution, using the maximum likelihood method, and no rotation (rotation is not possible with one factor). Factor loadings and communalities are shown in Table 2. Groove, synchrony, and liking all loaded strongly and positively onto this factor, while difficulty had a high negative loading. With a loading less than 0.3, participants’ felt influence over the pulse played only a small role in this overall factor and was therefore excluded from the overall factor score. We hereafter refer to this factor as “Enjoyment.”

TABLE 2

Table 2. Item loadings for the Enjoyment factor (Exp. 1).

Participants’ enjoyment factor scores are plotted as a function of metronome adaptivity in Figure 2, right panel (solid line). A repeated measures ANOVA revealed a main effect of α condition on enjoyment, F(4, 56) = 13.65, p < 0.001, η² = 0.49. There was no significant difference between baseline (α = 0) and α = 0.25. However, a significant decrease in enjoyment was observed between baseline and α = 0.5, 0.75, and 1. Detailed statistics for all pairwise comparisons can be viewed in the associated Jupyter Notebook.

Discussion

The tapping results of Experiment 1 replicate those of Fairhurst et al. (2013), using our new Arduino-based adaptive metronome. The Arduino-based system functions as expected: it brought participants into greater synchrony with the metronome at adaptivity levels of 25–50% (compared to baseline), as in Fairhurst et al. (2013). At higher levels of adaptivity (75% or 100%) participants’ tapping performance returned to baseline or worse. We can, therefore, conclude, like Fairhurst et al. (2013), that it is possible for the metronome to be in “optimal” vs. “overly” adaptive states, as could be the case with a real person with whom one might interact. By simulating such states in a controlled way, we can assess the degree to which optimal vs. extreme adaptivity influence participants’ experience.

With respect to subjective experience during the task, we firstly found participants’ perceived synchrony with the metronome was correlated with their objective (measured) tapping synchrony. We also found their synchrony rating is highly correlated with their groove, liking, and difficulty ratings. An exploratory factor analysis showed all of these ratings could be reduced to one collective factor which we labeled Enjoyment. These correlations between our rating items echo previous studies that have also often found associations between groove, liking, and difficulty (e.g., Janata et al., 2012; Hurley et al., 2014; Witek et al., 2014).

In examining participants’ enjoyment scores in relation to metronome adaptivity, we did not see any increase in enjoyment with optimal adaptivity, though we did find significant decreases in enjoyment when the metronome adapted by 50% or more. It is hard to directly compare these subjective findings with those of Fairhurst et al. (2013), as they used only difficulty, influence, and synchrony ratings on a visual analog scale, ranging from 0 to 10, whereas we included additional items, all on 5 pt. scales. Nonetheless, our results follow their overall pattern of subjective rating findings (see their Supplementary Table 1), with about the same ratings observed in baseline and 25% adaptivity conditions, followed by an increased difficulty (in our case, decreased enjoyment), at adaptivity levels 50% and higher, compared to baseline. Whether the overall patterns in tapping synchrony and subjective experience reported here will remain the same when participants receive auditory feedback about their taps is the topic of Experiment 2.

Experiment 2

Before applying our adaptive metronome to group contexts, we first wanted to more clearly characterize its functioning in the single-person use case. To this end, we repeated Experiment 1, but additionally provided participants auditory feedback from their taps. We hypothesized such feedback would lead to better synchronization with the metronome and increased enjoyment of the task.