Robust Temporal Averaging of Time Intervals Between Action and Sensation

Zeng, Huanke; Chen, Lihan

doi:10.3389/fpsyg.2019.00511

ORIGINAL RESEARCH article

Front. Psychol. , 19 March 2019

Sec. Cognition

Volume 10 - 2019 | https://doi.org/10.3389/fpsyg.2019.00511

This article is part of the Research Topic Temporal Cognition: Its Development, Neurocognitive Basis, Relationships to Other Cognitive Domains, and Uniquely Human Aspects View all 13 articles

Robust Temporal Averaging of Time Intervals Between Action and Sensation

$\r\nHuanke Zeng$ Huanke Zeng

Lihan Chen^*

School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China

Perception of the time interval between one’s own action (a finger tapping) and the associated sensory feedback (a visual flash or an auditory beep) is critical for precise and flexible control of action and behavioral decision. Previous studies have examined temporal averaging for multiple time intervals and its role for perceptual organization and crossmodal integration. In the present study, we extended the temporal averaging from sensory stimuli to the coupling of action and its sensory feedback. We investigated whether and how temporal averaging could be achieved with respect to the multiple intervals in a sequence of action-sensory feedback events, and hence affect the subsequent timing behavior. In unimodal task, participants voluntarily tapped their index finger at a constant pace while receiving auditory feedback (beeps) with varied intervals as well as variances throughout the sequence. In crossmodal task, for a given sequence, each tap was accompanied randomly with either visual flash or auditory beep as sensory feedback. When the sequence was over, observers produced a subsequent tap with either auditory or visual stimulus, which enclose a probe interval. In both tasks, participants were required to make a two alternative forced choice (2AFC), to indicate whether the target interval is shorter or longer than the mean interval between taps and their associated sensory events in the preceding sequence. In both scenarios, participants’ judgments of the probe interval suggested that they had internalized the mean interval associated with specific bindings of action and sensation, showing a robust temporal averaging process for the interval between action and sensation.

Introduction

Time perception upon the interval between one’s action and its sensory feedback (such as visual flash or auditory beep), i.e., sensorimotor timing, is critical for daily perception, behavioral decision and even human living (Repp, 2005). Two prominent examples of sensorimotor timing are sensorimotor synchronization (Aschersleben and Bertelson, 2003; Repp, 2005, 2006a,b) and temporal recalibration effect (TRE) (Stekelenburg et al., 2011; Sugano et al., 2012, 2014, 2016, 2017). In sensorimotor synchronization, observers produced tapping movements in synchrony with a sequence of isochronously (and continuously) repeated pacing signals, being either light flashes or auditory beeps (Aschersleben and Bertelson, 2003). A typical finding in sensorimotor synchronization is that timing of the taps has been biased significantly to the auditory signals than visual flashes, when the taps were synchronized with continuous visual or auditory stimuli, indicating the preference of the perceptual system for continuous information with visual stimuli (Varlet et al., 2012; Armstrong and Issartel, 2014). TRE, on the other hand, reflects the nature of “causality” between action and its sensory feedback, and time adaptation aftereffect. In a seminal study, Stetson et al. (2006) inserted a temporal delay between one’s own action (key presses) and the associated sensory feedback (visual flashes). Following a period of adaptation, when the flashes appeared unexpectedly after the keypresses, however, they were often perceived as occurring before the keypresses (Stetson et al., 2006), demonstrating recalibration effect for motor-sensory temporal order judgments.

In a typical sensorimotor synchronization task, observers are usually tapping according to the pacing signals with regular rhythm. However, it is often the case that the pacing rhythm is not regular, wherein observers have to calculate the “mean” rhythm (as a temporal reference) for making the subsequent prompted action decision and execution, whether by adopting the temporal estimation or (re)production tasks. The ability to extract the average time interval information in the action-sensory feedback sequence demonstrates the individual timing sensitivity (“temporal window” for sensory integration) and help us adapt to the environmental changes (Repp, 2005). The computation of the “mean,” i.e., temporal averaging process, has been realized in a number of contexts, including crossmodal interaction in recent studies (Cheng et al., 1996; Matell and Henning, 2013; Schweickert et al., 2014; De Corte and Matell, 2016a; Chen et al., 2018). One compelling example for temporal averaging is the central tendency effect within the broader framework of Bayesian optimization. In the central tendency effect, observers incorporated the mean of the statistical distribution for sensory properties to assimilate/bias the estimates toward the mean (Jazayeri and Shadlen, 2010; Burr et al., 2013; Shi et al., 2013; Karaminis et al., 2016; Roach et al., 2017). For examples, the discrimination of the target sensory interval was biased to the preceding time interval from a different modality (Burr et al., 2013), the discrimination of visual apparent motion was modulated by the perceived mean inter-interval in the preceding auditory sequence (Chen et al., 2018; Wan and Chen, 2018).

The perception of the time interval between an action and its sensory feedback, in which the perception of time will be biased to the concurrent actions, is different to the perception of time intervals within pure sensory events. A recent study showed that motor timing during rhythmic tapping influences the visual timing. Tomassini et al. (2018) asked participants to tap their finger with a rhythm same to the preceding sequence of four auditory tones. During finger tapping, they were presented with an empty visual interval and judged its time interval compared with the previously established (internalized) interval of 150 ms. The perceived time was maximally expanded at halftime between two consecutive finger taps and the maximal expansion has been found to be anchored to the center of the inter-tap interval. This distortion in time perception indeed indicates that a timing mechanism exists to maximally keep perception and action accurately synchronized (Tomassini et al., 2018). In another seminal study, Yon et al. (2017) investigated the influence of movement duration on the perceived duration of an auditory tone. The judgments of tone duration were attracted toward the duration of executed movement-the tones were perceived to last longer when participants executed a movement with longer duration (Yon et al., 2017).

Temporal averaging entails the empirical inquiries with regards to the distribution of irregular (unequal) time intervals (De Corte and Matell, 2016a; Chen et al., 2018; Wan and Chen, 2018), selective averaging one of the sequences (Overduin et al., 2008), as well as potential capacity limits of simultaneous temporal processing (Cheng et al., 2014). Schweickert et al. (2014) demonstrated that observers estimated the average of tone durations and their performance was influenced by the distribution of the tone durations. In general the estimated averages were a linear function of the stimulus means. The estimates were accurate for the smallest population mean but underestimates for the larger means, and human observers subjectively shortened the durations in memory (Schweickert et al., 2014). With multiple intervals, human observers could encode two different, and distinct, standard durations. In this case, temporal generalization with respective to the one of the two standards was subject to the memory loading in temporal references as well as their variances (Jones and Wearden, 2004). Moreover, take two consecutively presented standards (A and B, each presented three times, but the duration of B was 100 ms longer than A) for example, the certain combinations of delay and interference could render the memory of A unusable and a new standard (“false memory”) is constructed on the basis of the remembered relationship between A and B (Ogden et al., 2008). Therefore, the internal representation of temporal statistics depends on the distribution of time intervals, the variances of the intervals and is affected by the potentially memory mixing effect (due to the time delay as well as the interference among the many intervals being encoded).

In current study, we examine the mechanisms of temporal averaging of the time intervals between action and its sensory feedback (visual flash and auditory beep). Specifically, we investigated how the mean and irregularity (variances) in the distribution of time intervals affect the perception of target interval in the loop of action and its sensory feedback. Secondly, we examined how human observers can selectively average the sensory-specific time intervals in two sequences in which the actions were bound with either visual flashes or auditory beeps (Chen and Vroomen, 2013). Lastly, we examined the potential memory mixing effect induced by the memory load (and decay) and inherent individual capacity limit of simultaneous temporal processing.

We implemented four experiments to address these issues. In Experiment 1, we examined the ability of extracting the mean interval from a sound sequence and replicated the central tendency effect. In Experiments 2 and 3, we studied the selective temporal averaging in which the actions were bound with two types of events: beeps of two types of pitches, or two types of sensory stimuli (visual flashes and auditory beeps). In Experiment 2, we investigated whether observers could selectively separate the different mean action-auditory feedback intervals and hence make the comparisons between the produced interval and the preceding duration-specific mean auditory intervals. To examine whether the ability of temporal averaging is dependent on the individual modalities (events) or not, in Experiment 3, we used both auditory beeps and visual flashes as sensory feedbacks and examined the selectivity of temporal assimilations to either short or long mean intervals (actions associated with visual or auditory feedbacks). By averaging, human observers could take both the mean interval information and the variance of the intervals into account (Acerbi et al., 2012). In Experiment 4, we further looked into whether the variations of the intervals (by manipulating the coefficient of variances, CV) affect the averaging process of temporal information. The results from the four experiments largely support a robust temporal averaging process for time intervals between actions and their associated sensations. We further validated the effectiveness of the temporal averaging of the intervals rather than the sampling from individual intervals (including the last interval of the action-sensation loops), and discussed the limited role of the memory load on the averaging process with the current paradigms.

Materials and Methods

Stimuli and Apparatus

Auditory stimuli in a sound sequence were pure tones (30 ms, 500 Hz or 1000 Hz), with 65 dB SPL. Two pure tones of 2000 Hz were used as cueing signals. The starting cue (duration of 500 ms) prompted the beginning of a trial. The testing cue (for the last tap, duration of 200 ms) indicated the coming of the probe interval for discrimination (see the following procedure for more details).

Visual flash was a black disk (duration of 30 ms, 2.74 degree in diameter, 11 cd/m² in luminance) appearing at the center of the screen, with a gray background (16.8 cd/m² in luminance), presented on a 27-inch screen (ASUS PG278QR, NIVIDIA GeForce GTX 1080 Ti visual graphic card). The viewing distance from the participants to the center of the monitor was 60 cm. Auditory stimuli were delivered through NIVIDIA High Definition Audio. Participants wore headset of Sennheiser Momentum 2 to receive the sounds. We used RTBox v6 (Suzhou Litong Company Limited, China) to collect responses. The experimental program was written with Matlab (Mathworks Inc.) and the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997; Kleiner et al., 2007).

In Experiment 1, only 500-Hz tones were used and mean of eight intervals between tappings and tones (sensory feedback) was 800 ms. The eight sequential intervals were in the time range of 600 to 1000 ms, and were drawn from a Gaussian distribution of N(800, 100). Using customized codes, we composed each trial(sequence) to ensure the coefficient of variance (CV, i.e., the ratio of the standard deviation to the mean) of all intervals was between 0.1 and 0.15, thus to largely randomize the temporal information as well as within the human observers’ perceptual expertise to perform the tasks. In Experiment 2, two mean intervals were used. The short interval (mean of 400 ms) was associated with low-pitch tone (500 Hz) and the long interval (mean of 800 ms) was associated with high-pitch tone (1000 Hz). The short sequential intervals were in the range from 200 to 600 ms, and were drawn from a Gaussian distribution of N(400, 100). The CV of the intervals was between 0.1 and 0.15. The mapping between tone pitch and mean interval was reversed in the other condition. In Experiment 3, the similar configurations were used as in Experiment 2 except that both auditory and visual feedbacks were used. In Experiment 4, we designed two types of tap-tone sequences in which the mean tap-tone interval was kept at 800 ms. However, for one sequence, the taps were followed with tones (500 Hz) with low CV (between 0.1 and 0.15) of the intervals. For the other sequence, the taps were associated with tones with high-pitch tones (1000 Hz) and with high CV (between 0.3 and 0.35). The CVs were determined by previous evidence so that in this range human observers could well perform the relevant tasks (Chen et al., 2018; Getty, 1975a,b). For all the above experimental conditions, following the sequences of action-sensory feedback, participants pressed a button and generated an interval of 200, 400, 600, 800, 1000, 1200, or 1400 ms, to compare with the preceding long mean interval (800 ms); and from 100, 200, 300, 400, 500, 600, or 700 ms to compare with the preceding short mean interval (400 ms).

In the formal experiments, the preceding sequence contained two different intermixed durations, with the two different durations each cued by different pitches or by different sensory events (visual flashes or auditory beeps). Under this context, people can extract and maintain a standard for each duration. The two standards might interact and may interfere a bit in memory references. To examine whether there are perceptual shifts and response biases due to the mixing of the two sequences (standards), we further implemented control tests with the same tasks as in formal experiments, but obtained the baseline data for mean 400 and 800 ms interval conditions from another groups of participants.

Procedure

The experiments were performed in compliance with the institutional guidelines set by the Academic Affairs Committee, School of Psychological and Cognitive Sciences, Peking University. The protocol was approved by the Committee for Protecting Human and Animal Subjects, School of Psychological and Cognitive Sciences, Peking University. All participants gave written informed consent in accordance with the Declaration of Helsinki, and were paid for their time on a basis of 40 CNY/hour, i.e., 6.3 United States dollars/hour.

In a preceding action-sensation sequence, participants did voluntary taps that triggered either auditory beeps or visual flashes as sensory feedbacks. This loop with multiple tap-sensation intervals (with mean interval of 400 or 800 ms) served as a temporal reference for the subsequent comparison of target interval (in a single action-sensation loop). The target interval was defined by a tap with its associated sensory feedback (visual flash or auditory beep). The target interval was 200, 400, 600, 800, 1000, 1200, or 1400 ms for the long mean duration (800 ms) condition and 100, 200, 300, 400, 500, 600, or 700 ms for the short mean duration (400 ms) condition. A typical trial started with a black fixation (“cross” on the monitor screen) which appeared 500 ms before the first signaling tone and lasted until the second cueing tone was over. The first cueing beep (2000 Hz, 500 ms) indicated the start of the action-sensory feedback sequence and prompted the participants to issue the tappings within 3 s. The tap was accompanied with either visual flash or auditory beep, with the repetition of eight action-sensation intervals (mean 400 ms or 800 ms). When the last sensation feedback was over, after a blank interval of 300 ms, participants heard a 2000 Hz beep (200 ms) which indicated the issuing of a last tap for generating target interval (either with visual flash or auditory beep) (Figure 1). We used the method of constant stimuli to compare the target interval duration with the mean action-sensation interval duration. Participants were asked to make a two alternative forced choice (2-AFC) with RTbox, to indicate which interval is longer: the mean action-sensation interval, or the last target interval (Figure 1). We detailed the specific methods for each experiment as follows.

FIGURE 1

Figure 1. Stimuli configurations and schema for the experiments. (Upper): Experiments 1, 2, and 4. In a typical trial, upon hearing a beep participants voluntarily pressed a button to issue its sensory feedback (“beeps,” with same or different pitches). When the sequence of multiple action-sensory events was over, another signaling beep appeared which prompted the participants to issue a press and it was followed by a last sensory feedback. At this time point they were encouraged to make perceptual discrimination of whether the probe interval (between the offset of the action and onset of the beep) was shorter or longer than the mean interval between the action and its sensory feedback. (Down): The procedure for Experiment 3. The general procedure was the same as shown in the upper figure, however, the sensory feedback include mixed streams of visual flashes and auditory beeps. Participants were asked to compare the probe interval between tap and flash, or between tap and beep with the corresponding mean interval of the preceding intervals of the same type. Detailed information was given in the main text.

Experiment 1

Thirteen participants (with ages from 19 to 25, 6 males) took part in experiment 1. In Experiment 1, we used 500 Hz tones as sensory feedbacks for participants’ voluntary taps. Participants consecutively tapped eight times first, in which each tap was followed by a 500 Hz auditory beep as sensory feedback. The time intervals between action and sensory feedback were not equal (with mean interval of 800 ms and coefficient of variance of 0.1 to 0.15). The target interval was 200, 400, 600, 800, 1000, 1200, or 1400 ms. Participants took two blocks of tests, each block having seven trials for each given target interval. Participants received 14 trials, twice for each target interval, to get familiar with the task.

The data from Experiment 1 served as a subset of baseline data, in which only one type of auditory signals were used. Three further control experiments were implemented to provide baseline data in which only a single type of stimuli was presented eight times, i.e., 500 Hz tones with short intervals, visual flashes with long intervals (mean 800 ms) and visual flashes with short intervals (mean 400 ms). The control experiments were modified after Experiment 1. In addition to the specific mappings of sensory feedbacks and intervals, in each control experiment participants received practices (visual feedback of “correct” or “wrong” after each response) until their accuracies were above 75%. The number of practice blocks were identical to the formal experiments. Thirteen participants (ages from 19 to 24, 5 males) took parts in control experiment (CE1). In CE1 (baseline corresponding to Experiment 2 and Experiment 3), sensory feedbacks were 500 Hz auditory beeps, but the mean tap-beep interval was 400 ms. Thirteen participants (ages from 19 to 24, 3 males) attended in CE2. In CE2 (baseline for Experiment 3), we used visual flashes as sensory feedbacks to associate with the taps. The mean tap-flash interval was 800 ms. Thirteen participants (ages from 18 to 24, 3 males) attended in CE3. In CE3 (baseline for Experiment 3), the tap-visual flash sequence was adopted with the mean tap-flash interval of 400 ms. For all the control experiments, after the preceding sequence was over, the probe interval was given and was always demarcated with the sensory event of the same properties as shown in the sequence. The probe interval was 200, 400, 600, 800, 1000, 1200, or 1400 ms for the long mean duration (800 ms) condition, and 100, 200, 300, 400, 500, 600, or 700 ms for the short mean duration (400 ms) condition.

Experiment 2

Seventeen participants (ages from 20 to 25, 5 males) took part in Experiment 2. We used two kinds of auditory feedbacks (500 or 1000 Hz) and two sets of tap-sensation intervals (mean = 400 ms or mean = 800 ms, CVs of both sets of intervals were 0.1 to 0.15). In one condition, short intervals were marked by 500 Hz tones and long intervals were marked by 1000 Hz tones. Nine participants took the test in this condition. In the other condition, eight participants joined the test in which the associations between intervals and tones were reversed (short intervals-high pitch tones and long intervals-low pitch tones). In a tap-sensation sequence, the short and long intervals were mixed. Participants issued eight taps in which the ratio of the short to long intervals was selected from one of the given sets (1:1, 3:5, 5:3). Participants were prompted to compare the target interval with the preceding mean interval of action-sensory feedbacks in four blocks, in which both the target interval and the preceding intervals between action and sensation were marked by the tones with the same pitches. In each block, one target interval (from seven levels) was presented four times. Prior to formal experiment, participants received two tasks for practice. In the first task, they received the practice with both short and long mean intervals (but in one sequence only either 500 or 1000 Hz tones were given). Each target interval was presented three times, resulting in 42 trials. Participants could take another session for practice until their accuracies were above 75%. In the second task, they received another 14 trials (with mixed tones of 500 and 1000 Hz, seven times for each condition). Both practice tasks were implemented with visual feedback of “correct” or “wrong” responses. When the practice session was over, participants took the formal test.

Experiment 3

Sixteen participants (ages from 20 to 25, 7 males) took part in Experiment 3. The stimuli configurations and timing parameters were similar to those in Experiment 2, except that the 1000 Hz tones were replaced by visual black disks as sensory feedback. The practice protocol was the same as the one in Experiment 2.

Experiment 4

Twelve participants (ages from 20 to 25, 4 males) took part in Experiment 4. The stimuli setting and timing parameters were similar to those in Experiment 2, except that the two sets of action-sensation intervals were same (mean 800 ms) but with different CVs. In one configuration, the intervals marked with 500 Hz tones were associated with CVs of 0.1 to 0.15 (i.e., low variance), and those intervals marked with 1000 Hz were associated with CVs of 0.3 to 0.35 (i.e., high variance). In the other configuration, the mappings between tone pitches and CVs were reversed. Prior to the formal experiment, participants took 14-trial practice with feedback of “correct” or “wrong” responses as did in Experiment 2.

Data Analysis

In all four experiments, the proportions of reporting the target duration as longer across seven intervals were fitted to the psychometric curve using a logistic function (Treutwein and Strasburger, 1999; Wichmann and Hill, 2001). The transitional threshold, that is, the point of subjective equality (PSE) at which the participant was likely to report the two motion percepts equally, was calculated by estimating 50% of reporting of group motion on the fitted curve. The just noticeable difference (JND), an indicator of the sensitivity of apparent motion discrimination, was calculated as half of the difference between the lower (25%) and upper (75%) bounds of the thresholds from the psychometric curve.

Results

Experiment 1 and Control Experiments

Exp1

Baseline bias when eight sequential stimuli were drawn from a single distribution

The mean PSE and JND were 869.3 ± 24.1 ms (standard deviation) and 194.4 ± 29.4 ms. All the mean PSEs and JNDs were ploted in Figure 3. One sample t-test showed that participants underestimated the target interval, compared with 800 ms, t(12) = 10.368, p < 0.001 (Figure 2, left).

FIGURE 2

Figure 2. The fitted psychometric curves for Experiment 1 (left, averaged plot) and Experiment 2 (right, plot for a typical participant). The proportions of reporting the probe action-sensation interval as being longer than the mean preceding action-sensation intervals were plotted as a function of the probe intervals (200–1400 ms, with 200 ms as step size, left figure) or probe intervals of different ranges (right figure solid line with short range of 100–700 ms and dashed line with long range of 200–1400 ms). The crossing points on the x-axis indicated the PSEs.

Effects of individual standards within the sequence

To evaluate whether certain intervals in the action-sensation sequences play a significant role in determining the estimation of the probe interval, e.g., the potential recency effect stemming from the last interval (Wan and Chen, 2018), we performed binary logistic regression with responses to target intervals (“0” as shorter and “1” as longer compared with the mean interval) as dependent variable and eight sequential intervals and probe interval as predictor variables for each participant. Ominibus Tests of Model Coefficients of all participants’ model reached significant level (ps < 0.001), which suggested at least one of the predictor variables was statistically significant in contributing the discrimination of probe interval. The results of Hosmer and Lemeshow Tests of models were not significant (ps > 0.143), implying good fitness of the models. We then implemented one-sample t-tests comparing parameter estimates of the eight sequential intervals of all participants with “0.” None of these sequential intervals reached significant level (ps > 0.521). Finally, a repeated-measure ANOVA test was implemented with positions of sequential intervals as within-subject variables on parameter estimates of sequential intervals of all participants. The difference between sequential intervals was partially significant [F(7,84) = 2.112, p = 0.051, η² = 0.150] and the effect of intercept was not significant [F(1,12) = 0.291, p = 0.599, η² = 0.024]. The detailed values were given in Table 1.

TABLE 1

Table 1. The parameter estimates of binary logistical regressions. The probe intervals were labeled as 1∼7 in the regression models.

CE1

In this separate control experiment with 500 Hz auditory beeps and short mean durations, the mean PSE and JND were 470.8 ± 19.5 ms and 119.1 ± 24.5 ms. One sample t-test revealed a significant bias of perceived “compression” of the probe intervals (compared with the reference of 400 ms) [t(12) = 13.333, p < 0.001]. Binary logistic regression, the same as in Exp1 was applied. Ominibus Tests of Model Coefficients of all models reached significant level (ps < 0.001). The results of Hosmer and Lemeshow Tests of models were not significant (ps > 0.196) for eleven participants except for two participants (which means their models were not good fitted). Thus we implemented one-sample t-tests with the two participants excluded. None of these sequential intervals reached significant level (ps > 0.055). The repeated measures ANOVA test revealed a partially significant effect of intercept [F(1,12) = 4.585, p = 0.053, η² = 0.276] but no significant effect of sequential intervals [F(7,84) = 0.702, p = 0.610, η² = 0.055].

CE2

The mean PSE and JND of the control experiment with visual flashes and long mean duration (800 ms) were 832.7 ± 27.6 and 138.2 ± 7.5 ms. One sample t-test of this condition showed participants’ tendency of “compressing” probe intervals as above [t(12) = 4.271, p = 0.001] (Figure 3). Ominibus Tests of Model reached significant level (ps < 0.001) and Hosmer and Lemeshow Tests of models were not significant [ps > 0.579] for the binary logistic regression. One-sample t-tests showed that none of the effects of these sequential intervals were significant (ps > 0.345). Both the effects of sequential intervals [F(1.000,12.003) = 1.007, p = 0.335, η² = 0.077] and intercept [F(1,12) = 0.958, p = 0.347, η² = 0.074] were not significant by repeated-measure ANOVA test.

FIGURE 3

Figure 3. The mean bar plots of PSEs and JNDs for the experiments. A-audio; V-visual;CV-coefficient of variance. The error bar represented the standard errors. The horizontal axis decipted experimental conditions. PSEs and JNDs of Exp2, Exp3, and Exp4 were from data collapsed across tone pitches or modalities of feedbacks. Left halves of (A) and (B) indicated PSEs and JNDs for short mean duration conditions. Right halves of (A) and (B) indicated PSEs and JNDs for long mean duration conditions.

CE3

For the control experiment with visual flashes and short mean duration (400 ms), the mean PSE and JND were 418.5 ± 13.0 and 75.7 ± 10.3 ms. Participants had biases to “compress” the probe intervals [t(12) = 5.128, p < 0.001]. For binary logistic regressions, Ominibus Tests of Model reached significant (ps < 0.001) and Hosmer and Lemeshow Tests of models were not significant (ps > 0.364). One-sample t-tests showed none of these sequential intervals was significant in contributing the perceived probe intervals (ps > 0.277). The repeated-measure ANOVA test showed neither effect of sequential intervals [F(1.001,12.015) = 1.018, p = 0.333, η² = 0.078], nor effect of intercept [F(1,12) = 0.960, p = 0.347, η² = 0.074].

Combine data from Exp1 and CEs for analysis

A 2 × 2 ANOVA test that took modality (auditory/visual) and mean duration (short/long) as between-subject factors showed, for both PSEs and JNDs, a significant main effect of modality [PSE: F(1,48) = 54.890, p < 0.001, η² = 0.533; JND: F(1,48) = 79.144, p < 0.001, η² = 0.622] and a significant main effect of mean duration [PSE: F(1,48) = 4577.967, p < 0.001, η² = 0.990; JND: F(1,48) = 151.808, p < 0.001, η² = 0.760]. The interaction of modality and mean duration was not significant [PSE: F(1,48) = 1.725, p = 0.195, η² = 0.035; JND: F(1,48) = 1.314, p = 0.257, η² = 0.027]. To be more specific, PSEs and JNDs of auditory modality were significantly larger than those in visual modality. PSEs and JNDs in short mean duration condition was significantly smaller than those in long mean duration condition.

The data from Exp1 and CEs could serve as control references for following experiments.