Spontaneous Task Structure Formation Results in a Cost to Incidental Memory of Task Stimuli

Bejjani, Christina; Egner, Tobias

doi:10.3389/fpsyg.2019.02833

ORIGINAL RESEARCH article

Front. Psychol., 17 December 2019

Sec. Cognition

Volume 10 - 2019 | https://doi.org/10.3389/fpsyg.2019.02833

Spontaneous Task Structure Formation Results in a Cost to Incidental Memory of Task Stimuli

$\r\nChristina Bejjani,*$ Christina Bejjani^1,2*

Tobias Egner^1,2

¹Department of Psychology and Neuroscience, Duke University, Durham, NC, United States
²Center for Cognitive Neuroscience, Duke University, Durham, NC, United States

Humans are characterized by their ability to leverage rules for classifying and linking stimuli to context-appropriate actions. Previous studies have shown that when humans learn stimulus-response associations for two-dimensional stimuli, they implicitly form and generalize hierarchical rule structures (task-sets). However, the cognitive processes underlying structure formation are poorly understood. Across four experiments, we manipulated how trial-unique images mapped onto responses to bias spontaneous task-set formation and investigated structure learning through the lens of incidental stimulus encoding. Participants performed a learning task designed to either promote task-set formation (by “motor-clustering” possible stimulus-action rules), or to discourage it (by using arbitrary category-response mappings). We adjudicated between two hypotheses: Structure learning may promote attention to task stimuli, thus resulting in better subsequent memory. Alternatively, building task-sets might impose cognitive demands (for instance, on working memory) that divert attention away from stimulus encoding. While the clustering manipulation affected task-set formation, there were also substantial individual differences. Importantly, structure learning incurred a cost: spontaneous task-set formation was associated with diminished stimulus encoding. Thus, spontaneous hierarchical task-set formation appears to involve cognitive demands that divert attention away from encoding of task stimuli during structure learning.

Introduction

Humans are characterized by a remarkable degree of cognitive flexibility, allowing us to respond to an identical stimulus in a variety of ways, as a function of context. This flexibility derives from our ability to form, apply, and update “task-sets” that define context-specific stimulus-response rules (e.g., Monsell, 2003). The study of task-sets has, to a large extent, focused on how people maintain and switch between explicitly instructed rules, as in classic cued task-switching studies (e.g., Allport et al., 1994; Rogers and Monsell, 1995; reviewed in Kiesel et al., 2010; Vandierendonck et al., 2010). By contrast, how people learn to form task-sets through trial-and-error learning has received much less attention in the task-switching literature (but see Dreisbach et al., 2006, 2007). Moreover, while a parallel learning literature has examined how individuals infer and identify causal structure (e.g., Gershman and Niv, 2010; Wilson and Niv, 2012; Schapiro et al., 2013), not much attention has been paid to the process of task structure building in terms of its immediate cognitive demands and consequences.

Connecting these distinct but related literatures, we here ask the question: How does learning task structure affect the processing of the stimuli that form the input of the learning process? One potent way to answer this question is through the lens of incidental stimulus encoding, since on-task fluctuations in attention ramify in subsequent memory for task stimuli (e.g., deBettencourt et al., 2017). By measuring incidental memory for task stimuli as a function of the structure learning process via a surprise recognition memory test, we can infer where participants focused their attention. This approach has proved successful in elucidating attentional processing during cued task-switching (Richter and Yeung, 2012; Chiu and Egner, 2016). To our knowledge, no previous study has assessed how the process of building and applying a hierarchical task-set impacts the encoding of individual task stimuli. Importantly, an improved understanding of implicit structure formation via incidental encoding adds further insight into interactions between attention and memory in everyday scenarios.

We therefore sought to examine structure learning through incidental task stimulus encoding. We build on recent studies that have shown that when humans learn stimulus-response associations for multi-dimensional stimuli, they implicitly form and generalize abstract, hierarchical task sets (Badre et al., 2010; Collins and Frank, 2013, 2016a,b; Collins et al., 2014; Collins, 2017; Bhandari and Badre, 2018), sometimes even in the absence of inherent structure and performance advantages. Specifically, we here leverage the design of a recent learning task (Collins and Frank, 2016a) that manipulated how stimulus-action rules were mapped to response keys as a means of promoting structure learning. In this task, participants learn to map four two-dimensional stimuli (blue/green triangle/circle) onto four response buttons. To solve this learning problem, participants can memorize all four stimulus-response associations (“flat mapping”) or form a hierarchical mapping, where one higher-level dimension (e.g., color) acts as a context that cues the task-set and the other dimension (e.g., shape) cues the appropriate response rule. We therefore refer to “structure” when participants discriminate the causal relationship between variables in their environment and organize their responses into a hierarchical task-set.

Whether participants build hierarchical task-sets can be inferred from switch costs (cf. Monsell, 2003): participants are slower and more error-prone when their supraordinate dimension (in the above example, color) switches between trials than when their sub-ordinate dimension (i.e., shape) changes, due to additional processing involved in reconfiguring the task-set (Rogers and Monsell, 1995) and/or overcoming proactive interference from the previous set (Allport et al., 1994). Switch costs should be non-existent for participants who adopt a flat mapping, because if they have simply memorized the four stimulus-response associations, there are no supraordinate or sub-ordinate stimulus dimensions. Any transition from one stimulus to another would simply represent a change in the specific S-R mapping retrieved, but would not constitute a task-switch. By contrast, switch costs should be non-zero for participants who adopt a hierarchical task-set, because a change in the supraordinate stimulus dimension would initiate a change in the task rule that is used to determine the correct response (e.g., Collins and Frank, 2013, 2016a,b; Collins et al., 2014).

Given the non-instructed nature of task-set formation processes in this protocol, the experimenter does not have direct control over whether participants adopt a flat mapping or hierarchical task-set strategy in learning the task. However, in order to promote hierarchical set formation in half of the participants and discourage it in the other half, we adopted a biasing technique from Collins and Frank (2016a). Specifically, the proclivity for task-set formation can be biased by manipulating whether stimulus-response mappings are “motor-clustered” along a supraordinate dimension, such that the response mappings for each task-set are either spatially adjacent or not (Figure 1). For example, the color dimension could cue this shape task-set: if participants observe a blue stimulus, they use their right index finger for a triangle and right middle finger for a circle, via a clustered mapping, whereas for a non-clustered mapping, they use their right index finger and right ring or pinky fingers, respectively. This manipulation ultimately results in a greater proclivity for hierarchical structure building (i.e., greater switch costs) for mappings that are motor-biased (i.e., spatially adjacent) than those that are not (Collins and Frank, 2016a), and allows us to naturally bias the likelihood of participants engaging in structure learning.

FIGURE 1

Figure 1. Summary of Task Manipulation. For “non-clustered” learners, response mappings were arbitrarily determined; for “clustered” learners, they were spatially adjacent on the keyboard along a higher-level stimulus dimension. For instance, if all learners applied a hierarchical task-set where color cued the shape task-set, clustered learners would press their right index and middle fingers for a blue square and circle, while non-clustered learners would press their right index and pinky fingers for a blue square and circle. This clustering manipulation was primarily used to promote differences in the strength of structure learning and thus our ability to detect differences in incidental encoding later. Object images were used in Experiment 1, organized by border color (red/blue) and shape (square/circle). Faces were used in Experiments 2–4, organized by apparent gender identity (female/male) and age (young, i.e., less than 30 vs. old, i.e., older than 45).

We here extend this prior work with some key design modifications: we adapt the task to the item (Experiment 1) and category level (Experiments 2–4), using trial-unique object and face stimuli that differ along various stimulus dimensions, so as to assess effects of spontaneous structure learning (as gaged via switch costs) on incidental stimulus encoding in a surprise memory phase (MP). These changes allow us to control for low-level feature priming effects that are inevitable in the original task due to the small stimulus set, and improve ecological validity. Finally, our primary design change allows us to understand structure learning in terms of its immediate cognitive demands and consequences, that is, inferring how implicitly forming task-sets occurs via assessing incidental encoding, which can only be done using trial-unique stimuli. Of particular focus in the context are the early stages of task performance, as this is when the structure learning process would be expected to take place (cf. Badre et al., 2010).

If building spontaneous hierarchical task-sets imposes cognitive demands that divert attention away from stimulus encoding, this would result in worse memory for stimuli encountered, in particular during early structure learning. For instance, structure learning may impose demands on working memory through hypothesis testing of possible rule structures (cf. Gershman and Niv, 2010; Ashby and Maddox, 2011). Alternatively, during learning, participants may selectively attend more to particular stimulus features or dimensions as a means of exploiting environmental redundancy, thus resulting in better subsequent memory for task stimuli (cf. Aly and Turk-Browne, 2017). To adjudicate between these hypotheses, we ran four experiments that individually manipulated the level of stimulus encoding, changes to category-response associations, and instructions about the category-response associations.

Experiment 1

The goal of Experiment 1 was to replicate the effect of spontaneous task-set formation but with trial-unique stimuli. In this first experiment, we directly adopted the shape/color dimensions employed in previous studies (e.g., Collins and Frank, 2016a) as task-relevant stimulus features, while placing inside the colored shapes trial-unique, task irrelevant object images. This manipulation allowed us to examine how structure learning affects both incidental encoding of these items as well as source memory of the color-shape context in which an item was encountered.

The task consisted of a trial-and-error learning phase (LP), an instructed filler task phase, and a surprise MP (Figure 2). In the LP, participants had to learn, via trial-and-error response feedback, which response buttons matched superficial border categories that were characterized by shape (square/circle) and color (blue/red) dimensions and surrounded trial-unique object images. For “non-clustered” learners, response mappings were arbitrary; for “clustered” learners, they were clustered together on the keyboard (v, b, n, m) by stimulus features along a higher-level dimension (e.g., shape or color) (Figure 1). For instance, if participants saw a red square and used color as their higher-level dimension, the corresponding hierarchical task-set that clustered learners should learn was “If red, press v for square and b for circle” while for non-clustered learners the flat mapping, was “If red, press v for square and m for circle.” The motor-clustering manipulation of learner group was primarily used to promote differences in the likelihood and/or strength of structure learning, and this experiment remained closest to the original learning task (cf. Collins and Frank, 2016a), but further tested incidental memory of the task-irrelevant object images. In the filler phase (FP), participants performed a standard instructed task-switching paradigm with different classes of trial-unique stimuli. In the surprise MP, participants were presented with new images and images from the LP, and were first asked to identify whether they had seen the images in the LP and then asked about the rule context in which they had seen the images.

FIGURE 2

Figure 2. Summary of Task Procedure. (A,D) In the trial-and-error learning phase (LP), participants learned the associations between response buttons and image categories via feedback. In Experiment 1, object images were presented within borders that varied by shape (circle/square) and color (red/blue). In Experiments 2–4, face images were presented and varied by age (young/old) and assumed gender identity (male/female). (B,E) Next, in the instructed filler phase (FP), participants performed a standard task-switching paradigm in which the color of the border cued the task-relevant judgment, with trial-unique face images [instructed age (young/old) and gender (female/male)] in Experiment 1 and object images [instructed animacy (man-made/natural) and physical size (smaller/larger than a shoebox)] in Experiments 2–4. (C,F) Finally, in the incidental memory phase (MP), participants judged whether they had previously seen images in the learning phase. In Experiment 1 alone, because object images were shown within different border colors and shapes, participants were also probed for their source memory of the border color and shape.

If participants are learning the latent task structure through hypothesis testing, and this process imposes cognitive demands (e.g., on working memory) that divert attention away from the task stimuli, we should observe greater incidental encoding for non-clustered (flat) compared to clustered (hierarchical) learners. However, if participants selectively attend more to particular stimulus dimensions during structure learning, we should observe greater incidental encoding for clustered (hierarchical) compared to non-clustered (flat) learners. For both hypotheses, we predict that differences in memory between the two learner groups should be observed primarily for images that are presented early in the task, when structure learning would be most evident. Clustered learners should take less time than non-clustered learners to acquire category-response mapping associations because of the inherent rule structure (cf. Badre et al., 2010), but once these associations are learned, the learners should perform equally well due to similar attentional and cognitive resources.

Materials and Methods

Participants

Our target sample size was determined based on previous experiments that examined structure learning (Collins and Frank, 2013, 2016a,b; Collins et al., 2014), which had final sample sizes between twenty-two and thirty-five participants. In particular, Collins and Frank (2016a) reported a Cohen’s d_z of 0.88 when comparing motor-clustered RT switch costs against zero, suggesting that for a one-sample two-sided t-test with a high power of 0.90 and error rate of 0.05, we would only need sixteen participants to find significant switch costs. Note that in this study, we do not compute a one-sample t-test against zero, but instead use permutation-based analyses (cf. Maris and Oostenveld, 2007; Groppe et al., 2011). These analyses are typically more robust and sensitive to trial count in addition to sample size. Moreover, because we were more interested in examining structure building through the lens of incidental memory than probing switch costs, we aimed to recruit around thirty viable participants for each of the two learning groups across our experiments, thus doubling the a priori estimated sample size.

Eighty-one Amazon Mechanical Turk (MTurk) workers consented to participate for a $3.85 ($0.13/min) fee in accordance with the policies of the Duke University Institutional Review Board. Nine participants were excluded because of poor accuracy on the LP (<65%; see instruction paragraph below) and nine participants were excluded because of incorrect category-response associations on the post-test questionnaire (see post-test section for more details), resulting in a final sample size of sixty-three (mean age = 32.1, SD = 8.7; 31 female, 32 male; clustered n = 31, non-clustered n = 32). This level of exclusions is consistent with research suggesting that attrition rates among web-based experiments varies between 3 and 37% (cf. Chandler et al., 2014).

Workers were told the approximate length of the study and the number of the tasks that they had to complete. Workers were asked to take no longer than 4 min for any of the breaks that occurred during the study (e.g., between task phases). Finally, they were also informed that they needed to get above 65% accuracy on the LP for compensation, and that if they got above 90% accuracy, they could earn a flat $1 bonus. Nine workers earned the bonus. Workers who participated in one experiment were explicitly preempted from participating in the others. All exclusion criteria remained the same across experiments.

Stimuli

We obtained object images from the Cabeza lab database¹ and Google searches of images with a license for non-commercial reuse with modification. Pilot data suggested that participants did not spontaneously form task-sets when asked to categorize object images by innate properties such as animacy (man-made/natural) and physical size (smaller/larger than a shoebox), and found these judgments quite difficult. We therefore organized our object images according to superficial stimulus features: border color (red/blue) and shape (square/circle). All object images were cropped to 500 × 300 pixels.

Experimental Procedure

The main study manipulation involved motor-clustering biases that encouraged the formation of either a flat or hierarchically structured task-set for shape/color border categories (Figure 1; Collins and Frank, 2016a). This clustering manipulation allowed us to investigate how participants form structured task-sets, shown through switch costs incurred when switching between feature dimension rules that were either superficial (Experiment 1) or inherent to (Experiments 2–4) trial-unique images from higher-level stimulus categories.

The task consisted of consecutive Learning, Filler, and MP (Figure 2). On each trial in the LP (Figure 2A), participants saw a fixation cross for 500 ms and an object image for 1250 ms, followed by performance feedback for 500 ms and a blank ITI screen for 750 ms. Trial-unique object images were shown within a border that varied in color (red/blue) and shape (circle/square), which were the dimensions guiding button responses. Participants were instructed to learn the association between response buttons and border categories (red/blue circle/square) via trial-and-error feedback about their responses. Each response button (v, b, n, and m on a QWERTY keyboard, mapped onto right index, right middle, right ring, and right pinky fingers, respectively) was associated with only one shape/color combination.

To create an encoding-retrieval interval and distract participants from further encoding the LP images prior to the surprise memory test, participants then underwent a 4-minute FP (Figure 2B). The filler task consisted of a standard, cued task-switching protocol Results of this task phase were of no interest to our study goals and are therefore not reported.

Crucially, in the MP (Figure 2C), we then tested whether the learner groups differed in their incidental stimulus encoding and source memory. Participants were shown all the Old images from the LP and ∼1/3 New images (44 due to uneven division of 128 Learning trials), each displayed for 2000 ms. They were then asked whether the borders that had surrounded these images were blue/red or a square/circle, with each source memory question also shown for 2000 ms. The response mappings were always shown on-screen (h, j, k, l mapped to Definitely Old/right index, Probably Old/right middle, Definitely New/right ring, Probably New/right pinky fingers, and a and s mapped to Red/Square/left middle and Blue/Circle/left index fingers, respectively).

For all task phases, if participants did not respond before the image disappeared from the screen, a feedback time-out (“respond quicker”) was provided for 1000 ms to encourage quicker responses. In Experiment 1, this occurred on a total of 3.05% of LP trials, 5.15% of MP trials, and 4.24% and 4.06% of source memory color and shape trials.

All images were presented in the center of the screen. All stimulus categories and trial types were shown in random order, with equal frequency in every task phase [LP: 128 total trials across 1 run; FP: 120 trials across 1 run; MP: 172 trials (128 old/44 new) across 2 runs].

To counterbalance the memory judgments and dimensions around which clustering biases were formed, and manipulate clustering biases, we ran eight task versions. Four versions were essentially duplicates, but with clustered instead of non-clustered motor mappings. We did not have a priori hypotheses about whether participants would prefer to use border color/shape (Experiment 1) or age/gender (Experiments 2–4) as supraordinate dimensions, so we ran clustered mapping versions for each dimension. We also varied whether the source memory questions about border color were asked before or after those on border shape.

Post-test Questionnaire

After the main experiment, participants filled out a questionnaire that assessed their explicit knowledge of the response mapping and image category associations. If participants could not accurately report the border category-response associations (e.g., whether the button “v” was associated with a “red square” border), we assumed that they either did not learn the associations or were not sufficiently motivated to respond correctly, and they were therefore excluded from the study. Participants also responded to a series of debriefing questions inquiring about their awareness of the task structure. However, responses to these questions largely matched the implicit measures of learned task-switch costs and are therefore not reported here. Finally, participants marked how difficult they found the task on a 1–5 Likert scale, anchored by not difficult and very difficult, and this measure is not reported here for the same reason.

Data Analysis

Analyses were carried out on accuracy (proportion correct) and reaction time (RT) for the LP data and Hit and False Alarm rates for the memory data. Memory data only included trials in which the participants registered a response that was not excessively fast (<200 ms) before the feedback time-out. LP RT was analyzed for correct trials that were not excessively fast (<200 ms) or slow (feedback time-out: >1250 ms). The first trial was excluded for RT and accuracy data for switch cost calculations.

Learning Phase Switch Costs

In the Learning Phase, we calculated hypothetical switch costs (Monsell, 2003) to test whether participants formed task-sets according to a higher-level dimension (Collins and Frank, 2013, 2016a,b; Collins et al., 2014). For example, participants could adopt the following task-set structure: “If Red, press v for square and b for circle; if Blue, press n for square and m for circle.” Here, participants would be quicker and more accurate to respond to a red square trial following a red circle trial, or a blue square trial following a blue circle trial, or vice versa (“supraordinate-repeat” trials). If, however, they were forced to switch between the supraordinate red and blue feature rules, they should be slower and less accurate when responding (“supraordinate-switch” trials). Both supraordinate-repeat and switch trials were subtracted from each other to obtain a final switch cost. In calculating switch costs, we excluded trials where there was a double feature switch (e.g., red square followed by blue circle), because these could not be unambiguously attributed to either feature rule.

If participants did not form structured task-sets and instead memorized the associations between buttons and image categories (flat mapping), their switch costs should empirically be zero. Because of our assumption about supraordinate repeat and switch trials, the sign (positive or negative values) of these switch costs only indicated which feature dimension was supraordinate. We therefore took the absolute (unsigned) value of all switch costs and then tested whether participants formed implicit task-sets with a standard permutation method that created an empirical null distribution within each learner group. Specifically, we shuffled the labels of each trial type within each participant, assuming that the conditions were empirically meaningless, and then recalculated switch costs for each participant 10,000 times. For group-level analysis, we compared the mean z-score switch cost obtained from our sample, within each learner group, against the 10,000 mean z-scores obtained from permutations, testing whether the z-score across participants was larger than the z-scores generated by an empirical null distribution (two-sided test: 2.5%). For individual difference analysis, we compared the normalized switch cost (d-prime, i.e., $\frac{μ_{S} - μ_{N}}{\sqrt{(\frac{1}{2}) (σ_{S}^{2} + σ_{N}^{2})}}$ ) for each participant against the 10,000 switch costs generated from permutations (two-sided test: 2.5%), testing whether an individual showed a switch cost larger than the switch costs generated by an individual-specific empirical null distribution. We calculated z-scores using the standard error of the mean for the group-level analysis, treating all participants as having similar variance, and used standard deviation for the individual difference analysis to account for individual variance. Note that the normalized switch cost does not indicate the strength of structure formation.

Finally, we determined the number of participants who used each feature dimension as a supraordinate rule by evaluating their response mappings and the sign of the raw switch costs.

In sum, we ran permutation tests to determine (a) whether the learner groups showed overall significant switch costs and (b) whether individuals, irrespective of group, showed statistically meaningful switch costs, suggesting spontaneous structure formation. We also determined whether the clustering manipulation was effective by investigating the use of each supraordinate rule and the magnitude of switch costs across groups (see Supplementary Text).

Memory

In the MP, we collapsed “Definitely Old” and “Probably Old” and “Definitely New” and “Probably New” responses into aggregate “Old” and “New” measures. We then calculated Hit and False Alarm rates and assessed whether incidental encoding was above chance by comparing Hit and False Alarm rates within each learner group with a paired t-test. We also used an independent-samples t-test to compare Hit Rates for the one border/image category that had the same response mapping across learner groups (e.g., blue square in Figure 1) as a baseline measure of individual differences in memory capacity or motivation between the groups that might otherwise confound our results.

Exploratory pilot analyses, estimating the trial at which the 95% confidence intervals for performance surpassed chance performance (Smith et al., 2004), suggested that participants took around fifteen trials to learn the associations between response mappings and image categories. However, these individual differences in learning varied by experiment, and depended on the size of the training set for the model (e.g., initial 40 trials vs. all trials). Overall, participants were also more accurate later in the LP, suggesting that after the first sixty or so trials, they had learned the stimulus-response rules.

Thus, to ensure adequate power, we created three bins of 30 and a fourth bin of 38 LP trials, in order to assess learning over the duration of the LP. To assess whether any differences in memory occurred due to time-dependent structure learning, we next ran a repeated-measures ANOVA on Hit rates with LP bin (4) as a within-participants factor and learner group (clustered/non-clustered) as a between-participants factor. In particular, we anticipated differences in memory to be most prominent for images that were shown earlier in the LP, because non-clustered learners should learn the associations between response mappings and border/image categories slightly slower than clustered learners. Thus, we expected early differences in learning to manifest via differences in memory between the learner groups. Because LP bin was a critical factor in our analysis, if participants had fewer than 10 trials in any bin due to a lack of response, they were excluded from the ANOVA. We used this exclusion criterion (i.e., <10 trials) for all memory analyses.

To determine whether source memory was above chance, we compared accuracy for old images rated as new and old with a paired t-test for each learner group. We then compared accuracy between learner groups using a repeated-measures ANOVA, with trial type (border color/shape) and rating (old/new) as within-participants factors and learner group (clustered/non-clustered) as a between-participant factor.

In sum, we determined whether participants (a) encoded images above chance, (b) differed in baseline levels of motivation or memory capacity, and (c) showed differences in time-dependent learning across groups.

All data were Greenhouse–Geisser corrected where appropriate. Effect sizes were calculated according to published recommendations (Lakens, 2013)². Data and experimental code are online at https://github.com/christinabejjani/ClusteredTSMem.

Results

Learning Phase (LP)

To test whether participants formed hierarchical task-sets at the group-level, we compared their normalized switch costs against an empirical null distribution. Specifically, we counted the number of mean z-score switch costs from the empirical null distribution that were smaller than the mean z-score switch cost obtained from our data for both learner groups separately. Mean z-score switch costs were not significant (i.e., two-sided test: greater than 97.50%) for clustered learners or non-clustered learners when compared to their respective empirical null distributions (clustered: RT = 68.41%, Accuracy = 16.54%; non-clustered: RT = 59.98%, Accuracy = 92.98%). Thus, neither group showed significant evidence of structure formation. See Figures 3, 4 for RT and accuracy switch cost distributions and Supplementary Table 1 for switch cost means and confidence intervals across all experiments.

FIGURE 3

Figure 3. Observed Learning Phase RT Switch Costs. Learning phase reaction time (RT) switch costs are shown as a function of experiment and learner group (clustered: red; non-clustered: green). Switch costs are displayed in milliseconds on the x-axis, with the mean of each distribution outlined in black as a scatter plot point. Participant data are displayed as scatter points. Note that only Experiment 2 generated significant switch costs for the clustered learner group.

FIGURE 4

Figure 4. Observed Learning Phase Accuracy Switch Costs. Learning phase accuracy switch costs are shown as a function of experiment and learner group (clustered: red; non-clustered: green). Switch costs are displayed as a proportion of correct responses on the x-axis, with the mean of each distribution outlined in black as a scatter plot point. Participant data are displayed as scatter points. Note that only Experiment 2 generated significant switch costs for the clustered learner group.

To explore individual differences in structure formation, we then compared the normalized switch cost for each participant against their individual-specific null distribution. Thirteen participants in the clustered group and two participants in the non-clustered group had RT or accuracy normalized switch costs that were at least 97.50% larger than their respective permuted switch costs.

Signed mean RT switch costs indicated that 32 participants categorized object images by border color and 31 by border shape (clustered: 18 BC, 13 BS; non-clustered: 14 BC, 18 BS). This suggests two points: (1) participants had about equal likelihood of using either stimulus feature as a supraordinate dimension and (2) most participants in the clustered group followed the mapping suggested by the cluster manipulation.

Memory Phase (MP)

Both clustered [Hit vs. FA: t(30) = 6.51, p < 0.001, Cohen’s d = 0.91, CL effect size = 88%] and non-clustered learners [t(31) = 4.28, p < 0.001, Cohen’s d = 0.58, CL effect size = 78%] remembered the task-irrelevant object images above chance.

However, consistent with the lack of group-level structure formation, we observed little evidence for group differences in incidental memory. See Supplementary Table 2 for full behavioral data from the MP across experiments.

There were no differences in incidental encoding via time-dependent learning [Figure 5; LP bin: F(2.75,167.96) = 1.04, p = 0.371, η_p² = 0.02; LP bin × group: F(2.75,167.96) = 1.27, p = 0.288, η_p² = 0.02]. Nor did we observe any differences in overall Hit Rate between learner groups [group: F(1,61) = 1.82, p = 0.182, η_p² = 0.03].

FIGURE 5

Figure 5. Hit Rates in the Memory Phase. Hit rates are shown as a function of experiment (top row: Experiment 1; bottom row: Experiment 4), learner group (clustered: red line in left panel and all lines in middle panel; non-clustered: green line in left panel and all lines in right panel), and learning phase bins (four bins of thirty images each). The line in the left panel indicates the mean Hit rate across participants within learner groups, with the 95% confidence interval shaded for each group. The center and right panels indicate individual Hit rates, with each individual in a different color.

Neither clustered learners nor non-clustered learners had above chance source memory accuracy for border color or shape trials (Clustered: Old images rated Old vs. New: ts < 0.42; Non-clustered: ts < 0.97). There were no significant effects on source memory (Fs < 1.68).

Discussion

The results of Experiment 1 did not replicate previous evidence of spontaneous task-set formation (cf. Collins and Frank, 2013, 2016a,b; Collins et al., 2014). Consistent with the lack of learner group differences in spontaneous task-set formation, we also found no significant differences in recognition memory between groups. In fact, overall memory performance was rather poor.

In Experiment 1, the trial-unique images were irrelevant to task-set formation, which may have degraded stimulus processing altogether. Prior studies employed very small sets of task relevant stimulus features (cf. Collins and Frank, 2013, 2016a,b; Collins et al., 2014), which did not allow for tests of recognition memory and may suggest that structure formation is prone to large individual differences or results from feature priming effects across trials. In order to answer our question about how learning task structure impacts incidental stimulus encoding, and whether spontaneous structure formation is constrained by feature priming, we addressed this level of encoding limitation in Experiment 2 by using task-relevant trial-unique images.

Experiment 2

We ran Experiment 2 to address the question of how structure learning, facilitated by motor clustering biases, affects incidental encoding. Because the object images in Experiment 1 were task-irrelevant, this may have masked differences in how learning affects incidental encoding, and hurt our ability to adjudicate between our hypotheses. Specifically, the effect of structure learning on incidental memory may depend on the level of stimulus encoding. We therefore sought to render the trial-unique images directly task-relevant, which should promote deeper encoding. Including task-relevant trial-unique images should also control for feature priming confounds in previous work, since each trial now shows a completely new and unique stimulus (cf. Collins and Frank, 2013, 2016a,b; Collins et al., 2014).

Faces are social, have inherent value (e.g., Smith et al., 2010), can prime attentional categories (e.g., Cañadas et al., 2013), and may be categorized by a number of features (e.g., age, gender identity, emotion). Here, using face features as task-relevant stimulus categories (face age/gender identity), we tested how implicit task-set learning affects cognitive flexibility and incidental encoding.