The attentional selection in visual search within short-term memory representations
University of Oxford, Oxford, UK
Our representation of the visual world can be modulated by spatially specific attentional biases that depend flexibly on task goals. We compared searching for task-relevant features in perceived versus remembered objects. When searching perceptual input, selected task-relevant and suppressed task-irrelevant features elicited contrasting spatiotopic ERP effects, despite them being perceptually identical. This was also true when participants searched a memory array, suggesting that memory had retained the spatial organization of the original perceptual input and that this representation could be modulated in a spatially specific fashion. However, task-relevant selection and task-irrelevant suppression effects were of the opposite polarity when searching remembered compared to perceived objects. We suggest that this surprising result stems from the nature of feature- and object-based representations when stored in visual short-term memory. When stored, features are integrated into objects, meaning that the spatially specific selection mechanisms must operate upon objects rather than specific feature-level representations.
It is becoming increasingly clear that attentional biases can operate upon representations stored in visual short-term memory (VSTM), but the mechanisms by which this occurs remain largely unknown. In the case of incoming perceptual events, top-down attentional signals are believed to bias neural processing throughout multiple visual areas according to anticipated spatial locations or object features that are task-relevant (Desimone and Duncan, 1995
). In the case of events maintained in VSTM, it is not clear whether the same feature-level information is available for attentional biasing, and whether the representation maintains the relative spatiotopic layout that was present in the original percept (Fabiani et al., 2003
).
According to an increasingly popular account, VSTM maintenance involves at least some portion of the activity related to the initial perceptual processing of the items, supported by the activity of domain-general executive areas, especially posterior parietal and prefrontal cortex (D’Esposito et al., 2000
; Passingham and Sakai, 2004
). We might therefore expect VSTM to retain something of its original feature-based constitution and spatial organization. Such a representation might provide a template upon which top-down biases might operate according to the current task-goal (Desimone and Duncan, 1995
).
In our study, we investigated whether the individual features that constitute objects are still available for attentional selection within VSTM. We made use of an on-line ERP marker related to attentional selection of targets within perceptual/visual arrays – the N2pc. This component is revealed by comparing recordings from electrodes contralateral to the attended target with those from ipsilateral electrodes. The comparison collapses across left- and right-hemisphere electrodes, but maintains the relative contralaterality of the electrode position. The N2pc is an enhanced negativity contralateral, relative to ipsilateral, to the target stimulus over posterior electrode sites, typically occurring between 200 and 300 ms (Eimer, 1996
; Hickey et al., 2006
; Luck and Hillyard, 1994a
,b
). The standard paradigm for eliciting the N2pc involves the presentation of an array of stimuli, in which participants must search covertly for the presence of a particular object.
In the first of two experiments, we tested whether this marker of the spatial biasing of neural activity according to the target location could be induced when participants searched for a specific feature of a perceived object within a visual array, while ignoring the other feature dimension. In addition to testing for the presence of feature-specific biases in neural activity, the experiment enabled us to control rigorously for the degree of perceptual similarity between the array items and the probe item. It became possible, therefore, to test whether this modulation could be driven entirely by a top-down task goal, rather than by bottom-up perceptual processes or perceptual priming. In the second experiment, we used a similar approach to test whether feature-based selection could proceed in a similar fashion when participants searched for target features from within an array of items maintained in VSTM.
A schematic of the first experiment is presented in Figure 1
A. All stimuli (in both probe and search arrays) were shapes with a thin colored border (i.e., each was defined by a conjunction of a color and shape feature). After the presentation of the probe stimulus, participants were told, via a cue, which of the two features was relevant for the current trial. An array of two objects then appeared, one to the left and the other to the right of fixation. The participant’s task was to search the array covertly and establish whether it contained the relevant feature. For example, when color was the task-relevant feature, participants determined whether an object in the array shared the color of the probe. On every trial one feature of the probe would always be present. In arrays containing a ‘target’, the matching feature in the array object would be the task-relevant feature. In arrays containing a ‘distracter’, the matching feature in the array object would be task-irrelevant. Therefore, the probe partially matched one of the array objects on every trial, but it was never a perfect visual match. This control over the degree of visual similarity between target and distracter arrays differentiates our task from the traditional visual search situation. Participants should not simply search the array for the best visual match to the probe item – this is as likely to be a distracter as it is a target. Instead, participants must search for a task-relevant feature match, and ignore any task-irrelevant matches. Would we still observe a spatially specific modulation of the perceptual representation, even when this search is feature-guided and relies entirely on top-down control rather than perceptual similarity?
Figure 1. The experimental paradigm. (A) Perceptual search: participants first saw a probe stimulus and then a task cue instructing them on whether the color or shape of that probe was relevant for the upcoming search. Finally, an array of two shapes appeared and participants had respond as to whether the relevant probe feature was present in the array. (B) Memory search: participants first saw the array of two objects, and were subsequently instructed as to whether color or shape was relevant on that trial. Finally participants were presented with the probe, and had to decide whether the memory array contained the relevant feature of the probe. In both cases the actual display consisted of white figures (shapes and cues) on a black background. In both cases participants responded with their right hand, using the index finger for ‘target-present’ and the middle finger for ‘target-absent’.
Materials and Methods
Participants
Fourteen participants (four male) completed a combined behavioral and electrophysiological recording session; all were right-handed and had normal or corrected-to-normal vision. One participant was removed due to excessive oculomotor artefacts, leaving 13 participants, with an average age of 25 ± 2.61 years (SD). Each participant provided written informed consent. The study was approved by the medical ethics review board at the University of Oxford, UK.
Task
We presented participants with single shapes, each with a thin colored border. Following a delay, participants were cued to ‘select’ either the color or shape feature from their memory of the shape. Following a further delay participants were presented with two shapes, one to the left and one to the right of fixation (see Figure 1
). Participants’ task was to search this simple array covertly for the task-relevant feature. Participants also had to ignore any distracters; i.e., they had to ignore any matches to the task-irrelevant feature.
Design
Each trial started with the onset of an asterisk (which was formed from the two task cues overlaid on one another). After a randomized variable interval (500–1000 ms), a single white shape with a colored border appeared for 400 ms, which was termed the ‘probe’. This was then again replaced by the asterisk for 400–800 ms. Following this, the asterisk turned into a cue, which was either a ‘+’ or an ‘×’, for 1000 ms. The cue instructed participants to select either the color or shape feature from the preceding probe/items. The meaning of the cue (select shape or select color) was counterbalanced across participants. This cue then turned back into the asterisk for 100–500 ms, until the onset of the ‘array’. This was a pair of objects, each with a colored border, with one being presented to the left and one to the right of fixation. One feature from the probe stimulus would always be present in the array, i.e., one the array shapes would share either the color or the shape of the original probe item, but it would never be shared by both array shapes, nor would both probe features be present in the array. After 200 ms the array disappeared. The asterisk was present until a response was selected, but timed-out at 1800 ms. A failure to respond within this window was recorded as an error. Participants searched this array of objects, to establish whether the task-relevant feature was present or absent. Importantly the participants were only to search for a match in the relevant dimension (e.g., shape), and ignore any matches in the irrelevant dimension (e.g., color). Responses were made with the right hand, with the index finger meaning target present and the middle finger meaning target absent. Following a blank screen of 1000–1500 ms next trial started.
There were equal numbers of ‘select color’ and ‘select shape’ trials, equal numbers of task-switch and task-repeat trials, and an equal number of target-present and target-absent (distracter) trials.
Procedure
Task performance was broken into short blocks containing 10 trials, as recent work has suggested that shorter blocks enable participants to make better use of a cue to select a task-set (Astle et al., 2008
). Participants completed a pure block of the ‘select-color’ task and a pure block of the ‘select-shape’ task (the order of which was counterbalanced across participants) as practice. Following this, participants completed eight blocks of mixed practice, before proceeding to 50 blocks of experimental trials. A break was given after every block, the duration of which was determined by the subject.
Stimuli
A black background was used throughout the experiment. Arrays were composed of two objects quasi-randomly selected from a set of four white shapes (a triangle, diamond, square and a circle). Each object had a 2-mm thick colored border (red, orange, green or blue). In pilot work, we had varied both the thickness of the color border and the shade of the color in order to make the color task more difficult. This meant that the shape and color tasks had roughly equal difficulty, and, by making colored borders of the shape less salient, irrelevant colors were not more distracting than irrelevant shapes.
Each item in the target array subtended approximately 2° × 2° at its widest/highest point, and was positioned along the outer edge of a diagonal (upper left and lower right; or upper right and lower left) of an invisible 2 × 2 matrix that subtended approximately 2.6° × 2.4°. The probe items were identical in size to the target array items. The two cues were a white ‘×’ or ‘+’, each subtended approximately 2° × 2°.
Electrophysiological recording protocol
Participants were seated in an electrically shielded booth, approximately 1 m from the screen, with their head on a chin-rest. The EEG was recorded continuously using NuAmps amplifiers (Neuroscan, Inc.) from 40 silver–silver chloride electrodes placed on the scalp with an elasticated cap, positioned according to the 10-20 international system. The montage included six midline sites (FZ, FCZ, CZ, CPZ, PZ and OZ) and 14 sites over each hemisphere (FP1/FP2, F3/F4, F7/F8, FC3/FC4, FT7/FT8, C3/C4, T7/T8, CP3/CP4, TP7/TP8, P3/P4, P7/P8, PO3/PO4, PO7/PO8 and O1/O2). Electrodes were placed around the eyes to monitor for blinks and eye movements. Additional electrodes were used as ground and reference sites. Electrodes were referenced to the right mastoid site. The electrode between FPZ and FZ on the midline served as the ground electrode. Electrode impedances were kept below 5 kΩ. The ongoing brain activity at each electrode site was sampled every 1 ms (1000-Hz analog-to-digital sampling rate). Activity was filtered with a low-pass filter of 300 Hz. The EEG was recorded continuously during the entire duration of each experimental run.
Event-related potential formation
The EEG was re-referenced off-line to the algebraic average of the right and left mastoids. Bipolar electro-oculogram (EOG) signals were derived by computing the difference between the voltages at electrodes placed horizontally to each eye (HEOG) and between the voltages at electrodes placed vertically (VEOG). The re-referenced and transformed continuous data were further filtered to exclude high-frequency noise (low-pass-filter 40 Hz).
The continuous EEG was then segmented into epochs starting 100 ms before and ending 400 ms after each stimulus array. The EEG epochs were normalized to the baseline period before stimulus presentation. Epochs containing excessive noise or drift (±100 μV) at any electrode were excluded. In addition, epochs with eye-movement artifacts (blinks or saccades) were rejected. Blinks were identified as large deflections (±50 μV) in the HEOG or VEOG electrodes. Finally, the first trial in a block and any trials with incorrect or absent behavioral responses were also discarded.
Epochs were averaged according to whether target or distracter items appeared in the left or the right of the array. ERPs from trials containing targets/distracters located on the right, and from trials containing targets/distracters located on the left, side of the remembered arrays were subsequently combined by an averaging procedure that preserved the electrode location relative to the target/distracter side (ipsilateral or contralateral). We then compared waveforms over posterior contralateral versus ipsilateral electrode locations for target and distracter trials. The N2pc is observed as a relative negativity at posterior electrode sites contralateral versus ipsilateral to an attended stimuli. Typically the N2pc is recorded and displayed at PO7/8, usually being maximal between 200 and 300 ms post array onset (Eimer, 1996
; Eimer and Kiss, 2008
; Hickey et al., 2006
; Kiss et al., 2008
; Luck and Hillyard, 1994a,b
). We included four electrode sites over both contralateral and ipsilateral scalp in our analyses (PO7/8, PO3/4, P7/8 and O1/2).
Statistical comparisons
We submitted the mean voltage across the 230–280 ms window to a repeated-measures ANOVA for factors target versus distracter, hemisphere contralateral versus ipsilateral relative to the stimulus and electrode (four levels). Where appropriate, the statistical tests were calculated using the Greenhouse–Geisser correction, to control for the potential non-sphericity of EEG data (Jennings and Wood, 1976
).
We also statistically compared the distribution of the contralateral–ipsilateral differences. We did this by comparing the differences across one hemisphere (as the other hemisphere is just the mirror of this), using 13 electrodes (we omitted FP1/2, because these recordings tend to be noisy). We compared differences of the same polarity, using normalized voltages (McCarthy and Wood, 1985
). This enabled us to restrict the comparison to the distribution of the effects, rather than their amplitude.
Results and Discussion
Behavioral comparisons
We compared reaction times (RTs) to targets and distracters. We also compared the RTs for those trials upon which participants searched for an object of a particular color, with those upon which they searched for a particular shape. A repeated-measures ANOVA with condition (target/distracter) and judgment (color match/shape match) as within-subject factors revealed a significant main effect of target/distracter [F(1,12) = 11.146, p = 0.006]. Participants responded more quickly to targets than distracters (target: 512 ms; distracter: 551 ms). There was no effect of whether the relevant feature was a color or a shape [F(1,12) = 0.015, p = 0.906], and these factors did not interact [F(1,12) = 2.695, p = 0.127]. For errors, the main effect of target/distracter approached significance [F(1,12) = 4.251, p = 0.061], with distracter trials being more error-prone (target: 12% errors; distracter: 16% errors). The main effect of judgment reached significance [F(1,12) = 6.502, p = 0.025], with participants being more error-prone on the color judgment than on the shape judgment (color: 16% errors; shape: 12% errors). There was no interaction between these factors [F(1,12) = 0.427, p = 0.526]. Thus, participants did not find task-irrelevant colors more distracting than task-irrelevant shapes, or vice versa.
ERP results
Epochs were averaged according to whether they were collected on target or distracter trials. We further divided the target and distracter waveforms into those during which the target or distracter were presented on the left or on the right of the array. ERPs from trials containing targets/distracters located on the right, and from trials containing targets/distracters located on the left, were combined by an averaging procedure that preserved the electrode location relative to the target side (ipsilateral or contralateral). We then compared the four waveforms: target (contralateral and ipsilateral) and distracter (contralateral and ipsilateral). We compared ERPs at four lateral posterior electrode sites over both contralateral and ipsilateral scalp (PO7/8, PO3/4, P7/8 and O1/2).
Reliable contralateral–ipsilateral differences occurred following the onset of the array. These are summarized in Figure 2
. We calculated the mean voltage across a window, 230–280 ms post array onset, when the N2pc was expected to be maximal (Eimer, 1996
; Luck and Hillyard, 1994a,b
). We submitted these mean voltages to a three-way ANOVA, with the within-participants factors of target/distracter, hemisphere relative to stimulus (contralateral/ipsilateral), and electrode pair (PO7/8, PO3/4, P7/8, O1/2). This resulted in a two-way interaction between target/distracter and hemisphere [F(1,12) = 6.295, p = 0.027], driven by a significant difference between contralateral and ipsilateral waveforms on target trials [F(1,12) = 5.943, p = 0.031], but no such difference on distracter trials [F(1,12) = 0.001, p = 0.981]. There was no three-way interaction between target/distracter, hemisphere and electrode [F(2.36, 28.24) = 0.539, p = 0.618].
Figure 2. Grand average waveforms time-locked to the array onset at 0 ms. The left-hand figure shows target-locked waveforms, and the right-hand figure shows distracter-locked waveforms. In both cases the blue waveform indicates the average voltage (μV) recording from electrodes ipsilateral to the matching feature, and the red waveform indicates the average voltage (μV) recording from electrodes contralateral to the matching figure. In both cases the waveforms indicate the mean of the PO3/4, PO7/8, O1/2 and P7/8 recordings. Bellow each waveform is a topographical plot. Both plots show the distribution of the contralateral minus ipsilateral difference, with the left-hemisphere showing contralateral minus ipsilateral and the right-hemisphere being the mirror of this (ipsilateral minus contralateral). This is shown for the targets from 230 to 280 ms, and for the distracters from 300 to 350 ms.
We also noted that there appeared to be a contralateral positivity, relative to the ipsilateral waveform, for distracters (see Figure 2
). To identify the time-course at which this effect was reliable for the distracter trials, we ran successive point-wise t-tests, comparing the average of the four electrodes from contralateral and ipsilateral hemisphere recordings (Guthrie and Buchwald, 1991
; Murray et al., 2002
). We defined the peak on the basis of the significance of these t-tests (p value) rather than the amplitude of the difference. On this basis, we selected a 50-ms window from 300–350 ms, which spanned the peak of the difference, and submitted the mean voltages from this window to a three-way ANOVA. The interaction between target/distracter and hemisphere reached significance [F(1,12) = 5.116, p = 0.043]. This was the result of an effect of contralaterality on distracter trials [F(1,12) = 6.783, p = 0.023], but not on target trials [F(1,12) = 0.274, p = 0.610]. The three-way interaction between target/distracter, hemisphere and electrode did not approach significance [F(2.03, 24.31) = 0.274, p = 0.610].
Thus we found two distinct, spatially specific effects: The first resembled the N2pc (230–280 ms), with a negativity contralateral to the location of task-relevant match (a target). The second occurred slightly later (300–350 ms) and was characterized by a positivity contralateral to the location of a task-irrelevant match (a distracter). In short, the modulation was not based upon the degree of perceptual similarity between the array and probe items (i.e., bottom-up influences), because targets and distracters, which were perfectly matched in this respect, elicited contrasting effects. Rather the modulation was dependent upon the task-relevance of that similarity.
Topographical analyses
We produced a topographical plot for these two effects – between 230 and 280 ms on target trials, and between 300 and 350 ms on distracter trials (see Figure 2
). In each plot, the left- hemisphere displays the contralateral minus ipsilateral ERP, and the right-hemisphere displays the mirror of this difference. The first plot shows the distribution of the difference on target trials (230–280 ms), and the second on distracter trials (300–350 ms). We tested whether these two effects had a significantly different distribution by comparing the contralateral minus ipsilateral difference from the target trials with the ipsilateral minus contralateral difference from the distracter trials, such that in both cases the differences were negative. We then normalized these data, to control for any differences in amplitude (McCarthy and Wood, 1985
). Having controlled for the differences in polarity and potential differences in amplitude, we compared the distribution of the differences by calculating a two-way ANOVA, with the within-participants factors of target/distracter and electrode. The analysis used the subtraction waveforms across all lateral electrode pairs, except FP1/2 (omitted because recordings tended to be noisy at these sites), leaving 13 pairs. There was no significant interaction between target/distracter and electrode [F(5.93,71.20) = 0.494, p = 0.809], suggesting that whilst the effects were at different latencies and of the opposite polarity, the contralateral ‘target negativity’ and ‘distracter positivity’ had statistically indistinguishable distributions.
Initial ERP studies had linked the N2pc to distracter- suppression (Luck and Hillyard, 1994a,b
). Many studies exploring the N2pc, however, have presented simultaneously targets in one hemifield and distracters in the other hemifield. Given that target- selection and distracter-suppression might elicit opposing effects, the contralateral–ipsilateral difference that is termed the ‘N2pc’ could, in principle, index both target-selection and distracter- suppression. A recent study overcame this confound by manipulating the laterality of targets and distracters orthogonally (Hickey et al., 2008
). Targets elicited a contralateral negativity and salient distracters elicited the reverse: a contralateral positivity. In this recent study, participants searched for objects rather than specific features, however these results are consistent with those that we observed in Experiment 1. Specifically, we suggest that the N2pc may reflect both spatially specific enhancement of task-relevant features and suppression of task-relevant features (Desimone and Duncan, 1995
) in a ‘salience’ map used to guide selection and action. In Experiment 1 this is brought about by the top-down biases applied to the specific features of objects that are necessary for distinguishing targets and distracters. Though, as in the case of Hickey et al. (2008)
, top-down biases may also be applied to object-level representations. Our data demonstrate that this spatially specific modulation can occur entirely on the basis of top-down control. In addition, these enhancement and suppression effects can be temporally separated, with participants prioritizing the enhancement of task-relevant features before the suppression of task-irrelevant features. The fact that we observed distracter-suppression effects would also suggest that the irrelevant probe feature is still present in VSTM when subjects search the perceptual array; even though subjects already know by this point that the feature is irrelevant, and may cause them to make an incorrect response, their attention is still captured by its correspondence with a feature in the perceptual array. One possibility is that both features of the probe stimulus become bound together into a single object when they are stored in VSTM (Awh et al., 2007
; Vogel et al., 2001
).
In a second experiment, we tested whether equivalent attentional biasing mechanisms are applied when the selection of targets or suppression of distracters occurs within representations that are stored in VSTM. If VSTM representations retain their constituent feature-level information within a spatial layout, in a way that is analogous to that in the original percept, one might expect similar ERP markers of spatially specific selection and suppression.
A number of previous electrophysiological studies of VSTM have examined spatial selection processes that occur during a retention interval, as subjects maintain items presented to the left or to the right of fixation in VSTM (Klaver et al., 1999
). When participants selectively maintain objects presented on one side of space in order to perform a subsequent change-detection task, a sustained negative potential is observed over the contralateral posterior scalp, termed contralateral delay activity or sustained posterior contralateral negativity (Dell’Acqua et al., 2006
; Klaver et al., 1999
; Vogel and Machizawa, 2004
). This lateralized brain activity has been found to vary with the number of items being held in VSTM (McCollough et al., 2007
) and according to individual differences in VSTM capacity (Vogel and Machizawa, 2004
). Accordingly, it has been interpreted to reflect delay activity in posterior areas related to the spatially selective maintenance of items in VSTM, though it is also possible that it reflects delay activity related to spatial attention in anticipation of the subsequent items that are relevant for task performance (e.g., Chelazzi et al., 1993
; Luck et al., 1997
). Although these studies have shed light on the question of what determines VSTM capacity and on the maintenance of perceptual information within VSTM, an orthogonal issue remains unexplored: do VSTM representations retain feature-specific information that can be searched in a spatially specific manner? In Experiment 2 we turn to this question.
Very few studies have examined the specific processes that occur as subjects search VSTM, though a consistent picture is emerging. Fabiani et al. (2003)
first demonstrated that objects stored in VSTM retain a spatial layout that is based upon the original percept. More recently, Kuo et al. (submitted) compared the selection of whole objects from perceptual input and whole objects from VSTM: similar, spatially specific N2pc potentials were elicited when participants searched for a probe object within a visually presented array or a stored array. That is, target objects elicited a relative negativity over contralateral scalp regardless of whether they were present in perceptual input or stored in VSTM, suggesting that objects held in VSTM retain a spatial organization that is based upon the original percept and that similar mechanisms of spatial selection of objects can operate on perceptual and VSTM representations. The present experiment goes further by testing whether VSTM search can also be directed at the level of individual features within objects, by examining biasing mechanisms related both to selection and suppression of information based on task goals, and by controlling rigorously for bottom-up factors or feature-overlap between probe and array that could otherwise bias neural activity.
In Experiment 2 we reversed the order the stimulus displays in our task. Participants first viewed an array of items, and subsequently viewed the probe stimulus (see Figure 1
B, and the Section ‘Materials and Methods’ for details). When the probe is presented participants must search their memory of the array. The central question was whether this VSTM search would elicit lateralized visual ERP effects similar to those observed in Experiment 1.
We can conceive of three possible outcomes. The first amounts to finding that VSTM does not retain its feature-based or object-based information in a spatial layout that can be biased by ‘top-down’ signals about task-relevant items. In this case we should not observe any effects of contralaterality, for either targets or distracters. In light of the recent findings by Kuo et al. (submitted), this possibility is unlikely. The second possibility is that feature-level information is represented within a spatial layout in much the same way as it is during perception, such that task-goal biasing signals can operate upon the feature-level information in much the same way as during perceptual search. If this is the case, we would expect to observe a similar pattern of modulation during VSTM search as in Experiment 1. Target arrays would be expected to elicit relative negativities over contralateral versus ipsilateral sites; and distracter arrays would be expected to elicit relative positivities over contralateral versus ipsilateral sites. The third possibility is that VSTM retains a spatially organized representation, but that its nature differs from that of perceptual representations in the way that individual feature-levels are stored or can be accessed. In VSTM, the features may be bound into integrated objects. Indeed, some evidence suggests that when items are stored in VSTM, features are bound together to form single objects (Awh et al., 2007
; Vogel et al., 2001
). This would make it difficult to distinguish targets from distracters, because the features that are necessary to do so would be bound together. Therefore, in this final possibility, participants will need to employ different mechanisms to distinguish targets and distracters.
Materials and Methods
All methods were the same as in Experiment 1 unless otherwise noted.
Participants
Twelve right-handed participants (seven male) with normal or corrected-to normal vision participated, with an average age of 24.5 ± 2.94 years.
Design
The second experiment was the same as the first experiment, except for a change in the order in which the probe and array were presented. Rather than being presented with the probe stimulus first, and then the array, participants viewed the array first, and subsequently the probe (see Figure 1
B). The timing of stimulus presentations was the same as in Experiment 1: the original array was presented for 400 ms and the subsequent probe was presented for 200 ms.
Electrophysiological recording and ERP formation
In Experiment 2, the ERPs were time-locked to probe onset. ERPs were constructed according to whether a target or distracter appeared on the left or the right of the remembered array, maintained in VSTM. ERPs at contralateral and ipsilateral posterior electrodes were compared to target and distracter trials. The distribution of the N2pc effect was more anterior in during VSTM search in Experiment 2, so we revised the electrodes used for analysis accordingly (PO7/8, P3/4, P7/8 and TP7/8).
Results and Discussion
Behavioral comparisons
We compared performance across the four trial types in the same way as in Experiment 1, using a repeated-measures ANOVA. For RTs, there was a main effect of target/distracter [555 ms for target versus 610 ms for distracter trials respectively, F(1,11) = 16.649, p = 0.002]. There was also a main effect of the judgment [568 for color match versus 597 ms for the shape match judgment respectively, F(1,11) = 9.904, p = 0.009], but these two factors did not interact [F(1,11) = 1.210, p = 0.295]. In the error data, the main effect of target/distracter approached significance [F(1,11) = 3.511, p = 0.088], with distracter trials being more error-prone (22% errors) than target trials (17% errors), but there was no main effect of judgment [F(1,11) = 1.199, p = 0.297]. As before, these two factors did not interact [F(1,11) = 0.652, p = 0.436]. As in Experiment 1, we concluded that whilst the judgments were not equally easy, participants did not find irrelevant but matching colors more or less distracting than irrelevant but matching shapes.
ERP comparisons
We averaged our waveforms as in Experiment 1 (Figure 3
), except that ipsilateral and contralateral hemisphere corresponded to the relative location of the target/distracter item in the remembered array (see Section ‘Materials and Methods’ for details). We used the same N2pc comparison window as we had in Experiment 1 (230–280 ms), and compared the average voltages across this window using a repeated-measures ANOVA. As in Experiment 1, target/distracter interacted with hemisphere [F(1,11) = 9.351, p = 0.011]. Surprisingly, the pattern of findings differed strikingly, with the direction of the effects occurring in the opposite direction to that observed in Experiment 1. Targets elicited a contralateral positivity [F(1,11) = 5.302, p = 0.042] and distracters elicited a contralateral negativity [F(1,11) = 11.841, p = 0.006]. There was no statistically significant three-way interaction between target/distracter, hemisphere and electrode [F(1.81, 19.87) = 1.139, p = 0.335].
Figure 3. Grand average waveforms time-locked to the probe onset at 0 ms. The left-hand figure shows target-locked waveforms and the right-hand figure shows distracter-locked waveforms. In both cases the blue waveform indicates the average voltage (μV) from electrodes ipsilateral to the matching figure, and the red waveform indicates the average voltage (μV) from electrodes contralateral to the matching feature. Importantly, in the memory task ipsilateral and contralateral are defined on the basis of the original array, because the probe (to which the waveforms are time-locked) is centrally presented. In both cases the waveforms indicate the mean of the P7/8, P3/4, PO7/8 and TP7/8 recordings. Below each waveform is a topographical plot. Both plots show the distribution of the contralateral–ipsilateral difference, with the left-hemisphere showing contralateral minus ipsilateral and the right-hemisphere being the mirror of this (ipsilateral minus contralateral). This is shown for the targets and distracters from 230 to 280 ms.
Topographical comparisons
We produced and compared topographical plots in the same way as for Experiment 1. As in Experiment 1, we had contrasting polarities for contralateral versus ipsilateral differences in target and distracter trials. To test whether these effects differed in distribution, as well as in polarity, we rectified voltage differences across pairs of lateral electrodes, normalized the resulting values, and compared these across target versus distracter trials. There was no interaction between target/distracter and electrode [F(3.47,38.15) = 1.776, p = 0.161]. Thus, as in Experiment 1, although the effects elicited by targets and distracters differed in polarity, the distribution of these effects was statistically indistinguishable.
We also compared topographies from Experiments 1 and 2 by including a between-participants factor of perceptual search versus memory search. The interaction between perceptual/memory search and electrode approached significance [F(5.49, 126.30) = 2.195, p = 0.053], suggesting that the distribution of the contralateral/ipsilateral differences was different when searching visual input versus when searching memory. Whilst in the perceptual version (Experiment 1) the electrodes F3/4, TP7/8 PO7/8, O1/2, PO3/4 and P7/8 showed a significant contralateral–ipsilateral difference (p values <0.042), in the memory version (Experiment 2) this difference was significant at electrodes FT7/8, T7/8, TP7/8, P3/4 and P7/8 (p values <0.046). There was no three-way interaction between target/distracter, electrode and perceptual/memory search [F(6.69, 153.81) = 0.630, p = 0.723], consistent with earlier analyses showing that the distribution of target and distracter effects was the same as one another in Experiment 1 and they were the same as one another in Experiment 2.
The striking polarity inversions of both target-selection and distracter-suppression in VSTM search (Experiment 2) relative to visual search (Experiment 1) were unexpected and intriguing. The findings during VSTM search clearly support the existence of a spatially organized representation that can be modulated by biasing signals based on task goals (Kuo et al., submitted). However, they also strongly suggest that the feature-level information in VSTM representations cannot be biased in the same way by task-goal signals as in perceptual representations.
Why should targets and distracters produce one pattern of effects in the perceptual search and a seemingly opposite pattern of effects in the memory search? We suspect that the difference stems from basic differences in the nature of the representations that participants are searching. The most plausible explanation may lie in how individual feature values are coded relative to objects in VSTM and perceptual representations.
In perceptual search (Experiment 1), task-goal biasing signals can prioritize the processing of the specific relevant feature value before any search-array items are presented and before features become integrated into object representations. In other words, participants are able to search the array at a feature-specific, rather than object-specific, level.
By contrast, we speculate that when information is transformed into VSTM, these features become bound together in objects (Awh et al., 2007
; Vogel et al., 2001
). This poses a challenge for identifying the task-relevant feature in our VSTM-search task. The targets and distracters always match the probe item on one feature but not another. They can only be distinguished by whether the individual matching feature is in the correct, task-relevant dimension instructed by the cue. Therefore, when participants search the array in VSTM, they must actively separate the relevant and irrelevant features of the array objects in order to distinguish targets from distracters. For example, if color were the relevant feature, then when participants searched their memory for a green square and found a green triangle, they must suppress the irrelevant mismatching feature (i.e., the triangle shape) in order to facilitate the match in the relevant dimension (i.e., the green color). Conversely, if shape were relevant, then when participants searched their memory for the green square and found the green triangle their attention may have been drawn to the irrelevant, but matching, feature (i.e., the green color). In order to establish correctly that the object is in fact a distracter, participants must then enhance the relevant feature (i.e., the triangle). According to this account, the contralateral negativity is an index of spatially specific enhancement across both experiments: in Experiment 1 the perceptual input of the matching relevant feature is selectively enhanced, in order to find the target; in Experiment 2 the relevant feature of a distracter object is enhanced, in order for it to be correctly identified as a distracter. Likewise, the contralateral positivity is an index of suppression across both experiments: in Experiment 1 the perceptual input from the irrelevant feature is suppressed, such that participants ignore the object containing it; in Experiment 2 the irrelevant information from the target is suppressed, such that participants can correctly identify that object as a target.
The time-course of the effects in the perceptual and memory searches also supports this account. When participants searched through perceptual input (Experiment 1), matches to the task-relevant feature elicited an effect that preceded that elicited by matches to the task-irrelevant feature. In contrast, when participants searched memory, matches to the task-relevant feature elicited an effect that was synchronous with that elicited by matches to the task-irrelevant feature. This supports the idea that our search of perceptual input operates by biasing feature level representations in advance, whereas when participants search VSTM they must deploy enhancement and suppression simultaneously on feature-integrated objects.
Our findings and interpretations are also compatible with a previous study investigating the ERP correlates of VSTM search (Kuo et al., submitted). The authors contrasted perceptual and VSTM search when targets could be specified at the object-level. Across two experiments, targets were defined by either shape or color, but there was only one defining feature per object and there were no distracting task-irrelevant features. Importantly, in these object-based searches, the same contralateral negativity was elicited by targets, both during perceptual and VSTM search. We have also replicated this finding in a subsequent experiment that required subjects to search memory for abstract shapes that contained only one feature, without any conflicting task-irrelevant features. The result of this previous study (Kuo et al., submitted) would also rule out the possibility that the inverted polarity seen in Experiment 2 results from time-locking the contralateral and ipsilateral waveforms to a centrally presented probe, rather than to an array (as in Experiment 1). In the previous study, despite the VSTM search N2pc being time-locked to a probe and the perceptual search N2pc being time-locked to an array, both types of N2pc had the same polarity (with contralateral more negative than ipsilateral in both cases). Such a finding is however in line with our suggestion that the contralateral negativity may reflect spatially specific enhancement of the task-relevant features or objects. Moreover, because there is no conflicting feature-level information interfering with target selection in this previous study, there was no reversal of polarity when participants searched their VSTM.
We demonstrated that VSTM is modulated in a spatially specific way as participants search their memory. However, targets and distracters in the remembered arrays (Experiment 2) elicited contrasting effects to those obtained when participants searched actual visual arrays (Experiment 1). One possible explanation for this novel finding is that constituent features of objects become bound together when encoded in memory, so that the task-relevant and task-irrelevant features then need to be decomposed in order to distinguish targets from distracters.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
D.E.A. is supported by a post-doctoral fellowship from the Economic and Social Research Council, UK. D.E.A. and G.S. are supported by a project grant from the John Fell fund (Oxford University Press). G.S. and A.C.N. are both supported by project grants from the Wellcome Trust. B-C.K. was supported by a Scholarship from the National Science Council of Taiwan.