A Metacognitive Perspective of Visual Working Memory With Rich Complex Objects

Sahar, Tomer; Sidi, Yael; Makovski, Tal

doi:10.3389/fpsyg.2020.00179

ORIGINAL RESEARCH article

Front. Psychol., 25 February 2020

Sec. Cognition

Volume 11 - 2020 | https://doi.org/10.3389/fpsyg.2020.00179

This article is part of the Research Topic Understanding the Operation of Visual Working Memory in Rich Complex Visual Context View all 10 articles

A Metacognitive Perspective of Visual Working Memory With Rich Complex Objects

$\r\nTomer Sahar,*$ Tomer Sahar^1,2*

Yael Sidi¹

Tal Makovski¹

¹Department of Psychology and Education, The Open University of Israel, Ra’anana, Israel
²Department of Psychology, University of Haifa, Haifa, Israel

Visual working memory (VWM) has been extensively studied in the context of memory capacity. However, less research has been devoted to the metacognitive processes involved in VWM. Most metacognitive studies of VWM studies tested simple, impoverished stimuli, whereas outside of the laboratory setting, we typically interact with meaningful, complex objects. Thus, the present study aimed to explore the extent to which people are able to monitor VWM of real-world objects that are more ecologically valid and further afford less inter-trial interference. Specifically, in three experiments, participants viewed a set of either four or six memory items, consisting of images of unique real-world objects that were not repeated throughout the experiment. Following the memory array, participants were asked to indicate where the probe item appeared (Experiment 1) whether it appeared at all (Experiment 2) or whether it appeared and what was its temporal order (Experiment 3). VWM monitoring was assessed by subjective confidence judgments regarding participants’ objective performance. Similar to common metacognitive findings in other domains, we found that subjective judgments overestimated performance and underestimated errors, even for real-world, complex items held in VWM. These biases seem not to be task-specific as they were found in temporal, spatial, and identity VWM tasks. Yet, the results further showed that meaningful, real-world objects were better remembered than distorted items, and this memory advantage also translated to metacognitive measures.

Introduction

To what degree does one have access to own mental processes? The body of research termed Metacognition aims to answer this question. The field of metacognition refers to “thinking about thinking” (Flavell, 1979) and it deals with the evaluation and monitoring of cognitive processes and the control and regulation of these processes (see Koriat, 2007 for review). Broadly speaking, monitoring of cognitive processes refers to one’s awareness of the operation of a specific cognitive process while it occurs. From an experimental perspective, monitoring is usually assessed by collecting participants’ direct (subjective) confidence judgments regarding the relevant process, and matching that with the actual outcome. Consequently, metacognitive studies often find that monitoring may be based on heuristics: a pragmatic but not necessarily optimal approach to generate subjective judgments, as in certain situations, they may prove unreliable and lead to biased decisions (Tversky and Kahneman, 1974).

To assess the degree to which monitoring coincides with the actual performance, two central measures are used: calibration and resolution (Fleming and Lau, 2014; Fiedler et al., 2019). Calibration, or absolute accuracy, refers to the gap between subjective confidence judgments and task performance scores (e.g., correct responses). Thus, calibration is maximized when the proportion of correct responses equals to the subjective confidence judgments given by the observer, and the absolute difference is zero. That is, subjective confidence ratings equal to the actual performance. An overconfidence bias occurs when subjective confidence exceeds task scores—as the observer overestimates her performance. Conversely, an underconfidence bias occurs when high performance is underestimated.

Resolution, or relative accuracy, is the extent to which confidence judgments vary between a correct or incorrect response. This is measured as a correlation between confidence and accuracy. Resolution is maximized when high performance is predicted by high confidence judgments and low performance is predicted by low confidence judgments (Ackerman and Goldsmith, 2011, for reviews, see Schwartz and Efklides, 2012; Goldsmith et al., 2014). Note that calibration and resolution are two independent measures. Calibration reflects the extent of deviation from being subjectively accurate in confidence judgments, whereas resolution is a correlation that reflects the extent of how judgments represent and change with performance.

The current study aimed at examining visual working memory (VWM) from a metacognitive perspective. VWM is considered to be a fundamental, capacity-limited on-line buffer, and individual differences in this ability are related to high cognitive functions, such as intelligence (Luck and Vogel, 2013). Hence, understanding how people access and assess the content held in VWM can shed new light on the mechanisms underlying VWM processes. Furthermore, the relationship between working memory and metacognitive abilities is likely to be bi-directional. For example, Komori (2016) showed that in a dual-task setting, observers with high working memory capacity made more accurate judgments about their performance than observers with low capacity. On the flip side, researchers are also relying on the assumption that observers have accurate metacognitive reports and use that to assess VWM processes (Adam et al., 2017). Thus, studying metacognitive processes within VWM can gain valuable insights into both VWM and metacognitive processes.

Metacognitive studies of VWM have mainly examined the accuracy of subjective estimations of VWM limit and the extent that subjective and objective visual knowledge dissociate from one another. The correspondence of objective VWM measures and subjective judgments showed that, overall, subjective judgments reliably reflect (at least to some extent) VWM content and objective visual information (Rademaker et al., 2012; Vandenbroucke et al., 2014; Samaha and Postle, 2017; Suchow et al., 2017). Yet, other studies have stressed the separability of objective visual information and subjective judgments (Bona et al., 2013; Bona and Silvanto, 2014; Vlassova et al., 2014; Maniscalco and Lau, 2015). For instance, Adam and Vogel (2017) showed that while subjective judgments predicted some variation in memory performance, observers were consistently unaware of their own memory failures.

One issue of measuring metacognitive processes in VWM is the repeated use of a limited set of simple stimuli (e.g., colors, orientations) in VWM tasks. This results in a narrow, homogeneous stimuli space and increases the likelihood of proactive interference. The outcome of proactive interference is that items from previous trials are harder to reject, and are mistakenly reported as if they appeared in the current trial (e.g., Keppel and Underwood, 1962; Hartshorne, 2008; Makovski and Jiang, 2008; Makovski, 2016; but see Lin and Luck, 2012). Thus, without accounting for these errors, studies might inaccurately estimate VWM performance, and more importantly for the current purposes, they might impair our ability to adequately assess the metacognitive processes involved in VWM because both subjective and objective performance are likely to be contaminated by information encountered in previous trials.

One way to minimize proactive interference is by using real-world objects instead of simple stimuli. These stimuli afford to test numerous distinct items without repetition throughout the experiment (Endress and Potter, 2014; Makovski, 2016; Shoval et al., 2019). Testing real-world objects in VWM tasks further bears an ecological benefit as we typically interact with meaningful, rich, complex objects and not with impoverished stimuli such as color patches. Accordingly, recent findings showed that the visual and semantic heterogeneity of meaningful objects leads to an improved VWM performance and extend the typical limit of VWM capacity (Brady et al., 2016; Shoval et al., 2019). However, it is still unknown how accurate people are in monitoring VWM of rich, real-world objects.

The goal of the current study was to explore observers’ ability to monitor VWM processes using distinct complex stimuli and various VWM tasks. Three experiments were conducted in order to reveal the correspondence between objective and subjective memory performance while minimizing proactive interference by using non-repeating images of real-world objects. Specifically, we measured observers’ resolution and calibration while they were performing VWM tasks with unique (i.e., presented only once in the task) and distinct real-world objects. This allowed us to estimate the metacognitive abilities of VWM across three domains (e.g., spatial, identity, temporal) with minimal interference from the information shown in previous trials.

Experiment 1

The aim of the first experiment was to examine spatial VWM performance from a metacognitive perspective. Thus, on each trial, participants memorized a set of six images of real-world objects, presented sequentially at distinct locations (Makovski, 2016). After a short retention period, one of the presented images appeared and participants were asked to indicate the item’s location. Next, they were asked to evaluate their confidence by indicating the degree of certainty that they chose the correct item’s location on a 0–100 scale. This allowed us to assess both subjective and objective performance and thereby estimate resolution and calibration.

Method

Participants

Participants were students (age: 18–35) from the Open University of Israel who took part in the experiment for course credit. All had normal or corrected-to-normal visual acuity and were without learning disabilities or attention disorders. Power calculation showed that a minimum sample size of 20 participants provided a power of 0.8 for detecting a Cohen’s d effect size of 0.66 using a two-tailed paired samples t-test. Twenty-two participants completed Experiment 1 (19 females, mean age = 27).

Materials and Stimuli

The task was created and implemented with MATLAB software (MathWorks Inc., Natick, MA, United States, 2010) and Psychtoolbox (Brainard, 1997) on a 23.5” Eizo Foris monitor (1920 × 1080, 120 Hz) and a standard PC. Participants were tested individually in a dim room. They sat approximately 50 cm from the screen. A black fixation cross (0.96°) was presented at the center of a white background screen. Two columns of three black-frame empty squares (5.6° × 5.6°) served as place-holders (located 14° to the left and right of fixation, and 14° above, at fixation level, and 14° below the fixation, Figure 1). The image set included 1200 images of real-world objects (4.8° × 4.8°) drawn from a previously published set (Brady et al., 2008¹). Confidence judgments were collected by scrolling with the mouse over a rectangle bar (40° × 1.9°). The initial position of the cursor was at the middle of the bar (i.e., at 50%). The bar was interactively filled with the color blue from its left edge to the position of the cursor. The percentage of the filled area, from 0 to 100, served as a numeric indicator for confidence and it was presented above the rectangle. Participants finalized their judgment response by pressing the space key. Note that responding without moving the cursor was impossible, and a response of 50% was not allowed.

FIGURE 1

Figure 1. (A) Illustration of Experiment 1’s trial’s sequence: Each trial began with blank place-holders. The presentation sequence consisted of six items, each at a distinct location. After the final stimulus and between the presentation of the probe, a blank place-holders display was shown for 600 ms. Following this short retention, participants were asked to indicate where the probed item appeared and to rate their confidence regarding their response. (B) Experiment 1’s results: Mean confidence (gray line) and the mean percentage of correct location (red line) plotted as a function of the probed-item’s serial position during the presentation sequence. Error bars represent standard error of the mean.

Procedure

The trial began with a 950 ms fixation and place-holders display that remained visible throughout the trial. Each trial consisted of six unique images, randomly drawn in each trial for each subject. Each image appeared in isolation within a distinct place-holder for 500 ms. The items appeared sequentially in random order and after the last image was shown, a fixation cross was displayed for 600 ms. Then, the probe item, which was always one of the six items presented in that trial, appeared above fixation together with the six empty place-holders and the mouse cursor at fixation. The probe item was evenly and randomly chosen from the six possible locations and six serial positions. Participants were instructed to indicate the place-holder in which the probe item appeared by clicking on its position using the mouse. There was no time limit for this task and only accuracy was emphasized. After a response was registered, participants were instructed to indicate their subjective confidence that they made a correct response by scrolling with the mouse over a 0 (“not-sure”) to 100 (“very sure”) scale. A numeric value of confidence was accordingly shown, and the participants were instructed to choose any value that reflected their subjective confidence except for 50%. The next trial began after 500 ms of a blank display (Figure 1A).

Participants performed 180 experimental trials (five trials in each of the six locations, six serial-positions combinations), preceded by eight practice trials. Every 36 trials participants could take a short break.

Results

Accuracy

Figure 1B depicts performance as a function of serial position. The overall accuracy reflected moderately poor performance, but was above chance level [16.6%, M = 44.3%, SD = 22.9, t(21) = 18.2, p < 0.001]. A repeated-measures analysis of variance (ANOVA, Greenhouse–Geisser corrected) of accuracy as a function of the probed-item serial position was significant, F(2.84,59.7) = 82.063, p < 0.001, η_p² = 0.796. Bonferroni corrected comparisons showed that accuracy was best for the last item to (the 6th item, all p’s < 0.001). The second to last item (the 5th item) was also better than all previous positions (all p’s < 0.001). There was no other significant difference between positions 1–4 (all p’s > 0.1) except that the fourth item was better than the second item (p = 0.017). These results reflect a typical recency effect as the locations of the last two items were better remembered than the location of the first four items (Broadbent and Broadbent, 1981).

Confidence

Similar to accuracy, a repeated-measures ANOVA of confidence ratings (Figure 1B) as a function of the probed-item serial position revealed a significant effect, F(2.79,58.75) = 77.872, p < 0.001, η_p² = 0.788. Bonferroni corrected post hoc comparisons showed that confidence was largest for the last presented item (all p’s < 0.001). The confidence of the second to last presented item was also larger than all previously presented items (all p’s < 0.046). No other significant difference was found (all p’s > 0.08).

Calibration

Calibration was calculated as the difference between confidence and accuracy in each serial position of each subject. Repeated-measures ANOVA of calibration as a function of serial position revealed a significant main effect, F(5,105) = 9.063, p < 0.001, η_p² = 0.301. To further examine the source of the overconfidence bias, post hoc Bonferroni corrected comparisons showed that the last item significantly differed from the third, second, and first items (all p’s < 0.033). The fifth item differed from all previous items (all p’s < 0.028). No other comparisons were significant. Bayesian one-sample t-test further showed a reliable and positive difference from zero for the first four items (BF = 36, 13, 21, 5.4, respectively), but did not show a reliable difference from zero for the last two presented items (BF = 0.26, 0.22, respectively). This suggests that the overconfidence bias was driven from the first four items, whereas observers were well-calibrated for the last two items (see Figure 1B).

Resolution

For each participant, a resolution was calculated as the Gamma correlation coefficient (i.e., Goodman–Kruskal correlation) between accuracy and confidence (Nelson, 1984) collapsed across all serial positions. The averaged resolution across participants was moderate (M = 0.521, SD = 0.1), suggesting that observers’ discrimination between the better- and less-remembered location of the probed item was only accurate to some extent.

Discussion

The results of Experiment 1 showed that participants’ sensitivity to their performance in the VWM task was moderate—as reflected by their resolution. However, this estimation (0.521) seems to be numerically larger than correlations previously reported in other metacognitive studies of VWM, which varied between 0.19 and 0.47 (0.22–0.39, Thomas et al., 2012; 0.19–0.43, Yue et al., 2013; 0.47, Adam and Vogel, 2017; but see Masson and Rotello, 2009). We also found that the calibration was highly influenced by the item’s serial position as observers were overconfident in the first four items but well-calibrated in the last two items.

In the current experiment, we asked observers about the item’s location and not about the memory of the item itself. That is, the objective and subjective measures were only based on the spatial memory of the item (where the item was presented). However, a crucial aspect of memory is the explicit access to the item’s identity, which is also often used as a measure of memory performance (e.g., “was this chair presented?”). Therefore, in Experiment 2, we turn to directly examine whether participants explicitly remember the probed item and particularly their confidence that the probed item appeared.

Experiment 2

While we usually interact with both the item’s identity and its location, they are not necessarily recalled together nor do they decay together in an obligatory manner (Köhler et al., 2001; Pertzov et al., 2012). Thus, testing spatial memory alone, as in the previous experiment, does not provide a full view of the metacognitive abilities of VWM. Specifically, it remains unclear whether people can accurately assess their VWM when it is based on the item’s identity.

Several changes were therefore done in Experiment 2. First, the presentation set-size was reduced to four items to ensure that the capacity limit was not exceeded. As in Experiment 1, each item appeared at a distinct location, and items were not repeated throughout the experiment. After the presentation sequence, observers were asked to indicate whether they explicitly remember that the probed item appeared and to rate their confidence regarding the item’s appearance. Importantly, the probed item was always an item from the presentation sequence. Afterward, they were asked to indicate its location. When participants reported that the probed item did not appear, an “appearance error” was registered but the trial continued the same. That is, participants were asked to guess a possible location and were not told anything about whether the item actually appeared or not (note that the item always appeared). This allowed us examine the location accuracy in those trials where participants reported that they do not remember that the item appeared (i.e., its identity). Note that in this experiment, we focused on participants’ reports and confidence ratings that the item appeared and thus we did not measure the confidence in knowing where the item appeared (as was in Experiment 1).