To Punish or to Restore: How Children Evaluate Victims' Responses to Immorality

Liu, Xin; Yang, Xin; Wu, Zhen

doi:10.3389/fpsyg.2021.696160

ORIGINAL RESEARCH article

Front. Psychol., 13 August 2021

Sec. Human Developmental Psychology

Volume 12 - 2021 | https://doi.org/10.3389/fpsyg.2021.696160

To Punish or to Restore: How Children Evaluate Victims' Responses to Immorality

Xin Liu¹

Xin Yang²

Zhen Wu¹^*

¹Department of Psychology, Tsinghua University, Beijing, China
²Department of Psychology, Yale University, New Haven, CT, United States

Punishment is important for deterring transgressions and maintaining cooperation, while restoration is also an effective way to resolve conflicts and undo harm. Which way do children prefer when evaluating others' reactions to immorality? Across four experiments, Chinese preschoolers (aged 4–6, n = 184) evaluated victims' different reactions to possession violations (i.e., punishing the perpetrator or restoring the belongings). Children evaluated restorative reactions more positively than punitive ones. This tendency to favor restoration over punishment was influenced by the degree of punishment, with more pronounced patterns observed when punishment was harsher (Experiments 1–3). Indeed, when different degrees of punishment were directly contrasted (Experiment 4), children viewed victims who imposed milder punishment (“steal one object, remove one or two objects”) more positively than those who imposed harsh punishment (“steal one object, remove three objects”). These patterns were especially manifested in preschoolers who chose restoration when being put in the victim's situation, suggesting a consistency between evaluations and behaviors. Taken together, the current study showed that children prioritize protecting the victim over harshly punishing the perpetrator, which suggests an early take on the preferred way to uphold justice.

Introduction

Across different cultures, justice is one of the most crucial positive virtues (Peterson and Seligman, 2004). Converging theories and empirical evidence suggest that justice has evolved in the ecological context of pressure to maintain cooperation (Tyler, 2009), positive social interactions (Cohen, 1991), and social norms (Boyles et al., 2008). Other theories in the interdisciplinary field (e.g., mathematics, Capraro and Perc, 2021; physics, Perc, 2016) have also emphasized the importance of justice and cooperation. Justice ensures that people receive the benefits and punishment they deserve. For example, when facing possession violations, one may return objects to their rightful owners and (or) punish the perpetrator to a fair degree. Both solutions sustained justice. Traditionally, studies focus on how people enforce justice through punishment (e.g., Henrich et al., 2006). However, recent studies reveal that compared to punishment, people prefer to compensate victims and restore the possessions when these options are available (e.g., FeldmanHall et al., 2014; Riedl et al., 2015; Yang et al., 2021). The debate over the priority of punishment vs. restoration touches on principles we use when dealing with injustice. The current study approaches this debate from a developmental perspective: how do children evaluate victims' different responses to possession violations, such as punishing the perpetrator or returning the possessions to the victim? Studying young children's preferences may provide hints at human nature in upholding justice.

Punishment has traditionally been defined as a penalty or retribution directed toward those who cause harm or violate social norms (Clutton-Brock and Parker, 1995). In fairness violation (Fowler, 2005; Herrmann et al., 2008) and situations that ask for rehabilitating justice (e.g., illegality and crime, Heffner and FeldmanHall, 2019), people show the desire for punishment. Punishment is the common method in the judicial system that is widespread across human societies and plays an important role in ensuring social harmony (Hofmann et al., 2018). It serves as a powerful tool to support the cooperative system by deterring selfishness, decreasing incentives that take advantage of the system, and rewarding behaviors that comply with norms in the long run (Fehr and Gächter, 2002; Krasnow et al., 2016). It may also be used for reputational reasons, as people are more likely to punish norm violations when observed (Kurzban et al., 2007); moreover, those who have enacted punishment are judged as more trustworthy (Jordan et al., 2016). The preference for punishment emerges early in development. Infants as young as 6 month prefer individuals who act negatively toward antisocial others (Hamlin et al., 2011; Kanakogi et al., 2017). Three-year-olds punish selfish peers both when they are directly affected (Wu and Gao, 2018) and when they are third-party observers (Vaish et al., 2011). At 6 years of age, children take a cost to punish selfish peers even as unaffected third-party observers (McAuliffe et al., 2015; Salali et al., 2015). In addition, 5-year-olds allocate unpleasant items to antisocial adults anonymously (Kenward and Osth, 2015) and choose to play with a character who shows retaliating aggression to the perpetrator, again implying a preference for punishment (Etchu, 2005). These findings all speak to the possibility that children may positively evaluate punitive behaviors and victims who punish perpetrators.

Despite the well-documented evidence on the preference for punishment (e.g., Henrich et al., 2006), it is important to note that past studies usually contrast punishment with “inaction”—doing nothing or accepting the injustice (e.g., McAuliffe et al., 2015; Wu and Gao, 2018). Therefore, it is unclear whether children prefer punishment, or they merely dislike “doing nothing” in the face of injustice. Indeed, recent work has revealed that when alternative actions are available, punishment is not always the preferred way to resolve conflict. For example, when facing property loss or unfair distributions, adults prefer to compensate the victims rather than punish the perpetrators (e.g., Lotz et al., 2011; FeldmanHall et al., 2014; Heffner and FeldmanHall, 2019). These findings suggest that restorative actions may be a preferred avenue to restore justice at least in adults. In different fields of social science such as law and criminology, scholars have argued that restoration, as compared to punishment, calls attention to victim's welfare (Wenzel et al., 2008). Restoration is also beneficial for repairing the relationship destroyed by the perpetrator, thereby maintaining cooperation (McCullough, 2008).

Recent developmental work with both punishment and restoration options also shows a similar preference for restoration in children. In face of the unpermitted loss of their own or others' possessions, children from age three choose to intervene by returning the possessions to the original owner (restoration) rather than by removing the possessions to a place inaccessible to the perpetrator (punishment) (Riedl et al., 2015; Yang et al., 2021). In a separate line of work, 5–9-year-olds prefer third-party helping to third-party punishment (Lee and Warneken, 2020). These findings suggest that punishment is not always favored by children, and other options such as restoration is sometimes more valued.

However, open questions remain concerning children's preference between restoration and punishment. To begin with, the developmental work reviewed above examined either children's own behaviors (Riedl et al., 2015; Yang et al., 2021) or attitudes toward unaffected observers who intervened in face of immorality (Lee and Warneken, 2020). However, little is known about how children evaluate victims' responses to immorality. Learning how children evaluate victims' responses is important, as children might be the victims of immorality themselves and whether they support or oppose different responses from victims reflects their moral values and behavioral tendencies (see also Oostenbroek and Vaish, 2019, who proposed that children evaluated forgiving victims positively because they approved of repairing cooperation). Therefore, the present study aimed to examine how children evaluate victims' responses to immorality as unaffected bystanders.

Second, the punishment option in these previous studies is either very harsh (made the perpetrator lose all or most of the resources; e.g., Yang et al., 2021) or very mild and ineffective for the perpetrator (the perpetrator did not lose anything they initially owned; e.g., Riedl et al., 2015). Therefore, it remains unclear whether children genuinely prefer restoration over punishment, or they only prefer restoration when punishment is too harsh or ineffective. In fact, adults expect the degree of punishment to match the immorality of transgression, suggesting that the degree of punishment matters in adults' reasoning about justice (Wenzel and Okimoto, 2016). Specifically, if punishment is too mild to fit the transgression, it will be inefficient and unsatisfying (Adams, 2016); but if it is too severe, it will lead to other negative consequences including violence (Murphy, 2003) and further damage (McCullough et al., 2013). Prior work on how the degree of punishment corresponds to different transgressions mostly comes from a judicial perspective and only tests adult participants (Murphy, 2003; McCullough et al., 2013; Adams, 2016). Children also frequently encounter social conflicts such as unpermitted taking of toys and unequal resource distributions (Hartup et al., 1988; Laursen and Adams, 2018). However, little is known about how children weigh different degrees of punishment against restoration. Therefore, the present study aimed to systematically investigate children's relative preference between restoration and different degrees of punishment in contexts that they routinely encounter in their lives.

Studying this question with children will enrich our understanding of the origin of human justice. It will also provide insights into educational practices for the development of moral reasoning and social skills. For example, teachers and parents frequently encounter the problem of guiding children to deal with daily conflicts with peers. If children's toys are taken away by others, should teachers or parents ask children to first restore the toys back or punish the perpetrator? As children begin to develop their own views of justice and morality, understanding how they interpret and evaluate these different actions may shed light on potential solutions to this problem.

The other goal of the present study is to examine the connection between children's evaluations and their own behaviors. Previous work comparing punishment and restoration only measured children's behaviors (to punish or to restore; Riedl et al., 2015; Yang et al., 2021) or their evaluations of or attitudes toward others' behaviors (Oostenbroek and Vaish, 2019; Lee and Warneken, 2020). We do not yet know whether children's evaluations connect or disconnect with their own behaviors in the context of punishment and restoration. Theories and studies show that the knowledge–behavior gap commonly exists among adults (Rimal, 2000) and children (Blake et al., 2014; Blake, 2018) in various domains. For instance, preschool children understand fairness principles and prefer fair allocations but they do not allocate resources fairly (Rogers and Tisak, 1996; Smith et al., 2013). Given the documented gap between evaluations and behaviors (see also Kollmuss and Agyeman, 2002; Juvan and Dolnicar, 2014), we aimed to examine whether children's evaluations of punishment and restoration parallel with their actual behaviors when they are victims of transgressions.

The Present Study

The present study investigated how children, as unaffected observers, evaluated the victims who chose punishment or restoration in the face of immorality. We tested children in possession violation cases, as children at this age are familiar with these scenarios (Hartup et al., 1988). Importantly, we included Chinese children, who were relatively less studied and may be different from the WEIRD (Western, Educated, Industrialized, Rich, and Democratic) group. Compared to WEIRD cultures, Chinese culture emphasizes more on duty-based communal obligations and spiritual purity (Buchtel et al., 2015). Therefore, different from WEIRD adults, Chinese adults think the “informal immorality controls” (which are based on moral rules rather than laws) are also important (Jiang et al., 2010). Although studies on adults find cultural differences, little is known about whether the cultural differences on justice judgement emerges in children. Studies with them could help us further understand the commonalities in children's justice development (Henrich et al., 2010). We tested children aged 4–6 because children in this age range can distinguish restoration and punishment (Riedl et al., 2015; Lee and Warneken, 2020; for studies that tested children from the same cultural background, see Yang et al., 2021).

Specifically, using a within-subject design, we presented children with scenarios in which one victim chose to restore the possessions (the restorer) and the other chose to punish the perpetrator (the punisher) in response to possession violations. Importantly, we manipulated the degree of punishment and contrasted each of them with restoration (Experiments 1–3) and among themselves (Experiment 4). We incorporated a battery of evaluation measures, including children's ratings of these different actions and their attitudes toward the victims. Testing how children evaluated others' behaviors avoids triggering potential negative emotions and self-interested tendencies (because children were not victims), thus offering a relatively neutral assessment. We hypothesized that the degrees of punishment played a role in children's relative evaluations of restoration and punishment. Children may evaluate restoration relatively more positively when punishment was too mild or too harsh (for reasons discussed above). We also further examined how different degrees of punishment compared among themselves. Additionally, in order to understand whether children's relative evaluations between punishment and restoration aligned with their own behaviors, we also asked children what they would do in a similar situation. We hypothesized that children who chose restoration themselves would especially evaluate the restorer more positively than the punisher.

Experiment 1: Restoration vs. Harsh Retribution

Method

Participants

The participants were 48 Chinese children (24 girls, M = 62.89 months, SD = 3.96, range = 56.62–72.10 months) from a private preschool in Beijing. Four additional participants were tested but excluded from data analyses for not completing the experiment. We decided this sample size based on prior work in this field (Oostenbroek and Vaish, 2019) and given our resources. Post-hoc power analysis using the current sample size and main results with G^*Power 3.1 (two tails, α = 0.05, difference between two dependent means, effect size calculated from the behavior rating measure) indicated that we achieved 84% power. In all experiments reported in this paper, parents' consent forms were obtained via the preschool and children received a picture book for participation. This research project has Institutional Review Board approval from Tsinghua University, protocol #201602.

Procedure

A female experimenter tested the children individually in a quiet room. The experiment consisted of 5 phases: introduction, video watching, comprehension check, main evaluation task, followed by a behavioral task. We tested children's evaluation first because it is the main focus of the current study. If the behavioral task was completed before the evaluation, then evaluations might be changed to justify their own behavior (Bandura et al., 1996; Tsang, 2002). It took ~15 min to complete the experiment for each child. In what follows, we summarized the procedure of this experiment; exact scripts are included in Supplemental Materials (same for the following experiments).

Introduction. Children were introduced to the rules of a novel game (adapted from Yang et al., 2021; see Figure 1). In this game, two players (acted by real-life puppets) faced each other across the purple game board. Each player began with two wooden blocks (toys of players) and they put their blocks on the horizontal lane closer to them. There were two cars on the game board (we added this feature to make the game more appealing to children). Children were told that each car only moved in the given direction (as demonstrated in Figure 1). Pushing different cars resulted in wooden blocks being moved to either the storehouse (blocks in the storehouse did not belong to either player, so we called this the “punishment” option), or the original location (returning the relocated blocks to where they originally belonged to, so we called this the “restoration” option). We manipulated the puppets, the game board, the cars, and the blocks in real life. To ensure that children understood the respective consequences of pushing each car, we asked them to push the cars by themselves. Children could not proceed to the next step until they answered the comprehension questions correctly (How many blocks are left on the upper or the lower lane after moving this car?). If they failed to correctly answer the questions, the experimenter would reintroduce the rules (n = 10). All children understood the rules after the experimenter reintroduced the rules once or twice.

FIGURE 1

Figure 1. The arrangement of the game board in three experiments (see online version for the colored version of this figure). (A–C) represent the experimental set-up in Experiments 1, 2, and 3, respectively. The victim (top) owns the blocks (represented by white squares in the figure) on the upper horizontal lane, while the perpetrator (bottom) owns the blocks on the lower horizontal lane. There are two cars (shaded gray rectangles with four black “wheels”); these cars can move on the dark purple lanes that they are located at. Car A is on the lower horizontal lane and can move toward the “storehouse” on the right side, whereas Car B is on the vertical lane and moves upwards toward the upper horizontal lane. The white arrows illustrate the relocation of the victim's block caused by the perpetrator. The difference among the three experiments is the number of blocks that Car A can push to the storehouse. In Experiment 1 (A), Car A can push 3 blocks to the storehouse, thus the perpetrator has no blocks left. In Experiment 2 (B), Car A can push 2 blocks to the storehouse, thus the perpetrator has 1 block left. In Experiment 3 (C), Car A can push 1 block to the storehouse, thus the perpetrator has 2 blocks left.

Video watching. In order to standardize the procedure, each child watched both video clips that showed different events. A video sample has been uploaded to the Open Science Framework at this weblink https://osf.io/u59kd. Children were first introduced to the two real-life puppets featured in each video (learning their names and greeting them in person) before watching videos on an iPad. Each video began with two puppets facing each other as described above. Then one puppet (the victim) left for the restroom, and the other puppet (the perpetrator) took a block from the victim's lane and put it on his or her own lane. Later the victim returned, realized that his or her block was stolen by the perpetrator, and faced a decision between punishment and restoration, as specified above. In the punishment video, the victim punished the perpetrator (by moving one car); in the restoration video, the victim restored his or her block (by moving the other car). The experimenter referred to the puppets with their names (e.g., “Hua”) rather than “the victim,” “the perpetrator,” or “the puppet” (these words were never used with children during testing). Whether children watched the restoration video or the punishment video first was counterbalanced across children, so were the role of the puppets (acted as a perpetrator, a punisher, or a restorer).

Comprehension Check. After watching each video clip, the experimenter showed the real-life two puppets and the game board that appeared in the previous video clip, to help children recall the plot. Children were asked the following comprehension questions: “Who went to the restroom?” “Who took a block without the other's permission?,” “Which car did Hua (the victim) move?,” and “How many blocks did Hua and Feng (the perpetrator) have at the end?.” Those who failed to give correct answers at first try (n = 16 after the restoration video and n = 7 after the punishment video) passed the task after re-watching the video clips. More than half of the children correctly answered the comprehension questions for the first time. The times of replay in the restoration condition and the punishment condition did not differ significantly (χ² (2) = 4.69, p = 0.10). In addition, whether children correctly answered the comprehension questions at the first time did not significantly influence children's performance in the main tasks (behavior ratings and liking scores) (ps > 0.05, for the detailed statistics, see Supplementary Materials).

Main Evaluation Task. There were three measures (in the following order): (1) behavior ratings of the four puppets. With visual aids of happy vs. unhappy faces, we asked children to evaluate each puppet's behavior (two perpetrators, and two victims—one punisher and one restorer; “Is his or her behavior good or bad?”) followed by a question about to what extent children considered the behavior as good or bad (“Is it a little good, or very good?” or “Is it a little bad, or very bad?”) and a justification question (“Why do you think so?”. For results on children's justifications, see Supplementary Materials). By measuring children's behavior ratings, we can learn whether children approve of punishment and restoration. (2) Liking scores of the two victims (the punisher and the restorer; “Do you like him or her?”) followed by a question about to what extent children liked or disliked the protagonist (“Do you like him or her a little or a lot?” or “Do you dislike him or her a little or a lot?”). By asking how much children like the punisher or the restorer, we can learn whether the protagonist's behavior affect children's evaluation of the protagonist. (3) Sticker allocation task between the two victims. The participant was given a sticker, and was asked which victim to give it to, the punisher or the restorer. This question was to examine whether children preferred the punisher or the restorer.

Behavioral task. Finally, to probe children's own behavioral responses, we asked children which option they preferred if they were victims of similar possession violations. Children were instructed to imagine playing with a classmate who took a block away when they went to the restroom (in fact, the experimenter moved the block using the above apparatus for demonstration). Children then pushed a car either to punish the classmate or to restore his or her possessions (i.e., the block).

Coding and Scoring

(1) Behavior ratings: there were four raw rating scores (two scores for the two perpetrators, one score for the punisher, one score for the restorer). The scores ranged from 1 to 4, with higher scores indicating more positive ratings of the behavior. (2) Liking scores: there were two raw liking scores (one for each victim), The scores ranged from 1 to 4, with higher scores indicating more positive attitudes toward the victim. For main analyses on these two measures, we further computed difference scores between the two victims (restorer-punisher; range −3 to 3); higher values indicated relatively stronger positivity toward restoration or the restorer. (3) Sticker allocation task (forced-choice between the two victims): we coded whether children gave the sticker to the punisher or the restorer, as well as counted the number of children who gave the sticker to each victim. (4) Behavioral task: we coded whether children chose punishment or restoration, as well as counted the number of children who performed each action. A second coder coded a subset of children (30% of the data, n = 16) on these four measures and the inter-rater reliability was perfect (κ = 1.00).

Results

Data analyses were conducted with R version 4.0.2 (R Core Team, 2020). Preliminary analyses with linear models (gender as an independent factor and age as a covariate) showed no significant effects of gender or age on children's responses; thus, we collapsed the data across these factors. The order of video watching (i.e., punishment first or restoration first) and the times of replaying video clips did not significantly influence the results either (ps > 0.05; for the detailed statistics, see Supplementary Materials). Main analyses for behavior ratings and liking scores were conducted via linear regressions (using the package “lmerTest;” Kuznetsova et al., 2017), while for count data we used binomial or Chi-square tests (for sticker allocation and children's own behavioral task).

Main Evaluation Measures

On the behavior ratings, we first ensured that children distinguished the victims from the perpetrators—children rated the victims' behaviors more positively (M = 3.16, SD = 1.03) compared to the perpetrators' (M = 1.43, SD = 0.52), B = 1.73, SE = 0.11, p <0.001, R² = 0.53.

Main analyses focused on children's relative evaluations of restoration vs. punishment (using the difference scores described above for behavior ratings and liking scores). We found that children rated restoration significantly more positively than punishment (via an intercept only model), M_{differencescores} = 0.46, SD = 1.05, B = 0.46, SE = 0.15, p = 0.004, Cohen's d = 0.44 (see Figure 2). On the liking scores (see Figure 2), children showed more favorable attitudes toward the restorer compared to the punisher, M_{differencescores} = 0.40, SD = 1.27, B = 0.40, SE = 0.18, p = 0.04, Cohen's d = 0.31 (via an intercept only model). Consistent with these results, in the sticker allocation task, children tended to give the sticker to the restorer (n = 31 out of 48) rather than the punisher (n = 17 out of 48), p = 0.06 (via a binomial test) (see Table 1).

FIGURE 2

Figure 2. Differences of behavior ratings and liking scores between the restorer and the punisher (restorer—punisher) in Experiment 1, 2, and 3. In Experiment 1, differences of behavior ratings and liking scores were both significant, showing that children approved of the restoration behavior; in Experiment 2, only the differences of behavior ratings were significant; in Experiment 3, neither differences of behavior ratings nor liking scores was significant. The error bars represent the 95% CI. ***p <0.001, **p <0.01, * p <0.05.

TABLE 1

Table 1. The number (proportion) of children who gave stickers to the punisher or the restorer in experiments 1–3.

Connections Between Evaluations and Behaviors

When children's own block was taken away, 20 (out of 48, ~42%) children chose to punish the perpetrator, while 28 children chose to restore the block. A binominal test showed that the numbers of these two types of children (punitive children: children who chose punishment; restorative children: who chose restoration) were not significantly different, p = 0.31.

We then compared evaluation results between these two types of children for each measure (adding type of children in the previous models), as shown in Table 2. On behavior ratings, restorative children (M_{differencescores} = 0.75, SD = 0.97) rated restoration more positively than punishment (difference scores compared to 0, p < 0.001). Punitive children, however, did not evaluate these two behaviors differently (M_{differencescores} = 0.05, SD = 1.05, which did not differ from 0, p = 0.85). Restorative children also scored significantly higher on the difference scores than punitive children, B = −0.70, SE = 0.29, p = 0.02, R² = 0.11. These results suggest that the relative positivity toward restoration shown above was especially driven by the restorative children.

TABLE 2

Table 2. Means (standard deviations) of difference scores on behavior ratings and liking scores in experiment 1–3 as a function of children's type (punitive or restorative).

On liking scores, these two types of children did not differ, B = −0.42, SE = 0.37, p = 0.26, R² = 0.03 (see Table 2). They did not significantly differ in their allocations of stickers either, χ² (1) = 2.19, p = 0.14 (see Table 3).

TABLE 3

Table 3. The number (proportion) of children who gave stickers to the punisher or the restorer in experiments 1–3 as a function of children's type (punitive or restorative).

Discussion

Experiment 1 found that children evaluated restoration more positively than punishment. Specifically, compared to punishment, children more highly rated restorative behaviors (behavior ratings), preferred the victim who chose restoration (liking scores), and tended to share the sticker with the restorer (sticker allocations). This finding suggests that children prioritized compensating to the victim rather than punishing the perpetrator. Interestingly, this trend was especially pronounced (at least on behavioral ratings) in children who chose restoration when facing possession violations themselves. These findings further suggest that children's behavior paralleled with their evaluation to some extent.

Experiment 2: Restoration vs. Moderate Retribution

In Experiment 1, one alternative reason why the preschoolers viewed punishment more negatively than restoration is that removing all three of the perpetrator's blocks was too harsh. Children may have thought that even though punishment was necessary, it was inappropriate to impose such a high degree of punishment. To test this possibility, in Experiment 2, we decreased the harshness of the punishment option by changing it to removing two instead of three of the perpetrator's blocks. That is, in the punishment condition, both the perpetrator and the victim consequently got one block (see Figure 1B). The study procedure was the same as Experiment 1, except that in the ‘punishment’ video, the punisher removed 2 of the perpetrator's blocks into the storehouse.