Prosodic Prominence in Polar Questions and Exclamatives

Repp, Sophie; Seeliger, Heiko

doi:10.3389/fcomm.2020.00053

ORIGINAL RESEARCH article

Front. Commun., 04 August 2020

Sec. Psychology of Language

Volume 5 - 2020 | https://doi.org/10.3389/fcomm.2020.00053

Prosodic Prominence in Polar Questions and Exclamatives

$\nSophie Repp$ Sophie Repp

Heiko Seeliger^*

Department of German Language and Literature, University of Cologne, Cologne, Germany

This study investigates prosodic prominence in string-identical verb-first exclamatives and questions in German. It presents results from three production experiments comparing polar exclamatives/questions with different finite verbs [auxiliary, lexical verb (unergative)] and/or subjects (d-pronoun, full phrase) in order to explore the prominence-lending characteristics of various lexical, syntactic and semantic factors, which seem to be relevant for prosodic prominence in exclamations but not in other speech acts. The results show that clause-initial finite verbs are accented much more often in exclamatives than in questions, indicating that the C-position is an attractor for prosodic prominence in exclamatives. Furthermore, d-pronouns are accented very frequently in exclamatives but virtually never in polar questions. Given full subjects are also accented more often in exclamatives than in questions. With respect to verb type, the findings show that finite auxiliaries are only accented in exclamatives, but that lexical verbs are also accented in questions. Thus, the lexical verbs tested in this study may carry an accent irrespective of clause-initial or clause-final position and independently of speech act. While some of the findings can be explained by semantic-pragmatic factors, not all of them can. We suggest that exclamations have a prosodic constructional default, which is determined by the speech act type: it comprises a requirement for the accentuation of certain elements in the clause, a low speaking rate and a reduced sensitivity to information-structural requirements for low prosodic prominence.

Introduction

Many syntactic structures are ambiguous with respect to the speech act they express. For instance, the English declarative Peter is here may express an assertion or a question. Non-declarative structures also may be ambiguous. In English and German, verb-first structures may be used as questions or as exclamations. For example, the German sentence Hat der geschrien (lit.: has he screamed) may express the question “Did he scream?” or the exclamation “(Boy,) did he scream!” Verb-second wh-structures also may be used as questions or as exclamations in German. For instance, Was hat die für Schuhe gekauft (lit.: what has she for shoes bought) may express the question “What kind of shoes did she buy?” or the exclamation “The shoes she bought!”

It is generally agreed that speech-act-ambiguous structures are disambiguated by context and by prosody. The particular prosodic marking strategies contributing to the disambiguation have mainly been investigated for the disambiguation of assertions vs. questions, or for the disambiguation of different types of questions, for instance of information-seeking vs. rhetorical questions, or of information-questions vs. echo questions (for German e.g., von Essen, 1964; Bierwisch, 1966; Isačenko and Schädlich, 1966; Batliner, 1989a; Selting, 1995; Brinckmann and Benzmüller, 1999; Kügler, 2003; Schneider and Lintfert, 2003; Kohler, 2004; Peters, 2005; Niebuhr et al., 2010; Niebuhr, 2012; Truckenbrodt, 2012; Petrone and Niebuhr, 2014; Repp and Rosin, 2015; Wochner et al., 2015; Michalsky, 2017; Braun et al., 2019b; Neitsch and Niebuhr, 2019). Experimental investigations for other speech acts are rare. In the 1980s a number of sentence types that are associated with such other speech acts in German received some attention: exclamatives (speech act exclamation), imperatives (e.g., speech act order), optatives (wish) and adhortatives (suggestion), see inter alia the contributions in Altmann (1988) and Altmann et al. (1989). However, these investigations were rather restricted both in the linguistic materials that were used (very few items) and in the contextual control. More recent investigations of non-assertive speech acts other than questions are reported in Repp (2015, 2019), who compared exclamations to questions, more specifically wh-exclamatives to wh-interrogatives.

The current paper is concerned with the prosody of exclamations vs. questions, and explores prosodic prominence relations as a disambiguating factor. Exclamations are expressive speech acts (Searle, 1969). Speakers use expressive speech acts to reveal their psychological state concerning a state of affairs. For exclamations, this psychological state most commonly is surprise (e.g., d'Avis, 2002; Zanuttini and Portner, 2003) but other states can also be expressed, such as admiration, indignation, mockery or disgust, see e.g., Rett (2011) for discussion. The flexibility of expressing different psychological states is also found in echo questions and rhetorical questions (e.g., Repp and Rosin, 2015; Neitsch and Niebuhr, 2019). Exclamations are interesting from a prosodic perspective because the licensing conditions for the prosodic prominence relations in them—as reflected for instance in the accent distribution—seem to be somewhat different from the licensing conditions in other speech acts. In exclamations, these conditions seem to comprise factors like particular lexical items or speech-act-specific semantic properties. Furthermore, syntactic positions that in other speech acts would not be considered prominence-lending may be prominence-lending in exclamations. Finally, exclamations have been claimed to be less susceptible to discourse factors that otherwise have been shown to be highly relevant for prosodic prominence relations, most notably the givenness of expressions or referents in the discourse.

In this paper, we explore the role of some of these factors for the prosodic prominence relations in a specific type of string-identical exclamation vs. question: polar exclamatives vs. polar interrogatives. The factors that we are interested in concern on the one hand core-grammatical aspects (lexical choice, syntax, semantics), and on the other hand, pragmatic and discourse aspects. Polar exclamatives seem to display the kind of speech-act specific licensing conditions that we described above. However, our knowledge concerning these conditions in polar exclamatives is sketchy and the few suggestions in the literature have hardly been backed up by experimental or other quantitative evidence. In the following we will summarize what has been suggested about the prosody of polar exclamatives—and by way of comparison—polar questions. We will highlight in what way prosodic prominence seems to be particular in these speech acts, and we will lay out the precise research questions that our study addresses. Before we do this, however, we will briefly describe the syntax and the semantics of the two speech acts under investigation because these are important for the above-mentioned prosodic licensing conditions.

Polar Exclamatives and Polar Questions

In German, polar exclamatives and polar interrogatives are verb-first structures: a finite verb occurs in clause-initial position, see (1). The subject in (1) is a so-called d-pronoun. D-pronouns are personal pronouns that are often used in colloquial German, which is the register where polar exclamatives typically occur. D-pronouns are homophonous with the respective definite articles, e.g., der can be used as a nominative masculine singular personal pronoun (“he”) and as a nominative masculine singular definite article (“the_MASC.SING”). Many examples in the theoretical literature on exclamatives contain a d-pronoun (e.g., Rosengren, 1992; Brandner, 2010; d'Avis, 2013) but these pronouns also occur regularly in assertions and questions.

yes

The verb-first structure in (1) roughly has the meanings given in the English translations. As a polar exclamative, (1) may be interpreted as expressing the speaker's surprise at the length of time that the subject referent talked. Other psychological states like indignation also may be expressed but since the experiments conducted for this study only tested exclamatives expressing surprise we will not discuss other states here. As a polar interrogative, (1) inquires whether or not the subject referent talked for a long time.

Polar exclamatives are not polar in the same sense as polar interrogatives are. As we just mentioned, polar interrogatives inquire whether or not a certain state of affairs holds. In other words, they expect a positive-polar or a negative-polar answer. Polar exclamatives, in contrast, presuppose that a certain state of affairs holds. They presuppose the equivalent of a positive-polar answer. However, they are not truly polar because they do not express surprise at the fact that the state of the affairs holds, i.e., in (1) that the subject referent talked (for a long time). Rather, they express that the speaker is surprised at the degree to which something holds, which exceeds the expected or standard degree (d'Avis, 2002; Rett, 2008, 2011; Brandner, 2010). In (1), which contains the gradable predicate lange (“long”), the scale for the observed degree and the expected degree is the length of time: (1) expresses that the length of the talking in the situation exceeded the speakers expectations regarding talking time. The term polar exclamative thus essentially concerns the string identity of polar exclamatives and polar interrogatives and is not a semantic notion. Another note on terminology that we would like to make is that polar exclamatives and polar interrogatives are sentence types, which are used to express the speech acts exclamation and directive (= question), respectively. For ease of exposition, we will use the terms polar exclamative and polar question as short-hand for the two speech acts that we are considering: degree-related exclamations that are realized by a verb-first structure and polar questions that are realized by a verb-first structure.

The Prosody of Polar Exclamatives vs. Polar Questions

As already mentioned, exclamations are an interesting case when it comes to prosodic prominence relations. There is agreement that they contain a so-called exclamative accent. This accent is usually described as a very prominent accent (see below for details). Its position seems to be tied to certain lexical elements and syntactic positions—which can be considered speech-act specific prominence-lending factors –, and is not determined by default sentence accent assignment rules or influenced by information structure (Jacobs, 1988; Oppenrieder, 1988). In other words, the speech act exclamation seems to dictate where the exclamative accent occurs and how prominent it is. For polar exclamatives like (2), Rosengren (1992) suggests that there typically is an exclamative accent on the argument of the exclamative relation—the subject d-pronoun die—and/or on the (gradable) predicate, i.e., the adjective schön. There might also be accents on both of these elements.

yes

In polar exclamatives without an adjective and with a finite lexical verb like (3), ideally two accents occur. According to Rosengren (1992), an accent on only the argument, Leo, produces a reading with a narrow focus on Leo. According to Batliner (1989a), an accent on only the verb, säuft [“drinks (alcohol)”], produces a question interpretation. Note that the predicate denoted by säuft is not a gradable predicate although in the habitual reading it obviously means that the drinking person drinks regularly, and thus a lot. However, (3) may also have a non-habitual meaning: Leo is drinking heavily in the current situation. Thus, the non-gradable predicate can be coerced into a gradable reading.

yes

An accentuation option that has not been discussed in the literature a lot is that in structures like (2) the exclamative accent intuitively may also occur on the finite auxiliary [in (2) preferably without an accent on the gradable predicate]. For finite auxiliaries in wh-exclamatives, it has been shown that they are typical attractors of prosodic prominence, provided they occur in the C-position (Repp, 2019). (4) illustrates that wh-exclamatives may occur with verb-second word order, i.e., with the finite auxiliary in the C-position (4a), or with verb-final word order, i.e., with the finite auxiliary in ν/V/T and without overt C (4b). Repp (2019) found that in verb-second wh-exclamatives, the auxiliary is regularly accented whereas in verb-final wh-exclamatives it never is (see Section Discussion for a discussion of potential syntactic-semantic reasons for this observation).

yes

In the polar exclamative in (2), the clause-initial auxiliary is in the C-position. So, if our intuitions are correct and polar exclamatives pattern with verb-second wh-exclaprematives, the auxiliary in (2) can carry a prominent accent. Another reason to assume that the finite auxiliary might be accented in polar exclamatives is Truckenbrodt's (2012) suggestion that in exclamatives with a degree reading, the nuclear accent may occur on an element toward the beginning of the clause, i.e., “early,” rather than toward the end, which would be the default. Note in this connection that Repp (2019) also found that in wh-exclamatives, finite auxiliaries are not quite as often the carrier of a prominent accent as subject d-pronouns are. Speakers typically choose one or the other, and they choose the subject d-pronoun more often.

The accents on the auxiliary and the d-pronoun in wh-exclamatives are prominent in the sense that they are phonetically more prominent than when they occur in a corresponding wh-question. They have a higher maximum pitch and a larger pitch excursion, and the accented syllable is longer and louder (Repp, 2019). For declarative structures that are used as exclamations, it has also been observed that the exclamative accent is prominent phonetically: it has a higher and later pitch peak and a longer duration than a non-exclamative nuclear accent in the same structural position would have in assertions (also cf. Oppenrieder, 1988; Scholz, 1991; see Batliner, 1988a,c for perception studies).

On the basis of the previous findings about accentuation in exclamatives, we may assume that in polar exclamatives like (2) all three lexical items may carry an accent—although maybe not at the same time. The reasons for the attraction of prosodic prominence are different for the three items. In the case of the finite auxiliary, previous research suggests that the reason might be syntactic: the auxiliary occurs in the C-position. This assumption gets support from the suggestions about (3), where the arguably accented lexical verb säuft (“drinks/is drinking”) also is in the C-position. However, the accentuation of säuft may also be a consequence of default accentuation because in unergative intransitive sentences, the lexical verb may (but need not) carry the nuclear accent by default—at least in assertions (for German cf. Uhmann, 1991; Féry, 1993, 2011; Kratzer and Selkirk, 2007; Verhoeven and Kügler, 2015). This observation in turn raises the issue of whether lexical verbs that are not finite—and thus do not occur in C—typically carry an accent in exclamatives or not. This connects to the issue of an “early” nuclear accent in degree exclamatives mentioned above (Truckenbrodt, 2013). Turning to the adjective in (2), the reason for the attraction of prosodic prominence is lexico-semantic: the adjective is the gradable element in the exclamative relation, which essentially is a degree relation. An interesting question arising here is whether it is the gradability that is decisive or the fact that the adjective is the predicate of the exclamative relation. Finally, the d-pronoun may attract the accent for several reasons. The reason that is given in the literature is semantic: the d-pronoun is the argument of the exclamative relation. However, in all the examples considered thus far the argument of the degree relation also is the subject of the clause. So the reason may also be syntactic: the d-pronoun in (2) is the subject, and subjects might just happen to attract prosodic prominence in exclamatives—maybe because they occur toward the beginning of the clause. An unlikely reason for the d-pronoun attracting prosodic prominence in (2) is the choice of lexical item—but only if in polar exclamatives like (3), a completely different subject, Leo, must indeed be accented.

Having only three monosyllabic words in exclamatives like (2), for all of which there are potentially good reasons to attract prosodic prominence, an obvious question to ask is whether speakers ever choose to place three accents. This is unlikely because rhythmical patterns with alternating strong and weak syllables are generally preferred (e.g., Liberman and Prince, 1977; Hayes, 1984; Selkirk, 1984; Couper-Kuhlen, 1986; Jessen, 1999; Domahs et al., 2008). Thus, we may ask if there are preferences for certain accentuation patterns. In short exclamatives, speakers may display a preference for the realization of only one of the potential accents if they do not divide the clause into smaller phrases—recall that in verb-second wh-exclamatives there seems to be a preference for accenting the d-pronoun over the finite auxiliary. An additional question that we may ask in this context is whether in sentences that allow for rhythmical alternations due to their greater length, speakers place more accents on the relevant elements than in sentences that are prosodically short.

In summary, the present study aims at answering the following research questions concerning polar exclamatives:

(5) Research questions concerning polar exclamatives

i. Is the C-position an attractor for prosodic prominence independently of verb type?

ii. Are lexical verbs attractors for prosodic prominence irrespective of finiteness?

iii. Are gradable adjectives attractors for prosodic prominence if they are not the predicate of the exclamative relation, i.e., is gradability the decisive characteristic?

iv. Are subjects attractors for prosodic prominence independently of their form as d-pronoun?

v. Does the length of a polar exclamative influence the prominence relations?

Questions (i–iv) address the issue of whether the speech act exclamation requires the prosodic prominence of particular elements in polar exclamatives in terms of lexical choice, syntax (syntactic position, grammatical function) and semantics (gradability, exclamative relation). This issue is relevant for a wider understanding of prosodic prominence because it explores interface factors that are not immediately relevant when looking at assertive speech acts. Although it is widely accepted that syntactic factors play an important role for accentuation patterns, semantic factors like gradability or the exclamative degree relation are not typical contenders for being prominence-lending factors¹. Furthermore, although it is widely accepted that certain lexical items—like function words—typically are unstressed and thus unaccented, it is not usually assumed that certain lexical items—like d-pronouns—must be accented (and not just carry word stress). Note that such lexically induced prosodic prominence is different from expectation-based prosodic prominence where certain words in the linguistic context—like focus particles—raise the expectation that the subsequent word is accented (e.g., Baumann and Winter, 2018).

As already discussed, our study juxtaposes polar exclamatives with string-identical polar questions. This comparison is informative beyond providing a baseline for the above-mentioned prominence-lending factors in exclamatives, as we will see in a moment. Concerning the baseline aspect, the prosody of polar questions prima facie does not seem to be susceptible to any of the lexical or semantic factors that may be relevant for exclamatives. We do not expect that d-pronouns, gradable predicates or the argument of the degree relation are typical attractors of prosodic prominence in questions. Neither do we expect that the C-position is a prominence-lending position. Rather, we expect the default accent patterns that are familiar from assertions, with the caveat that the altered word order—the finite verb occurs in clause-initial position—might lead to altered default accentuation patterns. If a lexical verb that carries the nuclear accent in an assertion occurs in clause-initial position in a question, this will have consequences for the position of the nuclear accent.

Our expectations concerning the baseline aspect are backed up to some extent by the comparison of wh-questions and wh-exclamatives reported in Repp (2019): in wh-questions speakers placed an accent on the subject d-pronoun much less frequently than in wh-exclamatives. Maybe surprisingly, in wh-questions speakers accented the auxiliary in the C-position more often than in wh-exclamatives. However, this observation might have to do with the information-structural setup of the experiments: much of the material in the target sentences was given, so that C became a good candidate for accentuation in the questions. Finally, in wh-questions, the wh-pronoun also was a frequent carrier of an accent, which virtually never happened in wh-exclamatives.

Beyond the baseline aspect, a comparison of polar exclamatives and polar questions is also informative with respect to the contextual sensitivity of exclamatives. Some researchers have argued that exclamatives are information-structurally inert: they are not marked prosodically for focus (Jacobs, 1988; Oppenrieder, 1988). Others have claimed that there are only rudimentary information-structural effects (Batliner, 1988c). Repp (2019) found that wh-exclamatives seem to be fairly rigid in their accent patterns irrespective of contextual factors: in wh-exclamatives, object nouns denoting given referents were hardly ever deaccented (for the prosody of givenness in assertions, see e.g., Peters, 2005; Baumann, 2006; Röhr and Baumann, 2010; Baumann et al., 2015). In wh-questions, in contrast, object nouns denoting given referents were deaccented. Whether wh-exclamatives show no effect of information status whatsoever remained unclear in Repp's study. Reduced phonetic prominence on given referents—lower maximum pitch and a smaller pitch excursion—was only reliable in one of the two experiments. Considering these results, the issue arises whether the same holds for polar exclamatives. A recent study by Seeliger and Repp (2020) suggests that this is indeed the case. Seeliger and Repp investigated information structure in polar exclamatives with a different syntactic structure from the exclamatives tested in the current study. They tested transitive sentences with new, given or contrastive objects. They found that givenness did not result in deaccentuation, although the number of the arguably more prominent L+H^* accents (vs. H^*) was lower for given objects than for new objects. So, there was some givenness marking. For contrastive objects, this study did find clear prosodic reflexes. We will discuss these results in detail in Section Discussion.

A very obvious but crucial difference between exclamatives and questions that we have not discussed yet is their overall prosodic contour. Exclamations always have a falling contour (Altmann, 1993; Repp, 2015, 2019). Verb-first questions may end in a high or low boundary tone but the final pitch accent is always rising (Kügler, 2003; also cf. Kohler, 2004). We are assuming here that a low pitch accent followed by a high boundary tone is functionally equivalent to a rising pitch accent followed by a high boundary tone. The choice of high and low boundary tones in questions has been associated with the speaker's attitude toward the addressee or toward the answer. A high boundary tone has been suggested to indicate a friendly/interested attitude toward the addressee, or that the speaker has no clear expectation with respect to the answer. A low boundary tone has been associated with a clear answer expectation (for German see e.g., von Essen, 1964; Batliner, 1988c; Selting, 1995; Kügler, 2003; Kohler, 2004; Peters, 2005; Petrone and Niebuhr, 2014). Furthermore, the speech mode might play a role. In read speech, polar questions end more often in a rise than in spontaneous speech (Michalsky, 2017). Phonetically, questions ending in a high boundary tone display a higher pitch offset than other utterances and then intonation phrases forming part of a longer utterance (continuation rises), irrespective of their syntax (Michalsky, 2017). Michalsky suggests that a higher pitch offset is typical of interrogativity in general. Previous research furthermore suggests that the phonetic features of prenuclear accents may be used by listeners to differentiate assertions from declarative questions (Petrone and Niebuhr, 2014). Finally, questions seem to have a higher speaking rate than assertions (e.g., Niebuhr et al., 2010; also see van Heuven and van Zanten, 2005 for Dutch and Orkney English).

Concerning global aspects of the utterance like speaking rate, exclamations also seem to differ from other speech acts. Exclamations are longer than string-identical assertions (Altmann, 1993) and questions (Repp, 2019), i.e., the speaking rate is lower. We may speculate that this is a consequence of exclamations being expressive speech acts, plausibly involving a greater emotional arousal, which might involve a slower, more expansive speaking style. Higher emotional arousal has been suggested to have prosodic effects, even if speaking rate has not been investigated specifically. There seem to be global effects like a greater pitch range (for Bänziger and Scherer, 2005), which also is picked up by listeners during interpretation (Ladd et al., 1985). Furthermore, heightened arousal has been associated with a greater global intensity (Lieberman and Michaels, 1962; Banse and Scherer, 1996; and subsequent research). All these phonetic effects are indications that emotional arousal is a prominence-lending property in the sense that it increases the prosodic prominence of an entire utterance. This arguably makes the utterance stand out in comparison to other utterances in the discourse.

Based on these findings we formulate the following additional research questions regarding our comparison of polar exclamatives and polar questions:

(6) Research questions concerning a comparison of polar exclamatives and polar questions

vi. Do polar exclamatives display less sensitivity to information-structural demands imposed by the context than polar questions?

vii. Are the speech acts reliably distinguished by their falling contour (exclamatives) vs. various characteristics of interrogativity marking (final rise, rising nuclear accent etc.)

viii. Are polar exclamatives uttered with a lower speaking rate than polar questions?

ix. Are polar exclamatives uttered with a greater intensity than polar questions?

x. Are polar exclamatives uttered with a higher pitch range than polar questions?

Prosodic Prominence

As laid out in the previous sections, the goal of this study is to explore prosodic prominence in polar exclamatives and in polar questions. We took for granted that the presence of an accent increases prominence and we also suggested that various acoustic features contribute to the prosodic prominence of utterance parts or of the entire utterance. In this section, we briefly summarize what prosodic characteristics have been argued to contribute to prosodic prominence—which is also an issue of what is perceived as prominent. For reasons of space we restrict our discussion to German and we focus on research that is concerned with the prominence of words in utterances. This research builds on earlier work investigating prominence within words, i.e., lexical stress, which suggests that inter alia the following factors contribute to prominence: segmental changes resulting in hyperarticulation, more pitch movement, longer duration, higher intensity, changed spectral balance (for an overview, see Gordon and Roettger, 2017; van Heuven and Turk, 2020; cf. Baumann and Winter, 2018).

Both (categorical) phonological and phonetic factors have been argued to be relevant for the prosodic prominence of words in utterances. Phonologically, a syllable carrying a pitch accent uncontroversially is more prominent than a syllable not carrying an accent. Different pitch accent types seem to differ in their perceived degree of prominence (e.g., Jessen, 1999; Niebuhr, 2009; Baumann, 2014; Baumann and Röhr, 2015; Baumann and Winter, 2018). In general, rising accents are perceived as more prominent than falling accents. Baumann and Röhr (2015) tentatively suggest the following prominence hierarchy from high to low prominence, in terms of GToBI (Grice et al., 2005): rising (L+H^* > L^*+H > H^*) > falling (H+!H^* > H+L^* > L^*). The rising !H^* accent grouped with the falling accents in this study. As the GToBI labels indicate, the steepness of the fall/rise plays a role, with a greater steepness being perceived as more prominent. Overall, high starred tones are perceived as more prominent than downstepped or low starred tones. The role of the type of pitch accent for prominence perception was also shown in a larger-scale study by Baumann and Winter (2018).

Phonetically, many of the features that are relevant for lexical stress have been found to contribute to the perceived prominence of words within utterances, too. As may be expected from the findings about the accent types, a higher pitch excursion contributes to prominence (e.g., Mixdorff and Widera, 2001; Arnold et al., 2013; Baumann and Winter, 2018). Other relevant pitch measures are a higher maximum and, to a lesser extent, a higher minimum pitch (Baumann and Winter, 2018). Furthermore, a longer syllable duration contributes to increased prominence (e.g., Mixdorff and Widera, 2001; Tamburini and Wagner, 2007; Baumann and Winter, 2018), but it has been suggested that duration is most relevant for low prominence levels, whereas higher prominence levels are associated with pitch measures (Mixdorff and Widera, 2001; also cf. Niebuhr, 2009 for the greater role of pitch vs. duration). For intensity, the results have been mixed. Sometimes intensity has been found to be a reliable factor (Baumann and Winter, 2018), sometimes it has not (Nöth et al., 1991). Furthermore, spectral tilt and spectral emphasis play a role (Baumann and Winter, 2018).

Many of the above features have been identified indirectly, as it were, because they have been found to be present in the marking of information-structural categories that have been associated with high vs. low prosodic prominence, e.g., focus and new information vs. given information. We cannot do justice to this vast literature here (see e.g., Batliner, 1989b; Peters, 2002; Baumann, 2006; Baumann et al., 2006; Baumann and Riester, 2013). It is noteworthy, though, that later peak alignment, which has been associated with higher prosodic prominence for the marking of e.g., contrastive focus vs. narrow focus (Grice et al., 2017) may be interpreted by listeners as signaling surprise when the change is from a medial to a late peak (Kohler, 1991). This is of course directly relevant for exclamatives.

The Experiments: Overview

The present study addresses the research questions formulated in the previous section in three production experiments. In the experiments, speakers produced polar exclamatives and polar questions embedded in a dialogue context. The experiments differed in the lexical make-up so that questions (i)-(v) about the specific licensing conditions for prosodic prominence in polar exclamatives could be explored. Questions (vi–x) were explored in all three experiments.

The target sentences were verb-first structures, see Table 1. Experiment 1 tested verb-first structures with a finite auxiliary in the clause-initial C-position, a full subject consisting of a definite article and a noun, a gradable adjective and a non-finite lexical verb in the past participle form. The tense was present perfect, which in German is used in oral speech to talk about the past. In Experiments 2 and 3 the tense was present tense, which allowed a change of the verbal lexical material: instead of a finite auxiliary, a finite lexical verb occurred clause-initially. Experiment 3 further differed from Experiment 2 in the form of subject. In Experiment 3 the subject was a d-pronoun. All multi-syllabic words in the materials were stressed on the penultimate syllable. For ease of exposition, we will refer to words rather than syllables when talking about prosodic prominence in the following—we always mean the stressed syllable in the respective word, which is the exponent of prominence.

TABLE 1

Table 1. Target sentences in the three experiments.

Research questions (i) and (ii) were addressed in the above design through the variation of the tense in Experiments 1 vs. 2/3. If (i) the C-position is an attractor for prosodic prominence in exclamatives independently of verb type, the finite verb should be prosodically prominent in all three experiments, e.g., carry an accent. If it is the auxiliary rather than the C-position that attracts prominence, we should see differences between the experiments. Furthermore, if, as has been found for wh-questions—the auxiliary in C (also) carries an accent in polar questions, we expect differences in terms of accent type or phonetic differences between the two speech acts. If (ii) lexical verbs are attractors for prosodic prominence irrespective of finiteness, lexical verbs should be prominent in exclamatives irrespective of their syntactic position. Since, as we argued above, the lexical verbs in the present study might also be accented because of default accentuation of unergative verbs, there should be no difference between exclamatives and questions.

Question (iii)—if gradable adjectives attract prosodic prominence even if they are not the predicate of the exclamative relation—is addressed as follows. We already mentioned that the adjective always was gradable. However, it was not the “immediate” predicate in the exclamative relation in the sense that it was predicated of the subject. Rather, it was used adverbially as a modifier of the lexical verb, in Table 1 gut fahnden (lit. well-investigate). According to Rosengren (1992), gut would still be the predicate in the exclamative relation—just because it is gradable. Compositionally, however, there is no actual predication relation between subject and adverbial adjective. Rather, the combined meaning of verb and adjective is the predicate of the exclamative relation in Rosengren's sense. In more current parlance, we would not talk about an exclamative relation. Instead, we would assume that the exclamatives in Table 1 express that the speaker is surprised that the degree of goodness (i.e., the quality) of the investigation carried out by the dogs exceeds the expected degree of goodness of investigations in comparable situations. Now, if gradability is what matters for prosodic prominence (= Rosengren's view), the adverbial adjective should be prominent in the exclamatives but not in the questions because the latter do not “depend” on the degree semantics contributed by the adjective. If, however, the relation between subject and predicate matters, there should not be a difference between exclamatives and questions. Of course, other scenarios and explanations are conceivable as well. It might be the case that speakers make adverbial modifiers prominent in both speech acts just because they are adverbial modifiers. Something along these lines has been suggested as default accentuation rule for adverbials (cf. Gussenhoven, 1984; Selkirk, 1984).

Question (iv) about the lexical form of the subject is addressed by comparing Experiments 1/2 vs. 3. If d-pronouns are lexical items that attract prosodic prominence, there should be greater differences between exclamatives and questions in Experiment 3 than in the other experiments. We also expect accents to occur frequently on the d-pronoun in exclamatives. Note that our comparisons between experiments in this study will only be on a general level because we conducted no statistical comparisons between experiments. See the Results and Discussion sections for details.

Question (v)—whether there are more accents in longer than in shorter exclamatives because of rhythmical reasons is addressed by a comparison of the three experiments. Experiment 1 has more syllables than Experiment 2, which has more syllables than Experiment 3.

Question (vi) regarding the information-structural inertness of exclamatives will be discussed in the Materials section, where we present the context in which the target utterances were embedded. Questions (vii–x) concerning the comparison of exclamatives and questions with respect to more general characteristics (overall contour, utterance duration, intensity, and pitch range) find immediate translations into predictions for the comparison of polar exclamatives vs. polar questions in all three experiments.

In addition to research questions (i–x), the present study also explored the influence of speaker sex on the prosodic realization of exclamatives and questions. Speaker sex was reported to be potentially relevant for the production of exclamations in earlier literature. For the production of wh-exclamatives, Repp (2019) reports that male and female speakers displayed different preferences for the accentuation of d-pronouns vs. finite auxiliaries. Female speakers never accented the finite auxiliary whereas male speaker often did. There were also finer phonetic differences, which were difficult to explain. Oppenrieder (1989) reports that male speakers produce longer exclamatives than female speakers. We will not discuss these differences in this section because they could not be replicated in the present study. Instead, other, unsystematic effects occurred, see the Results sections.

Another type of speaker-related variation that has been reported in earlier literature is that there overall seem to be personal preferences of various sorts such that some speakers have a preferred accentuation pattern for wh-exclamatives (e.g., some speakers always accent d-pronouns), whereas other speakers do not have the same preference or have no preference at all and are inconsistent (Repp, 2019). We are exploring here if there are also such preferences for verb-first structures. We will detail this issue in the Results and Discussion sections where relevant.

Materials and Methods

All experiments had a 2 × 2 design, with the within-subjects factor SPEECH ACT and the between-subjects factor SEX. There were 10 experimental items, each in two conditions: exclamative vs. question. The two conditions were string-identical. They were verb-first sentences with a different lexical make-up in the three experiments (see previous section). The target sentence occurred with an exclamation mark in the exclamative condition and with a question mark in the question condition. All target sentences were embedded in a colloquial-register dialogue context, see (7) for a sample item from Experiment 1 (see the Supplementary Materials for a list of all experimental items). Participants were told that they would take part in a (pseudo)-dialogue between two comic book authors, who are discussing story ideas for the next volume in a comic series. Participants read and heard the part of the first comic book author. Then they read and vocally enacted the part of the second author, which contained the target sentence flanked by two other sentences. The flanking sentences served to support a reading of the target sentence as exclamative vs. question. In the exclamative condition, the sentences make clear that the second author knows exactly what is going on in the comic story and gives some enthusiastic comments. In the question condition, the sentences reveal that the second author is not sure about the contents of the story and makes some enquiries. In the sample item in (7), the two sentences flanking the target sentence might be taken as signaling that speaker 2 has an expectation for a positive answer. However, note that being motivated and being intent on being successful does not necessarily imply that the way the investigation was carried out was good. The materials were not completely balanced for answer expectations but six items might be taken to signal a rather clear answer expectation, whereas for the remaining four this was unlikely (see the Supplementary Materials). We address this issue again in Section Discussion.

The context for the overall story was always given by the first author. It introduced the subject referent(s) of the target sentence and prepared for the action that was described in that sentence. In (7) the context introduces some dogs that start a private investigation agency. This sets the scene for the action that the verb-first exclamative exclaims about—that the dogs carried out a very good investigation—or that the verb-first question asks about—whether the dogs carried out a good investigation. Thus, the information-structural setup of the materials was such that the subject of the target sentence was given and that the meaning of the lexical verb was new but accessible. In assertions, such an information-structural setup typically induces deaccentuation of the subject but not of the lexical verb.

Since the tense of the target sentence was present perfect in Experiment 1 and present tense in Experiments 2 and 3, the contexts of the latter experiments were adapted so that the story was set in the present.

yes

In addition to the 20 experimental items, there were 20 fillers. The target sentences in the fillers were declarative verb-second structures containing a negation. Half of them expressed the speech act (corrective) exclamation, the other half expressed the speech act (double-checking) question, e.g., Die Katzen wollen kein Restaurant aufmachen!/? (lit: “the cats want no restaurant open.make”; translation: “The cats do not want to open a restaurant!/?”). The fillers were the same in all three experiments except that in Experiment 3 the subject was a d-pronoun. Fillers and experimental items were presented in a pseudorandomized order.

The recordings were made in a sound-proof booth. The items were presented on a computer screen. A recording session started with three practice items. The items were presented in four blocks of ten items. Between blocks, participants were engaged in a short distraction task. In all trials, first the contribution of the first author was presented visually and auditorily. The visual presentation was a speech bubble next to a cartoon picture of a female person. The auditory presentation was from a recording through headphones. All stimuli had been pre-recorded by a female speaker and checked for naturalness by three native speakers. After the contribution of the first author, participants clicked on a key on the keyboard. A unisex cartoon shadow of another person appeared on the screen together with a speech bubble containing the text to be spoken by the participant (= second comic book author). When the participant had silently read that text, they started the recording. They could repeat their utterances if they felt that the recording was not good enough in any sense.

The three experiments had 20 participants each, who were mainly recruited from the student population of the University of Cologne (mean age: 24.3; range 18–52; 71% from North Rhine-Westphalia). Each participant took part in only one of the experiments. Participants were paid or received course credit.

Results and Discussion

Of the 1,200 utterances, 45 were excluded because of disfluencies or text errors (Exp. 1: 10/13 questions/exclamatives, Exp. 2: 2/4; Exp. 3: 6/7). All utterances were annotated for syllables and accents according to the GToBI system by three trained research assistants such that at least two assistants annotated each utterance. For annotations where annotators disagreed, consensus was reached by joint final annotation. The identification of accents was carried out in two steps: first, annotators only labeled where they perceived any accents at all—prenuclear and nuclear; second, annotators identified the nuclear accent. After the second round of annotation, remaining differences in annotation were inspected on a case-by-case basis and a final annotation was agreed on. The presence or absence of a nuclear accent was problematic in 8–13% of utterances (depending on the experiment), while the kind of the nuclear accent was problematic in 3–7% of utterances.

Some syllables in the lexical materials were such that they very often were not realized. This concerned unstressed syllables of the form /Nən/, where /N/ stands for any nasal. In most realizations of this sequence, there is no schwa and the two nasals assimilate in place of articulation. For example, singen (“to sing”), /zIŋən/, was usually realized as [zIŋ:]. It is not generally possible to locate a syllable boundary within the lengthened nasal. In order to keep the number of syllables per utterance constant during annotation, we annotated the non-realized syllable by assigning to it the final period of the waveform of the nasal. This resulted in syllable durations of 5–10 ms, depending on the pitch, and allowed us to divide syllables into realized and non-realized syllables after the annotation on the basis of the bimodal distribution of duration. Non-realized syllables were treated differently in the various analyses (see below).

The utterance-level acoustic measures that we investigated were duration (speaking rate), pitch range, mean intensity and intensity range. Syllable-level acoustic measures were analyzed for accented syllables with enough data points in each condition. The measures were maximum pitch, minimum pitch, pitch range, temporal alignment of the pitch peak, syllable duration, mean intensity. The acoustic measures were extracted automatically using ProsodyPro (Xu, 2013). Pitch contours were corrected manually by unvoicing octave jumps. Outliers in all measures were investigated individually. If they represented measurement errors, we corrected them; if they represented real data points, we left them in the data.

For the statistical analysis, we fitted mixed models. Unless otherwise noted, all models included the fixed effects SPEECH ACT and SEX, their interaction, random intercepts for subjects and items, and a random by-subject slope for the factor SPEECH ACT. Both factors were sum-to-zero contrast coded. In cases of model convergence problems and/or singular model fits, we first removed the random slope. If problems persisted, we removed the interaction of the fixed effects from the model. We will note any divergences from this general model formula. All interval scale data were analyzed with the R package lme4 (Bates et al., 2015). The p-values that we report for these data are based on the Kenward-Roger approximation (lmerTest; Kuznetsova et al., 2017). For further details, see below. We only report significant results.

Results for the Utterance-Level Acoustic Measures

Data Treatment

The utterance-level measures were pre-treated as follows. Speaking rate was normalized as syllables per second. Non-realized syllables were treated as if present in the calculation. We did not analyse the logarithm of the duration because the numbers are harder to conceptualize and speaking rate is more straightforwardly comparable between experiments. The mapping between log-duration and syllables per second is approximately linear. Pitch range was calculated by subtracting the minimum pitch of the syllable with the lowest pitch in the utterance from the maximum pitch of the syllable with the highest pitch, with all pitch values on the semitone scale relative to 1 Hz. Mean intensity is the mean intensity across the utterance. Intensity range was calculated by subtracting the mean intensity of the syllable with the lowest intensity in the utterance from the mean intensity of the syllable with the highest intensity, with all intensity values on the decibel scale with an arbitrary reference point. Non-realized syllables were excluded from the computation of pitch range and intensity range, since the pitch and intensity values of those syllables were likely to be spurious.

Results

Table 2 and Figure 1 present the results of the utterance-level measures for all three experiments. Speaking rate showed a main effect of SPEECH ACT in all experiments. Speakers realized fewer syllables per second in exclamatives than questions, i.e., exclamatives had a slower speaking rate and were longer. Across experiments, effect sizes increased as the number of syllables decreased. The target sentences of Experiment 1 consisted of nine syllables and showed the smallest, though still highly significant effect of SPEECH ACT (b = 0.22, SE = 0.06, t = 3.9, p < 0.001); the target sentences of Experiment 2 consisted of six syllables and showed a larger effect (b = 0.22, SE = 0.05, t = 4.6, p < 0.001); the target sentences of Experiment 3 consisted of four syllables and showed the largest effect (b = 0.29, SE = 0.04, t = 8.5, p < 0.001).

TABLE 2

Table 2. Utterance-level measures.

FIGURE 1

Figure 1. Utterance-level measures in the three experiments by condition.

Intensity range showed a main effect of SPEECH ACT in Experiment 1 (b = −0.65, SE = 0.25, t = −2.5, p < 0.05) and in Experiment 2 (b = −0.68, SE = 0.18, t = –3.6, p < 0.01). Intensity range was higher in exclamatives than in questions. In Experiment 3, the direction of the difference was the same but it was not significant. SEX was not significant in Experiment 1, interacted with SPEECH ACT in Experiment 2 (b = 0.57, SE = 0.18, t = 3.1, p < 0.01), and had a significant main effect in Experiment 3 (b = 0.7, SE = 0.25, t = 2.8, p < 0.05). The interaction in Experiment 2 results from the effect of SPEECH ACT manifesting itself only for male speakers. In Experiment 3, female speakers overall showed a greater intensity range than male speakers. Across experiments, intensity range increased as the number of syllables increased. Experiment 1 had the largest absolute values of intensity range (global mean across conditions: 11.4 dB), followed by Experiment 2 (8 dB), followed by Experiment 3 (6 dB). Mean intensity showed no effects of SPEECH ACT. In Experiment 2, there was a main effect of SEX. Women spoke louder than men (b = 1.17, SE = 0.3, t = 3.4, p < 0.01). This was also the case in Experiments 1 and 3, but not at a significant level.

Pitch range showed a main effect of SPEECH ACT in all experiments. It was higher in questions than in exclamatives (Experiment 1: b = 1.7, SE = 0.18, t = 9.3, p < 0.001; Experiment 2: b = 2.34, SE = 0.2, t = 11.6, p < 0.001; Experiment 3: b = 1.81, SE = 0.25, t = 7.4, p < 0.001). In Experiment 1, there was also a main effect of SEX. Men had a higher pitch range than women (b = −1.0, SE = 0.42, t = −2.4, p < 0.05). The direction of the difference was the same in Experiments 2 and 3, but it was not significant. Across experiments, pitch range was constant (global means: 10.9/10.6/10.9 semitones).

Discussion

The analysis of the utterance-level acoustic measures pertains to research questions (viii–x) about utterance-level differences between polar exclamatives and questions, as well as to the additional questions about the role of speaker sex. With respect to research question (viii)—whether the speaking rate in exclamatives is lower than in questions—the results suggest that this is reliably the case. Concerning question (ix)—whether the intensity of exclamatives is greater than that of questions—we found that exclamatives tend to be uttered with a greater intensity range than questions, but not robustly so. The shorter the target utterances were, the less reliable the effect was and the smaller the effect size was. Mean intensity was not influenced by speech act. Thus, question (ix) must be answered in the negative. Question (x)—whether the pitch range is greater in exclamatives than in questions—must also be answered in the negative. Pitch range was reliably smaller in exclamatives than in questions. In addition, our analysis showed that speaking rate was proportionately faster in longer utterances than in shorter utterances. As for speaker sex, we found spurious effects except for pitch range, which tended to be larger for male speakers, but this effect was not robust.

In what follows, we will discuss these findings in detail. With regard to speaking rate we hypothesized that polar exclamatives would display a slower speaking rate than polar questions because wh-exclamatives had been found to be longer than wh-questions (Repp, 2019), and declarative exclamations had been found to be longer than assertions (Altmann, 1993). Thus, overall it seems that exclamations are spoken more slowly than other speech acts. We can only speculate here about the reasons for this. A slower speaking rate may make an utterance more prominent in comparison to other speech acts and thus attract the listener's attention to the nature of the speech act. Speakers uttering an exclamation are expressing their psychological state, viz. some kind of emotional arousal (surprise, indignation), and reducing the speaking rate may draw attention to the speaker's personal involvement.

As for the observation that the speaking rate was faster in longer utterances than in shorter utterances in our experiments, this finding is consistent with previous findings that speaking rate decreases in shorter utterances, a phenomenon that has been called anticipatory lengthening (see Fletcher, 2010, for an overview). Fletcher (2010) reports that in previous studies mean articulation rates for German read speech ranged from 5.3 syllables per second (Trouvain, 2004) to 6.04 (Künzel, 1997). The longer utterances of Experiment 1 partly show a faster speaking rate than this, whereas the very short utterances of Experiment 3 were clearly slower.

Turning to utterance-level intensity range and mean intensity, our findings—in conjunction with the previous findings on exclamative and interrogative wh-structures (Repp, 2019)—suggest that mean intensity is not a feature reliably distinguishing exclamatives from questions, although intensity range may be relevant for longer utterances. However, overall it seems that the greater intensity associated with greater emotional arousal (Banse and Scherer, 1996) is not associated with the speech act exclamation. As for the finding that the intensity range is greater in longer verb-first structures than in shorter verb-first structures, this is consistent with findings in Duběda (2006), who presents evidence that intensity drops globally across an utterance, and the slope of this drop increases with utterance length.

Regarding pitch range, we assume that the higher pitch range in the polar questions is due to the final rise in these structures. The data suggest that the potential exclamative accent(s) in the polar exclamatives (see below) did not reach the same pitch excursion as the final rise. This assumption is compatible with the previous findings on exclamative and interrogative wh-structures. These mostly ended in a final fall and there was no utterance-level difference in pitch range between wh-exclamatives and wh-questions. Obviously, this latter result is a null result, so we are taking it as only weak evidence for our interpretation. Still, at the moment there is no reason to assume that the greater global pitch range that has been associated with emotional arousal (Bänziger and Scherer, 2005), manifests itself as a distinguishing characteristic for exclamations vs. questions. Finally, the observation that pitch range is insensitive to the number of syllables in verb-first structures ties in well with observations for assertions, where the slope of pitch declination depends on utterance length—shorter utterances show a steeper slope (e.g., cp. Maeda, 1976; Fuchs et al., 2015).

The effects of speaker sex were not consistent across the experiments. Thus, although there are occasional differences between male and female speakers, these are not reliable and are probably due to the sample that was picked for the individual experiments. It is of course possible that the sample size was simply not large enough to detect potential reliable differences but considering that the earlier experiments on wh-exclamatives found no global differences relating to speaker sex either, we will assume that speaker sex does not impact utterance-level acoustic differences between exclamations and questions.

Results for the Overall Contour and the Distribution of Accents/Accent Types

The choice of final contour was clear-cut in all experiments. With very few exceptions, exclamatives ended in a fall (low boundary tone) and questions ended in rise (high boundary tone). In Experiment 1, there were 3.2% falling questions and 6.4% rising exclamatives; in Experiment 2, there were 1% falling questions and 2% rising exclamatives; in experiment 3, there were no falling questions and one rising exclamative (~0.5%).

The accent distribution also differed clearly in the two speech acts. Table 3 shows the mean number of accents across conditions in all experiments. Figure 2 breaks down the total number of accents per syllable across the utterance per condition, and Table 4 specifies the proportion of accentuations per syllable, collapsed over SEX. Tables 5–7 specify the total number of individual accent types per syllable, also collapsed over SEX.

TABLE 3

Table 3. Mean number of accented syllables per utterance.

FIGURE 2

Figure 2. Accent distribution in Experiments 1–3. The syllable labels refer to the words that the accented syllables occur in. The order is left-to-right according to the temporal sequence.

TABLE 4

Table 4. Percentage of accentuation per syllable per speech act in Experiments 1-3.

TABLE 5

Table 5. Number of accent types per speech act in Experiment 1.

TABLE 6

Table 6. Number of accent types per speech act in Experiment 2.

TABLE 7

Table 7. Number of accent types per speech act in Experiment 3.

Descriptively, exclamatives contained more accented syllables than questions did. To detail this observation, we fitted cumulative link mixed models using R package ordinal (Christensen, 2019) with the number of accented syllables as the dependent variable². There was a main effect of SPEECH ACT in all experiments (Experiment 1: b = −1.1, SE = 0.24, z = −4.5, p < 0.001; Experiment 2: b = −0.63, SE = 0.18, z = −3.4, p < 0.001; Experiment 3: b = −1.6, SE = 0.16, z = −9.7, p < 0.001). Speakers produced fewer accents in questions than in exclamatives. In Experiment 3, there also was a main effect of SEX. Women produced more accented syllables than men did (b = 0.9, SE = 0.2, z = 4.2, p < 0.001).

To analyze the frequency of occurrence of accents on individual syllables, we fitted generalized linear mixed models with a binomial logit link using R package lme4 (Bates et al., 2015) for each syllable. In cases of complete separation, we report only the percentages of accentuation, since the maximum likelihood estimate for these models does not exist. The results for the distribution of accent types are given only descriptively.

For the accentuation of the clause-initial finite verb, the analysis revealed that in Experiments 1 and 2 there was a main effect of SPEECH ACT. The finite verb was accented more often in exclamatives than in questions. In Experiment 1, the finite verb was the auxiliary (b = −2.7, SE = 0.4, z = −6.7, p < 0.001; no random slope for SPEECH ACT in the model), in Experiment 2 it was the lexical verb (b = −1.2, SE = 0.2, z = −5.8, p < 0.001). In Experiment 3, the difference was in the same direction but was not significant. In Experiment 3, there was a main effect of SEX. Women produced accents more often than men did (b = 1.3, SE = 0.36, z = 3.6, p < 0.001).

The accent type that occurred most frequently on the clause-initial finite verb was H^* in both speech acts. In exclamatives, H^* amounted to 57% (Experiment 1), 80% (Experiment 2), and 64% (Experiment 3) of the accents produced on the verb. The second most frequent accent, L+H^* made up 41, 18, and 34% of the accents, respectively. In questions, H^* made up 40, 48, and 44%. The distribution of the other accent types was unspecific.

The accentuation of subjects could not be analyzed with GLMMs. The numbers in.

Table 4 indicate that subjects are accented more often in exclamatives than in questions. In fact, the number of accents on subjects in questions is negligible. Furthermore, the descriptive data suggest that the presence of an accent on a subject correlates with the subject's form. The full subjects in Experiments 1 and 2 are accented rarely (12%, 16%), whereas the pronominal subjects in Experiment 3 are accented frequently (57%). The accent type that occurred most frequently on the subject in the exclamatives was H^* in Experiments 2 and 3 (83%, 70%). In Experiment 1, L+H^* was slightly more frequent than H^* (53 vs. 47%). L+H^* was also relatively frequent in Experiment 3 (30%).

Turning to the accentuation of adjectives, the results show that the adjective was accented frequently in both speech acts in all experiments. There was a main effect of SPEECH ACT in Experiments 1 and 2. The adjective was accented more often in questions than in exclamatives (Experiment 1: b = 0.7, SE = 0.2, z = 3.6, p < 0.001; Experiment 2: b = 1.4, SE = 0.2, z = 6.1, p < 0.001, no random slope for factor SPEECH ACT in Experiment 2). There was also a significant interaction in Experiment 1 (b = 0.4, SE = 0.2, z = 2.1, p < 0.05), such that women produced more accented syllables in questions than in exclamatives, whereas for men there was no difference. For Experiment 3, binomial models could not be fitted because of complete separation. Many participants placed an accent on every adjective they produced. The accent types that occurred most frequently—or almost exclusively—across experiments were H^* in exclamatives and L^* in questions³.

For the clause-final non-finite lexical verb in Experiment 1, there was a main effect of SPEECH ACT. The verb was accented more often in exclamatives than in questions (b = −0.4, SE = 0.16, z = −2.5, p < 0.05). The type of accent that occurred most frequently was H^* in exclamatives (78%) and L^* in questions (90%).

The accent (type) distribution was reflected in different contour types that the two speech acts came with. As already mentioned, exclamatives almost always ended in a fall and questions almost always ended in a rise. The accent patterns that speakers chose depended to some extent on the lexical make-up of the utterance, i.e., the preferences for particular patterns differed in the three experiments. Figure 3 gives an overview of the most frequent contour types per condition in each experiment.

FIGURE 3

Figure 3. Percentage of contours within condition per experiment for contours that occurred in at least 5% of the utterances per condition (for exclamatives only falling, for questions only rising contours). Bars representing contours with one pitch accent have only one color: solid for monotonal accents, dotted for bitonal accents. The color coding is for the word categories where the accent(s) occur (also cp. Figure 2) Bars representing contours with two pitch accents have two colors: vertical stripes for combinations of two monotonal accents, horizontal stripes for combinations of a bitonal with a monotonal accent. If the nuclear accent is L^* there is a black frame around the bar.

In Experiment 1 (Aux_FIN.Subj_FULL.Adj.VLex), there were 41 patterns in exclamatives and 22 in questions. Overall, speakers most frequently produced contours with a single accent. 47% of the exclamatives and 84% of the questions contained only one accent. In exclamatives, the most frequent contour contained a single H^* accent on the adjective (24% of exclamatives), see Figure 4 for an example. The second-most frequent contour contained a H^* accent both on the clause-initial auxiliary and on the adjective (9%); the third contained a single H^* on the auxiliary (7%). In questions, speakers most often produced a single L^* on the adjective (72% of questions), see Figure 4. The second-most frequent pattern was a single L^* on the clause-final lexical verb (10%).

FIGURE 4

Figure 4. Sample contour for an exclamative with a single H^* accent on the adjective (left) and for a question with a single L^* accent on the adjective (right) in Experiment 1.

In Experiment 2 (VLex_FIN.Subj_FULL.Adj), there were 20 accent patterns in exclamatives and 8 in questions. Again, single-accent contours were most frequent (49% exclamatives, 69% questions). The two most frequent patterns in exclamatives were a single H^* on the finite verb (29%) and H^* both on the clause-initial finite verb and on the adjective (29%; Figure 5). A single H^* on the adjective occurred in 11% of the exclamatives. The most frequent pattern in questions was again a single L^* on the adjective (64%; Figure 5), followed by H^* both on the finite verb and L^* on the adjective (17%). The third-most frequent pattern was L+H^* on the finite verb and L^* on the adjective (9%).

FIGURE 5

Figure 5. Sample contour for an exclamative with an H^* accent on the finite lexical verb and on the adjective (left) and for a question with a single L^* accent on the adjective (right) in Experiment 2.

In Experiment 3 (VLex_FIN.Subj_D-PRON.Adj), there were 15 accent patterns in exclamatives and 7 in questions. In this experiment, exclamatives were predominantly realized with two accents (67%) and there even were a considerable number of exclamatives with three accents (12%). Questions, in contrast, again mostly contained only one accent (76%). The two most frequent patterns in exclamatives contained two H^* accents, namely either on the subject d-pronoun and on the adjective (27%, Figure 6), or on the clause-initial finite verb and on the adjective (20%). The third-most frequent pattern was a single H^* on the adjective (10%). In questions, the most frequent pattern again was a single L^* on the adjective (71%, Figure 6). The second-most frequent pattern was one with two accents: H^* on the clause-initial finite verb and L^* on the adjective (13.4%).

FIGURE 6

Figure 6. Sample contour for an exclamative with an H^* accent on the subject d-pronoun and on the adjective (left) and for a question with a single L^* accent on the adjective (right) in Experiment 2.

There was considerable inter-individual variation in the choice of accent placement and contour type. As for the accent placement on syllables that appear to be most strongly specific to exclamatives, investigation of individual choices revealed the following. The clause-initial auxiliary of Experiment 1 was accented by eight participants in at least 70% of the exclamatives (three participants accented them 100%), while 12 participants accented the auxiliary in maximally 30% of the exclamatives (eight participants never accented them). Thus, there was a sort of bimodal distribution. The subject d-pronoun of Experiment 3 was accented by five participants in all exclamatives, while four participants never accented them, and the remaining 11 participants accented the d-pronoun in any number of exclamatives between these two extremes.

The inter-individual variation in the choice of accent placement (pooled over accent types/contours, ignoring rises/falls), is illustrated in Figure 7. There were significant correlations in participants' choices for accent patterns across exclamatives and questions in Experiments 1 and 2. In both cases this concerned the most frequent accent pattern in exclamatives, which in Experiment 1 was the pattern with a single accent on the adjective (Kendall's τ = 0.38, p < 0.05), and in Experiment 2, the pattern with an accent on the finite lexical verb and the adjective (Kendall's τ = 0.46, p < 0.01). The more a participant used the most frequent pattern in exclamatives the more s/he used it in questions. Within exclamatives and within questions there were negative correlations between the two most-frequent patterns (Kendall's τ between −0.41 and −0.81). The more often a participant produced one of the patterns the less s/he produced the other. Figure 7 also shows that some participants had clear preferences especially in Experiments 1 and 3, whereas others alternated more readily between the patterns.

FIGURE 7

Figure 7. Inter-individual variation for accent patterns (pooled over accent types and rises/falls) in the three experiments. For Experiments 1 and 2, only the three most frequent accent patterns per condition are shown. The patterns are sorted (left to right) from most frequent to least frequent in the exclamatives. The colors on the x-axes correspond to those of Figure 3; deviances from Figure 3 in frequency (orderings) are due to pooling of the data here over accent types. The participants are sorted (top to bottom) from most frequent to least frequent sorted for the accent pattern that was most frequent in exclamatives and also present in questions (marked by a dotted outline on the x-axes). The darker a tile is, the more often a participant used the respective contour.

Results for the Syllable-Level Acoustic Measures

As mentioned, syllable-level acoustic measures were analyzed for accented syllables with enough data points in each condition for fitting mixed linear models. Duration (log) and mean intensity were investigated across accent types. Mean intensity was normalized so that the loudest segment of every utterance had the same amplitude. Figures 8, 9 show the results for duration and intensity for those accented syllables in each experiment for which effects were observed. The pitch measures (semitones re 1 Hz) were investigated only for accented syllables with similar accents. Specifically, H^* and L+H^* were pooled if this produced enough data points for the analysis. This was the case for the clause-initial finite lexical verb in Experiments 2 and 3. We do not report main effects of SEX on minimum and maximum pitch because this is not informative. Temporal alignment of the pitch peak within a syllable was measured as the proportion t_pitch.max/total syllable duration. However, there were no effects of alignment. As before, we only report model parameters if there were models that could be fitted, and we only report significant results.

FIGURE 8

Figure 8. Duration of selected accented syllables in ms. Clause-initial finite verbs were an auxiliary in Experiment 1 and a lexical verb in Experiments 2 and 3.

FIGURE 9

Figure 9. Intensity of selected accented syllables in dB. Clause-initial finite verbs were an auxiliary in Experiment 1 and a lexical verb in Experiments 2 and 3.

For accented realizations of the penultimate syllable of the clause-initial finite verb, models could be fitted for Experiments 2 and 3. In Experiment 2, duration showed a significant interaction of SPEECH ACT and SEX (b = 0.05, SE = 0.02, t = 2.3, p < 0.05). The interaction indicates that male speakers produced a longer syllable in exclamatives than in questions and women did the opposite, but the single comparisons were not significant. In Experiment 3, there was a main effect of SPEECH ACT on the duration (b = −0.06, SE = 0.02, t = −2.3, p < 0.05). The syllable was longer in exclamatives than in questions. There also was a main effect of SPEECH ACT on the pitch excursion in Experiment 3. Syllables with H^* or L+H^* accents had a larger pitch excursion in exclamatives than in questions (b = −0.6, SE = 0.3, t = −2.1, p < 0.05; no random slope for SPEECH ACT). This difference remained significant even if the accent type was added to the model as a predictor.

For accented realizations of the penultimate syllable of the full subject in Experiment 1, the analysis revealed a main effect of SPEECH ACT on the duration (b = −0.13, SE = 0.04, t = −3.5, p < 0.01, no random slopes in the model). The syllable was longer in exclamatives than in questions. In the other experiments, there were not enough data points in questions.

For accented adjectives, analysis revealed a main effect of SPEECH ACT on the duration in Experiments 1 and 3, and a non-significant effect in the same direction in Experiment 2 (Experiment 1: b = −0.06, SE = 0.02, t = −3.5, p < 0.01; Experiment 2: b = −0.03, SE = 0.01, t = −1.9, p = 0.07; Experiment 3: b = −0.03, SE = 0.007, t = −3.9, p < 0.01). Accented adjectives were longer in exclamatives than in questions. There also was a main effect of SPEECH ACT on the mean intensity in Experiment 1. Accented adjectives were louder in exclamatives than in questions (b = −0.8, SE = 0.3, t = −2.5, p < 0.05). In Experiment 2, the effect was reversed. Accented adjectives were quieter in exclamatives than in questions (b = 0.7, SE = 0.26, t = 2.8, p < 0.05).

For accented realizations of the penultimate syllable of the clause-final non-finite verb in Experiment 1, the analysis revealed a main effect of duration (b = −0.06, SE = 0.02, t = −3, p < 0.05). The syllable was longer in exclamatives than in questions. There also was a main effect of SPEECH ACT on the intensity of that syllable. It was louder in exclamatives than in questions (b = 0.9, SE = 0.3, t = −3.1, p < 0.01; no random slope for SPEECH ACT).

In sum, the analysis of the acoustic measures of the accented syllables show a fairly persistent effect of duration, which is longer in exclamatives. Furthermore, the pitch excursion on accented syllable of the finite lexical verb was larger in exclamatives in one of the experiments.

Discussion

The results for the contour, the accent (type) distribution and the syllable-level acoustic measures pertain to research questions (i–v), which explore speech-act specific characteristics of exclamatives, as well as to research questions (vi–vii), which concern the direct comparison between exclamatives and questions. We start our discussion with research question (vii) about falling vs. rising contours, which for the present study is straightforward to answer.

Falling and Rising Contours in Exclamatives and Questions

The polar exclamatives and the polar questions in the present study were distinguished reliably by the final contour. In the exclamatives, the final contour was almost always falling. The nuclear accent usually was a H^* accent. It was followed by a low boundary tone. In the questions, the final contour was almost always rising. The nuclear accent usually was a L^* accent, which was followed by a high boundary tone. In the introduction we suggested that this final contour is functionally equivalent to a combination of a rising pitch accent with a high boundary tone, which was described by Kügler (2003) as one of two typical contours of polar questions in German. The other one is a rising pitch accent (L^*+H) followed by a low boundary tone (also cf. Kohler, 2004). As for the pragmatic difference that has been associated with the different boundary tones, recall that high boundary tones have been suggested to signal the speaker's friendly/interested attitude toward the addressee or their lack of expectations, whereas low boundary tones signal clear answer expectations (also cf. Batliner, 1988b; Kügler, 2003; Kohler, 2004; Petrone and Niebuhr, 2014). In Section Materials and Methods we mentioned that our items were not perfectly balanced for answer expectations but a small majority of items (six out of ten) had a fairly clear answer expectation, whereas the other items did not. This difference did not play a role for the question prosody produced by the participants in the study: we virtually found no low boundary tones in questions. We suggest that the speech mode was decisive for the choice of high boundary tones: the speech mode was non-spontaneous, which promotes high boundary tones (Michalsky, 2017). In addition, considering that the participants did not address a real person, speaker attitude toward an addressee was unlikely to play a significant role.

Turning next to the speech-act-specific characteristics of exclamatives, we first consider the research questions addressing the potential of certain syntactic positions, lexical items and semantic features to attract prosodic prominence.

Prominence Attraction by Individual Elements

Question (i) asked if in exclamatives the C-position is an attractor for prosodic prominence independently of verb type. This question was fuelled by two observations in the literature. In polar exclamatives with a clause-initial finite lexical verb, that verb has been argued to obligatorily carry an accent (Batliner, 1988c; Rosengren, 1992). In wh-exclamatives, the finite auxiliary in the C-position has been found to often carry a prominent accent (Repp, 2019). The current results suggest that we can answer research question (i) in the affirmative: the C-position attracts prosodic prominence irrespective of verb type. Both auxiliaries and lexical verbs in C are accented often, and they are accented more often in polar exclamatives than in polar questions. We note that the effect was reliable in two experiments and went in the same direction in the third. Accented finite verbs also were longer in exclamatives than in questions in one of the experiments but this could be a consequence of the overall slower speaking rate of exclamatives that we argued is a hallmark of exclamations in our discussion of utterance-level characteristics. We note that the duration effect was present for virtually all accented syllables that we investigated.

The difference in accentuation of the clause-initial finite verb in exclamatives vs. questions was largest for auxiliaries. In polar questions, the clause-initial finite auxiliary was virtually never accented. This deviates from earlier findings for wh-questions, where auxiliaries often carried a prominent accent (Repp, 2019). We assume that the earlier findings were indeed a result of the information structure in the wh-questions: the information in the question was mostly given so that the auxiliary could easily carry the nuclear accent. This was not the case in the current study. Nevertheless, despite the clear differences between questions and exclamatives, our study also showed that there is considerable inter-individual variation in the accentuation of the finite auxiliary in exclamatives. About half of the participants accented the auxiliary rarely or never, whereas the other half accented it often or always. Similar differences have been observed for wh-exclamatives (Repp, 2019). In the current study, speakers who did not accent the auxiliary frequently in exclamatives often produced a contour with a single accent on the adjective, which is another potential attractor of prosodic prominence in the polar exclamatives (see below for discussion). Thus, it seems that speakers consider accenting the auxiliary or accenting the adjective as viable options when producing polar exclamatives. In sum our findings about the finite verbs corroborate the assumption that the C-position is a prosodic attractor in exclamatives, and only in those.

A natural question to ask at this point is what makes C such a prime candidate for accentuation in exclamatives. This is a question that we cannot address in detail here because answering it requires a careful investigation of the syntax-semantics interface. We will discuss some potential explanations but the upshot of our discussion will be that we do not think that there is a syntactic-semantic or information-structural explanation available for the accentuation of C. The C-position in German is often thought to host the sentence operator, or alternatively the illocutionary operator. Note, however, that for exclamatives it is debated whether they are a sentence type different from questions and how exactly they are to be described at the syntax-semantics-pragmatics interface⁴. An accent on an element in C in any kind of sentence type has been argued to mark so-called VERUM-focus (Höhle, 1988, 1992). VERUM-focus originally was thought to be focus on the truth of the proposition expressed by an assertion (hence its name) or on the polarity of that proposition (e.g., Turco et al., 2014 for a prosodic study). More recently it has been proposed to be an illocutionary operator (e.g., Romero and Han, 2004; Repp, 2009), or focus on a sentence mood operator (e.g., Stommel, 2011; Lohnstein, 2012, 2016). None of these conceptions of VERUM seem to be straightforwardly applicable to exclamations because exclamations do not negotiate truth (and thus polarity), and because the illocutionary operator conceptions are not applicable to expressive speech acts⁵.

An anonymous reviewer suggests that the propositional content of exclamatives always is given (even though it need not be discourse-given), which might play a role for its particular prosodic characteristics. This idea seems to be generally compatible with VERUM focus playing a role in exclamatives because VERUM focus contexts are typically such that the entire information in the sentence with VERUM focus is given. For instance, the question DID he come to the party? would only be appropriate in a context where the issue whether or not he came to the party had been addressed before. Now, although exclamatives can be uttered out of the (linguistic) blue (e.g., What a beautiful day!), they are indeed often assumed to presuppose their propositional content, or to be factive (e.g., Michaelis and Lambrecht, 1996; d'Avis, 2002; Zanuttini and Portner, 2003; Abels, 2010; Brandner, 2010; Driemel, 2015, 2016). The function of the speech act exclamation is then to reveal the speaker's psychological state regarding that presupposed content. Importantly, presupposition in itself does not equal givenness in the discourse sense, for instance definite noun phrases can introduce new entities into the discourse and accordingly be accented. Still, it might be the case that accenting the element in C and deaccenting the remainder of the utterance signals the “givenness” of the entire propositional content of the clause in a way that does not trigger any (unwanted) focus readings—which is where VERUM focus in other speech acts and VERUM focus in exclamatives would be parallel. It is interesting to note in this connection that Driemel (2015, 2016) observes for verb-second wh-exclamatives in German that a finite verb in C may be accented independently of verb type (auxiliary or lexical verb), whereas in verb-final wh-exclamatives the finite verb may only be accented if it is a lexical verb. An accented clause-final auxiliary at first sight seems to be ungrammatical. However, if the exclamative occurs in a context that licenses polarity focus, accenting a clause-final auxiliary is fine⁶. Thus, although polarity focus is one of the potential functions of VERUM focus (see above), it seems that accenting an element in the C-position in wh-exclamatives is not dependent on a VERUM focus discourse context, whereas accenting a finite verb in another position is.

It is worth pointing out that in Swedish, verb-first exclamatives also carry an accent on the verb in C, and other C-elements like complementizers may do, too (Delsing, 2010). English exclamatives, in contrast, seem to allow a pitch accent on C only if their propositional content is all given, as was pointed out to us by an anonymous reviewer for the translation of example (2) in the introduction (Boy/Wow, is she beautiful!). For German, we think that an explanation in terms of VERUM focus does not capture the facts. The reasons are that (a) presupposition does not equal discourse givenness, as laid out above, and presupposition is unlikely to have the same prosodic effects as givenness; and (b) in our experiments, the information in the exclamatives was not all-given: the information conveyed by the adjective was new, and the information conveyed by the lexical verb was accessible. Both of these elements were regularly accented, i.e., attracted prosodic prominence, which is an observation to which we will turn in a moment. Further below we will propose that exclamations come with a prosodic constructional default, and that accenting C is one way of realizing this default. The cross-linguistic differences between Swedish and German on the one hand, and English on the other, are a matter for future research.

Research question (ii) asked whether lexical verbs are attractors for prosodic prominence in exclamatives irrespective of finiteness. This question on the one hand had been fuelled by the claim that in short exclamatives like Säuft der Leo! (lit.: drinks the leo) the verb must be accented (Batliner, 1989a), which could be a consequence of the verb being in C. On the other hand, unergative verbs carry a default nuclear accent in assertions (Uhmann, 1991; Féry, 1993, 2011; Kratzer and Selkirk, 2007; Verhoeven and Kügler, 2015), and they might do the same in other speech acts. In the latter case, no difference is expected between questions and exclamatives. The results suggest that irrespective of position, the lexical verb attracts prosodic prominence in exclamatives more than it does in questions. In exclamatives, lexical verbs carried an accent more often even when in clause-final position, and across experiments finite lexical verbs in the C-position were accented more often than finite auxiliaries in that position were, which seems to speak for an additive effect of prominence attraction by the syntactic position and the verb type.⁷ The phonetic differences that we found for accented lexical verbs also point in the direction that lexical verbs are attractors for prosodic prominence in exclamatives but these differences are not fully informative because the longer duration of the accented syllable in lexical verbs in either position might be an utterance-level effect, and the higher mean intensity of the accented syllable in clause-final verbs might be spurious because intensity proved not to be a reliable difference between the two speech acts. The higher pitch excursion of the accented syllable of clause-initial verbs in short exclamatives—which was not found for auxiliaries—on its own is not yet informative. Still, on the basis of the accentuation findings, we suggest that lexical verbs are attractors for prosodic prominence in exclamatives more so than they are in questions. The reason for this might be that they are “part” of the exclamative relation, to which we turn in more detail when we discuss question (iii).

Although lexical verbs are accented less often in questions than in exclamatives, our findings show that they still are accented regularly (21–36%)—independent of position. This suggests that unergative verbs may carry an accent in questions (and exclamatives) for the same reasons that they do in assertions. Of course, the number of accentuations is not high, and in comparison to the adverbial adjectives, which we discuss next, lexical verbs are not accented frequently. The reason might be that the meaning of the verb was always accessible information-structurally, because of the context. An alternative explanation is that adverbials might carry the nuclear accent by default. We will come back to both issues.

Question (iii) asked whether gradable adjectives are an attractor for prosodic prominence if they are not the immediate predicate of the exclamative relation, i.e., if gradability is the decisive characteristic. Recall that Rosengren (1992) suggested that gradable predicates carry an accent in polar exclamatives like Ist der schön! (lit.: is he handsome) because they are the gradable predicate in the exclamative (degree) relation. In our study, we tested exclamatives where the gradable adjective was used adverbially as a modifier of the lexical verb, e.g., Fahnden die gut! (lit. investigate they well). We suggested that the combined meaning of adjective and verb should be considered the predicate of the exclamative relation in Rosengren's sense. The results show that the adjective was accented frequently in exclamatives but also in questions. As a matter of fact, in two of the three experiments, adjectives were accented more frequently in questions than in exclamatives. Although the accent type in the exclamatives (H^*) is more prominent than the accent type in questions (L^*) on the prominence scale that we discussed in the introduction (Kohler and Gartenberg, 1991; Niebuhr, 2009; Baumann, 2014; Baumann and Röhr, 2015; Baumann and Winter, 2018), we assume that this is not decisive because L^* was part of a nuclear contour which ended in a high boundary tone, and prominence in such a rising contour can arguably be created by initial low rather than high pitch (cf. Kügler and Genzel, 2012; Repp, 2019). The acoustic measures for the adjective do not yield additional information because we found the duration effects that may be an utterance-level effect, and intensity effects were inconsistent. Thus, overall, our results indicate that gradability is not enough to attract an accent in polar exclamatives.

It is of course an interesting question why the adjectives were accented more often in questions than in exclamatives. We propose that two aspects may be informative with respect to this issue. The first is the distribution of the accents across the clause—i.e., different contour types (ignoring falls/rises). Across experiments, questions most frequently were realized with a single L^* accent on the adjective. Exclamatives came with a very wide range of contours with, or without an accent on the adjective. Many of these were not, or hardly ever, produced for questions. To be sure, as already mentioned, adjectives overall were accented very frequently—which can probably be explained as a default accentuation for adverbial modifiers (Selkirk, 1984; cf. Gussenhoven, 1984). Yet exclamatives more readily seem to forego this default because of the underlying speech-act specific requirement or option to produce prosodic prominence (also) on other elements in the clause. The other aspect that may be relevant for the frequent accentuation of the adjective in the questions is—as we indicated above—information structure. The meaning of the verbal predicate was accessible because the context already prepared for that meaning, and the subject referent always was given. As a consequence, the adjectival modifier may be considered the only piece of truly new information. Therefore, it was a good candidate to carry the nuclear accent. In exclamatives, information-structural factors arguably have less impact on prosodic prominence relations (see below). Future research must show which of these two aspect carries the burden of the explanation or if a different explanation must be sought.

Question (iv) asked if subjects are attractors for prosodic prominence in polar exclamatives independently of their form as d-pronoun. The question was fuelled on the one hand by Rosengren's (1992) suggestion that there should be an accent on the argument of the exclamative relation, which in our case is the subject (also Batliner, 1988c). On the other hand, it had been observed that in wh-exclamatives subject d-pronouns are an attractor for prosodic prominence (Repp, 2019). We found that full subjects do not often attract an accent in polar exclamatives. This indicates that there is no semantic or structural requirement for their accentuation. The observation that they nevertheless seem to be accented more often in exclamatives than in questions, and have a longer duration, will be addressed further below, where we discuss subjects in questions. Concerning d-pronouns, we found that these regularly attract an accent in polar exclamatives, and that only 20% of the speakers never accented them. We think that it is not possible to find an information-structural motivation for the high prosodic prominence of the subject d-pronoun. The referents of the d-pronoun were always given, and although there were other referents in the discourse context, it was clear that there was no contrast between the referents of the d-pronoun and the other referents: the referents of the d-pronoun were just the topic of the discourse passage in which the exclamative occurred. As already mentioned, we will propose further below that exclamations come with a prosodic constructional default. We will argue that accenting the subject d-pronoun is another way of realizing this default.

Turning next to the subject in polar questions, we expected a prosodic prominence marking that reflects the contextual requirements in terms of information structure. Recall that all subjects were given information so they should not carry an accent. This is what we found indeed for the questions. However, as we just saw, in exclamatives participants chose not to mark the givenness of the subject to the same extent. Thus, in exclamatives low prosodic prominence is given up in favor of speech-act-specific high prominence. Still, considering that the proportion of accented full subjects in exclamatives is fairly low (between 12% and 20%), full subjects seem to have a different prosodic status from d-pronouns. Whereas, d-pronouns seem to truly attract prosodic prominence, the prosodic characteristics of full subjects feature into a generally reduced sensitivity to givenness in exclamatives (as well as into the overall slower speaking rate) in exclamatives.

Information-Structural Sensitivity

Summarizing the discussion of the previous subsection with respect to question (vi), which asked if polar exclamatives display less sensitivity than polar questions to information-structural demands imposed by the context, our answer is affirmative. As we just stated, full subjects display less givenness marking in exclamatives than in questions. Furthermore, we discussed that the accessible lexical verbs also were accented more frequently in exclamations than in questions. Finally, there is the frequent accentuation of the d-pronoun, which by definition denotes given information. In sum, this evidence suggests a reduced sensitivity to givenness in exclamatives. Overall, our findings tie in well with the findings for wh-exclamatives in Repp (2015, 2019), who also found a lack of deaccentuation for given information. They also tie in well with earlier claims in the non-experimental literature on exclamatives in general (Jacobs, 1988; Oppenrieder, 1988). It seems then that prosodic givenness marking is regularly overridden by speech-act-related prominence in exclamations.

It is important to highlight that our study did not manipulate information structure as an experimental factor. So our claims are largely based on somewhat indirect evidence. Still, considering that the polar questions in our experiment were largely produced in a way that reflects the context-induced information structure of the utterance in the expected way, we can be quite sure that exclamatives are “special,” at least when it comes to givenness marking. Furthermore, our findings match those reported in Seeliger and Repp (2020), who directly tested information structure in polar exclamatives with transitive verbs: in that study given objects were not deaccented. They were “only” marked less frequently than new objects with L+H^* accents (vs. H^*). Seeliger and Repp also report a finding that was somewhat surprising from the current perspective. We mentioned in the introductory sections that Seeliger and Repp also tested prosodic reflexes of contrast. Contrast was implemented in that study as follows: The exclamative occurred in a context which contained an explicit alternative to the object referent, and the exclamative expressed that the state-of-affairs involving the object referent was even more impressive than the state-of-affairs involving the alternative. Seeliger and Repp found that contrastive objects were marked by higher prosodic prominence than new objects very reliably: by more frequent accentuation, by more prominent accent types (L+H^* vs. H^*) and for the prominent L+H^* accents by acoustic parameters that are associated with greater prominence (longer duration, higher F0, higher F0 excursion). These findings are not yet surprising from the current perspective: information-structurally induced high prosodic prominence—for contrast—seems to be easily compatible with the speech act exclamative, which supports the assumption that exclamatives impose a general requirement for high prosodic prominence. However, Seeliger and Repp also found that the subject d-pronoun, which was accented in around 80 percent of the utterances when the object of the transitive structures was given or new, was accented in only 40% of the utterances when the object was contrastive⁸. In other words, it seems that the prominence of a contrastive object was increased by decreasing the prominence of another element of the clause. We conclude from this that prominence reduction seems to be possible if the prominence relations are altered in favor of an element that requires high prosodic prominence (to mark contrast).

A Constructional Default for Exclamations

Taken together, the present study and the earlier studies on exclamatives suggest that, although high prosodic prominence in exclamations may be motivated by information structure (in the case of contrast), there are also elements that attract prosodic prominence for reasons that are not likely to be information-structural/discourse-semantic. We argued that this is the case for the finite verb in the C-position, and even more so for the d-pronoun, for which a discourse-semantic/information-structural motivation seems to be quite out of reach. To be sure, the C-position and the d-pronoun are not always carriers of prosodic prominence—there is flexibility, and this flexibility is not totally arbitrary. There is interaction with information structure (contrast), and we saw that individual speakers may have preferred accentuation patterns (also see the next subsection). Nevertheless, it seems clear that exclamations come with accentuation patterns that cannot easily be explained by discourse semantics.

We propose that the speech act exclamation (covering polar exclamatives and wh-exclamatives) has a prosodic constructional default. Neitsch and Niebuhr (2019) posit prosodic constructions as feature bundles that systematically differ in their values between speech acts – in Neitsch and Niebuhr's case rhetorical vs. information-seeking German questions. They assume that gradual acoustic parameters—such as duration—are more important for a prosodic construction than categorical phonological features. Still, they allow for the possibility that categorical features can play a role in a prosodic construction. We propose that the prosodic construction for exclamations comes with a requirement for a prominent accent on certain elements, which is a categorical feature. This requirement reflects the observation in older literature that exclamations are characterized by an exclamative accent, except that we now know much more about the position(s) of such an accent: if there is an element for which there is a discourse-semantic motivation for high prosodic prominence, e.g., a contrastive element, it is very likely that this element carries a/the accent; otherwise, the d-pronoun or the C-element do. Supporting evidence for the d-pronoun being a prime candidate for realizing the required accent comes from the observation that in the current study, a small percentage of exclamatives with a full subject had an accent on the article of the subject noun phrase. Plausibly, these accents were placed erroneously because the definite article and the d-pronoun are homophonous. Participants might have read the article and prematurely decided that it was a d-pronoun rather than an article preceding a noun. Importantly, the prosodic construction allows more than one (prominent?) accent in exclamatives. This is something that speakers exploit: the exclamatives in our study contained more accents than the questions did. However, we note here that combining an accent on a d-pronoun and on the C-element does not seem to be possible – probably for rhythmical reasons: the two elements are adjacent.

In addition to the categorical prosodic feature of the prosodic construction to come with an accent on one (or more) of the above-mentioned elements, we are assuming that the prosodic construction comprises generally reduced givenness marking. This means that other given elements than the d-pronoun may carry an accent (as well). We found evidence for this in the present study. Also recall that both Seeliger and Repp (2020) and Repp (2019) found that in exclamatives with a transitive structure, given direct objects very reliably are accented. We furthermore assume that a slower speaking rate is an ingredient of the prosodic construction because this has been found consistently in all production studies of exclamatives. Speaking rate/duration is a gradient acoustic parameter, unlike accentuation. As already mentioned, Neitsch and Niebuhr (2019) consider it likely that continuous prosodic variables contribute more than categorical prosodic variables to prosodic constructions. At present, we remain agnostic on the relative contributions of different types of prosodic variables because this requires careful investigation, also in perception studies. Furthermore, we know from studies on focus marking that different speakers can and do use different tools to fulfill categorical pragmatic functions like focus marking. Grice et al. (2017) argue that these tools can be categorical themselves (such as using a prosodic event that is perceived as an L+H^* accent in order to mark contrastive focus) or continuous (such as using a later alignment of the pitch peak of the stressed syllable of a contrastively focused word, but not to such an extent that it is perceived as an L+H^* accent). Finally, we do not take a prosodic construction to be a necessarily coherent feature bundle whose values can only be changed in certain ways with respect to each other, as is suggested by Neitsch and Niebuhr. For instance, prima facie, there is no need to assume that a slower speaking rate must be accompanied by certain kinds of accentuation patterns. It is possible that prosodic constructions are collections of various features that are associated with the prosodic marking of specific utterance types, which can “step in” to help identification of the speech act in redundant ways. This requires future research.

Further General Prosodic Characteristics and Inter-Individual Variation

In this final subsection, we will discuss prosodic characteristics that concern prominence relations in exclamations in a more global way: the interaction of accentuation patterns with the length of the utterance, and the overall choice of type of accent. Then we will consider inter-individual variation.

We will start with research question (v), which asked if the length of a polar exclamative influences the prominence relations in the exclamative. We may answer this question in the affirmative. However, against our expectations we found that shorter exclamatives contained more accents than longer exclamatives did, rather than the other way round. Thus, the adjacency of stressable syllables did not induce speakers to produce fewer accents in order to keep to a rhythmical structure of alternating strong and weak syllables. Rather, it seems that the speech act exclamation seems to favor a high number of accent even and especially in shorter utterances. There may be two reasons for this. On the one hand, short exclamatives are arguably what is produced predominantly in everyday oral speech and participants therefore more easily adopt an “expressive” or “exalted” speaking style (also see next paragraph). On the other hand, the short exclamatives in the present study contained the prominence-attracting d-pronoun in addition to the gradable adjective and the finite verb, which are also present in the longer structure tested here. We suggest that the presence of this d-pronoun contributed considerably to the higher number of accents.

As for the overall choice of accent in the present study, which overwhelmingly was H^*, with clause-initial finite verbs and subject d-pronouns also sometimes carrying L+H^*, the data suggest that speakers chose accent types that in earlier studies on assertions (Baumann and Röhr, 2015) were found to be more prominent than others. However, they did not choose L^*+H, which according to the earlier studies also is perceived as very prominent. This absence in conjunction with the fairly strong preference for H^* over L+H^* suggests that speakers do not necessarily use bitonal accents with salient rises, which arguably produce maximum prominence. Recall in this connection that Seeliger and Repp (2020) found that contrastive objects in exclamatives are marked with L+H^* more frequently in comparison to new or given objects. This might be taken to suggest that this accent is used specifically for marking contrast(ive focus) in German, as is often assumed in the literature (for German e.g., Weber et al., 2006; Gotzner et al., 2013; Gotzner, 2014; Grice et al., 2017; Braun and Biezma, 2019; Braun et al., 2019a). However, this is not fully compatible with the finding in the present study that the proportion of L+H^* is highest in the short exclamatives with a d-pronoun subject, which also had the slowest speaking rate of all exclamatives. These short exclamatives contained no more contrastive elements than the longer exclamatives did. Rather, it seems that participants chose a more “exalted” speaking style in the shorter exclamatives. This suggest that a particular prosodic category—like the L+H^* accent—may be associated with rather different functions. L+H^* may be used to mark contrast, and it may be used to mark a greater “degree of exclamativity” in short utterances with a slow speaking rate. One potential way to explain this observation is appealing to the effort code (Gussenhoven, 2004). Contrast and exclamativity arguably are easily derived from effort because they constitute “special” communicative situations: they are both linked to unexpectedness. However, since such claims are hard to falsify, we will leave this as a speculation here. On an alternative speculative note, we could consider contrast as having an illocutionary component. After all, contrast is a relational property and juxtaposing something is an act (of speech). By this speculation, L+H^* does not mark the information-structural category contrast but a speech act that involves deviance from expectations or context.

Turning finally to general aspects of the inter-individual variation in the production of exclamatives, we found that speakers seem to have preferences for accent patterns across speech acts. This is at least what we observed for the most frequent accent pattern in exclamatives in two of the experiments. The more often a speaker produced one of these patterns in the exclamatives the more often s/he produced it in the questions—although to a lesser extent. Thus, speakers displayed a certain consistency during the experimental session. Future research exploring corpus data must show if such preferences are truly consistent or if they are an artifact of the experimental situation. However, within speech acts, some speakers alternated between the first and second most frequent patterns, whereas others showed differential preferences for these patterns. Therefore, we may doubt that there is general consistency. None of the inter-individual differences correlated with the experimental factor speaker sex. For this factor, we only found spurious effects. Thus, we are assuming that female and male speaker do not systematically differ in the production of exclamatives vs. questions.

The consequences of the inter-individual variation must be explored further in perception studies. Generally, we would expect that listeners can use a broad variety of cues to interpret intended meanings. For instance, Grice et al. (2017) elicited broad, narrow and contrastive focus from naïve speakers and then had listeners match the recordings to wh-questions with different focus structures. Even the recordings of speakers that seemed to not clearly mark contrastive focus (either by not using L+H^* accents or by not delaying the pitch peak, or by doing neither) could be matched to the licensing wh-questions at the same level as the recordings of speakers who clearly marked contrastive focus. This suggests that there are bundles of potentially subtle features that listeners can and do use to extract semantic and pragmatic information from speech. Variation between speakers is then almost to be expected: different speakers will make use of different features to different extents.

Conclusion

The present study shows that German polar questions and polar exclamatives differ systematically in their prosodic realization, in particular with respect to prosodic prominence. Certain lexical items and certain syntactic positions attract prosodic prominence in exclamatives but not in questions. We have suggested that the reasons for this are partly semantic-pragmatic, and partly prosodic. We proposed that in addition to the semantic relation between subject and lexical verb/adverbial adjective (“exclamative relation”), there are features that belong to the prosodic construction exclamation. This construction is characterized by a slower speaking rate, by a tendency to accent C-elements and d-pronominal subjects (for which there is no prima facie semantic-pragmatic or information-structural reason), and by an overall tendency to give a greater preference to speech-act-specific high prosodic prominence over an information-structurally induced low prosodic prominence. All these aspects of the prosodic construction seem to be purely motivated by the pragmatic factor speech act exclamative. They do not seem to have (discourse-)semantic correlates in the way that discourse givenness, i.e., high discourse prominence, negatively correlates with prosodic prominence, i.e., low prosodic prominence. Thus, overall the current findings suggest that the interface conditions for prosodic prominence are highly sensitive to the type of speech act, and the present study has contributed to exploring some of the details of these interface conditions.

Data Availability Statement

The datasets generated for this study are available on request to the corresponding author.

Ethics Statement

The studies involving human participants were reviewed and approved by XLinC Lab, University of Cologne. The patients/participants provided their written informed consent to participate in this study.

Author Contributions

SR designed the experiments and wrote the article. HS statistically analyzed the experiments and wrote the article. SR and HS developed the full set of materials and supervised the recordings and the data annotation. All authors contributed to the article and approved the submitted version.

Funding

This research was supported by the German Science Foundation (DFG) as part of the Collaborative Research Center 1252 Prominence in Language, University of Cologne, Project A6 Prominence in non-assertive speech acts.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

The materials were spoken by Jonilla Ried. The data were recorded in the Speech Lab of XLinc Lab, University of Cologne. The recordings were run by Claudia Kilter, Lukas Kurzeja, Hanna Maurer, Brita Rietdorf, and Marlon Siewert. The data were annotated by Lukas Kurzeja, Hanna Maurer, and Marlon Siewert.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fcomm.2020.00053/full#supplementary-material

Footnotes

1. ^An anonymous reviewer highlights the work by Pierrehumbert and Hirschberg (1990) in this connection, who building on Ward and Hirschberg (1985, 1986) suggest that the L^*+H accent (followed or not by L-H%) is used to convey uncertainty about/lack of commitment with respect to a contextually evoked scale. Pierrehumbert and Hirschberg also explicitly mention degrees in this connection. Still, the semantic-pragmatic notion of a scale in their sense is different from the semantic notion of gradability used here. A scale is defined in these works as a partially ordered set. Partially ordered sets include relationships like type-subtype or part-whole, as well as more generally alternatives, which does not apply to gradability as part of the lexical semantics of a predicate. For L^*+H, Pierrehumbert and Hirschberg also suggest that a scale is evoked. Again, what they have in mind is a contextually evoked scale rather than the degree semantics of a lexical item.

2. ^The models for Experiments 1 and 3 estimated the variances of the by-item random intercepts as 0, i.e., the models are singular fits.

3. ^We pooled pre-nuclear and nuclear accents for every syllable for the acoustic analysis further below. There were only 9 pre-nuclear accents on adjectives in the questions of Experiment 1, and none in Experiment 2 and 3.

4. ^See e.g., d'Avis (2002) for the idea that exclamatives are self-answering questions. Also see Huddleston (1993) for a purely pragmatic account of the interpretation of polar exclamatives vs. polar questions in English.

5. ^Lohnstein's (2012, 2016) proposal that VERUM focus is focus on a sentence mood operator prima facie is promising because the particular type of speech act (sentence type) plays a role in this proposal. Lohnstein argues that VERUM focus reduces the alternatives of (verbal) behavior of the addressee to the one alternative that serves the function of the speech act uttered by the speaker. For instance, placing an accent on a do-auxiliary in a question like So, DID he come to the party? signals that the addressee is expected to promptly give the true answer rather than digressing and/or giving a non-direct answer. Lohnstein does not discuss exclamatives. Importantly, however, an accent on C in exclamatives does not have an effect that is comparable to that in questions (or imperatives). As a matter of fact, it is hard to say what alternative behaviors could be excluded in exclamatives: the addressee is only expected to add to their discourse representation that the speaker has a certain psychological state concerning a certain state-of-affairs.

6. ^For lexical verbs, alternatives readily come to mind so that accenting these verbs in clause-final position does not incur the same intuition of unacceptability.

7. ^The finding that the difference in accentuation on the finite auxiliary in Experiment 1 and on the finite lexical verb in Experiment 3 is not very large, is probably due to the fact that the d-pronoun in Experiment 3 – which is adjacent to the finite verb—is accented very frequently.

8. ^Note that the auxiliary in the C-position was not accented frequently in that study. We assume that this is a consequence of the rhythmical structure of the sentences that Seeliger and Repp (2020) tested. Whereas we tested bisyllabic verbs in C, Seeliger and Repp tested monosyllabic verbs, which were adjacent to the d-pronoun. Accenting both elements would have resulted in a rhythmical clash.

References

Abels, K. (2010). Factivity in exclamatives is a presupposition. Stud. Linguistica 64, 141–157. doi: 10.1111/j.1467-9582.2010.01164.x

CrossRef Full Text | Google Scholar

Altmann, H. (1988). Intonationsforschungen. Tübingen: Niemeyer. doi: 10.1515/9783111358413

CrossRef Full Text | Google Scholar

Altmann, H. (1993). “Satzmodus,” in Syntax: Ein internationales Handbuch zeitgenössischer Forschung, eds J. Jacobs, A. von Stechow, W. Sternefeld, and T. Vennemann (Berlin: de Gruyter), 1006–1029.