The Bayes Estimators of the Variance and Scale Parameters of the Normal Model With a Known Mean for the Conjugate and Noninformative Priors Under Stein’s Loss

Zhang, Ying-Ying; Rong, Teng-Zhong; Li, Man-Man

doi:10.3389/fdata.2021.763925

ORIGINAL RESEARCH article

Front. Big Data, 03 January 2022

Sec. Data Science

Volume 4 - 2021 | https://doi.org/10.3389/fdata.2021.763925

This article is part of the Research TopicBayesian Inference and AIView all 5 articles

The Bayes Estimators of the Variance and Scale Parameters of the Normal Model With a Known Mean for the Conjugate and Noninformative Priors Under Stein’s Loss

Ying-Ying Zhang^1,2*

Teng-Zhong Rong^1,2

Man-Man Li^1,2

¹Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, China
²Chongqing Key Laboratory of Analytic Mathematics and Applications, Chongqing University, Chongqing, China

For the normal model with a known mean, the Bayes estimation of the variance parameter under the conjugate prior is studied in Lehmann and Casella (1998) and Mao and Tang (2012). However, they only calculate the Bayes estimator with respect to a conjugate prior under the squared error loss function. Zhang (2017) calculates the Bayes estimator of the variance parameter of the normal model with a known mean with respect to the conjugate prior under Stein’s loss function which penalizes gross overestimation and gross underestimation equally, and the corresponding Posterior Expected Stein’s Loss (PESL). Motivated by their works, we have calculated the Bayes estimators of the variance parameter with respect to the noninformative (Jeffreys’s, reference, and matching) priors under Stein’s loss function, and the corresponding PESLs. Moreover, we have calculated the Bayes estimators of the scale parameter with respect to the conjugate and noninformative priors under Stein’s loss function, and the corresponding PESLs. The quantities (prior, posterior, three posterior expectations, two Bayes estimators, and two PESLs) and expressions of the variance and scale parameters of the model for the conjugate and noninformative priors are summarized in two tables. After that, the numerical simulations are carried out to exemplify the theoretical findings. Finally, we calculate the Bayes estimators and the PESLs of the variance and scale parameters of the S&P 500 monthly simple returns for the conjugate and noninformative priors.

1 Introduction

There are four basic elements in Bayesian decision theory and specifically in Bayesian point estimation: The data, the model, the prior, and the loss function. In this paper, we are interested in the data from the normal model with a known mean, with respect to the conjugate and noninformative (Jeffreys’s, reference, and matching) priors, under Stein’s and the squared error loss functions. We will analytically calculate the Bayes estimators of the variance and scale parameters of the normal model with a known mean, with respect to the conjugate and noninformative priors under Stein’s and the squared error loss functions.

The squared error loss function has been used by many authors for the problem of estimating the variance, σ², based on a random sample from a normal distribution (see for instance (Maatta and Casella, 1990)). As pointed out by (Casella and Berger, 2002), the squared error loss function penalizes overestimation and underestimation equally, which is fine for the location parameter with parameter space $Θ = (- \infty, \infty)$ . For a variance or scale parameter, the parameter space is $Θ = (0, \infty)$ where 0 is a natural lower bound and the estimation problem is not symmetric. In these cases, we should not choose the squared error loss function, but choose a loss function which penalizes gross overestimation and gross underestimation equally, that is, an action a will incur an infinite loss when it tends to 0 or ∞. Stein’s loss function has this property, and thus it is recommended to use for the positive restricted parameter space $Θ = (0, \infty)$ by many authors (see for example (James and Stein, 1961; Petropoulos and Kourouklis, 2005; Oono and Shinozaki, 2006; Bobotas and Kourouklis, 2010; Zhang, 2017; Xie et al., 2018; Zhang et al., 2019; Sun et al., 2021)). In the normal model with a known mean μ, our parameters of interest are θ = σ² (a variance parameter) and θ = σ (a scale parameter). Therefore, we will select Stein’s loss function.

The motivation and contributions of our paper are summarized as follows. For the normal model with a known mean μ, the Bayes estimation of the variance parameter θ = σ² under the conjugate prior which is an Inverse Gamma distribution is studied in Example 4.2.5 (p.236) of (Lehmann and Casella, 1998) and Example 1.3.5 (p.15) of (Mao and Tang, 2012). However, they only calculate the Bayes estimator with respect to a conjugate prior under the squared error loss. (Zhang, 2017) calculates the Bayes estimator of the variance parameter θ = σ² of the normal model with a known mean with respect to the conjugate prior under Stein’s loss function which penalizes gross overestimation and gross underestimation equally, and the corresponding Posterior Expected Stein’s Loss (PESL). Motivated by the works of (Lehmann and Casella, 1998; Mao and Tang, 2012; Zhang, 2017), we want to calculate the Bayes estimators of the variance and scale parameters of the normal model with a known mean for the conjugate and noninformative priors under Stein’s loss function. The contributions of our paper are summarized as follows. In this paper, we have calculated the Bayes estimators of the variance parameter θ = σ² with respect to the noninformative (Jeffreys’s, reference, and matching) priors under Stein’s loss function, and the corresponding Posterior Expected Stein’s Losses (PESLs). Moreover, we have calculated the Bayes estimators of the scale parameter θ = σ with respect to the conjugate and noninformative priors under Stein’s loss function, and the corresponding PESLs. For more literature on Bayesian estimation and inference, we refer readers to (Sindhu and Aslam, 2013a; Sindhu and Aslam, 2013b; Sindhu et al., 2013; Sindhu et al., 2016a; Sindhu et al., 2016b; Sindhu et al., 2016c; Sindhu et al., 2017; Sindhu et al., 2018; Sindhu and Hussain, 2018)

The rest of the paper is organized as follows. In the next Section 2, we analytically calculate the Bayes estimators of the variance and scale parameters of the normal model with a known mean, with respect to the conjugate and noninformative priors under Stein’s loss function, and the corresponding PESLs. We also analytically calculate the Bayes estimators under the squared error loss function, and the corresponding PESLs. The quantities (prior, posterior, three posterior expectations, two Bayes estimators, and two PESLs) and expressions of the variance and scale parameters for the conjugate and noninformative priors are summarized in two tables. Section 3 reports vast amount of numerical simulation results of the combination of the noninformative prior and the scale parameter to support the theoretical studies of two inequalities of the Bayes estimators and the PESLs, and that the PESLs depend only on the number of observations, but do not depend on the mean and the sample. In Section 4, we calculate the Bayes estimators and the PESLs of the variance and scale parameters of the S&P 500 monthly simple returns for the conjugate and noninformative priors. Some conclusions and discussions are provided in Section 5.

2 Bayes Estimator, PESL, IRSL, and BRSL

In this section, we will analytically calculate the Bayes estimator $δ_{s}^{π, θ} (x)$ of the variance parameter $θ = σ^{2} \in Θ = (0, \infty)$ under Stein’s loss function, the PESL at $δ_{s}^{π, θ} (x)$ , $P E S L_{s}^{π, θ} (x)$ , and the Integrated Risk under Stein’s Loss (IRSL) at $δ_{s}^{π, θ}$ , $I R S L_{s}^{π, θ} = B R S L^{π, θ}$ , which is also the Bayes Risk under Stein’s Loss (BRSL) for π, θ. See (Robert, 2007) for the definitions of the posterior expected loss, the integrated risk, and the Bayes risk. We will also analytically calculate the Bayes estimator $δ_{s}^{π, σ} (x)$ of the scale parameter $σ \in Θ = (0, \infty)$ under Stein’s loss function, the PESL at $δ_{s}^{π, σ} (x)$ , $P E S L_{s}^{π, σ} (x)$ , and the IRSL at $δ_{s}^{π, σ}$ , $I R S L_{s}^{π, σ} = B R S L^{π, σ}$ , which is also the BRSL for π, σ.

Suppose that we observe X₁, X₂, …, X_n from the hierarchical normal model with a mixing variance parameter θ = σ²:

\{\begin{cases} X_{i} | θ \overset{iid}{\sim} N (μ, θ), i = 1,2, \dots, n, \\ θ \sim π (θ), \end{cases} (1)

where − ∞ < μ < ∞ is a known constant, $N (μ, θ)$ is the normal distribution with a known mean μ and an unknown variance θ, and $π (θ)$ is the prior distribution of θ. For the normal model with a known mean μ, the Bayes estimation of the variance parameter θ = σ² under the conjugate prior which is an Inverse Gamma distribution is studied in Example 4.2.5 (p.236) of (Lehmann and Casella, 1998) and Example 1.3.5 (p.15) of (Mao and Tang, 2012). However, they only calculate the Bayes estimator with respect to a conjugate prior under the squared error loss. (Zhang, 2017) calculates the Bayes estimator of the variance parameter θ = σ² with respect to the conjugate prior under Stein’s loss function, and the corresponding PESL. Motivated by the works of (Lehmann and Casella, 1998; Mao and Tang, 2012; Zhang, 2017), we want to calculate the Bayes estimators of the variance parameter of the normal model with a known mean for the noninformative (Jeffreys’s, reference, and matching) priors under Stein’s loss function. The usual Bayes estimator with respect to a prior $π (θ)$ is to calculate $δ_{2}^{π, θ} (x) = E (θ | x)$ under the squared error loss function. As pointed out in the introduction, we should calculate and use the Bayes estimator of the variance parameter θ with respect to a prior $π (θ)$ under Stein’s loss function, that is, $δ_{s}^{π, θ} (x)$ .

Alternatively, we may be interested in the scale parameter θ = σ. Motivated by the works of (Lehmann and Casella, 1998; Mao and Tang, 2012; Zhang, 2017), we also want to calculate the Bayes estimators of the scale parameter θ = σ with respect to the conjugate and noninformative priors under Stein’s loss function, and the corresponding PESLs. Suppose that we observe X₁, X₂, …, X_n from the hierarchical normal model with a mixing scale parameter θ = σ:

\{\begin{cases} X_{i} | σ \overset{iid}{\sim} N (μ, σ^{2}), i = 1,2, \dots, n, \\ σ \sim π (σ), \end{cases} (2)

where − ∞ < μ < ∞ is a known constant, $N (μ, σ^{2})$ is the normal distribution with a known mean μ and an unknown variance σ², and $π (σ)$ is the prior distribution of σ. The usual Bayes estimator with respect to a prior $π (σ)$ is to calculate $δ_{2}^{π, σ} (x) = E (σ | x)$ under the squared error loss function. As pointed out in the introduction, we should calculate and use the Bayes estimator of the scale parameter σ with respect to a prior $π (σ)$ under Stein’s loss function, that is, $δ_{s}^{π, σ} (x)$ .

Now let us explain why we choose Stein’s loss function on $Θ = (0, \infty)$ . Stein’s loss function is given by

L_{s} (θ, a) = \frac{a}{θ} - \log \frac{a}{θ} - 1, (3)

where θ > 0 is the unknown parameter of interest and a is an action or estimator. The squared error loss function is given by

L_{2} (θ, a) = {(a - θ)}^{2} . (4)

The asymmetric Linear Exponential (LINEX) loss function ((Varian et al., 1975; Zellner, 1986; Robert, 2007)) is given by

L_{L} (θ, a) = e^{c (a - θ)} - c (a - θ) - 1, (5)

where c ≠ 0 serving to determine its shape. In particular, when c > 0, the LINEX loss function tends to ∞ exponentially, while when c < 0, the LINEX loss function tends to ∞ linearly. Note that on the positive restricted parameter space $Θ = (0, \infty)$ , Stein’s loss function penalizes gross overestimation and gross underestimation equally, that is, an action a will incur an infinite loss when it tends to 0 or ∞. Whereas, the squared error loss function does not penalize gross overestimation and gross underestimation equally, as an action a will incur a finite loss (in fact θ²) when it tends to 0 and incur an infinite loss when it tends to ∞. Similarly, the LINEX loss functions also do not penalize gross overestimation and gross underestimation equally, as an action a will incur a finite loss (in fact e^−cθ + cθ − 1) when it tends to 0 and incur an infinite loss when it tends to ∞. Figure 1 shows the four loss functions on $Θ = (0, \infty)$ when θ = 2.

FIGURE 1

FIGURE 1. The four loss functions on $Θ = (0, \infty)$ when θ = 2.

As pointed out by (Zhang, 2017), the Bayes estimator

δ_{s}^{π, θ} (x) = \frac{1}{E (\frac{1}{θ} | x)}

minimizes the PESL, that is,

δ_{s}^{π, θ} (x) = \arg \min_{a \in A} E [L_{s} (θ, a) | x],

where $A= \{a (x) : a (x) > 0\}$ is an action space, $a = a (x) > 0$ is an action (estimator), which is a function only of x, $L_{s} (θ, a)$ given by (Eq. 3) is Stein’s loss function, and θ > 0 is the unknown parameter of interest. Note that Stein’s loss function has a nice property that it penalizes gross overestimation and gross underestimation equally, that is, an action a will incur an infinite loss when it tends to 0 or ∞. Moreover, note that θ may be the variance parameter σ² or the scale parameter σ.

The usual Bayes estimator of θ is $δ_{2}^{π, θ} (x) = E (θ | x)$ which minimizes the Posterior Expected Squared Error Loss. It is interesting to note that

δ_{s}^{π, θ} (x) \leq δ_{2}^{π, θ} (x), (6)

whose proof exploits Jensen’s inequality and the proof can be found in (Zhang, 2017). Note that the inequality (Eq. 6) is a special inequality in (Zhang et al., 2018). As calculated in (Zhang, 2017), the PESL at $δ_{s}^{π, θ} (x) = {[E (θ^{- 1} | x)]}^{- 1}$ is

P E S L_{s}^{π, θ} (x) = {E [L_{s} (θ, a) | x]|}_{a = \frac{1}{E (\frac{1}{θ} | x)}} = \log E (\frac{1}{θ} | x) + E (\log θ | x),

and the PESL at $δ_{2}^{π, θ} (x) = E (θ | x)$ is

\begin{array}{l} P E S L_{2}^{π, θ} (x) & = {E [L_{s} (θ, a) | x]|}_{a = E (θ | x)} \\ = E (θ | x) E (\frac{1}{θ} | x) - \log E (θ | x) + E (\log θ | x) - 1 . \end{array}

As observed in (Zhang, 2017),

P E S L_{s}^{π, θ} (x) \leq P E S L_{2}^{π, θ} (x), (7)

which is a direct consequence of the general methodology for finding a Bayes estimator or due to $δ_{s}^{π, θ} (x)$ minimizes the PESL. The numerical simulations will exemplify (Eqs 6, 7) later. Note that the calculations of $δ_{s}^{π, θ} (x)$ , $δ_{2}^{π, θ} (x)$ , $P E S L_{s}^{π, θ} (x)$ , and $P E S L_{2}^{π, θ} (x)$ depend only on the three expectations $E (θ | x)$ , $E (θ^{- 1} | x)$ , and $E (\log θ | x)$ .

2.1 Conjugate Prior

The problem of finding the Bayes estimator under a conjugate prior is a standard problem that is treated in almost every text on Mathematical Statistics.

The quantities and expressions of the variance and scale parameters of the normal models (Eqs 1, 2) with a known mean μ for the conjugate prior are summarized in Table 1. In the table, α > 0 and β > 0 are known constants,

α^{*} = α + \frac{n}{2}, β^{*} = {[\frac{1}{β} + \frac{1}{2} \sum_{i = 1}^{n} {(x_{i} - μ)}^{2}]}^{- 1},

ψ (z) = \frac{Γ^{'} (z)}{Γ (z)} = \frac{d}{d z} \log Γ (z) = d i g a m m a (z)

is the digamma function, and $Γ (z)$ is the gamma function. In R software (R Core Team. R, 2021), the function digamma(z) calculates $ψ (z)$ . The quantities and expressions of the variance parameter θ = σ² for the conjugate prior are calculated in and quoted from (Zhang, 2017). The calculations of the quantities and expressions of the scale parameter θ = σ for the conjugate prior can be found in the Supplementary Material. We remark that the calculations of the quantities and expressions in Table 1 are not trivial, especially $E^{π_{c}} (\log θ | x)$ .

TABLE 1

TABLE 1. The quantities and expressions for the conjugate prior.

2.2 Noninformative Priors

Famous noninformative priors include the Jeffreys’s ( (Jeffreys, 1961)), reference ( (Bernardo, 1979; Berger and Bernardo, 1992)), and matching ( (Tibshirani, 1989; Datta and Mukerjee, 2004)) priors. See also (Berger, 2006; Berger et al., 2015) and the references therein.

The Jeffreys’s noninformative prior for θ = σ² is

π_{J} (θ) \propto \frac{1}{θ} or π_{J} (σ^{2}) \propto \frac{1}{σ^{2}} .

See Part I (p.66) of (Chen, 2014), where μ is assumed known in the normal model $N (μ, θ)$ . The Jeffreys’s noninformative prior for θ = σ is

π_{J} (σ) \propto \frac{1}{σ} .

See Example 3.5.6 (p.131) of (Robert, 2007), where μ is assumed known in the normal model $N (μ, σ^{2})$ .

Since μ is assumed known in the normal models, there is only one unknown parameter. Therefore, the reference prior is equal to the Jeffreys’s prior, and the matching prior is also equal to the Jeffreys’s prior (see pp.130–131 of (Ghosh et al., 2006)). In summary, when μ is assumed known in the normal models, the three noninformative priors equal, that is,

π_{n} (θ) = π_{J} (θ) = π_{R} (θ) = π_{M} (θ) \propto \frac{1}{θ}

and

π_{n} (σ) = π_{J} (σ) = π_{R} (σ) = π_{M} (σ) \propto \frac{1}{σ},

where $π_{n} (\cdot)$ stands for the noninformative prior.

Note that as in many statistics textbooks, the probability density function (pdf) of $θ \sim I G (α, β)$ is given by

f_{θ} (θ | α, β) = \frac{1}{Γ (α) β^{α}} {(\frac{1}{θ})}^{α + 1} \exp (- \frac{1}{β θ}), θ > 0, α > 0, β > 0 .

The conjugate prior of the scale parameter θ = σ is a Square Root of the Inverse Gamma (SRIG) distribution that we define below.

DEFINITION 1 Let $θ = σ^{2} \sim I G (α, β)$ with α > 0 and β > 0. Then $σ = \sqrt{θ} \sim S R I G (α, β)$ and the pdf of σ is given by

f_{σ} (σ | α, β) = \frac{2}{Γ (α) β^{α}} {(\frac{1}{σ})}^{2 α + 1} \exp (- \frac{1}{β σ^{2}}), σ > 0, α > 0, β > 0 .

Definition 1 gives the definition of the SRIG distribution, which is the conjugate prior of the scale parameter θ = σ of the normal distribution. Because the SRIG distribution can not be found in standard textbooks, so we give its definition here. Moreover, Definition 1 is reasonable, since

\begin{array}{l} f_{σ} (σ | α, β) & = f_{θ} (θ | α, β) |θ^{'} (σ)| \\ = \frac{1}{Γ (α) β^{α}} {(\frac{1}{σ^{2}})}^{α + 1} \exp (- \frac{1}{β σ^{2}}) \cdot 2 σ \\ = \frac{2}{Γ (α) β^{α}} {(\frac{1}{σ})}^{2 α + 1} \exp (- \frac{1}{β σ^{2}}) . \end{array}

We have the following proposition which gives the three expectations of the $S R I G (α, β)$ distribution. The calculations needed in the proposition can be found in the Supplementary Material. We remark that the calculations of $E (σ)$ and $E (σ^{- 1})$ are straightforward by utilizing a simple transformation of θ = σ² and the integration of an $I G (α, β)$ distribution. However, the calculations of $E (\log σ)$ is skillful by first a transformation of $y = 1 / (β σ^{2})$ and then a change of the order of integration and differentiation.

PROPOSITION 1 Let $σ = \sqrt{θ} \sim S R I G (α, β)$ with α > 0 and β > 0. Then

\begin{array}{l} E (σ) & = \frac{Γ (α - \frac{1}{2})}{Γ (α) β^{\frac{1}{2}}}, f o r α > \frac{1}{2} a n d β > 0, \\ E (\frac{1}{σ}) & = \frac{Γ (α + \frac{1}{2}) β^{\frac{1}{2}}}{Γ (α)}, f o r α > 0 a n d β > 0, \\ E (\log σ) & = - \frac{1}{2} \log β - \frac{1}{2} ψ (α), f o r α > 0 a n d β > 0 . \end{array}

The relationship between the two distributions $I G (α, β)$ and $S R I G (α, β)$ are given in the following proposition whose proof can be found in the Supplementary Material. We remark that the proof of the proposition is straightforward by utilizing monotone transformations θ = σ² and $σ = \sqrt{θ}$ .

PROPOSITION 2 $θ = σ^{2} \sim I G (α, β)$ if and only if $σ = \sqrt{θ} \sim S R I G (α, β)$ , where α > 0 and β > 0.The posterior distributions of θ and σ for the noninformative priors are given in the following theorem whose proof can be found in the Supplementary Material.

THEOREM 1 Let $X | θ \sim N (μ, θ)$ and $X | σ \sim N (μ, σ^{2})$ where μ is known and θ = σ² is unknown, $π (θ) \propto \frac{1}{θ}$ , and $π (σ) \propto \frac{1}{σ}$ . Then

π (θ | x) \sim I G (\tilde{α}, \tilde{β}) and π (σ | x) \sim S R I G (\tilde{α}, \tilde{β}),

where

\tilde{α} = \frac{n}{2} and \tilde{β} = \frac{2}{\sum_{i = 1}^{n} {(x_{i} - μ)}^{2}} . (8)

We have the following two remarks for Theorem 1.

Remark 1 Let θ = σ². In the derivation of $π (σ | x)$ , if we derive it in this way,

\begin{array}{l} f_{σ} (σ) & = π (σ | x) \\ \propto {(\frac{1}{σ})}^{n + 1} \exp (- \frac{1}{2 σ^{2}} \sum_{i = 1}^{n} {(x_{i} - μ)}^{2}) \\ = {(\frac{1}{σ^{2}})}^{\frac{n + 1}{2}} \exp (- \frac{1}{2 σ^{2}} \sum_{i = 1}^{n} {(x_{i} - μ)}^{2}) \\ = {(\frac{1}{θ})}^{\frac{n + 1}{2}} \exp (- \frac{1}{2 θ} \sum_{i = 1}^{n} {(x_{i} - μ)}^{2}) = f_{θ} (θ) \\ \sim I G ({\tilde{α}}_{1}, \tilde{β}), \end{array}

where

{\tilde{α}}_{1} = \frac{n - 1}{2} and \tilde{β} = \frac{2}{\sum_{i = 1}^{n} {(x_{i} - μ)}^{2}},

then by Proposition 2, $f_{σ} (σ) = π (σ | x) \sim S R I G ({\tilde{α}}_{1}, \tilde{β})$ , which is different from $S R I G (\tilde{α}, \tilde{β})$ . In fact, the above practice is equivalent to the derivation of the pdf of θ in terms of the pdf of σ by $f_{θ} (θ) = f_{σ} (σ)$ , ignoring the $|σ^{'} (θ)|$ term, which is obviously wrong. Therefore, the above derivation which is a pitfall for incautious users is wrong. ‖

Remark 2 The two posterior distributions in Theorem 1, $π (θ | x) \sim I G (\tilde{α}, \tilde{β})$ and $π (σ | x) \sim S R I G (\tilde{α}, \tilde{β})$ , follow Proposition 2 by accident. We have

f_{θ} (θ) = π (θ | x) \propto f (x | θ) π (θ) \propto f (x | θ) \frac{1}{θ} \sim I G (\tilde{α}, \tilde{β})

and

f_{σ} (σ) = π (σ | x) \propto f (x | σ) π (σ) \propto f (x | θ) \frac{1}{σ} \sim S R I G (\tilde{α}, \tilde{β}) .

Note that $σ = \sqrt{θ}$ , and thus

f_{σ} (σ) |σ^{'} (θ)| \propto f (x | θ) \frac{1}{σ} \frac{1}{2 \sqrt{θ}} = f (x | θ) \frac{1}{\sqrt{θ}} \frac{1}{2 \sqrt{θ}} = f (x | θ) \frac{1}{2 θ} \propto f_{θ} (θ), (9)

which is the reason why $π (θ | x) = f_{θ} (θ)$ and $π (σ | x) = f_{σ} (σ)$ follow Proposition 2. Note that the posterior distributions depend on the prior distributions. If the prior distributions $π (θ)$ and $π (σ)$ are selected different from $\frac{1}{θ}$ and $\frac{1}{σ}$ , then the relationship (Eq. 9) may not be satisfied, and thus $π (θ | x)$ and $π (σ | x)$ may not follow Proposition 2. ‖

2.2.1 The Quantities and Expressions of the Variance Parameter

In this subsubsection, we will calculate the expressions of the quantities (three posterior expectations, two Bayes estimators, and two PESLs) of the variance parameter θ = σ².

Now we calculate the three expectations $E (θ | x)$ , $E (θ^{- 1} | x)$ , and $E (\log θ | x)$ for the variance parameter θ = σ². By Theorem 1, $π (θ | x) \sim I G (\tilde{α}, \tilde{β})$ , and thus

E (θ | x) = \frac{1}{(\tilde{α} - 1) \tilde{β}}, \tilde{α} > 1 and E (\frac{1}{θ} | x) = \tilde{α} \tilde{β} .

From (Zhang, 2017), we know that

E (\log θ | x) = - \log \tilde{β} - ψ (\tilde{α}) .

It is easy to see that, for $\tilde{α} > 1$ ,

δ_{s}^{π, θ} (x) = \frac{1}{E (\frac{1}{θ} | x)} = \frac{1}{\tilde{α} \tilde{β}} < \frac{1}{(\tilde{α} - 1) \tilde{β}} = E (θ | x) = δ_{2}^{π, θ} (x),

which exemplifies (Eq. 6). From (Zhang, 2017), we find that

P E S L_{s}^{π, θ} (x) = \log \tilde{α} - ψ (\tilde{α}), for \tilde{α} > 0,

and

P E S L_{2}^{π, θ} (x) = \frac{1}{\tilde{α} - 1} + \log (\tilde{α} - 1) - ψ (\tilde{α}), for \tilde{α} > 1 .

It can be directly proved that $P E S L_{s}^{π, θ} (x) \leq P E S L_{2}^{π, θ} (x)$ for $\tilde{α} > 1$ , which exemplifies (Eq. 7), and its proof which exploits the Taylor series expansion for e^x can be found in the Supplementary Material. Note that $P E S L_{s}^{π, θ} (x)$ and $P E S L_{2}^{π, θ} (x)$ depend only on $\tilde{α} = n / 2$ . Therefore, they depend only on n, but do not depend on μ and x. Numerical simulations will exemplify this result.

The IRSL at $δ_{s}^{π, θ}$ or the BRSL for θ = σ² is (similar to (Robert, 2007))

\begin{array}{l} I R S L_{s}^{π, θ} = B R S L^{π, θ} = r (π, δ_{s}^{π, θ}) \\ = E^{π} [R (θ, δ_{s}^{π, θ})] \\ = \int_{Θ} R (θ, δ_{s}^{π, θ}) π (θ) d θ \\ = \int_{Θ} \int_{X} L (θ, δ_{s}^{π, θ} (x)) f (x | θ) d x π (θ) d θ \\ = \int_{X} \int_{Θ} L (θ, δ_{s}^{π, θ} (x)) f (x | θ) π (θ) d θ d x \\ = \int_{X} \int_{Θ} L (θ, δ_{s}^{π, θ} (x)) π (θ | x) d θ m^{π, θ} (x) d x \\ = \int_{X} {P E S L^{π, θ} (a (x) | x)|}_{a = δ_{s}^{π, θ}} m^{π, θ} (x) d x \\ = \int_{X} P E S L_{s}^{π, θ} (x) m^{π, θ} (x) d x \\ = \int_{X} [\log \tilde{α} - ψ (\tilde{α})] m^{π, θ} (x) d x \\ = \log \tilde{α} - ψ (\tilde{α}) \\ = P E S L_{s}^{π, θ} (x), \end{array}

since $\tilde{α}$ does not depend on x, where

m^{π, θ} (x) = \int_{0}^{\infty} f (x | θ) π (θ) d θ

is the marginal density of x with prior $π (θ)$ .

2.2.2 The Quantities and Expressions of the Scale Parameter

In this subsubsection, we will calculate the expressions of the quantities (three posterior expectations, two Bayes estimators, and two PESLs) of the scale parameter θ = σ.

Now let us calculate $δ_{s}^{π, σ} (x)$ , $δ_{2}^{π, σ} (x)$ , $P E S L_{s}^{π, σ} (x)$ , and $P E S L_{2}^{π, σ} (x)$ for the scale parameter σ. To calculate these quantities, we need to calculate the three expectations $E (σ | x)$ , $E (σ^{- 1} | x)$ , and $E (\log σ | x)$ . Since $π (σ | x) \sim S R I G (\tilde{α}, \tilde{β})$ by Theorem 1, from Proposition 1, we have

\begin{align} E (σ | x) & = \frac{Γ (\tilde{α} - \frac{1}{2})}{Γ (\tilde{α}) {\tilde{β}}^{\frac{1}{2}}}, for \tilde{α} > \frac{1}{2} and \tilde{β} > 0, \end{align} (10)

\begin{align} E (\frac{1}{σ} | x) & = \frac{Γ (\tilde{α} + \frac{1}{2}) {\tilde{β}}^{\frac{1}{2}}}{Γ (\tilde{α})}, for \tilde{α} > 0 and \tilde{β} > 0, \end{align} (11)

\begin{align} E (\log σ | x) & = - \frac{1}{2} \log \tilde{β} - \frac{1}{2} ψ (\tilde{α}), for \tilde{α} > 0 and \tilde{β} > 0 . \end{align} (12)

It can be proved that, for $\tilde{α} > \frac{1}{2}$ ,

δ_{s}^{π, σ} (x) = \frac{1}{E (\frac{1}{σ} | x)} = \frac{Γ (\tilde{α})}{Γ (\tilde{α} + \frac{1}{2}) {\tilde{β}}^{\frac{1}{2}}} < \frac{Γ (\tilde{α} - \frac{1}{2})}{Γ (\tilde{α}) {\tilde{β}}^{\frac{1}{2}}} = E (σ | x) = δ_{2}^{π, σ} (x),

which exemplifies (Eq. 6), and the proof which exploits the positivity of $ψ^{'} (x)$ can be found in the Supplementary Material.

Now we calculate $P E S L_{s}^{π, σ} (x)$ and $P E S L_{2}^{π, σ} (x)$ for the scale parameter σ. From (Zhang, 2017), we know that the PESL at $δ_{s}^{π, σ} (x) = {[E (σ^{- 1} | x)]}^{- 1}$ is

P E S L_{s}^{π, σ} (x) = {E [L_{s} (θ, a) | x]|}_{a = \frac{1}{E (\frac{1}{σ} | x)}} = \log E (\frac{1}{σ} | x) + E (\log σ | x),

and the PESL at $δ_{2}^{π, σ} (x) = E (σ | x)$ is

\begin{array}{l} P E S L_{2}^{π, σ} (x) & = {E [L_{s} (θ, a) | x]|}_{a = E (σ | x)} \\ = E (σ | x) E (\frac{1}{σ} | x) - 1 - \log E (σ | x) + E (\log σ | x) . \end{array}

Substituting (Eqs 10, 11, 12), into the above expressions, we obtain

\begin{array}{l} P E S L_{s}^{π, σ} (x) & = \log \frac{Γ (\tilde{α} + \frac{1}{2}) {\tilde{β}}^{\frac{1}{2}}}{Γ (\tilde{α})} - \frac{1}{2} \log \tilde{β} - \frac{1}{2} ψ (\tilde{α}) \\ = \log Γ (\tilde{α} + \frac{1}{2}) - \log Γ (\tilde{α}) - \frac{1}{2} ψ (\tilde{α}), \end{array}

for $\tilde{α} > 0$ and $\tilde{β} > 0$ , and

\begin{array}{l} P E S L_{2}^{π, σ} (x) & = \frac{Γ (\tilde{α} - \frac{1}{2})}{Γ (\tilde{α}) {\tilde{β}}^{\frac{1}{2}}} \frac{Γ (\tilde{α} + \frac{1}{2}) {\tilde{β}}^{\frac{1}{2}}}{Γ (\tilde{α})} - 1 - \log \frac{Γ (\tilde{α} - \frac{1}{2})}{Γ (\tilde{α}) {\tilde{β}}^{\frac{1}{2}}} - \frac{1}{2} \log \tilde{β} - \frac{1}{2} ψ (\tilde{α}) \\ = \frac{Γ (\tilde{α} - \frac{1}{2}) Γ (\tilde{α} + \frac{1}{2})}{Γ^{2} (\tilde{α})} - 1 - \log Γ (\tilde{α} - \frac{1}{2}) + \log Γ (\tilde{α}) - \frac{1}{2} ψ (\tilde{α}), \end{array}

for $\tilde{α} > \frac{1}{2}$ and $\tilde{β} > 0$ . It can be directly proved that $P E S L_{s}^{π, σ} (x) \leq P E S L_{2}^{π, σ} (x)$ for $\tilde{α} > \frac{1}{2}$ and $\tilde{β} > 0$ , which exemplifies (Eq. 7), and its proof which exploits the Taylor series expansion for log u with u near 1 can be found in the Supplementary Material. Note that $P E S L_{s}^{π, σ} (x)$ and $P E S L_{2}^{π, σ} (x)$ depend only on $\tilde{α} = n / 2$ . Therefore, they depend only on n, but do not depend on μ and x. Numerical simulations will exemplify this result.

The IRSL at $δ_{s}^{π, σ}$ or the BRSL for θ = σ is (similar to (Robert, 2007))

\begin{array}{l} I R S L_{s}^{π, σ} = B R S L^{π, σ} = r (π, δ_{s}^{π, σ}) \\ = E^{π} [R (σ, δ_{s}^{π, σ})] \\ = \int_{Σ} R (σ, δ_{s}^{π, σ}) π (σ) d σ \\ = \int_{Σ} \int_{X} L (σ, δ_{s}^{π, σ} (x)) f (x | σ) d x π (σ) d σ \\ = \int_{X} \int_{Σ} L (σ, δ_{s}^{π, σ} (x)) f (x | σ) π (σ) d σ d x \\ = \int_{X} \int_{Σ} L (σ, δ_{s}^{π, σ} (x)) π (σ | x) d σ m^{π, σ} (x) d x \\ = \int_{X} {P E S L^{π, σ} (a (x) | x)|}_{a = δ_{s}^{π, σ}} m^{π, σ} (x) d x \\ = \int_{X} P E S L_{s}^{π, σ} (x) m^{π, σ} (x) d x \\ = \int_{X} [\log Γ (\tilde{α} + \frac{1}{2}) - \log Γ (\tilde{α}) - \frac{1}{2} ψ (\tilde{α})] m^{π, σ} (x) d x \\ = \log Γ (\tilde{α} + \frac{1}{2}) - \log Γ (\tilde{α}) - \frac{1}{2} ψ (\tilde{α}) \\ = P E S L_{s}^{π, σ} (x), \end{array}

since $\tilde{α}$ does not depend on x, where

m^{π, σ} (x) = \int_{0}^{\infty} f (x | σ) π (σ) d σ

is the marginal density of x with prior $π (σ)$ .

The quantities and expressions of the variance and scale parameters for the noninformative priors are summarized in Table 2. In the table, $\tilde{α}$ and $\tilde{β}$ are given by (Eq. 8).

TABLE 2

TABLE 2. The quantities and expressions for the noninformative priors.

From Tables 1, 2, we find that there are four combinations of the expressions of the quantities: conjugate prior and variance parameter, conjugate prior and scale parameter, noninformative prior and variance parameter, and noninformative prior and scale parameter. The forms of the expressions of the quantities are the same for the variance parameter under the conjugate and noninformative priors, since they have the same Inverse Gamma posterior distributions. Similarly, the forms of the expressions of the quantities are the same for the scale parameter under the conjugate and noninformative priors, since they have the same Square Root of the Inverse Gamma posterior distributions.

The inequalities (Eqs 6, 7) exist in Tables 1, 2. In fact, there are 8 inequalities in Tables 1, 2 and 4 inequalities in each table. Since the forms of the expressions of the quantities are the same in Tables 1, 2, with the only difference of the parameters, there are actually 4 different inequalities which are in Table 2. One inequality of the four inequalities about the Bayes estimators is obvious, and the proofs of the other three inequalities can be found in the Supplementary Material.

3 Numerical Simulations

In this section, we will numerically exemplify the theoretical studies of (Eqs 6, 7), and that the PESLs depend only on n, but do not depend on μ and x. The numerical simulation results are similar for the four combinations of the expressions of the quantities, and thus we only present the results for the combination of the noninformative prior and the scale parameter.

First, we fix μ = 0 and n = 10, and assume that σ = 1 is drawn from the improper prior distribution. After that, we draw a random sample

x = rnorm (n = n, mean = μ, sd = σ)

from N(μ, σ²).

To generate a random sample $σ = (σ_{1}, \dots, σ_{k})$ with k = 1000 from

π_{n} (σ | x) = S R I G (\tilde{α}, \tilde{β}),

we will adopt the following algorithm. First, compute $\tilde{α}$ and $\tilde{β}$ from (Eq. 8). Second, generate a random sample

G = rgamma (n = k, shape = \tilde{α}, scale = \tilde{β}) \sim G (\tilde{α}, \tilde{β}) .

Third, compute

I G = \frac{1}{G} \sim I G (\tilde{α}, \tilde{β}) .

Fourth, compute

σ = \sqrt{I G} \sim S R I G (\tilde{α}, \tilde{β}) .

Hence, σ is a random sample from the $S R I G (\tilde{α}, \tilde{β})$ distribution. Figure 2 shows the histogram of σ|x and the density estimation curve of π_n(σ|x). It is π_n(σ|x) that we find $δ_{s}^{π_{n}, σ} (x)$ to minimize the PESL. From the figure, we see that the $S R I G (\tilde{α}, \tilde{β})$ distribution is left peaked, right skewed, and continuous.

FIGURE 2

FIGURE 2. The histogram of σ|x and the density estimation curve of π_n(σ|x).

The Bayes estimators ( $δ_{s}^{π_{n}, σ} (x)$ and $δ_{2}^{π_{n}, σ} (x)$ ) and the PESLs ( $P E S L_{s}^{π_{n}, σ} (x)$ and $P E S L_{2}^{π_{n}, σ} (x)$ ) are computed by the following algorithm. First, compute $\tilde{α}$ and $\tilde{β}$ from (Eq. 8). Second, compute

\begin{array}{l} E_{1} & = E (σ | x) = \frac{Γ (\tilde{α} - \frac{1}{2})}{Γ (\tilde{α}) {\tilde{β}}^{\frac{1}{2}}}, \\ E_{2} & = E (\frac{1}{σ} | x) = \frac{Γ (\tilde{α} + \frac{1}{2}) {\tilde{β}}^{\frac{1}{2}}}{Γ (\tilde{α})}, \\ E_{3} & = E (\log σ | x) = - \frac{1}{2} \log \tilde{β} - \frac{1}{2} ψ (\tilde{α}) . \end{array}

Third, compute

\begin{array}{l} δ_{s}^{π_{n}, σ} (x) & = \frac{1}{E_{2}}, \\ δ_{2}^{π_{n}, σ} (x) & = E_{1}, \\ P E S L_{s}^{π_{n}, σ} (x) & = \log (E_{2}) + E_{3}, \\ P E S L_{2}^{π_{n}, σ} (x) & = E_{1} \times E_{2} - \log (E_{1}) + E_{3} - 1 . \end{array}

Numerical results show that

δ_{s}^{π_{n}, σ} (x) = 0.7712483 < 0.8152161 = δ_{2}^{π_{n}, σ} (x)

and

P E S L_{s}^{π_{n}, σ} (x) = 0.0267013 < 0.02826706 = P E S L_{2}^{π_{n}, σ} (x),

which exemplify the theoretical studies of (6) and (7).

In Figure 3, we fix μ = 0 and n = 10, but allow the seed number to change from 1 to 10 (i.e., we change x). From the figure we see that the estimators and PESLs are functions of x. We see from the left plot of the figure that the estimators depend on x in an unpredictable manner, and $δ_{s}^{π_{n}, σ} (x)$ are unanimously smaller than $δ_{2}^{π_{n}, σ} (x)$ , and thus (Eq. 6) is exemplified. The two Bayes estimators are distinguishable since we fix n = 10 to be a small number. The right plot of the figure exhibits that the PESLs do not depend on x, and $P E S L_{s}^{π_{n}, σ} (x)$ are unanimously smaller than $P E S L_{2}^{π_{n}, σ} (x)$ , and thus (Eq. 7) is exemplified.

FIGURE 3

FIGURE 3. The estimators are functions of x (left) and the PESLs are also functions of x (right).

Now we allow one of the two parameters μ and n to change, holding other parameters fixed. Moreover, we also assume that the sample x is fixed, as it is the case for the real data. Figure 4 shows the estimators and PESLs as functions of μ and n. We see from the left plots of the figure that the estimators depend on μ and n, and (Eq. 6) is exemplified. More specifically, the estimators are first decreasing and then increasing functions of μ, and the estimators attain the minimum when μ = 0. However, the estimators fluctuate around some value when n increases. The right plots of the figure exhibit that the PESLs depend only on n, but do not depend on μ , and (Eq. 7) is exemplified. More specifically, the PESLs are decreasing functions of n. Furthermore, the two PESLs as functions of n are indistinguishable, as the two PESLs are very close. In summary, the results of the figure exemplify the theoretical studies of (Eqs 6, 7).

FIGURE 4

FIGURE 4. Left: The estimators as functions of μ and n. Right: The PESLs as functions of μ and n.

Since the estimators $δ_{s}^{π_{n}, σ} (x)$ and $δ_{2}^{π_{n}, σ} (x)$ and the PESLs $P E S L_{s}^{π_{n}, σ} (x)$ and $P E S L_{2}^{π_{n}, σ} (x)$ depend on $\tilde{α}$ and $\tilde{β}$ , where $\tilde{α} > 1 / 2$ and $\tilde{β} > 0$ , we can plot the surfaces of the estimators and the PESLs on the domain $(\tilde{α}, \tilde{β}) \in (0.5, 10] \times (0,10] = D$ via the R function persp3d() in the R package rgl (see (Adler and Murdoch, 2017; Zhang et al., 2017; Zhang et al., 2019; Sun et al., 2021)). We remark that the R function persp() in the R package graphics can not add another surface to the existing surface, but persp3d() can. Moreover, persp3d() allows one to rotate the perspective plots of the surface according to one’s wishes. Figure 5 plots the surfaces of the estimators and the PESLs, and the surfaces of the difference of the estimators and the difference of the PESLs. From the left two plots of the figure, we see that $δ_{s}^{π_{n}, σ} (x) < δ_{2}^{π_{n}, σ} (x)$ for all $(\tilde{α}, \tilde{β})$ on D, which exemplifies (Eq. 6). From the right two plots of the figure, we see that $P E S L_{s}^{π_{n}, σ} (x) < P E S L_{2}^{π_{n}, σ} (x)$ for all $(\tilde{α}, \tilde{β})$ on D, which exemplifies (Eq. 7). In summary, the results of the figure exemplify the theoretical studies of (Eqs 6, 7).

FIGURE 5

FIGURE 5. The domain for $(\tilde{α}, \tilde{β})$ is D = (0.5, 10] × (0, 10] for all the plots. a is for $\tilde{α}$ and b is for $\tilde{β}$ in the axes of all the plots. The red surface is for $δ_{2}^{π_{n}, σ} (x)$ and the blue surface is for $δ_{s}^{π_{n}, σ} (x)$ in the upper two plots. (upper left) The estimators as functions of $\tilde{α}$ and $\tilde{β}$ . $δ_{s}^{π_{n}, σ} (x) < δ_{2}^{π_{n}, σ} (x)$ for all $(\tilde{α}, \tilde{β})$ on D. (upper right) The PESLs as functions of $\tilde{α}$ and $\tilde{β}$ . $P E S L_{s}^{π_{n}, σ} (x) < P E S L_{2}^{π_{n}, σ} (x)$ for all $(\tilde{α}, \tilde{β})$ on D. (lower left) The surface of $δ_{2}^{π_{n}, σ} (x) - δ_{s}^{π_{n}, σ} (x)$ which is positive for all $(\tilde{α}, \tilde{β})$ on D. (lower right) The surface of $P E S L_{2}^{π_{n}, σ} (x) - P E S L_{s}^{π_{n}, σ} (x)$ which is also positive for all $(\tilde{α}, \tilde{β})$ on D.

4 A Real Data Example

In this section, we exploit the data from finance. The R package quantmod ( (Ryan and Ulrich, 2017)) is exploited to download the data ˆGSPC (the S&P 500) during 2020-04-24 and 2021-07-02 from “finance.yahoo.com.” It is commonly believed that the monthly simple returns of the index data or the stock data are normally distributed. It is simple to check that the S&P 500 monthly simple returns follow the normal model. Usually, the data from real examples can be regarded as iid from the normal model with an unknown mean μ. However, the mean μ could be estimated by prior information or historical information. Alternatively, the mean μ could be estimated by the sample mean. Therefore, for simplicity, we assume that the mean μ is known. Assume that

μ = \bar{x}, α = 1, β = 1

for the S&P 500 monthly simple returns.

The Bayes estimators and the PESLs of the variance and scale parameters of the S&P 500 monthly simple returns for the conjugate and noninformative priors are summarized in Table 3. From the table, we observe the following facts.

• The two inequalities (Eqs 6, 7) are exemplified.

• Given the prior (conjugate or noninformative), the Bayes estimators are similar across different loss functions (Stein’s or squared error).

• Given the loss function, the Bayes estimators are quite different across different priors. Therefore, the prior has a larger influence than the loss function in calculating the Bayes estimators.

TABLE 3

TABLE 3. The Bayes estimators and the PESLs of the S&P 500 monthly simple returns.

More results (the data of the S&P 500 monthly simple returns, the plot of the S&P 500 monthly close prices, the plot of the S&P 500 monthly simple returns, the histogram of the S&P 500 monthly simple returns) for the real data example can be found in the Supplementary Material due to space limitations.

5 Conclusions and Discussions

For the variance (θ = σ²) and scale (θ = σ) parameters of the normal model with a known mean μ, we recommend and analytically calculate the Bayes estimators, $δ_{s}^{π, θ} (x)$ , with respect to the conjugate and noninformative (Jeffreys’s, reference, and matching) priors under Stein’s loss function which penalizes gross overestimation and gross underestimation equally. These estimators minimize the PESLs. We also analytically calculate the Bayes estimators, $δ_{2}^{π, θ} (x) = E (θ | x)$ , with respect to the conjugate and noninformative priors under the squared error loss function, and the corresponding PESLs. The quantities ( $π (θ)$ , $π (θ | x)$ , $E^{π} (θ | x)$ , $E^{π} (θ^{- 1} | x)$ , $E^{π} (\log θ | x)$ , $δ_{s}^{π, θ} (x)$ , $δ_{2}^{π, θ} (x)$ , $P E S L_{s}^{π, θ} (x)$ , $P E S L_{2}^{π, θ} (x)$ ) and expressions of the variance and scale parameters for the conjugate and noninformative priors are summarized in Tables 1, 2, respectively. Note that $E^{π} (\log θ | x)$ , which is essential for the calculation of $P E S L_{s}^{π, θ} (x)$ and $P E S L_{2}^{π, θ} (x)$ , depends on the digamma function.

Proposition 1 gives the three expectations of the $S R I G (α, β)$ distribution. Moreover, Proposition 2 gives the relationship between the two distributions $I G (α, β)$ and $S R I G (α, β)$ .

For the conjugate and noninformative priors, the posterior distribution of θ = σ², $π (θ | x)$ , follows an Inverse Gamma distribution, and the posterior distribution of σ, $π (σ | x)$ , follows an SRIG distribution which is defined in Definition 1.

We find that the IRSL at $δ_{s}^{π, θ}$ or the BRSL for θ = σ² is

P E S L_{s}^{π, θ} (x) = \log \tilde{α} - ψ (\tilde{α}) .

In addition, the IRSL at $δ_{s}^{π, σ}$ or the BRSL for θ = σ is

P E S L_{s}^{π, σ} (x) = \log Γ (\tilde{α} + \frac{1}{2}) - \log Γ (\tilde{α}) - \frac{1}{2} ψ (\tilde{α}) .

The numerical simulations of the combination of the noninformative prior and the scale parameter exemplify the theoretical studies of (Eqs 6, 7), and that the PESLs depend only on n, but do not depend on μ and x. Moreover, in the real data example, we have calculated the Bayes estimators and the PESLs of the variance and scale parameters of the S&P 500 monthly simple returns for the conjugate and noninformative priors.

Unlike in frequentist paradigm, if $\hat{σ}$ is the Maximum Likelihood Estimator (MLE) of σ, then ${\hat{σ}}^{2}$ is the MLE of σ². In Bayesian paradigm, we usually should estimate the variance parameter σ² and the scale parameter σ separately. In Table 2, we find that

δ_{s}^{π_{n}, σ^{2}} (x) = \frac{1}{\tilde{α} \tilde{β}} and δ_{s}^{π_{n}, σ} (x) = \frac{Γ (\tilde{α})}{Γ (\tilde{α} + \frac{1}{2}) {\tilde{β}}^{\frac{1}{2}}} .

It is easy to see that

δ_{s}^{π_{n}, σ^{2}} (x) \neq {[δ_{s}^{π_{n}, σ} (x)]}^{2} .

Similarly,

δ_{2}^{π_{n}, σ^{2}} (x) \neq {[δ_{2}^{π_{n}, σ} (x)]}^{2} .

When there is no prior information about the unknown parameter of interest, we prefer the noninformative prior, as the hyperparameters α and β are somewhat arbitrary for the conjugate prior.

We remark that the Bayes estimator under Stein’s loss function is more appropriate than that under the squared error loss function, not because the former is smaller, but because Stein’s loss function which penalizes gross overestimation and gross underestimation equally is more appropriate for the positive restricted parameter.

Data Availability Statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

Author Contributions

This work was carried out in collaboration among all authors. Author YYZ wrote the first draft of the article. Author TZR did literature searches and revised the article. Author MML revised the article. All authors read and approved the final article.

Funding

The research was supported by the Ministry of Education (MOE) project of Humanities and Social Sciences on the west and the border area (20XJC910001), the National Social Science Fund of China (21XTJ001), the National Natural Science Foundation of China (12001068; 72071019), and the Fundamental Research Funds for the Central Universities (2020CDJQY-Z001; 2021CDJQY-047).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors, and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Acknowledgments

The authors are extremely grateful to the editor, the guest associate editor, and the reviewers for their insightful comments that led to significant improvement of the article.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fdata.2021.763925/full#supplementary-material

References

Adler, D., and Murdoch, D. (2017). Rgl: 3D Visualization Using OpenGL. OthersR package version 0.98.1.

Google Scholar

Berger, J. O., and Bernardo, J. M. (1992). On the Development of the Reference Prior Method. Bayesian Statistics 4. London: Oxford University Press.

Google Scholar

Berger, J. O., Bernardo, J. M., and Sun, D. C. (2015). Overall Objective Priors. Bayesian Anal. 10, 189–221. doi:10.1214/14-ba915

CrossRef Full Text | Google Scholar

Berger, J. O. (2006). The Case for Objective Bayesian Analysis. Bayesian Anal. 1, 385–402. doi:10.1214/06-ba115

CrossRef Full Text | Google Scholar

Bernardo, J. M. (1979). Reference Posterior Distributions for Bayesian Inference. J. R. Stat. Soc. Ser. B (Methodological) 41, 113–128. doi:10.1111/j.2517-6161.1979.tb01066.x

CrossRef Full Text | Google Scholar

Bobotas, P., and Kourouklis, S. (2010). On the Estimation of a Normal Precision and a Normal Variance Ratio. Stat. Methodol. 7, 445–463. doi:10.1016/j.stamet.2010.01.001

CrossRef Full Text | Google Scholar

Casella, G., and Berger, R. L. (2002). Statistical Inference (USA: Duxbury). 2nd edition.

Google Scholar

Chen, M. H. (2014). Bayesian Statistics Lecture. Changchun, China: Statistics Graduate Summer SchoolSchool of Mathematics and Statistics, Northeast Normal University.

Google Scholar

Datta, G. S., and Mukerjee, R. (2004). Probability Matching Priors: Higher Order Asymptotics. New York: Springer.

Google Scholar

Ghosh, J. K., Delampady, M., and Samanta, T. (2006). An Introduction to Bayesian Analysis. New York: Springer.

Google Scholar

James, W., and Stein, C. (1961). Estimation with Quadratic Loss. Proc. Fourth Berkeley Symp. Math. Stat. Probab. 1, 361–380.

Google Scholar

Jeffreys, H. (1961). Theory of Probability. 3rd edition. Oxford: Clarendon Press.

Google Scholar

Lehmann, E. L., and Casella, G. (1998). Theory of Point Estimation. 2nd edition. New York: Springer.

Google Scholar

Maatta, J. M., and Casella, G. (1990). Developments in Decision-Theoretic Variance Estimation. Stat. Sci. 5, 90–120. doi:10.1214/ss/1177012263

CrossRef Full Text | Google Scholar

Mao, S. S., and Tang, Y. C. (2012). Bayesian Statistics. 2nd edition. Beijing, China: Statistics Press.

Google Scholar

Oono, Y., and Shinozaki, N. (2006). On a Class of Improved Estimators of Variance and Estimation under Order Restriction. J. Stat. Plann. Inference 136, 2584–2605. doi:10.1016/j.jspi.2004.10.023

CrossRef Full Text | Google Scholar

Petropoulos, C., and Kourouklis, S. (2005). Estimation of a Scale Parameter in Mixture Models with Unknown Location. J. Stat. Plann. Inference 128, 191–218. doi:10.1016/j.jspi.2003.09.028

CrossRef Full Text | Google Scholar

R Core Team. R (2021). A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.

Google Scholar

Robert, C. P. (2007). The Bayesian Choice: From Decision-Theoretic Motivations to Computational Implementation. 2nd paperback edition. New York: Springer.

Google Scholar

Ryan, J. A., and Ulrich, J. M. (2017). R Package Version 0, 4–10.Quantmod: Quantitative Financial Modelling Framework.

Google Scholar

Sindhu, T. N., and Aslam, M. (2013). Bayesian Estimation on the Proportional Inverse Weibull Distribution under Different Loss Functions. Adv. Agric. Sci. Eng. Res. 3, 641–655.

Google Scholar

Sindhu, T. N., Aslam, M., and Hussain, Z. (2016). Bayesian Estimation on the Generalized Logistic Distribution under Left Type-II Censoring. Thailand Statistician 14, 181–195.

Google Scholar

Sindhu, T. N., and Aslam, M. (2013). Objective Bayesian Analysis for the Gompertz Distribution under Doudly Type II Cesored Data. Scientific J. Rev. 2, 194–208.

Google Scholar

Sindhu, T. N., and Hussain, Z. (2018). Mixture of Two Generalized Inverted Exponential Distributions with Censored Sample: Properties and Estimation. Stat. Applicata-Italian J. Appl. Stat. 30, 373–391.

Google Scholar

Sindhu, T. N., Saleem, M., and Aslam, M. (2013). Bayesian Estimation for Topp Leone Distribution under Trimmed Samples. J. Basic Appl. Scientific Res. 3, 347–360.

Google Scholar

Sindhu, T. N., Aslam, M., and Hussain, Z. (2016). A Simulation Study of Parameters for the Censored Shifted Gompertz Mixture Distribution: A Bayesian Approach. J. Stat. Manage. Syst. 19, 423–450. doi:10.1080/09720510.2015.1103462

CrossRef Full Text | Google Scholar

Sindhu, T. N., Feroze, N., and Aslam, M. (2017). A Class of Improved Informative Priors for Bayesian Analysis of Two-Component Mixture of Failure Time Distributions from Doubly Censored Data. J. Stat. Manage. Syst. 20, 871–900. doi:10.1080/09720510.2015.1121597

CrossRef Full Text | Google Scholar

Sindhu, T. N., Khan, H. M., Hussain, Z., and Al-Zahrani, B. (2018). Bayesian Inference from the Mixture of Half-Normal Distributions under Censoring. J. Natn. Sci. Found. Sri Lanka 46, 587–600. doi:10.4038/jnsfsr.v46i4.8633

CrossRef Full Text | Google Scholar

Sindhu, T. N., Riaz, M., Aslam, M., and Ahmed, Z. (2016). Bayes Estimation of Gumbel Mixture Models with Industrial Applications. Trans. Inst. Meas. Control. 38, 201–214. doi:10.1177/0142331215578690

CrossRef Full Text | Google Scholar

Sun, J., Zhang, Y.-Y., and Sun, Y. (2021). The Empirical Bayes Estimators of the Rate Parameter of the Inverse Gamma Distribution with a Conjugate Inverse Gamma Prior under Stein's Loss Function. J. Stat. Comput. Simulation 91, 1504–1523. doi:10.1080/00949655.2020.1858299

CrossRef Full Text | Google Scholar

Tibshirani, R. (1989). Noninformative Priors for One Parameter of Many. Biometrika 76, 604–608. doi:10.1093/biomet/76.3.604

CrossRef Full Text | Google Scholar

Varian, H. R. (1975). “A Bayesian Approach to Real Estate Assessment,” in Studies in Bayesian Econometrics and Statistics. Editors S. E. Fienberg, and A. Zellner (Amsterdam: North Holland), 195–208.

Google Scholar

Xie, Y.-H., Song, W.-H., Zhou, M.-Q., and Zhang, Y.-Y. (2018). The Bayes Posterior Estimator of the Variance Parameter of the Normal Distribution with a Normal-Inverse-Gamma Prior Under Stein’s Loss. Chin. J. Appl. Probab. Stat. 34, 551–564.

Google Scholar

Zellner, A. (1986). Bayesian Estimation and Prediction Using Asymmetric Loss Functions. J. Am. Stat. Assoc. 81, 446–451. doi:10.1080/01621459.1986.10478289

CrossRef Full Text | Google Scholar

Zhang, Y.-Y. (2017). The Bayes Rule of the Variance Parameter of the Hierarchical Normal and Inverse Gamma Model under Stein's Loss. Commun. Stat. - Theor. Methods 46, 7125–7133. doi:10.1080/03610926.2016.1148733

CrossRef Full Text | Google Scholar

Zhang, Y.-Y., Wang, Z.-Y., Duan, Z.-M., and Mi, W. (2019). The Empirical Bayes Estimators of the Parameter of the Poisson Distribution with a Conjugate Gamma Prior under Stein's Loss Function. J. Stat. Comput. Simulation 89, 3061–3074. doi:10.1080/00949655.2019.1652606

CrossRef Full Text | Google Scholar

Zhang, Y.-Y., Xie, Y.-H., Song, W.-H., and Zhou, M.-Q. (2018). Three Strings of Inequalities Among Six Bayes Estimators. Commun. Stat. - Theor. Methods 47, 1953–1961. doi:10.1080/03610926.2017.1335411

CrossRef Full Text | Google Scholar

Zhang, Y.-Y., Zhou, M.-Q., Xie, Y.-H., and Song, W.-H. (2017). The Bayes Rule of the Parameter in (0,1) under the Power-Log Loss Function with an Application to the Beta-Binomial Model. J. Stat. Comput. Simulation 87, 2724–2737. doi:10.1080/00949655.2017.1343332

CrossRef Full Text | Google Scholar

Keywords: Bayes estimator, variance and scale parameters, normal model, conjugate and noninformative priors, Stein’s loss

Citation: Zhang Y-Y, Rong T-Z and Li M-M (2022) The Bayes Estimators of the Variance and Scale Parameters of the Normal Model With a Known Mean for the Conjugate and Noninformative Priors Under Stein’s Loss. Front. Big Data 4:763925. doi: 10.3389/fdata.2021.763925

Received: 24 August 2021; Accepted: 01 November 2021;
Published: 03 January 2022.

Edited by:

Niansheng Tang, Yunnan University, China

Reviewed by:

Guikai Hu, East China University of Technology, China
Akio Namba, Kobe University, Japan
Tabassum Sindhu, Quaid-i-Azam University, Pakistan

Copyright © 2022 Zhang, Rong and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Ying-Ying Zhang, cm9iZXJ0emhhbmd5eWluZ0BxcS5jb20=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.