- División de Ciencias de la Tierra, Centro de Investigación Científica y Educación Superior de Ensenada (CICESE), Ensenada, Mexico
Kernel Density Estimation is a powerful tool that can be used to extract information about the underlying plumbing system in zones of distributed volcanism. Different approaches concerning the form in which this tool should be applied, however, exist on the literature. One of those approaches sustains that an unbiased selection of a parameter known as the bandwidth is preferable to other alternatives because it reduces biases on the analysis. Nevertheless, there are more than 30 different forms in which a bandwidth can be “objectively” selected, therefore questioning the meaning of “objectivity” on the selection of a method used for its calculation. Furthermore, as shown in this work, the range of allowed “objective” choices of the bandwidth is not much different from a typical range that could be selected subjectively. Consequently, instead of focusing on the question of “what is the best method?” it is shown here that a more informative approach is to focus on the questions of “what are the special values of different methods, and what are their several advantageous applicabilities?”. The benefits of this shift in approach are illustrated with application to three locations of volcanic interest that have a previously well-constrained volcanic structure.
1 Introduction
Statistics and probability are two separate branches of mathematics useful to analyze the relative frequency of events. Statistics involves the description and analysis of the frequency of past events attempting to make sense of observations in the real world, whereas probability deals with predictions concerning the likelihood of future events through the examination of the consequences of mathematical definitions that are issued independently of the real world (Skiena, 2003). When applied to spatial data (i.e., data that can be drawn in a map) statistics includes the generation of summaries describing spatial patterns and can be extended to comparisons between those summaries and expectations raised by theories of how the identified spatial patterns formed (Ripley, 1981). Any attempt to forecasting the spatial location and timing of future events should be considered part of a probabilistic approach. If the objects of interest are volcanic vents, studies involving their spatial distribution can therefore have either a statistical or a probabilistic orientation, depending on the emphasis made on the forecasting component.
There is a vast literature discussing several probability aspects that may produce variable and mutually inconsistent hazard and risk estimates in both seismic and volcanic contexts (e.g., Bernreuter et al., 1988; Marzocchi et al., 2008; Neri et al., 2008; Marzocchi and Bebbington, 2012; Marzocchi and Jordan, 2014; Ake et al., 2018; Bevilacqua et al., 2018; Selva et al., 2019; Marzocchi et al., 2021). Works with a more pronounced statistical approach, however, are not as common (Richardson et al., 2012; Cañón-Tapia, 2014; Cañón-Tapia and Mendoza-Borunda, 2014; Delcamp et al., 2019; Jacobo-Bojórquez and Cañón-Tapia, 2020; Cañón-Tapia, 2021b). It must be remarked that the difference made here between the probabilistic and statistic reports is based on the purpose of the study rather than on the tools used for the presentation of data. This might seem confusing at first because some studies use what would seem to be a statistical tool to assess the hazard of a future eruption (Connor, 1987; Connor and Hill, 1995; Condit and Connor, 1996; Conway et al., 1998; Weller et al., 2006; Jaquet et al., 2008; Connor and Connor, 2009; Kiyosugi et al., 2009; Bebbington and Cronin, 2011; Bebbington, 2015; Connor et al., 2019). Nevertheless, the use of a statistical description within a probabilistic framework is not uncommon, and sometimes is referred by the name of “statistical inference” (DeGroot and Schervish, 2012).
Distinction between the probabilistic and statistic approaches outlined above is important for several reasons, particularly when examining the literature aimed to characterize the spatial distribution of volcanic vents. Motivations for the study of vent distribution in volcanic fields aim to increase our understanding of a wide variety of aspects. To mention but a few those aspects include the relationship existing between polygenetic and monogenetic edifices, geochemical complexities that might arise within a field, the distinction between volcanic events and volcanic edifices, the role played by temporal gaps on volcanic activity or even issues related with the very definition of what is a volcanic field. Readers interested in those particular issues are referred to the works by Valentine and Gregg (2008), Szakács and Cañón-Tapia (2010), Kereszturi and Németh, 2013, Németh and Kereszturi, 2015, Cañón-Tapia (2016), and references therein. Underlying all of those issues at the most general level, the statistical approach aims to present a description of the data (e.g., already existing and identifiable volcanic vents) in such a form that some hypothesis concerning the physical structure leading to the observed distribution can be formulated. In contrast, the probabilistic approach focuses on the identification of the site and time with a larger likelihood for the occurrence of a future event. At a more subtle level, the assumptions made by either a statistically- or a probabilistically-oriented interpretation are also contrastingly different. The statistical interpretation should include assessment of the physical conditions favoring the formation of two or more vents during a single eruption, the orientation of tabular conduits that transport magma to the surface and their relationship with the prevailing stress orientation at the time, the possible shift of location of magma reservoirs, etc. A recent review of these topics, and relevant references are provided by Cañón-Tapia (2021a) and Rivalta et al. (2019). In contrast, a probabilistic or statistical-inference interpretation includes assumptions concerning the type of underlying distribution (Uniform, Clustered, Poisson, etc.), but most importantly it makes assumptions about the equality of likelihood of each outcome (past, present and future) and the continuity of the preferred mathematical distribution both in space and time to validate the interpretations of analysis with respect to a single underlying probabilistic model. Identification of the most basic underlying assumptions in many works, however, are not explicitly discussed. Consequently, it is not surprising that although much effort has been made to characterize vent distribution, it has not been possible to reach a consensus about the methodological approach that guarantees extraction of the larger amount of reliable information from a given set of vents, or even about the type of information that can be obtained under all circumstances.
Failure in understanding the differences between various methodological approaches might result in a negative influence on the scientific and technical acceptability of models which ultimately is detrimental for the advance of scientific knowledge in general. From a practical point of view, the mentioned failure might result in rejection of works submitted to scientific journals on the grounds of alleged errors, ignorance about the method, or indulgence on bespoken methods that produce what the author wants to see. Such statements are difficult to discuss in a formal context because most journals nowadays do not accept the inclusion of references to personal communications or to unpublished works. Nevertheless, those comments play an important role in shaping the evolution of scientific knowledge because acritical acceptance of such statements by some editors leads to rejections of works that are based on alternative methodologies and eliminate the possibility of having a truly open discussion on those subjects. At the end, acceptance of those criticisms followed by rejection of the papers that present alternative methodologies results in the propagation of unilateral points of view that may not have factual support in every circumstance. Although such extreme situations are not the rule across the scientific literature, it is in the best interest of science to reduce the occurrence of such events. This goal can be reached by keeping an open mind capable to work with multiple hypothesis and multiple methodologies to enrich the outcome of scientific enquiry in the most general terms. In addition, it is important to assist the geoscience community to grasp the fundamental aspects of different families of methods developed within theoretical statistics providing landmarks that can be used to assess the possible merits and faults of various works beyond personal opinions.
In this work I focus attention on several aspects of Kernel Density Estimation (KDE) in a volcanic context. Among others, KDE has been used with forecasting purposes in seismology by Frankel (1995) and Hiemer et al. (2014) and in a volcanic context by Connor et al. (2019). Nevertheless, aspects of KDE that have not been treated in detail include: a) the definition of the function used to estimate the probability-density of the data, b) fulfillment in the volcanic context (i.e., the physical world) of the conditions imposed when establishing that function, and c) the possible interpretations of that function in connection with the physical world. Although those issues have been discussed in the natural hazard literature (see references listed above) it would seem that the most fundamental aspects of those issues have not been assimilated by the community at large. Thus, in order to make accessible all of this information to the widest possible audience, the presentation in this work leans towards a more colloquial style. In addition, because the intention of this work is not to finger point any particular work that was produced by adopting alternative approaches than those described below (that judgement is better left to each reader), there are sections of the text that are presented without providing specific references, although the most relevant references are provided in masse. Despite these characteristics, by clarifying here many aspects of KDE that have not been addressed before in the context of the spatial distribution of volcanic vents, it is hoped that the community will have more elements than currently available to make better judgements concerning the conclusions that have been reached whenever this type of methodology has been employed. More importantly, by facilitating comprehension of the several aspects of this method discussed below, it is hoped that not only others might be encouraged to use this powerful tool, but a wider diversity of hypothesis can be generated and tested, ultimately contributing to the better understanding of volcanic systems in general.
2 From KDE Theory to Volcanic Reality
This section presents arguments showing that the most common situations of volcanic interest are likely to involve observations that may not satisfy the assumptions for which KDE was conceived. Aspects examined include the definition of KDE, its interpretation in the context of the physical origin of volcanic vents, and the contrast between the idealized theoretical scenario with the conditions imposed by the real world in volcanic contexts.
2.1 What is Kernel Density Estimation?
KDE encompasses a variety of non-parametric procedures to estimate probability density functions (PDF) associated with a random variable. Despite its name, PDF is a fundamental concept in statistics as defined above (i.e., with no intentions to produce a forecasting of the likelihood of future events). Nevertheless, the normalization of the area beneath that function, inherent to its definition, facilitates its use in statistic-inference realms. In any case, it is important to remark that KDE theory was developed within a statistical context in mind.
Specifically, KDE was developed as a tool for the informal investigation of properties such as skewness and multimodality of the data (Silverman, 1986). As first introduced by Rosenblatt (1956), given a set of N independent observations (x1, x2, x3, … xN), all of which are associated with the same PDF, it is postulated that each of those observations contributes to the definition of a common distribution which can have multiple modes. Thus, in the statistical literature it is common to read that a non-parametric estimator of the PDF can be obtained by evaluating:
at every appropriate value of x. Ironically, the “non-parametric” definition of
Having established the intended meaning of the “non-parametric” adjective when applied to equation 1, it is important to also note that the specific form of K ( ) may lead to some differences in the results. The influence of this selection, however, is not as large as the influence exerted by the parameter h (Cañón-Tapia, 2013). Therefore, selection of K( ) will not be discussed in this paper any more. Instead attention will be focused on the role played by the parameter h.
Selection of a suitable value (or values) of the bandwidth has remained a very controversial subject even within the community specialized in the field of statistics. This topic is treated in more detail in Section 3. Before that, it is important to examine other aspects of KDE that have been the source of some confusion, or that have elicited differences of opinion in volcanic contexts.
2.2 How Suitable are Volcanic Variables to be Analyzed With a Kernel Estimator?
Until now, KDE has been used to characterize vent distributions in zones of distributed volcanism (e.g., Connor, 1987; Connor and Hill, 1995; Lutz and Gutmann, 1995; Condit and Connor, 1996; Conway et al., 1998; Weller et al., 2006; Jaquet et al., 2008; Connor and Connor, 2009; Kiyosugi et al., 2009; Srisutthiyakorn et al., 2010; Bebbington and Cronin, 2011; Richardson et al., 2012; Rose et al., 2013; Cañón-Tapia, 2014; Cañón-Tapia and Mendoza-Borunda, 2014; Bebbington, 2015; Connor et al., 2019; Delcamp et al., 2019; Jacobo-Bojórquez and Cañón-Tapia, 2020; Cañón-Tapia, 2021b; Cañón-Tapia, 2021c). In addition, KDE has also been used to estimate rock/mineral compositions and ages of activity (Bevilacqua et al., 2018; Champion et al., 2018; Marzoli et al., 2018; Stock et al., 2018). Despite this large list of works, there are several aspects embedded in the definition of a kernel that need to be assessed in the specific situation of vent distribution. Five of those aspects are discussed in this section.
2.2.1 Interpretation of the PDF in the Context of the Physical Origin of Vent Distribution
First, it must be noted that the definition of Eq. 1 is valid for cases in which the N observations are all associated with the same PDF that will be estimated. While this is relatively easy to constrain in the context of statistical proofs, where the procedure usually is such that the real PDF is known and the set (or sets) of N observations are drawn from that PDF to test the goodness of the estimator
Returning to the situation presented by volcanic vents, it might be justified to consider that N vents were produced by the same zone of magma storage as indicated by Figure 1A. However, by only looking at the position of vents at the surface it might not be straightforward to say if the observed vent distribution depicts a situation like that shown in Figure 1A (a unique and simple zone of magma storage), Figure 1B (two independent systems overlapping with each other),or Figure 1C (random fluctuations of a unique system that extends deeper beneath the surface). In the first scenario (only one zone of magma storage), the location and shape of the distribution might be controlled by the covering of older vents by younger products, or the lack of enough time to have an adequate statistical sample of the distribution. In the second scenario (two independent, yet overlapping systems) we need to separate N1 observations from system A from the N2 observations of system B, and only then it is justified to estimate the two independent PDFs. In the third scenario (one vertically extensive system with random fluctuations induced by reservoirs at shallower depths) it is justified to consider a mixture of vents emanating from any of the shallow zones of magma storage as part of the same set of N observations and therefore it is justified to calculate a single PDF that should be related to the whole system that includes the different levels of magma storage at different depths, but is equally well justified to attempt to isolate at least the most important intermediate depth reservoirs to gain some insights into the complexity of the volcanic system as a whole. In some circumstances we may be fortunate to have enough information about the composition of all the vents in the region, so that it is possible to be certain which of the three described scenarios is more likely. In most cases of volcanic interest, however, we may not have enough information justifying such a neat separation of vents, and therefore we need to make the analysis considering that there are at least three alternative scenarios that deserve to be evaluated.
![www.frontiersin.org](https://www.frontiersin.org/files/Articles/779095/feart-10-779095-HTML-r1/image_m/feart-10-779095-g001.jpg)
FIGURE 1. Cartoons of three possible scenarios in a region of distributed volcanism and their associated probability density distribution (PDF). In all three scenarios the upper part shows alternative PDFs (solid, dashed or grey lines) and the lower diagram show vents (triangles), zones of magma storage (ellipses) and conduits allowing the vertical transport of magma (lines). The PDFs could be normalized or not, so the vertical axes are left without a label to accommodate both alternatives. (A) All vents are related to one zone of magma storage. Kurtosis and skewness of the distribution (solid line) might be related to inadequate sampling (older vents covered by younger products or immature field not having time to erupt a large enough number of vents). (B) Two overlapping zones of magma storage. Each zone of magma storage produces its own unimodal PDF (solid lines), but they overlap in the central parts of the figure. The shallow zone of magma storage has a PDF with a negative kurtosis and a positive skew (black line), whereas that of the deeper zone tends more to normality, with a slight positive kurtosis (grey line). The dashed line shows a possible combined PDF. (C) a vertically extended zone of magma storage with multiple shallow subsystems. It is uncertain which subsystems have been sufficiently sampled to yield their own PDFs with well-defined characteristics (dashed lines), and which have not. The overall tendency is towards a unimodal PDF (solid line) that reflects the influence of the deeper zone of magma storage, albeit with some noise introduced by the shallower features.
Time is another variable that also needs to be taken into consideration. Its influence, as an independent variable is similar to the influence of composition of the erupted products. Consequently, in the absence of enough time-related information that justifies separation of observations in independent sets, the analysis of the spatial distribution of vents needs to be completed taking in consideration the possibility of the presence of different PDFs within the set of observations. As a result, due to the unknown structure of the buried parts of a volcanic system, assessing the real number of modes that are significant in one PDF obtained from the location of vents is not trivial matter at all. Nevertheless, a thorough analysis must remain open to assess several alternative possibilities.
2.2.2 Density, Intensity and Normalized Values of
A second aspect to be considered from equation 1 is that each
Thus, to avoid the annoying fact that different values of the vent density can be calculated at a given point, depending on the value of h used,
2.2.3 Influence of a Small Number of Observations
A third aspect to be considered from Eq. 1 is that even when the definition of
Thus, not only the estimated PDF might be a poor representation of the real PDF despite the method of selection of the bandwidth (due to a small number of related observations, or to a biased sampling, for example), but also the difficulty to make reliable tests with alternative sets of observations is increased because each subset might be even more limited to capture the complexity of the real PDF. This limitation is inherent to the nature of volcanic activity and cannot be overcome by a bootstrap approach because if true representation of the real PDF is not reached by a small number of observations, such a representativity will not be reached by any of the resampled sets. Actually, use of any resampling method might be misleading in those cases because it would provide a false sense of objectivity that is not justified at all, while simultaneously it might overemphasize some bias on the sampling that could have been present in the original set.
2.2.4 Independence of the Observations
A fourth aspect that needs to be taken in consideration is that Eq. 1 was defined for the case of a set of observations that are independent from each other. In most situations of volcanic interest the independence of two vents may not be entirely ensured. Actually, in many cases two or more vents could have been produced by the same eruption and are associated with a common dyke at depth. In those cases, the two vents are not independent observations, and therefore do not satisfy the requirements imposed to the set from which
2.2.5 Completeness of the Record
Finally, a fifth aspect worth mentioning is that obliteration of vents due to the most recent eruptions (whether by covering of vents by the new products or by destruction of those vents during the youngest event) as well as the possibility of the occurrence of eruptions that do not leave a clear indication of the vent through which the products were erupted (as for example fissure eruptions that leave feeble traces that are easily eroded or covered by more recent events), also contribute to bring apart the characteristics of the available observations from those assumed to characterize the set of N observations from which
2.2.6 Implications of the Disparity Between Assumptions and Observations
The arguments presented in this section show that the most common situations of volcanic interest are likely to involve observations that may not satisfy the assumptions for which KDE was conceived. Under those circumstances it is worth to question if it is adequate to complete any analysis of vent distribution using this method. An answer to this question is better postponed until the role played by the smoothing factor has been discussed, and a few illustrative examples have been examined.
3 Estimators of the Bandwidth
After deciding which kernel function K( ) will be used to calculate
Thus, to facilitate the discussion, hereafter all calculations will be made with reference to a univariate Gaussian (or normal) kernel. Upon this selection of K( ), Eq. 1 takes the form:
It must be noted that on Equation. 2,
It is remarked that even when the observations and evaluation points are represented as points in a plane, equation 2 is not strictly bivariate. Indeed, it requires only one variable (the distance between two points) for the calculation of
In those situations in which the real PDF from which the N observations have been drawn is known (that we should recall has been referred as F), it is reasonable to determine which of the various estimators of
Many comparisons of bandwidth selectors (hereafter referred only as selectors) have been made (Heidenreich et al., 2013; Schindler, 2011; Turlach, 1993 and references therein). Invariably those works have concluded that it has not been possible to find one selector that uniformly performs better than all alternatives when a variety of complex Fs are considered. In practice this indicates that the selector adopted in one situation may or may not have been the best choice, depending on the real F that was under study. In other words, the only possible form to be certain that the selector is the most adequate for the F that needs to be characterized is to know F before making that study. Thus, from a pragmatic point of view it is worth to question, if F is already known, therefore allowing us to calculate very precisely the error of a particular
Evidently, it can be argued that any unbiased estimation of h is preferable over a subjective selection of a single h because it should lead to a more robust (stable, reproducible) estimation of
To avoid discussions concerning how close a given
An indication that the method used to select an “objective” h exerts a large influence on the calculated
![www.frontiersin.org](https://www.frontiersin.org/files/Articles/779095/feart-10-779095-HTML-r1/image_m/feart-10-779095-g002.jpg)
FIGURE 2. Six examples of PDFs calculated with different selectors as indicated on top of each panel. The panels were selected among the examples presented in Figure 2 and Figure 3 of Duong 2007.
The variability of possible results shown in Figure 2, all of which can be claimed to have been calculated in an “unbiased and objective form” indicates that deciding which selector is used turns out to be a subjective choice that includes several assumptions concerning the form in which errors should be handled (MISE, SAMSE, etc.), and even about the specific form of the distribution that underlies the observed group of vents. Thus, whether knowingly or not, the choice of a selector for the bandwidth, and consequently the resulting selection of just one h, are prone to bias. The cumbersome procedure followed and the involvement of a measure of error, however, contribute to disguise such subjective decisions behind an apparently objective choice.
Another important aspect of the results shown in Figure 2 is that there is no form that an outside observer could decide with 100% certainty which
The exploratory approach has been deemed as very dated, also implying that it is undesirable because of its propensity to be affected by a subjective bias (Connor et al., 2019). Nevertheless, as discussed above, due to the diversity of alternative methods for the selection of h, and due to the striking differences that can be obtained from the adoption of even apparently similar methods for that “unbiased” selection, it cannot be asserted that use of a complicated selector is entirely devoid of biases. All things considered, perhaps the only difference is that when the exploratory approach is adopted the subjective choices are on the first plane whereas when an arbitrary selector is favoured the subjective choices are hidden behind the expectations (conscious or subconscious) of the analyst. In consequence, given the fact that subjective decisions are always involved in the selection of either a value of h directly, or through the selection of a method to obtain a supposedly “optimal” h, the merits of the exploration approach should not be dismissed without further examination.
In summary, in this section it has been shown that unless the real distribution from which the vents have been drawn, F, is known with 100% accuracy, it is not possible to decide in an unbiased form which value of h needs to be chosen to minimize the error between the estimated
4 Calibrating the Method in Volcanic Contexts
The theory discussed in the previous section is applied to a few case studies in this section. Thus, in this Section I present the results of the various
Although there is a large number of selectors that have been advanced in the literature, it is beyond the purpose of this work to make an exhaustive comparison of all of them. Furthermore, as many of those selectors might require access to very specialized programs or to algorithms that might not be easily accessible for the bulk of the geoscience community, the comparison of selectors presented below was restricted to only five. The five selectors included are those available in the generic function “density” of the R environment. These options include nrd0 and nrd which are the rule of thumb proposed by Silverman (1986) and its variation as proposed by Scott (1992), respectively; ucv and bcv, which are unbiased and biased cross-validation methods and sj corresponding to the selector presented by Sheather and Jones (1991). Despite the limited number of selectors presented here, the results serve to illustrate the variability that might characterize the use of different selectors, and therefore suffice to illustrate the main aspect highlighted in the discussion.
All the vent locations used are in Latitude-Longitude pairs, but these are not directly inserted into the functions available in R, which is designed to deal with Cartesian coordinates on a plane. Thus, to avoid issues related to the distortion associated with the shape of the Earth, Latitude-Longitude pairs were converted to UTM before using the density function in R. The resulting hs were then converted to km to produce the diagrams shown in the following sections.
4.1 Mauna Kea, Hawaii
Mauna Kea volcano is one of the five shields that form the Big Island of Hawaii. Its development has followed the same general trends as other Hawaiian large shields, but Mauna Kea is characterized by the presence of a large number of cinder cones along its slopes and close to the summit. Porter (1972) and Wolfe et al. (1997) described the distribution of cinder cones atop Mauna Kea volcano, grouping them in three rift zones, each separated from their neighbors approximately by 120° of arc. In addition to those three rifts, Porter (1972) also mentioned a summit group, and other arcuate or aligned groups randomly located on the slopes of the volcano at various altitudes. According to Porter, some of these linear groups were almost perpendicular to the radial southeast rift, and therefore contributed to diffuse the boundaries of the radial zone, especially in the East. In other words, based on the geology of Mauna Kea volcano it is clear that not all the rift zones are equally developed. The radial East Rift is the best defined of the three near the summit, whereas that on the northeast is the least developed. Still, even the Northeast Rift zone gives a sense of elongation radial to the summit. Most complications are found on the Southeast Rift zone, closer to the base of the larger edifice.
Most of the cones on top of Mauna Kea belong to the Laupahoehoe stratigraphic group, and therefore can be considered to represent examples of coeval activity. Although no claim is made in this work in the sense that all the cones were produced during the same eruption, the relative similarity in age of the cones at least justifies an interpretation in the sense that the ambient regional-tectonic stress should have remained relatively constant. In any case, a distinction based on general composition or age is not granted.
Thus, the things that can be expected to be revealed by an analysis of vent distribution on top of Mauna Kea volcano include: 1) Definition of the three main rifts and the summit groups, 2) Definition or hints about other arcuate groups, and 3) evidence suggesting that all of these vents belong, or at least can be related to one larger system that feeds a central conduit as well as the flank eruptions that formed the three main rifts.
The location of eruptive centers on top and around Mauna Kea were obtained from GoogleEarth, and saved as a MATLAB file, together with a hill-shadow image and georeferenced cell. The image was produced by using a section of the SRTM 1arc_v3 geotif image downloaded from the USGS EarthExplorer server that includes the topography of most of the Big Island. A total of 242 vents were used on the analyses.
The range of h yield by the various selectors goes from 0.9 to 2.5 km (Table 1). As shown in Figure 3, the largest “optimal” h, obtained with the bcv method (Figure 3E), highlights the presence of one large system that has two main subsystems, and probably a third, small one in the southwest corner of the figure. Two of the rift zones can be suspected from the position and shape of the two main subclusters, but the definition of a third rift is not entirely clear. Also, the summit group is not visible at all. In contrast, the smallest “optimal” h, obtained with the ucv method (Figure 3D) allows an immediate identification of the three rift zones and of the summit group. In addition, several groups located mainly on the outskirts of the main edifice are also visible, and distinction of the character of each of the three zones is easy to infer. Nevertheless, this small h does not convey very effectively the sense that all of the various clusters are related to one larger system. In particular, isolated clusters of low probability density on the outskirts of the main edifice fail to be integrated into one larger system. From the intermediate hs, the one obtained with sj (Figure 3F) leads to an
![www.frontiersin.org](https://www.frontiersin.org/files/Articles/779095/feart-10-779095-HTML-r1/image_m/feart-10-779095-t001.jpg)
TABLE 1. Optimal bandwidths (in km) selected with five different methods for three areas of distributed volcanism.
![www.frontiersin.org](https://www.frontiersin.org/files/Articles/779095/feart-10-779095-HTML-r1/image_m/feart-10-779095-g003.jpg)
FIGURE 3. Estimated PDFs using the selector and corresponding bandwidth as indicated on top of each panel. The first panel (A) shows the location of the vents used for the calculations. Mauna Kea example.
In summary, the ucv method (Figure 3D) provides the most complete information in this case, but even so, it fails to convey the idea of a unique larger system. All the other options convey that message of a unique system but lack enough resolution to allow clear identification of its main features. It is remarked that none of the calculated hs leads to erroneous interpretations.
4.2 San Rafael Volcanic Field, Colorado
The San Rafael region, Colorado Plateau (western United States) provides a good example of a zone of distributed volcanism where dykes are not associated with a single, central conduit, and where the spatial relationship between dykes and volcanoes can be observed directly at the surface due to the effects of erosion. In this work I used the locations of 62 conduits identified by Kiyosugi et al. (2012). Based on the measurements made in the field by those authors, it is well documented that the dykes have a preferred orientation along a NW-SE direction. Also, based on the described field relations it is clear than many vents have a direct relationship with the larger sills identified in the field. Thus, the things that can be expected to be revealed by an analysis of vent distribution in the San Rafael area include: 1) The most prevalent association that exists between vents and the larger sills, and 2) the regional orientation of stress as indicated by the orientation of dykes and/or sills.
The range of h yield by the various selectors goes from 0.7 to 4.7 km. As shown in Figure 4, the largest “optimal” h, again obtained with the bcv method (Figure 4E), suggests the presence of one roughly elliptical zone of influence with a general NE-SW orientation and two subclusters that loosely coincide in location with the foci of the rough ellipsoid. In contrast the smallest “optimal” h, again obtained with the ucv method (Figure 4D), yields a large number of local maxima that correspond to independent clusters of small dimensions, some of which actually enclose only one or two of the individual volcanoes. Two of the remaining
![www.frontiersin.org](https://www.frontiersin.org/files/Articles/779095/feart-10-779095-HTML-r1/image_m/feart-10-779095-g004.jpg)
FIGURE 4. Estimated PDFs using the selector and corresponding bandwidth as indicated on top of each panel. The first panel (A) shows the location of the vents used for the calculations (red triangles); the dykes and sills identified by Kiyosugi et al. (2012) are indicated with the red lines. San Rafael example.
4.3 Washington Cascades
The segment of the Cascades located between Mount Rainier and Mount Hood includes a large number of volcanic vents, some of which are large stratovolcanoes and some of which are small, monogenetic cinder cones. A general description of the area is provided by Hildreth (2007) and details of each of the groupings identified can be found on the references therein. From a tectonic point of view volcanism can be divided in three main groups. These include the main axial belt (from Mount Hood to Bumping Lake), a fore-arc section and a back-arc section. The axial belt may be subdivided in three or four sections, each dominated by a larger structure that also comprises several smaller vents. The larger structures in this belt are Mount Hood to the south, Mount Adams in the middle and Goat Rocks to the north. A less prominent structure is Jennies Butte, located between Mt Adams and Goat Rocks, slightly off axis towards the back-arc region. The fore arc section can be also divided in four to seven groups, depending on the emphasis made on the distribution. The main groups in the fore-arc are Mt Rainier to the north, Mt St Helens, Indian Heaven and a zone of diffuse volcanism. The zone of diffuse volcanism itself can be considered to be formed by three distinctive groups (Portland, Wind River and Blue Lake). Finally, the back-arc region is composed by vents of the Simcoe Mountains that may have a north -south division based on the age of the vents. The identification of groups in this area proved to be elusive when out-of-the box clustering methods were employed in isolation. Nevertheless, when the results of several of those methods were combined to define common group associations, the main geological divisions could be identified (Cañón-Tapia, 2020).
The things that can be expected to be revealed by an analysis of vent distribution in this area include: 1) the presence of the most important groups of vents, 2) the North- South orientation and/or distribution of some of those groups, especially in the arc-section, and 3) the different character of the various groups that can be identified. The location of the vents used in this case is the same as that used in a previous report (Cañón-Tapia, 2020).
The range of h yielded by the various selectors goes from 5.7 to 9.5 km. As shown in Figure 5, the largest “optimal” h, obtained with nrd method (Figure 5C), suggests the presence of five clusters, two of which are much more prominent than the other three. The two prominent groups have different orientations, and those two orientations are also reflected on the independent orientations of the less prominent groups. The location of the groups roughly coincides with the location of five of the known groups: three in the center of the map (Santa Helena, a combination of Indian Heaven-Mt Adams, and Simcoe Mountains), and two on the south (Portland area and Mount Hood). All of these groups are still visible, and actually better defined, on the
![www.frontiersin.org](https://www.frontiersin.org/files/Articles/779095/feart-10-779095-HTML-r1/image_m/feart-10-779095-g005.jpg)
FIGURE 5. Estimated PDFs using the selector and corresponding bandwidth as indicated on top of each panel. The first panel (A) shows the location of the vents used for the calculations (red triangles). Washington Cascades example.
None of the diagrams provides enough information suggesting a possible east-west distinction between fore-arc, arc and back-arc settings. Also, none of the diagrams shows all the main groups with clarity. Furthermore, somewhat misleading information about the expected groups is provided by several diagrams, especially when the fore-arc is fused in the same group as the arc volcanoes.
5 Discussion
5.1 Is There a Better Bandwidth Selector?
As it has been the case when different selectors are compared to each other within the parameters of statistical theory, none of the selectors examined above yield consistently better results than all its competitors. For the case of Mauna Kea the more complete image of the volcanic system was conveyed by the ucv selector, whereas in the San Rafael case the less ambiguous image was provided by the sj selector. In the case of the Washington Cascades none of the selectors did a good job in capturing the complexities of the region. Numerically, the range of bandwidths was ample in all three examples, and none of the selectors systematically yields the smallest or largest value. Perhaps more importantly, while in the Mauna Kea case all the selectors convey an incomplete but correct image of the volcanic system, some of the selectors lead to wrong conclusions in the San Rafael example. Also, ambiguous information is conveyed for the case of the Washington Cascades. Thus, albeit reduced in number, these three examples indicate that there is not a better selector that can be used in each and every case of volcanic interest.
5.2 Alternatives to Bandwidth Selectors
Although it can be argued that the sj method was relatively better because it provided somewhat reliable information in all three of studied locations, it is also clear that the information provided by the corresponding
Table 1 summarizes the values of h that were found by each selector in each of the three scenarios examined. As shown in the Table, the difference between the largest and smallest value of h does not depend in a simple form of neither the number of observations or the size of the area of study. Also, the largest, smallest or middle values of h are not consistently related to a particular selector. Consequently, it is almost impossible to identify a universal rule to decide which selector is the more appropriate for all type of volcanic scenarios, or even if one selector is likely to yield a bandwidth that is smaller, larger or intermediate in relation to other selectors. Based on the descriptions provided in the previous section, however, if instead of focusing attention on only one
![www.frontiersin.org](https://www.frontiersin.org/files/Articles/779095/feart-10-779095-HTML-r1/image_m/feart-10-779095-g006.jpg)
FIGURE 6. Sequence of PDFs used for exploration of the main features of the PDFs associated with the vent distribution of vents atop Mauna Kea volcano.
![www.frontiersin.org](https://www.frontiersin.org/files/Articles/779095/feart-10-779095-HTML-r1/image_m/feart-10-779095-g007.jpg)
FIGURE 7. Sequence of PDFs used for exploration of the main features of the PDFs associated with the vent distribution of vents on the San Rafael area.
![www.frontiersin.org](https://www.frontiersin.org/files/Articles/779095/feart-10-779095-HTML-r1/image_m/feart-10-779095-g008.jpg)
FIGURE 8. Sequence of PDFs used for exploration of the main features of the PDFs associated with the vent distribution of vents on the Washington Cascades.
As shown in Figure 6, the structure of the three main rifts, the central summit group and the areas where the SE Rift is complicated and intersected by other zones of linear volcanism are more clearly defined in the
Thus, Figures 6–8 show three important facts: 1) the range of “optimal” h’s (i.e., those defined by using one of the “objective” selectors”) not always captures the entire set of interesting features of a region, but a wider range does, 2) the examination of the sequence of produced
5.3 Is the Volcanic Case Very Special in Nature?
The results reported above illustrate that there is no “golden rule” to select one specific value of h. The existence of alternative statistic or probabilistic formulations making impossible the selection of just one model is not unique to the specific case of the KDE method examined in this work, but it is a common feature in many natural sciences. This has led to different approaches to deal with the uncertainty associated with the different models, and even has been referred to as the “range of technically defensible interpretations” (Ake et al., 2018) or the “extended expert’s distribution” (Marzocchi and Jordan, 2014). Furthermore, the different forms in which the existing information needs to be approached is at the heart of the general discussion that exists between the frequentist and Bayesian approaches and the different types in which various workers classify and handle different sources of error (Friedl and Hörmann, 2008; Marzocchi et al., 2021; O'Hagan, 2008). Thus, the specific problem of bandwidth selection with the kernel method is not exclusive of the volcanic context explored in this paper, but forms part of a wider context within the fields of probability and statistics. As it will be discussed next, which of those perspectives is adopted exerts some influence on the interpretations of results.
5.3.1 Probabilistic vs. Statistic Approach to Vent Distribution
From the perspective of hazard analysis (probabilistic in nature) it may not be important to know which of the existing systems in a region are those active at present, and depending on the information available it might be possible to focus attention on only a single PDF arising from a mixture of several processes. Nevertheless, if the objective of the study aims to infer clues concerning the physical structure present beneath a zone of distributed volcanism (statistic in nature), it might be more informative to examine the whole sequence of PDFs rather than trying to extract information only from one of those diagrams. In particular, it must be noted that the sequence of diagrams produced with increasing values of h follows an order that ultimately is controlled by the underlying structure. Such order is not random, and it is directly related to the specific situation that is under examination. In other words, the whole sequence of
On the other hand, if one encounters a situation in which only one
5.4 The Role of Experts’ Judgment
Another criticism that has been made to the exploration of several
The convenience of learning to work with multiple hypothesis was formalized more than 100 years ago by Chamberlin (1897). As explained by Chamberlin, learning to work with multiple hypothesis promotes less inclination to misapply evidence and to more caution in drawing conclusions. The validity of that assertion is such that those ideas were reprinted half a century after they were first issued (Chamberlin, 1965), and it might be worth reprinting them once again to remind the new generations of scientists that attachment to a ruling theory not always is the best approach. Consequently, in line with the approach of multiple hypothesis, it is considered here that a thorough examination of a range of possible
It is worth emphasizing that in the particular case of the analysis of spatial distributions of vents it is not suggested that each and every
6 Concluding Remarks
As pointed out by Chamberlin (1965) investigations often proceed on the presumption that there is a definite process through which all results are of maximum excellence, and therefore the question of ‘what is the best method?’ is more often asked than ‘what are the special values of different methods, and what are their several advantageous applicabilities?‘. This frame of mind clearly promotes the point of view that problems often arise when assessment teams do not understand how to execute a specific method, when in reality there might not be errors, but only differences in opinion about the outcomes of different methods. Furthermore, in many cases it is implicitly assumed that the selected method of study is well conceived for the purposes of a specific application, which as shown in section 2, may not be always the case for KDE in many volcanic contexts. Nevertheless, as shown by the examples of section 4, the examination of a sequence of PDFs can provide enough clues to assess which of the conceptual situations depicted in Figure 1 seems better suited to describe a particular situation, or at least can be used to formulate better informed hypothesis that can be used to guide future studies in a region.
This approach has been well recognized in the context of inferential analysis on some branches of Earth sciences (e. g., Budnitz et al., 1997), but is not commonly appreciated in the context of statistical applications.
Thus, in parallel to this conceptual issue, throughout this work it has been shown that there is a large diversity of “unbiased” and/or “objective” estimators that can be used to select a single value of h that in turn can be used to produce a single estimator of the real PDF associated with a group of vents within a mainly statistical approach in mind. Because the real distribution of vents is a priori unknown, the range of “optimal” hs is not well constrained. Because some methods of estimation of h work better with some types of distribution than others, and also because we ignore the real distribution when examining the location of vents, the only form to be thorough in our analysis is to use more than one method to estimate h. In so doing, we are not far from adopting the exploratory approach in which a range of hs is used to produce a sequence of
Data Availability Statement
The raw data supporting the conclusions of this article will be made available by the author, without undue reservation.
Author Contributions
EC-T design, data processing, interpretation, and manuscript writing.
Funding
This research was provided fund by CONACYT grant A1-S-23107.
Conflict of Interest
The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s Note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Acknowledgments
I thank the comments of J. Selva and two reviewers that helped to improve the clarity of this work. I also thank the continuous negative anonymous comments made to several papers submitted for publication over the past 5 years because this work would otherwise had not been deemed necessary. Finally, I also thank the editors for handling this work with the required open-mind alluded by one of the reviewers during the revision process.
References
Ake, J., Munson, C., Stamatakos, J., Juckett, M., Coppersmith, K., and Bommer, J. (2018). Updated Implementation Guidelines for SSHAC Hazard Studies. US Nuclear Regulatory Commission.
Aspinall, W. P., and Cooke, R. M. (2013). “Quantifying Scientific Uncertainty from Expert Judgement Elicitation,” in Risk and Uncertainty Assessment for Natural Hazards. Editors J. Rougier, S. Sparks, and L. Hill (Cambridge: Cambridge University Press), 64–99.
Bebbington, M. S., and Cronin, S. J. (2011). Spatio-temporal hazard Estimation in the Auckland Volcanic Field, New Zealand, with a New Event-Order Model. Bull. Volcanol. 73, 55–72. doi:10.1007/s00445-010-0403-6
Bebbington, M. S. (2013). Assessing Spatio-Temporal Eruption Forecasts in a Monogenetic Volcanic Field. J. Volc. Geoth. Res. 252, 14–28. doi:10.1016/j.jvolgeores.2012.11.010
Bebbington, M. S. (2015). Spatio-volumetric hazard Estimation in the Auckland Volcanic Field. Bull. Volcanol. 77, 39. doi:10.1007/s00445-00015-00921-0044310.1007/s00445-015-0921-3
Bernreuter, D., L., Savy, J., B., Mensing, R., W., and Chen, J., C. (1988). Seismic hazard Characterization of 69 Nuclear Plant Sites East of Rocky Mountains. Lawrence Livermore National Laboratory.
Bevilacqua, A., Bursik, M., Patra, A., Bruce Pitman, E., Sangani, R., Kobs‐Nawotniak, S., et al. (2018). Late Quaternary Eruption Record and Probability of Future Volcanic Eruptions in the Long Valley Volcanic Region (CA, USA). J. Geophys. Res. Solid Earth 123, 5466–5494. doi:10.1029/2018jb015644
Budnitz, R., J., Apostolakis, G., Boore, D., M., Cluff, L., S., Coppersmith, K., J., Cornell, C., et al. (1997). Recommendations for Probabilistic Seismic hazard Analysis: Guidance on Uncertainty and Use of Experts. Lawrence Livermore National Laboratory.
Cañón-Tapia, E. (2020). Influence of Method Selection on Clustering Analyses of point-like Features: Examples from Three Zones of Distributed Volcanism. Geomorphology 354, 107063. doi:10.1016/j.geomorph.2020.107063
Cañon-Tapia, E. (2014). Insights into the Dynamics of Planetary Interiors Obtained through the Study of Global Distribution of Volcanoes II: Tectonic Implications from Venus. J. Volcanology Geothermal Res. 281, 70–84. doi:10.1016/j.jvolgeores.2014.05.013
Cañon-Tapia, E., and Mendoza-Borunda, R. (2014). Insights into the Dynamics of Planetary Interiors Obtained through the Study of Global Distribution of Volcanoes I: Empirical Calibration on Earth. J. Volcanology Geothermal Res. 281, 53–69. doi:10.1016/j.jvolgeores.2014.05.015
Cañón-Tapia, E. (2016). Reappraisal of the Significance of Volcanic fields. J. Volcanology Geothermal Res. 310, 26–38. doi:10.1016/j.jvolgeores.2015.11.010
Cañón-Tapia, E. (2021a). Vent Distribution and Sub-volcanic Systems: Myths, Fallacies, and Some Plausible Facts. Earth-Science Rev. 221, 103768. doi:10.1016/j.earscirev.2021.103768
Cañón-Tapia, E. (2021b). Vent Distribution on Jeju Island, South Korea: Glimpses into the Subvolcanic System. J. Geophys. Res. Solid Earth 126, e2021JB022269.
Cañón-Tapia, E. (2013). Volcano Clustering Determination: Bivariate Gauss vs. Fisher Kernels. J. Volcanology Geothermal Res. 258, 203–214. doi:10.1016/j.jvolgeores.2013.04.015
Cañón-Tapia, E. (2021c). “Volcano Distribution and Tectonics: A Planetoidic Perspective,” in In the Footsteps of Warren B. Hamilton: New Ideas in Earth Science. Editors G. Foulger, D. M. Jurdy, C. A. Stein, L. C. Hamilton, K. Howardet al. (Geological Society of America).
Chamberlin, T. C. (1897). Studies for Students: The Method of Multiple Working Hypotheses. J. Geology. 5, 837–848. doi:10.1086/607980
Chamberlin, T. C. (1965). The Method of Multiple Working Hypotheses. Science 148, 754–759. doi:10.1126/science.148.3671.754
Champion, D. E., Cyr, A., Fierstein, J., and Hildreth, W. (2018). Monogenetic Origin of Ubehebe Crater Maar Volcano, Death Valley, California: Paleomagnetic and Stratigraphic Evidence. J. Volcanology Geothermal Res. 354, 67–73. doi:10.1016/j.jvolgeores.2017.12.018
Condit, C. D., and Connor, C. B. (1996). Recurrence Rates of Volcanism in Basaltic Volcanic fields: An Example from the Springerville Volcanic Field, Arizona. Geol. Soc. Am. Bull. 108, 1225–1241. doi:10.1130/0016-7606(1996)108<1225:rrovib>2.3.co;2
Connor, C. B. (1987). Cinder Cone Distribution Described Using Cluster Analysis and Two-Dimensional Fourier Analysis in the Central Transmexican Volcanic Belt, Mexico, and in SE Guatemala and NW El Salvador. Hanover: Darmouth College, 318.
Connor, C. B., Connor, L., Germa, A., Richardson, J., Bebbington, M., Gallant, E., et al. (2019). How to Use Kernel Density Estimation as a Diagnostic and Forecasting Tool for Distributed Volcanic Vents. Siv 43, 1–25. doi:10.5038/2163-338x.4.3
Connor, C. B., and Connor, L. J. (2009). “Estimating Spatial Density with Kernel Methods,” in Volcanic and Tectonic hazard Assessment for Nuclear Facilities. Editors C. B. Connor, N. A. Chapman, and L. J. Connor (Cambridge: Cambridge University Press), 346–368.
Connor, C., and Hill, B. E. (1995). Three Nonhomogeneous Poisson Models for the Probability of Basaltic Volcanism: Application to the Yucca Mountain Region. J. Geophys. Res. 100, 107–125. doi:10.1029/95jb01055
Conway, F. M., Connor, C. B., Hill, B. E., Condit, C. D., Mullaney, K., and Hall, C. M. (1998). Recurrence Rates of Basaltic Volcanism in SP Cluster, San Francisco Volcanic Field, Arizona. Geol 26, 655–658. doi:10.1130/0091-7613(1998)026<0655:rrobvi>2.3.co;2
Delcamp, A., Mossoux, S., Belkus, H., Tweheyo, C., Mattsson, H. B., and Kervyn, M. (2019). Control of the Stress Field and Rift Structures on the Distribution and Morphology of Explosive Volcanic Craters in the Manyara and Albertine Rifts. J. Afr. Earth Sci. 150, 566–583. doi:10.1016/j.jafrearsci.2018.09.012
Duong, T., and Hazelton, M. (2003). Plug-in Bandwidth Matrices for Bivariate Kernel Density Estimation. J. Nonparametric Stat. 15, 17–30. doi:10.1080/10485250306039
Duong, T. (2007). Ks: Kernel Density Estimation and Kernel Discriminant Analysis for Multivariate Data in R. J. Stat. Softw. 21. doi:10.18637/jss.v021.i07
Frankel, A. (1995). Mapping Seismic Hazard in the Central and Eastern United States. Seismological Res. Lett. 66, 8–21. doi:10.1785/gssrl.66.4.8
Friedl, H., and Hörmann, S. (2008). “Frequentist Probability Theory,” in Handbook of Probability. Editor T. Rudas (Los Angeles: SAGE Publications), 15–34.
Heidenreich, N. B., Schindler, A., and Sperlich, S. (2013). Bandwidth Selection for Kernel Density Estimation: a Review of Fully Automatic Selectors. Asta Adv. Stat. Anal. 97, 403–433. doi:10.1007/s10182-013-0216-y
Hiemer, S., Woessner, J., Basili, R., Danciu, L., Giardini, D., and Wiemer, S. (2014). A Smoothed Stochastic Earthquake Rate Model Considering Seismicity and Fault Moment Release for Europe. Geophys. J. Int. 198, 1159–1172. doi:10.1093/gji/ggu186
Hildreth, W. (2007). Quaternary Magmatism in the Cascades - Geological Perspectives. US Geological Survey, 125.
Jacobo-Bojórquez, R. A., and Cañón-Tapia, E. (2020). Distribution of Eruptive Centers on Top of Large Shield Volcanoes in the Inner Solar System: General Classification and Glimpses of Their Subvolcanic Structure. J. Geophys. Res. Planets 125, e2020JE006431.
Jaquet, O., Connor, C., and Connor, L. (2008). Probabilistic Methodology for Long-Term Assessment of Volcanic Hazards. Nucl. Tech. 163, 180–189. doi:10.13182/nt08-a3980
Jones, M. C., Marron, J. S., and Sheather, S. J. (1996). Progress in Data-Based Bandwidth Selection for Kernel Density Estimation. Comput. Stat. 11, 337–381.
Jones, M. C. (1991). The Roles of ISE and MISE in Density Estimation. Stat. Probab. Lett. 12, 51–56. doi:10.1016/0167-7152(91)90163-l
Kereszturi, G., and Németh, K. (2013). “Monogenetic Basaltic Volcanoes: Genetic Classification, Growth, Geomorphology and Degradation,” in Updates in Volcanol- Ogy: New Advances in Understanding Volcanic Systems. Editor K. Németh (InTech), 3–88.
Kiyosugi, K., Connor, C. B., Ferwerda, B. P., Germa, A. M., Connor, L. J., Hintz, A. R., et al. (2012). Relationship between dike and Volcanic Conduit Distribution in a Highly Eroded Monogenetic Volcanic Field: San Rafael, Utah, USA. Geology 40, 695–698. doi:10.1130/g33074.1
Kiyosugi, K., Connor, C. B., Zhao, D., Connor, L. J., and Tanaka, K. (2009). Relationships between Volcano Distribution, Crustal Structure, and P-Wave Tomography: an Example from the Abu Monogenetic Volcano Group, SW Japan. Bull. Volcanol. 72 (3), 331–340. doi:10.1007/s00445-009-0316-4
Lutz, M. T., and Gutmann, J., T. (1995). An Improved Method for Determining and Characterizing Alignements of point like Features and its Implications for the Pinacate Volcanic Field, Sonora, Mexico. J. Geophys. Res. 100, 17 659–617 670. doi:10.1029/95jb01058
Marzocchi, W., and Bebbington, M. S. (2012). Probabilistic Eruption Forecasting at Short and Long Time Scales. Bull. Volcanol. 74 (8), 1777–1805. doi:10.1007/s00445-012-0633-x
Marzocchi, W., and Jordan, T. H. (2014). Testing for Ontological Errors in Probabilistic Forecasting Models of Natural Systems. Proc. Natl. Acad. Sci. U.S.A. 111, 11973–11978. doi:10.1073/pnas.1410183111
Marzocchi, W., Sandri, L., and Selva, J. (2008). BET_EF: a Probabilistic Tool for Long- and Short-Term Eruption Forecasting. Bull. Volcanol. 70, 623–632. doi:10.1007/s00445-007-0157-y
Marzocchi, W., Selva, J., and Jordan, T. H. (2021). A Unified Probabilistic Framework for Volcanic hazard and Eruption Forecasting. Nat. Hazards Earth Syst. Sci. 21, 3509–3517. doi:10.5194/nhess-21-3509-2021
Marzoli, A., Callegaro, S., Dal Corso, J., Davies, J. H. F. L., Youbi, N., Bertrand, H., et al. (2018). “The Central Atlantic Magmatic Province (CAMP): A Review,” in The Late Triassic World. Editor L. H. Tanner (Cham: Springer), 91–125. doi:10.1007/978-3-319-68009-5_4
Németh, K., and Kereszturi, G. (2015). Monogenetic Volcanism: Personal Views and Discussion. Int. J. Earth Sci.. doi:10.007/s00531-015-1243-610.1007/s00531-015-1243-6
Neri, A., Aspinall, W. P., Cioni, R., Bertagnini, A., Baxter, P. J., Zuccaro, G., et al. (2008). Developing an Event Tree for Probabilistic hazard and Risk Assessment at Vesuvius. J. Volcanology Geothermal Res. 178, 397–415. doi:10.1016/j.jvolgeores.2008.05.014
O'Hagan, (2008). “The Bayesian Approach to Statistics,” in Handbook of Probability. Editor T. Rudas (Los Angeles: SAGE Publications), 85–100.
Porter, S., C. (1972). Distribution, Morphology, and Size Frequency of Cinder Cones on Mauna Kea Volcano. Hawaii Bull. Geol. Soc. Am. 83, 3 607–603 612. doi:10.1130/0016-7606(1972)83[3607:dmasfo]2.0.co;2
Richardson, J. A., Miller, D. M., Bleacher, J. E., Connor, C., Gregg, T. K., Connor, L., et al. (2012). Comparison of Monogenetic Volcano Clusters on Earth, Venus and Mars. San Francisco, CA: AGU Fall Meeting. abstract # V44C-04.
Rivalta, E., Corbi, F., Passarelli, L., Acocella, V., Davis, T., and Di Vito, M. A. (2019). Stress Inversions to Forecast Magma Pathways and Eruptive Vent Location. Sci. Adv. 5. doi:10.1126/sciadv.aau9784
Rose, W., I., Palma, J., L., Escobar Wolf, R., and Matías Gomez, R., O. (2013). “A 50 Yr Eruption of a Basaltic Composite Cone: Pacaya, Guatemala,” in Understanding Open-Vent Volcanism and Related Hazards. Editors W. Rose, I. J. Palma, H. Delgado-Granados, and N. R. Varley (Boulder: Geological Society of America), 1–21. doi:10.1130/2013.2498(01)
Rosenblatt, M. (1956). Remarks on Some Nonparametric Estimates of a Density Function. Ann. Math. Stat. 27, 642–669. doi:10.1214/aoms/1177728190
Schindler, A. (2011). Bandwidth Selection in Nonparametric Kernel Estimation. Georg-August-Universitat.
Scott, D. W. (1992). Multivariate Density Estimation: Theory, Practice, and Visualization. New york: Wiley, 317.
Selva, J., Acocella, V., Bisson, M., Caliro, S., Costa, A., Della Seta, M., et al. (2019). Multiple Natural Hazards at Volcanic Islands: a Review for the Ischia Volcano (Italy). J. Appl. Volcanol. 8, 5. doi:10.1186/s13617-019-0086-4
Sheather, S. J., and Jones, M. C. (1991). A Reliable Data-Based Bandwidth Selection Method for Kernel Density Estimation. J. R. Stat. Soc. Ser. B (Methodological) 53, 683–690. doi:10.1111/j.2517-6161.1991.tb01857.x
Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. New York: Chapman & Hall, 175.
Silverman, B. W. (1981). Using Kernel Density Estimates to Investigate Multimodality. J. R. Stat. Soc. Ser. B (Methodological) 43, 97–99. doi:10.1111/j.2517-6161.1981.tb01155.x
Skiena, S. (2003). Calculated Bets. Computers, Gambling, and Mathematical Modeling to Win. Cambridge UK: Cambridge University Press, 232.
Srisutthiyakorn, N., Kiefer, W. S., and Kirchoff, M. (2010). Spatial Distribution of Volcanoes in the Marius Hills and Comparison with Volcanic fields on Earth and Venus. Lunar Planet. Sci. Conf. 41. abstract 1185.
Stock, M. J., Bagnardi, M., Neave, D. A., Maclennan, J., Bernard, B., Buisman, I., et al. (2018). Integrated Petrological and Geophysical Constraints on Magma System Architecture in the Western Galápagos Archipelago: Insights from Wolf Volcano. Geochem. Geophys. Geosyst. 19, 4722–4743. doi:10.1029/2018gc007936
Szakács, A., and Cañón-Tapia, E. (2010). “Some Challenging New Perspectives of Volcanology,” in What Is a Volcano? Editors E. Cañón-Tapia, and A. Szakács (Geological Society of America), 123–140. doi:10.1130/2010.2470(09)
Turlach, B. A. (1993). Bandwidth Selection in Kernel Density Estimation: A Review. Humbolt-Universitat zu Berlin. Discussion Paper 9307 Institut fur Statistik und Okonometrie.
Valentine, G. A., and Gregg, T. K. P. (2008). Continental Basaltic Volcanoes - Processes and Problems. J. Volcanology Geothermal Res. 177, 857–873. doi:10.1016/j.jvolgeores.2008.01.050
Wand, M. P., and Jones, M. C. (1994). Multivariate Plug-In Bandwidth Selection. Comput. Stat. 9, 97–116.
Wand, M. P., and Jones, M. C. (1993). Comparison of Smoothing Parameterizations in Bivariate Kernel Density Estimation. J. Am. Stat. Assoc. 88, 520–528. doi:10.1080/01621459.1993.10476303
Weller, J. N., Martin, A. J., Connor, C. B., Connor, L. J., and Karakhanian, A. (2006). “Modelling the Spatial Distribution of Volcanoes: an Example from Armenia,” in Statistics in Volcanology. Editors H. Mader, M. S. Coles, C. Connor, and L. J. Connor (London: Geological Society), 77–87.
Keywords: kernel analyses, bandwidth selector, clustering methods, vent distribution, vent clustering
Citation: Cañón-Tapia E (2022) Kernel Analyses of Volcanic Vent Distribution: How Accurate and Complete are the Objective Bandwidth Selectors?. Front. Earth Sci. 10:779095. doi: 10.3389/feart.2022.779095
Received: 17 September 2021; Accepted: 31 March 2022;
Published: 14 April 2022.
Edited by:
Luis E. Lara, Servicio Nacional de Geología y Minería de Chile (SERNAGEOMIN), ChileReviewed by:
Derek Rust, University of Portsmouth, United KingdomDaniele Maestrelli, National Research Council, Italy
Jacopo Selva, Istituto Nazionale di Geofisica e Vulcanologia (INGV), Italy
Copyright © 2022 Cañón-Tapia. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Edgardo Cañón-Tapia, ZWNhbm9uQGNpY2VzZS5teA==