AUTHOR=Bell David M. , Wilson Barry T. , Werstak Charles E. , Oswalt Christopher M. , Perry Charles H. 

TITLE=Examining k-Nearest Neighbor Small Area Estimation Across Scales Using National Forest Inventory Data

JOURNAL=Frontiers in Forests and Global Change

VOLUME=5

YEAR=2022

URL=https://www.frontiersin.org/journals/forests-and-global-change/articles/10.3389/ffgc.2022.763422

DOI=10.3389/ffgc.2022.763422

ISSN=2624-893X

ABSTRACT=<p>National forest inventories (NFI), such as the one conducted by the United States Forest Service Forest Inventory and Analysis (FIA) program, provide valuable information regarding the status of forests at regional to national scales. However, forest managers often need information at stand to landscape scales. Given various small area estimation (SAE) approaches, including design-based and model-based estimation, it may not be clear which is most appropriate for the user’s application. In this study, our objective was to assess the uncertainty in tree aboveground live carbon (ALC) estimates for differing modes of SAE across multiple scales to provide guidance for appropriate scales of application. We calculated means and variances for ALC with design-based (Horvitz-Thompson), model-assisted (generalized regression), and model-based (k-nearest neighbor synthetic) estimators for estimation units over a range of sizes for 30 subregions in California, United States. For larger areas (10,000–64,800 ha), relative efficiencies greater than one indicated that the generalized regression estimator (GREG) generated estimates with less error than the Horvitz-Thompson estimator (HT), while the bias-adjusted synthetic estimator relative efficiency compared to either the Horvitz-Thompson or model-assisted estimators exceeded one for areas 25,000 ha and smaller. Variance estimates from the unadjusted synthetic estimator underestimated the total error, because the estimator ignores bias and thus only addresses model variance. Across scales (250–64,800 ha, 0–27 plots per area of interest), 93% of the variation in the synthetic estimator’s relative standard error was explained by forest area, forest dominance, and regional variation in forest landscapes. Our results support model-assisted estimation use except for small areas where few plots (&lt;10 in the current study) are available for generating estimates in spite of biases in estimates. However, users should exercise caution when interpreting model-based estimates of error as they may not account for model mis-specification, and thus induced bias. This research explored multiple scales of application for SAE procedures applied to NFI data regarding carbon pools, potentially supporting a multi-scale approach to forest monitoring. Our results guides users in developing defensible estimates of carbon pools, particularly as it relates to the limits of inference at a variety of spatial scales.</p>