
94% of researchers rate our articles as excellent or good
Learn more about the work of our research integrity team to safeguard the quality of each article we publish.
Find out more
ORIGINAL RESEARCH article
Front. Anal. Sci. , 20 February 2025
Sec. Chemometrics
Volume 5 - 2025 | https://doi.org/10.3389/frans.2025.1494615
Mass spectrometry (MS) is a powerful analytical technique employed for a variety of applications including drug development, quality assurance, food inspection, and monitoring environmental pollutants. Recently, in the production of actively developed antibody and nucleic acid pharmaceuticals, impurities with various modifications have been generated. These impurities can lead to a decrease in drug stability, pharmacokinetics, and efficacy, making it crucial to distinguish between them. We previously modeled mass spectrometry for each possible number of constituents in a sample, using parameters such as monoisotopic mass and ion counts, and employed stochastic variational inference to determine the optimal parameters and the maximum posterior probability for each model. By comparing the maximum posterior probabilities among models, we selected the optimal number of constituents and inferred their corresponding monoisotopic masses and ion counts. However, MS spectra are sparse and predominantly flat, which can lead to vanishing gradients when using simple optimization techniques. To solve this problem, using MCMC as in our previous studies would take a very long time. To address this difficulty, in this study, we blur the comparative spectra and gradually reduce the blur to prevent vanishing gradients while inferring accurate values. Furthermore, we incorporate MS/MS spectra into the model to increase the amount of information available for inference, thereby improving the accuracy of parameter inference. This modification improved the mass error from an average of 1.348 Da–0.282 Da. Moreover, the required time, even including the processing of additional five MS/MS spectra, was reduced to less than half.
Mass Spectrometry (MS) serves as a robust analytical method and is employed across various fields including drug development, food safety inspections, and environmental pollutant monitoring. In recent years, with the vigorous development of antibodies and nucleic acid drugs, impurities with different modifications have been produced. Such impurities may adversely affect the stability, pharmacokinetics, and efficacy of drugs (Sanghvi, 2011; Weinberg et al., 2005; Tamara, den Boer, and Heck, 2022; Pecori et al., 2022). It is, therefore, essential in pharmaceutical development and quality control to identify and address these multiple impurities. Additionally, understanding the monoisotopic mass of the constituents can offer crucial insights into the origins of impurity formation. Similarly, knowledge of the ion concentrations of these constituents assists in evaluating the potential impact of the impurities.
In contemporary mass spectrometry, accurately identifying impurities in middle or high molecules with minor modifications remains challenging. Traditional chromatography methods frequently struggle to effectively separate these impurities. It is also difficult to separate them on the MS axis due to increased spectral complexity from isotopes and multivalent ions.
Enhancing the hardware resolution allows for the distinction of subtle variations between isotopes and modifications. However, high-resolution techniques like Fourier Transform Ion Cyclotron Resonance (FT-ICR) necessitate large-scale equipment and significant investment, making them cumbersome to manage. Therefore, it is more practical to use devices suited for standard laboratories, such as Triple Quadrupole MS and Quadrupole Time-of-Flight MS (Q-TOF-MS).
Consequently, there is ongoing research into software-driven signal analysis. Efforts have been made to deduce mass from the data provided by mass spectrometers. Basic techniques for generating m/z lists from spectral data include wavelet transformations (Zhang et al., 2009). However, for spectra from medium to high molecular compounds that display broad isotope distributions, particularly those ionized by electrospray ionization (ESI) which generates multivalent ions, pinpointing the monoisotopic mass becomes more complex.
For tackling charge deconvolution and deisotoping in spectra from multivalent ions, numerous algorithms have been introduced, including heuristic Gaussian fitting via nonlinear least squares minimization (Dasari et al., 2009). The ReSpect algorithm, employing the Maximum Entropy method (Ferrige et al., 1992), has been widely utilized (Zhang and Alecio, 1998; Tranter, 2000; Ferrige et al., 2003). This algorithm calculates
In prior research (Tomono et al., 2024), we inferred the number of constituents based on their monoisotopic masses and ion counts. We modeled these using multiple assumed constituent counts and then derived the maximum posterior probability and optimal model parameters for each numbers of constituents using NUTS (No-U-Turn Sampler), Simulated Annealing, and Stochastic Variational Inference. Despite these efforts, the accuracy of our results was insufficient.
Consequently, this study introduces an improved methodology to accurately infer the optimal number of constituents and their monoisotopic masses and ion counts using MS/MS (Tandem Mass Spectrometry) spectra. This methodology is beneficial for detecting impurities in pharmaceutical products, optimizing synthesis conditions for medium to high molecular drugs, and enhancing quality assurance processes in manufacturing settings.
Our method initially models the physical MS and MS/MS system with all possible numbers of constituents. For each model with a different number of constituents, we calculate the optimal monoisotopic masses and ion counts and derived the posterior probabilities. The monoisotopic mass refers to the sum of the masses of the most abundant isotopes of each element present in a molecule or ion. This calculation is achieved by using Stochastic Variational Inference (SVI) (Wingate and Weber, 2013; Ranganath et al., 2014; Kingma and Welling, 2013).
However, this model encounters specific issues inherent in mass spectrometry. The MS spectrum we are comparing is mostly flat with several sharp peaks localized in certain areas. Applying simple optimization methods to such data often leads to vanishing gradients, making it difficult to effectively explore parameters. One way to avoid this difficulty is to use Markov chain Monte Carlo methods (MCMC) and Simulated Annealing, but this requires significant computational time.
Therefore, we propose a new method called Spectral Annealing Inference (SAI). SAI combines SVI and spectral annealing by Point Spread Function (PSF) to explore optimal parameters while avoiding vanishing gradients and local optima. After calculating all posterior probabilities by SAI, we select the most probable number of constituents, as well as their monoisotopic masses and ion counts.
To prevent selecting overfitted complex models, we introduce a prior distribution of the number of constituents. In this paper, we define a constituent as a set of ions that share the same monoisotopic masses,
To avoid the curse of dimensionality where the search space expands exponentially with the number of constituents, we employ a staged search approach. We incrementally increase the number of constituents from
Initially, we develop a model for
According to prior research, the spectrum in mass spectrometry can be approximated using the following model (Tomono et al., 2024). The probability distribution of mass of constituent
The mathematical expressions of the distributions generated by these binominal processes are:
Here,
Typically, the spectrum obtained from a mass spectrometer is represented along the mass-to-charge ratio
where
and
The theoretical probability distribution
As previously stated, regardless of its charge state or mass, a single ion contributes as a single delta function. Therefore, the observed spectrum of ions is proportional to the probability distribution of ions along the
Due to the point spread of the detector’s response
where
In this subsection, we particularly focus on the generation process of MS/MS spectra. Hybrid mass spectrometers equipped with multiple separation mechanisms allow for the selective passage of precursor ions based on specific
We define a set of ions sharing the monoisotopic mass
Accordingly, the number of atoms in constituent
Additionally, we approximate the charge distribution of constituent
Accordingly, the number of chargeable sites of constituent
When the total number of ions of constituent
When an ion
The probability distribution
In a manner similar to the MS spectrum, the observed spectrum of ions is proportional to the probability distribution of ions along the
Therefore, the MS/MS spectrum for
Here, we set
We consider a scenario in which we obtain a set of observed spectra
We determine the posterior probability and optimal parameters by maximizing logarithmic posterior probability
In this study, the set of parameters for inference, denoted as
Here, we introduce two likelihoods derived from observation error models. The observed spectrum typically includes thermal noise from detection circuitry, which is assumed to follow a normal distribution. Therefore, we base the observational error, representing a deviation between observed data and true values, on this distribution. For inference, we employ square error-based likelihood derived from the normal distribution. However, because low-intensity regions within the spectrum have less contribution to the overall error evaluation if we use a square error-based likelihood, relying solely on this likelihood reduces accuracy of parameter estimation where the errors in the low-intensity spectral regions must be reflected. To overcome this difficulty, we additionally introduce a likelihood function sensitive to errors in the low-intensity parts of the spectrum. To evaluate the discrepancies between the observed and inferred spectra regardless of spectral intensity, we use the correlation coefficient along the
Let
To introduce the additional correlation-based likelihood, we employ the von Mises distribution as an error model, which is defined by the correlation coefficient between two vectors representing the observed and inferred spectra. The logarithmic likelihoods based on the von Mises distribution are denoted as
The total log-likelihood of the inferred spectrum set (
In determining the appropriate number of constituents
Additionally, to ensure that the monoisotopic masses of the constituents do not overlap, we introduce a penalty function
This penalty function reaches its maximum value when the monoisotopic masses of different constituents completely coincide.
By assuming a uniform prior distribution of each parameter, the logarithmic prior probability is defined as:
Here, by substituting Equations 19, 22 into Equation 14, we obtain the logarithmic posterior probability
We use Stochastic Variational Inference (SVI) to infer the Maximum A Posteriori (MAP) values of each parameter and to determine the model’s highest posterior probability. SVI replaces the complex posterior probability distribution with a more manageable approximate distribution (variational posterior
Since
Given that
The optimization problem under this setup can be solved using conventional numerical optimization techniques. In this case, we used Adam (Kingma and Jimmy, 2014), a type of stochastic gradient descent widely used in machine learning, to find the value of
However, the MS and MS/MS spectra to be compared are mostly flat with several localized sharp peaks. Simply applying SVI to such data can result in vanishing gradients, making it difficult to effectively explore parameters. Therefore, to create appropriate gradients of the likelihood function, we convolve a Gaussian distribution
Then, we performed SVI and iteratively narrowing the variance of
For this study,
The blurred spectra at each step are represented as shown in Equations 29–32.
Using these blurred spectra, we derive the modified log-likelihood
By substituting Equation 37 in place of Equation 19 into Equation 14, the modified logarithmic likelihood
At each iteration step
By repeating this process from
In this section, we detail the outcomes of our experiments to validate the inference accuracy of constituent counts, monoisotopic mass, and ion quantities in our proposed method. All the experiments were conducted exclusively using numerical simulations. These simulations generated data to mimic real-world mass spectrometry analyses. We specifically focused on simulating the mass spectra of nucleic acid drugs and their impurities, such as Fomivirsen and its altered sequences. This is because current analytical methodologies have challenges in accurately identifying these substances, due to the complexities arising from their isotopic and charge distributions. We compared the performance of our proposed method against established baseline method, UniDec. The performance was evaluated based on accuracy of constituent count inference, deviations in monoisotopic mass, and discrepancies in ion quantities.
The specifications of a computer used to verify the proposed method, as well as the software versions, are summarized in Table 1. The proposed method handled data with high dimensions along the time axis, requiring a large memory size. Additionally, to rapidly explore the parameter space using SVI, the high-speed probabilistic programming library, NumPyro, along with its compatible CUDA and GPU, were used.
Based on the nucleic acid drug Fomivirsen (Perry and Balfour, 1999) (ID: A), two impurities with modified base sequences were added, and MS spectra for a total of three constituents were generated using simulation methods presented in the prior research (Tomono et al., 2024). Specific details were provided in Table 2. This setup replicated a system where the principal constituent’s isotopic distribution was mixed with the spectra of the impurities. The mutation from C (Cytosine) to U (Uracil), known as deamination, can occur during the synthesis process due to solvent conditions and thermal stress (Gao, Choudhry, and Cao, 2018; Stavnezer, 2011).
The single constituents A to C were combined according to the 10 combinations listed in Table 3. These combinations included both three-constituents mixtures (A, B, and C) and two-constituents mixtures (A and B, A and C, or B and C). To verify the accuracy of ion count inference, the ion counts of constituents A, B, and C were mixed at a ratio of 200,000:200,000 and 200,000:20,000. The reason for testing both balanced and imbalanced mixing ratios was to validate if our proposed algorithm tends to provide appropriate ratios of multiple constituents whether their actual ratios were balanced or imbalanced. When the ratio of ion counts between constituents was 10:1, the algorithm should not excessively provide less imbalanced ratios. This setup enabled the analysis of complex mixtures consisting of a few constituents. For instance, the standards for total desamido impurity and total impurities in injectable glucagon are set at 14% or less and 31% or less, respectively (Bao et al., 2022). To ensure rigor, we selected a stricter ratio of 10:1 (10%), which is below these standards yet sufficiently impactful to be considered as impurities. Additionally, the 10 patterns of combinations of each constituent were selected to comprehensively evaluate differences of 1 Da due to deamidation, while also considering workload required for our experimental performance evaluation and the constraints of a budget.
We set the number of chargeable sites
Next, we generated the MS/MS spectra of these mixture. The sequences, molecular formulas, monoisotopic masses, and conversion rates of the fragments generated from the dissociation of constituents A, B, and C are defined in Table 4. The MS/MS spectra were generated using these parameters. This time, we selected five peaks in ascending order of
We estimated the optimal parameters for assumed constituent count models. Table 5 presents the logarithm of the maximum posterior probabilities of each model. By selecting the constituent count that maximizes the logarithm of the posterior probability in each mixture, we inferred the number of constituents present in each mixture. Our method successfully inferred the true number of constituents in 80% of cases (8 out of 10 datasets). In the two cases where estimation failed, it is possible that the algorithm converged to a different local minimum.
Currently, there are no established guidelines for the quality control of nucleic acid-based pharmaceuticals (International Council for Harmonisation, 2023; World Health Organization, 2018). Therefore, we believe this result serves as a valuable benchmark for identifying the presence and quantity of impurities in pharmaceuticals and implementing appropriate corrective measures. However, there is still room for improvement in its accuracy.
Table 6 shows the optimal monoisotopic mass of the models of the selected number of constituents for each mixture, as described in Table 3. The median error was −0.005 Da, the average error in monoisotopic mass was −0.282 Da, and the maximum error was −1.840 Da. The standard deviation was 0.552 Da. The distribution of these errors is shown in Figure 1. As observed in the box plot in Figure 1, the errors in the monoisotopic masses inferred by the proposed method are discretely distributed approximately 1 Da apart, corresponding to the mass differences between isotopes. The extreme case of No. 6, which produced the maximum error of −1.840 Da, can also be explained by this discrete distribution. This large error is likely caused by the posterior probabilities of the monoisotopic masses being distributed in a comb-like pattern (Tomono et al., 2024), increasing the chances of the algorithm converging to a local minimum 1–2 steps away. However, no clear trend was observed between the ion count ratios of the constituents and the error magnitudes. Using the mean as the representative value and all data from No. 1 to No. 10, the 95% confidence interval calculated using the t-distribution (Student, 1908) ranges from −0.721 Da to +0.157 Da. This indicates the method could potentially be used to investigate the causes of impurities that occur with a difference of 1 Da (Rentel et al., 2019; Pourshahian, 2021).
Figure 1. Distribution of errors in the inferred monoisotopic masses (excluding points that the algorithm could not infer).
Additionally, the inferred ion counts for each constituent showed errors with a median of 1.1 times the true values, averaging up to twice the true values, with some errors reaching up to twelve times the true values, as shown in Table 7. This was thought to be due to the trade-off relationship between the ion counts of different constituents; that was, a decrease in the ion count of one constituent was compensated by an increase in another. This was further supported by the fact that the average error across the total ion counts of all constituents stabilized at 8% of the true value. For instance, the standard for total desamido impurities and total impurities in injectable glucagon were, respectively, below 14% and 31%. Therefore, the accuracy of ion count inference in our proposed method was insufficient to assess the impact of impurities.
Table 7. Optimal ion counts and relative quantities of the model with the maximum posterior probability.
We also performed deconvolution on the same mixture data using UniDec, a popular deconvolution software, and compared the inference results. For this verification, we used UniDec (Version 7.0.1). The specific parameter settings used during this verification are shown in Table 8. The Mass Range was set to the same range as the proposed method, and Sample Mass Every (Da) was set to 0.1 to ensure sufficient detection of impurities with a difference of 1 Da. Default values were used for parameters not mentioned.
The accuracy of estimating the number of constituents was 40% (4 out of 10). This was thought to be because the UniDec algorithm, which iterated through multiple deconvolutions to arrive at the number of constituents, did not necessarily guarantee the accuracy of the constituent count. Note that using UniDec to determine the number of constituents was not its intended application. The median error of the monoisotopic mass inferred using UniDec was −0.008 Da, which is slightly worse than that of the proposed method. On the other hand, the average error was 0.091 Da, and the maximum error was 0.993 Da, both slightly better than those of the proposed method. However, in principle, accurate inference on the monoisotopic mass required precise identification of the number of constituents. The error in estimating the number of ions was, on average, 3.2 times the true value and up to 17 times at maximum. This result was not better than that of the proposed method.
For reference, Figure 2 presents a comparison between the spectrum of Mixture No. 1 and the spectrum reconstructed from its inferred parameters. Figure 2A provides an overview of the charge distribution, while Figure 2B offers a detailed view of the isotopic distribution. The gray vertical dashed lines in Figures 2A, B indicate the m/z of the fragmented ions. Additionally, Figures 2C, D display the MS/MS spectrum of the fragmented ion groups and its detailed view, respectively. The five graphs correspond to the five peaks in Figure 2B, each representing the MS/MS spectra of the ions contained in those peaks when they are fragmented. These results demonstrated that the generated spectrum closely matched with the observed data. Furthermore, the appearance of the MS/MS spectra was consistent with findings from prior research cited in references (Agthoven et al., 2019; Szalwinski et al., 2020; Gonzalez et al., 2022).
Figure 2. Comparison of observed and inferred spectra for Mixture No. 1. (A) MS spectrum overall view, (B) MS spectrum enlarged view, (C) MS/MS spectrum overall view, (D) MS/MS spectrum enlarged view.
We confirmed that our proposed method allowed for accurate inference of parameters such as monoisotopic mass from simulated MS and MS/MS data of the nucleic acid drug Fomivirsen and its impurities, and it also successfully selected the correct number of constituents with an 80% probability, even in mixtures with a mass ratio of 10:1. These results were better compared to the 40% accuracy rate achieved with UniDec. This success was attributed to our approach of creating models for each constituent count, enabling comparative evaluation and selection of models for each constituent count. This capability suggests the presence of impurities in pharmaceuticals and could aid in the search for better synthesis conditions for medium to high molecular weight drugs, as well as in quality assurance in manufacturing facilities.
As shown in Table 6, we were able to infer monoisotopic mass with greater accuracy than previous studies (Tomono et al., 2024), with an average inference error of −0.282 Da, which was an improvement over the 1.348 Da error reported in prior research. Although this accuracy was slightly inferior to UniDec’s 0.091 Da, it was sufficient for distinguishing differences as small as 1 Da due to deamidation. We believe this improvement is due to the incorporation of the MS/MS spectra into the physical model, which increased the constraints on the model’s degrees of freedom. Additionally, the use of the correlation-based likelihood contributed to more stringent constraints on the spectral shape.
As indicated in Table 7, the inferred ion quantities for each constituent showed an average relative error of twice the true value. Although a direct comparison with the prior studies, which used a 1:1 mixing ratio, was not straightforward due to our use of a 10:1 ratio, the results were favorable compared to UniDec, which had an average error of 3.2 times the true value. The errors observed in our proposed method might result from a trade-off among the ion quantities of each constituent, where a decrease in one was offset by an increase in another. Despite our expectations that incorporating MS/MS spectra would tighten inference constraints and enhance both mass and ion quantity accuracy, the performance fell short of expectations, failing to reduce the relative error to below the 10% threshold required for impurity analysis in nucleic acid drugs. A possible solution to these issues would be to represent the ion quantities as probability distributions. By accounting for the uncertainty in the ion quantities of constituents in the sample, an improvement in inference accuracy was expected.
Despite the sixfold increase in data volume due to the incorporation of MS/MS spectra as observational data, the analysis time per data point was 13 h. While this duration did not match the few seconds required by UniDec, it was less than half the time required by existing methods (Tomono et al., 2024) that use MCMC.
In this study, we assumed the numbers of constituents in a given sample and created models of MS and MS/MS mass spectrometry based on parameters such as monoisotopic mass and ion quantity. We then applied our proposed method, Spectral Annealing Inference (SAI), which effectively seeks the maximum posterior probability by optimizing parameters for the observed data. After obtaining the maximum posterior probability for each constituent count model, we selected the model that had the highest maximum posterior probability across all models. As a result, we successfully estimated the number of constituents and simultaneously inferred the monoisotopic mass with high accuracy.
Future challenges include adapting to complex samples with a large number of constituents and improving the accuracy of ion counts inference.
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
TT: Conceptualization, Investigation, Methodology, Project administration, Software, Validation, Visualization, Writing–original draft. SH: Conceptualization, Writing–review and editing. JI: Writing–review and editing. TW: Conceptualization, Supervision, Writing–review and editing.
The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.
We extend our deepest gratitude to Yoshihiro Ueno3, Yusuke Tagawa3, and Daisuke Hiramaru2,5 for their invaluable advice and coordination throughout the entire project. This manuscript was partially translated with the assistance of ChatGPT (GPT-4, OpenAI) as of September 2024. The final translation were reviewed and confirmed by the authors.
Authors TT and JI were employed by Shimadzu Corporation.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Agthoven, M. A. van, Lam, Y. P. Y., O’Connor, P. B., Rolando, C., and Delsuc, M.-A. (2019). Two-dimensional mass spectrometry: new perspectives for tandem mass spectrometry. Eur. Biophysics J. EBJ 48 (3), 213–229. doi:10.1007/s00249-019-01348-5
Bao, Z., Cheng, Y.-C., Luo, M. Z., and Zhang, J. Y. (2022). Comparison of the purity and impurity of glucagon-for-injection products under various stability conditions. Sci. Pharm. 90 (2), 32. doi:10.3390/scipharm90020032
Dasari, S., Wilmarth, P. A., Reddy, A. P., Robertson, L. J. G., Nagalla, S. R., and Larry, L. D. (2009). Quantification of isotopically overlapping deamidated and 18O-labeled peptides using isotopic envelope mixture modeling. J. Proteome Res. 8 (3), 1263–1270. doi:10.1021/pr801054w
Ferrige, A., Ray, S., Alecio, R., Ye, S., and Waddell, K. (2003). Electrospray-MS charge deconvolutions without compromise – an enhanced data reconstruction algorithm utilising variable peak modelling. Santa Fe, NM: American Society for Mass Spectrometry. Available at: https://positiveprobability.com/POSTERS/ASMS%202003.pdf.
Ferrige, A. G., Seddon, M. J., Green, B. N., Jarvis, S. A., Skilling, J., and Staunton, J. (1992). Disentangling electrospray spectra with maximum entropy. Rapid Commun. Mass Spectrom. RCM 6 (11), 707–711. doi:10.1002/rcm.1290061115
Gao, J., Choudhry, H., and Cao, W. (2018). Apolipoprotein B MRNA editing enzyme catalytic polypeptide-like family genes activation and regulation during tumorigenesis. Cancer Sci. 109 (8), 2375–2382. doi:10.1111/cas.13658
Gonzalez, L. E., Szalwinski, L. J., Sams, T. C., Dziekonski, E. T., and Cooks, R. G. (2022). Metabolomic and lipidomic profiling of Bacillus using two-dimensional tandem mass spectrometry. Anal. Chem. 94 (48), 16838–16846. doi:10.1021/acs.analchem.2c03961
International Council for Harmonisation (ICH) (2023). ICH Q2(R2): validation of analytical procedures. Geneva, Switzerland: International Council for Harmonisation.
Kingma, D. P., and Jimmy, Ba. (2014). Adam: a method for stochastic optimization. ArXiv [Cs.LG]. doi:10.48550/arXiv.1412.6980
Kingma, D. P., and Welling, M. (2013). Auto-encoding variational Bayes. ArXiv [Stat.ML]. Available at: http://arxiv.org/abs/1312.6114v11.
Lucy, L. B. (1974). An iterative technique for the rectification of observed distributions. Astronomical J. 79 (June), 745. doi:10.1086/111605
Marty, M. T. (2020). A universal score for deconvolution of intact protein and native electrospray mass spectra. Anal. Chem. 92 (6), 4395–4401. doi:10.1021/acs.analchem.9b05272
Marty, M. T., Baldwin, A. J., Marklund, E. G., Hochberg, G. K. A., Benesch, J. L. P., and Robinson, C. V. (2015). Bayesian deconvolution of mass and ion mobility spectra: from binary interactions to polydisperse ensembles. Anal. Chem. 87 (8), 4370–4376. doi:10.1021/acs.analchem.5b00140
Neath, A. A., and Cavanaugh, J. E. (2012). The bayesian information criterion: background, derivation, and applications. WIREs Comput. Stat. 4 (2), 199–203. doi:10.1002/wics.199
Pecori, R., Di Giorgio, S., Paulo Lorenzo, J., and Nina Papavasiliou, F. (2022). Functions and consequences of AID/APOBEC-Mediated DNA and RNA deamination. Nat. Rev. Genet. 23 (8), 505–518. doi:10.1038/s41576-022-00459-8
Perry, C. M., and Balfour, J. A. (1999). Fomivirsen. Drugs 57 (3), 375–380. doi:10.2165/00003495-199957030-00010
Pourshahian, S. (2021). Therapeutic oligonucleotides, impurities, degradants, and their characterization by mass spectrometry. Mass Spectrom. Rev. 40 (2), 75–109. doi:10.1002/mas.21615
Ranganath, R., Gerrish, S., and Blei, D. (2014). “Black box variational inference,” in Proceedings of the seventeenth international conference on artificial intelligence and statistics. Proceedings of machine learning research. Editors S. Kaski, and J. Corander (Reykjavik, Iceland: PMLR), 33, 814–822.
Rentel, C., DaCosta, J., Roussis, S., Chan, J., Capaldi, D., and Bao, M. (2019). Determination of oligonucleotide deamination by high resolution mass spectrometry. J. Pharm. Biomed. Analysis 173 (September), 56–61. doi:10.1016/j.jpba.2019.05.012
Richardson, W. H. (1972). Bayesian-based iterative method of image restoration. J. Opt. Soc. Am. 62 (1), 55. doi:10.1364/josa.62.000055
Sanghvi, Y. S. (2011). “A status update of modified oligonucleotides for chemotherapeutics applications,” in Current protocols in nucleic acid chemistry. Editor S. L. Beaucage, 1–22.
Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statistics 6 (2), 461–464. doi:10.1214/aos/1176344136
Stavnezer, J. (2011). Complex regulation and function of activation-induced cytidine deaminase. Trends Immunol. 32 (5), 194–201. doi:10.1016/j.it.2011.03.003
Szalwinski, L. J., Holden, D. T., Morato, N. M., and Cooks, R. G. (2020). 2D MS/MS spectra recorded in the time domain using repetitive frequency sweeps in linear Quadrupole ion traps. Anal. Chem. 92 (14), 10016–10023. doi:10.1021/acs.analchem.0c01719
Tamara, S., den Boer, M. A., and Heck, A. J. R. (2022). High-resolution native mass spectrometry. Chem. Rev. 122 (8), 7269–7326. doi:10.1021/acs.chemrev.1c00212
Tomono, T., Hara, S., Nakai, Y., Takahara, K., Iida, J., and Washio, T. (2024). A bayesian approach for constituent estimation in nucleic acid mixture models. Front. Anal. Sci. 3. doi:10.3389/frans.2023.1301602
Weinberg, W. C., Frazier-Jessen, M. R., Wu, W. J., Weir, A., Hartsough, M., Keegan, P., et al. (2005). Development and regulation of monoclonal antibody products: challenges and opportunities. Cancer Metastasis Rev. 24 (4), 569–584. doi:10.1007/s10555-005-6196-y
Wingate, D., and Weber, T. (2013). Automated variational inference in probabilistic programming. ArXiv E-Prints, January, arXiv:1301.1299.
World Health Organization (WHO) (2018). Good practices for pharmaceutical quality control laboratories. Geneva, Switzerland: World Health Organization.
Zhang, J., Gonzalez, E., Hestilow, T., Haskins, W., and Huang, Y. (2009). Review of peak detection algorithms in liquid-chromatography-mass spectrometry. Curr. Genomics 10 (6), 388–401. doi:10.2174/138920209789177638
Keywords: LC-MS, ESI, chemometrics, Bayesian inference, deconvolution, signal processing, nucleic-acid-drugs
Citation: Tomono T, Hara S, Iida J and Washio T (2025) Enhancing constituent estimation in nucleic acid mixture models using spectral annealing inference and MS/MS information. Front. Anal. Sci. 5:1494615. doi: 10.3389/frans.2025.1494615
Received: 11 September 2024; Accepted: 23 January 2025;
Published: 20 February 2025.
Edited by:
Venugopal Rao Soma, University of Hyderabad, IndiaReviewed by:
Jingzhe Li, Lam Research, United StatesCopyright © 2025 Tomono, Hara, Iida and Washio. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Taichi Tomono, dF90YWljaGlAc2hpbWFkenUuY28uanA=
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
Research integrity at Frontiers
Learn more about the work of our research integrity team to safeguard the quality of each article we publish.