- 1Institute of Biology, ELTE Eötvös Loránd University, Budapest, Hungary
- 2National Laboratory for Health Security, ELTE Eötvös Loránd University, Budapest, Hungary
- 3National Reference Laboratory for Retroviruses, Department of Virology, National Center for Public Health and Pharmacy, Budapest, Hungary
- 4Department of Bioinformatics, Semmelweis University, Budapest, Hungary
- 5Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Leuven, Belgium
- 6Department of Infectology, Central Hospital of Southern Pest, National Institute of Hematology and Infectious Diseases, National Center of HIV, Budapest, Hungary
- 7Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
- 8Synthetic and Systems Biology Unit, Institute of Biochemistry, HUN-REN Biological Research Centre, Szeged, Hungary
- 9HUN-REN Office for Supported Research Groups, Budapest, Hungary
- 10Department of Infectious Diseases and Hospital Epidemiology, University Hospital Zurich, University of Zurich, Zurich, Switzerland
- 11Institute of Medical Virology, University of Zurich, Zurich, Switzerland
- 12Department of Internal Medicine and Hematology, Departmental Group of Infectious Diseases, Semmelweis University, Budapest, Hungary
Introduction: Although antiretroviral therapy can suppress the transmission of HIV-1, the pandemic persists and continues to evolve. Monitoring virus transmission patterns and evolving variants is therefore essential for improving prevention strategies. To address this need, we present the first comprehensive molecular epidemiological analysis of the HIV-1 epidemic in Hungary—a country in the contact zone of major HIV-1 subtypes A, B, and F.
Methods: We analyzed partial pol sequences obtained from 1,120 Hungarian patients in the context of routine drug resistance genotyping between 2008 and 2024, along with 2,202 international background sequences selected based on sequence similarity. We performed subtyping, drug resistance testing, maximum likelihood and Bayesian phylogenetic inference, distance-based and phylogenetic clustering, and Bayesian phylogeographic analyses to identify domestic clusters and cross-border transmission.
Results: Most sequences (814/1,120) belonged to subtype B; however, the frequency of non-B subtypes (mainly A, F, and several recombinant forms) has increased steadily since 2014, reaching 41.3% in 2024. Phylogenetic analyses identified 136 domestic clusters and a large transmitted drug resistance clade persisting over two decades. Remarkably, one in six recent diagnoses mapped to a single cluster, and approximately a third of all new diagnoses to the five most active clusters. Individuals in larger clusters were more often young, men who have sex with men (MSM), and had higher CD4+ counts. While the most affected risk group is still MSM, suspected heterosexual cases have increased recently, with a clear separation between sub-epidemics. Incorporation of international sequences revealed 149 mixed clusters, 56 mixed monophyletic pairs, and at least 122 independent introductions, linking the Hungarian epidemic predominantly to other European countries (Germany, UK, Poland, Spain, Italy, and Croatia for subtype B; Russia and Poland for subtype A; Romania for subtype F, Italy for CRF18_cpx), and to more distant sources for CRF01_AE (Thailand, China), CRF02_AG (Spain, Cameroon) and CRF56_cpx (Turkey).
Discussion: In summary, recent HIV infections in Hungary stem mainly from domestic transmission among MSM, with a few highly active transmission clusters underscoring the need for targeted interventions. The epidemic is most strongly linked to Western and Central Europe, with increasing introductions and spread of non-B subtypes from Eastern Europe and beyond.
Graphical Abstract. Created in BioRender. Zsichla, L. (2025) https://BioRender.com/za6ltyr.
1 Introduction
Human Immunodeficiency Virus type 1 (HIV-1) is characterized by high genetic diversity, primarily due to its error-prone reverse transcriptase and the absence of proofreading mechanisms (Roberts et al., 1988). The strain of HIV-1 responsible for the global pandemic, known as group M, has diversified into 10 recognized subtypes (A–D, F–H, and J–L) (Nair et al., 2024), which, due to distinct founder effects and subsequent local transmission (Ferdinandy et al., 2015), have become predominant in specific geographical regions. In addition, recombination between HIV-1 strains further contributes to genetic variability, leading to the emergence of 167 known circulating recombinant forms (CRFs) as of November 2025, which demonstrate evidence of sustained transmission, in contrast to unique recombinant forms (URFs) (LANL Sequence Database, 2025). This extensive genetic diversity facilitates the tracking of transmission dynamics within and between population groups (Beloukas et al., 2016; Nduva et al., 2021) and the identification of case clusters at both national and international levels (Hassan et al., 2017; Grabowski et al., 2018).
Hungary, a country with a population of 9.5 million, has one of the lowest rates of HIV diagnoses in the WHO European Region, with 2.1–2.7 registered cases per 100,000 inhabitants annually since 2014 and a cumulative total of 5,123 HIV and 1,285 AIDS diagnoses from 1986 to the end of 2024 (European Centre for Disease Prevention Control, 2024). Approximately 9 out of 10 diagnosed individuals were male, and the most prevalent risk group was men who have sex with men (MSM, 77.9% of cases). Other documented transmission routes—heterosexual contact (HET, 18.4%), people who inject drugs (PWID, 1.0%), and mother-to-child transmission (MTCT, 0.6%)—represent less prevalent risk categories (National Center for Public Health Pharmacy, 2025). Similar to other MSM-dominant epidemics in Europe, the HIV-1 epidemic in Hungary has historically been dominated by subtype B; however, several non-B subtypes began to increase in prevalence in the mid-2010s (Áy et al., 2020).
Hungary is situated at the intersection of several epidemiologically distinct regions, which makes it an ideal sentinel country to detect geographical patterns of international HIV spread or the expansion of (potentially) more transmissible subtypes. Neighboring countries to Hungary's north, west, and south—Austria (Leierer et al., 2024), Slovakia (Habekova et al., 2010), Slovenia (Lunar et al., 2015, 2018), Croatia (Oroz et al., 2019; Planinić et al., 2023), and Serbia (Jovanović et al., 2019)—exhibit low-prevalence HIV epidemics that are dominated by subtype B and concentrated in the MSM risk group, similar to the epidemic in Hungary (European Centre for Disease Prevention Control, 2024). In contrast, Ukraine, located at Hungary's northeastern border and at the edge of the subtype A-prevalent region of Eastern Europe (particularly affected by the A6 lineage), exhibits a higher overall HIV prevalence, with a larger proportion of cases among PWID and HET individuals (Vasylyeva et al., 2018, 2019). The ongoing war has resulted in large-scale displacement of Ukrainian citizens since 2022, mainly toward Russia, Poland, Romania, the Republic of Moldova, Slovakia, and Hungary (Lloyd and Sirkeci, 2022), potentially facilitating cross-border transmission of the A6 sub-subtype (Serwin et al., 2023, 2024; Weibull Wärnberg et al., 2023). Finally, to the east, Romania has another distinct epidemiological profile, uniquely dominated by subtype F (Preda and Manolescu, 2022). The country also has a Hungarian ethnic minority population exceeding 1 million individuals, and accounts for the largest number of non-citizen residents in Hungary (Kincses, 2015).
This unique epidemiological context implies that the genomic surveillance of HIV-1 in Hungary has relevance at the regional and international levels and can provide early warning signals for the cross-regional transmission of expanding HIV-1 lineages. Leveraging data from the national routine drug resistance monitoring system, we present the first large-scale molecular epidemiological analysis of the HIV-1 epidemic in Hungary until the end of 2024, seeking to identify transmission clusters, epidemiological links to and from other countries, and analyze the shifting dominance of HIV-1 subtypes and the compartmentalization of risk groups.
2 Materials and methods
2.1 Study population and sequencing
In this nationwide molecular epidemiological study, we analyzed sequences of 1,146 peripheral blood samples collected from 1,120 individuals (diagnosed between 1993 and 2024) living with HIV-1 between 2008 and 2024, as part of routine HIV-1 genotyping. In our analyses, we retained the first sample per patient, resulting in a final set of 1,120 analyzed sequences.
Serving as a primary HIV Center in Hungary, the National Institute of Hematology and Infectious Diseases in the Central Hospital of Southern Pest provided the majority of blood samples used to obtain the HIV-1 sequences included in this study. To enhance the comprehensiveness and geographic coverage of our dataset, additional samples were contributed from a clinic at Semmelweis University and two smaller HIV treatment centers in Pécs and Debrecen, both established in or after 2014. Most of the samples were obtained from newly diagnosed, antiretroviral therapy (ART)-naïve patients living with HIV-1; however, ART-experienced individuals exhibiting detectable HIV-1 viral load and patients with known pre-exposure prophylaxis (PrEP) usage were also included in this study.
Sample processing and Sanger sequencing of the viral protease (PR), partial reverse transcriptase (RT), and integrase (INT) genomic regions were performed by the National Reference Laboratory for Retroviruses at the National Center for Public Health and Pharmacy, Budapest, Hungary, in accordance with previously described protocols (Mezei et al., 2011; Áy et al., 2020, 2021).
2.2 Subtyping and drug resistance analyses
For HIV-1 subtype determination, the PR, RT and INT coding sequences were analyzed using the Stanford HIVDB subtyping program (Rhee et al., 2003), the REGA (de Oliveira et al., 2005) and the COMET tools (Struck et al., 2014). The final subtype assignment was determined by expert opinion based on the results of the three tools. Sequences that could not be unambiguously classified were labeled as “non-typable”. The presence and clinical relevance of drug resistance mutations (DRMs) were assessed by the Stanford HIV Drug Resistance Database algorithm (Liu and Shafer, 2006); mutations classified as surveillance drug resistance mutations (SDRMs) that formed clusters of three or more sequences on the maximum likelihood phylogenetic tree of sequences sampled in Hungary were defined as transmitted drug resistance (TDR) clades in our analysis.
2.3 Maximum likelihood phylogenetic analysis
The genomic region covering partial PR and RT (PRRT) was amplified as a single fragment in most samples, while in a subset of samples, it was generated as two contigs. These contigs were then concatenated by inserting indeterminate “N” nucleotides at the missing positions (primarily between codons RT13-32), based on alignment with the HXB2/K03455 reference sequence. Additionally, the full INT region of the genome was amplified for most samples.
Three maximum likelihood phylogenetic trees were built from the sequences sampled in Hungary. The first tree, used to characterize domestic transmission in Hungary, included all sequences sampled in Hungary and a reference dataset encompassing HIV-1 group M subtypes (A, B, C, D, F, and G) and circulating recombinant forms (CRFs) possibly present in Hungary based on initial subtyping results (Los Alamos Sequence Database, Recombination Identification Program Alignment 2023, https://www.hiv.lanl.gov/content/sequence/NEWALIGN/help.html#RIP, accessed on 13 January 2025). The second tree—to identify transmission clusters specific to the country and possible epidemiological links to other countries—additionally to the sequences from Hungary and the reference dataset, included an additional set of background control sequences. This set was constructed by pooling the five most similar NCBI BLASTN (megablast) (Zhang et al., 2000) search results against all HIV sequences in NCBI GenBank for each of the partial pol sequences obtained in Hungary. All sequences submitted previously to GenBank from Hungary were excluded, as these are already reported in (Mezei et al., 2011; Áy et al., 2020). GenBank was queried separately for PRRT, INT, and concatenated PRRT and INT sequences (concatenated in the same way as described previously for PR and partial RT sequences). The resulting three sequence sets were combined and deduplicated before further analysis. The third phylogenetic tree included all sequences sampled in Hungary classified as “non-typable” based on subtyping results, and the whole Recombination Identification Program Alignment to determine the subtype of ambiguously subtyped sequences.
Multiple sequence alignments were performed separately for PRRT and INT sequences using MAFFT v7.490 (Katoh and Standley, 2013) [with five cycles of iterative refinement (- -maxiterate 5)]. Nucleotide positions associated with the most common drug resistance mutations, based on the Stanford drug resistance mutation lists (Liu and Shafer, 2006), were excluded to avoid the confounding effect of ART-driven convergent evolution. Additionally, nucleotide positions at both ends of the alignments that were either absent or indeterminate in at least 50% of the sequences from Hungary or in the combined dataset were trimmed. The PRRT and INT alignments were manually inspected and corrected using Aliview v1.28 (Larsson, 2014). After this step, the alignments were concatenated into a unified PRRT-INT dataset, combining the two genomic fragments. The total alignment length for the dataset including only sequences from Hungary was 2,084 base pairs (918 bp for the PRRT and 1,166 bp for the INT region). Sequences from Hungary covered HXB2 positions 2,268-3,308 (codons PR6-RT253) and 4,044-5,242 (codons RNAse59-Vif63, including the full integrase region INT1-288). With the inclusion of international background sequences, the total alignment length increased to 2,137 bp (945 bp for the PRRT and 1,192 bp for the INT region).
Maximum likelihood phylogenetic trees were constructed using RAxML-NG v1.2.2 (Kozlov et al., 2019) under the GTR+Γ model with partitioned analysis. Branch support values were calculated using the automated bootstrap method (- -autoMRE) in RAxML-NG, with a maximum of 500 replicates.
The steps of the data pipeline were automated using scripts implemented in GNU bash v5.1.16, Python v3.10.11 (Van Rossum and Drake, 2009), and R v4.3.1 (R Core Team, 2023).
2.4 Transmission cluster analyses
In our main analysis, potential transmission clusters were identified using ClusterPicker v1.2.5 (Ragonnet-Cronin et al., 2013) applied on phylogenetic trees and HIV-TRACE v0.8.0 (Kosakovsky Pond et al., 2018) on multiple sequence alignments; we then considered the union of clusters identified by both methods. ClusterPicker was applied with a maximum genetic distance threshold of 0.045 nucleotide substitutions per site within clusters and a bootstrap support threshold of 80 (Novitsky et al., 2020). HIV-TRACE was set to use a maximum pairwise distance threshold of 0.01 substitutions per site (Wertheim et al., 2017; Labarile et al., 2023) and the preprocessed sequence of the HXB2 reference genome (GenBank accession number K03455.1). Sequence sets with at least two entries were classified as transmission clusters in the Hungarian-only tree, while in the tree including international sequences, clusters were defined as those with at least two sequences, including at least one from Hungary. To assess the impact of different genetic distance thresholds and branch support values on the number and size of clusters both for individual tools and for the combined assignments, we conducted sensitivity analyses comparing results with distance thresholds of 0.005, 0.0075, 0.01, 0.0125, and 0.015 for HIV-TRACE, and 0.015, 0.03, 0.045, and 0.055 for ClusterPicker. For ClusterPicker, we tested branch support thresholds of 0, 50, 80, and 95. We estimated the growth of identified transmission clusters in the Hungarian-only tree based on the distribution of diagnosis dates.
2.5 Phylogenetic cherry analysis
We extracted all monophyletic pairs—henceforth referred to as “cherries”—from the phylogenetic tree including international background sequences, using the R package ape v5.8 (Paradis and Schliep, 2019), applying a maximum phylogenetic distance of 0.045 substitutions per site (Kusejko et al., 2022). Cherries were classified into two categories: domestic cherries, where both sequences were obtained in Hungary; and mixed cherries, where one sequence originated in Hungary and the other abroad. Additionally, we calculated the assortativity factor (AF) for risk groups, defined as the ratio of expected (random pairings of tips) to observed pairings among domestic cherries. We checked the sensitivity of assortativity analysis results by repeating the cherry extraction using maximum phylogenetic distances of 0.015, 0.03, and 0.055 substitutions per site. For further details on the assortativity analysis, see (Kusejko et al., 2022).
2.6 Bayesian phylogenetic and phylogeographic analyses
Time-calibrated trees were built using Bayesian phylogenetic analysis in BEAST v10.5.0 (Baele et al., 2025). We used the concatenated Hungarian and international background PRRT-INT sequences described in the Maximum likelihood phylogenetic analysis section, supplemented with country of origin and sampling date data derived from NCBI GenBank for the background sequence set. Subtypes were analyzed separately, including only those with at least 10 sequences from Hungary (subtypes A, B, C, F, CRF01_AE, CRF02_AG, CRF18_cpx, and CRF56_cpx).
The Bayesian analysis employed the GTR+Γ nucleotide substitution model with codon partitions, combined with an uncorrelated lognormal relaxed molecular clock model and a Bayesian Skygrid coalescent tree prior. Ancestral state reconstruction used an asymmetric substitution model. We implemented a Bayesian stochastic search variable selection (BSSVS) analysis to investigate migration flows, simultaneously obtaining Markov jumps indicating import events to, or export from Hungary. Priors for the mean clock rate parameters were estimated based on the dataset of Bletsa et al. (2019).
Markov chain Monte Carlo analyses were run for 107 to 108 generations, depending on dataset size. Convergence was assessed using Tracer v1.7.2 (Rambaut et al., 2018); the first 10% of samples were discarded as burn-in, and the effective sample size threshold for all parameters was set to 200.
The maximum clade credibility (MCC) tree was generated using TreeAnnotator v1.10.4 (Helfrich et al., 2018) with node heights based on common ancestor heights. The timing of introduction events into Hungary was estimated using the median node heights, the 95% highest posterior density intervals of the most recent common ancestor (MRCA) of Hungarian clades and the MRCA of these clades with their closest international background sequence(s).
2.7 Statistical analyses
To compare trait frequencies between samples with B and non-B subtype sequences we used Fisher's exact test. Temporal trends in subtype incidence were evaluated using binomial logistic regression, with additional models adjusting for age and sex, and subtype-specific changes expressed as odds ratios (OR) with 95% confidence intervals and average marginal effects on the probability scale (percentage-point change per year, Δpp/year). Differences between patients in or outside large clusters were assessed using binomial logistic regression for categorical traits and for continuous traits dichotomized at the population median (age, CD4+ T cell count, and viral load), while two-sided t-tests were used to compare the original continuous values. To test associations between cherry type and patient characteristics we used chi-squared tests.
For phylogenetic distance calculations, we used the normalized Robinson-Foulds distance (Robinson and Foulds, 1981), quartet divergence (Estabrook et al., 1985), and Clustering Information Distance (Smith, 2020) metrics from the TreeDist R package (Smith, 2022). Ancestral states and the gain or loss of mutations along the largest TDR clade phylogeny were inferred using maximum parsimony.
To estimate emigration from and immigration to Hungary for specific countries, we used international bilateral migration flow data between 2015 and 2020 (Abel and Cohen, 2019) and inbound/outbound tourism statistics up to 2024 from the Hungarian Central Statistical Office (https://www.ksh.hu/stadat?lang=hu&theme=tur, accessed on April 2, 2025). HIV prevalence estimates were derived from the Global Burden of Disease Study 2021 Results (Global Burden of Disease Collaborative Network, 2022). Data on the number of sequences uploaded to NCBI GenBank by country of origin were queried on April 3, 2025. We modeled Markov jump imports and exports between Hungary and other countries using a joint multivariate negative binomial regression model. Predictors included bilateral migration rates (estimated with a closed demographic accounting system and pseudo-Bayesian method, retaining the sum of immigration and emigration), national HIV prevalence per 100,000, and the ratio of published sequences to the estimated number of PLWH. Continuous predictors were log-transformed and standardized. Model selection was assessed by absolute Akaike information criterion (AIC) values, and predictor importance was evaluated using likelihood-ratio tests. Observed vs. expected counts were compared to identify countries with over- or underrepresentation in international HIV connections.
Statistical significance was defined as P < 0.05 (two-tailed) for all tests. All statistical tests and visualizations were performed in R v4.3.1(R Core Team, 2023).
3 Results
3.1 Characteristics of the study population
A total of 1,120 patients diagnosed with HIV-1 between 1993 and 2024 were analyzed (with samples collected between 2008 and 2024), with the majority being therapy-naïve at the time of sampling (1,063 of 1,120; 94.9%) (Figure 1A). PrEP usage within the treatment-naïve group was documented in three cases (3 of 1,063; 0.3%). Among patients with known nationality (897 of 1,120), 8.5% (76 of 897) originated from countries other than Hungary, most commonly Ukraine (n = 17), Brazil (n = 5), China (n = 5), Romania (n = 5), Russia (n = 5), Serbia (n = 4), Argentina (n = 2), and Italy (n = 2), along with 31 other countries (n = 1 for each). The most common route of infection was among men who have sex with men (MSM; 951 of 1,120; 84.9%), followed by heterosexual transmission (HET) in 112 (10.0%) cases, maternal transmission (MTCT) in 7 (0.6%), injection drug use in 1 case (0.1%); transmission route was unknown for 49 (4.4%) patients. The majority of patients were male (1,034; 92.3%), female patients accounted for 84 (7.5%) cases, and the sex of 2 (0.2%) patients was not reported. The median age at diagnosis of the study group was 34 years (IQR: 27–42). At the time of sample collection, the median viral load was 56,000 copies/mL (IQR: 12,400–210,000), and the median CD4+ T cell count was 418 cells/μL (IQR: 241–613). The characteristics of the study population are summarized in Table 1.
Figure 1. Sampling intensity and subtyping in Hungary until the end of 2024. (A) The number of new registered and sequenced samples per year in Hungary. (B) Subtype distribution of sequenced samples from Hungary by year. (C) The ratio of subtype B and non-B subtypes among sequences from Hungary between 2014 and 2024. (D) Trend analysis of HIV-1 subtype incidence in Hungary. Odds ratios (OR) for subtype detection over the study period are shown with 95% confidence intervals (error bars). Error bars are colored by subtype.
3.2 Frequency and trends of HIV-1 subtypes
Using information from three subtyping tools, we found that the majority of patients carried subtype B HIV-1 strains (814; 72.7%), while pure subtypes F (112; 10.0%), A (71; 6.3%; sub-subtypes: A6: 57 and A1: 9, ambiguous: 5), C (11; 1.0%) and G (2; 0.2%), and CRFs CRF02_AG (31; 2.8%), CRF01_AE (19; 1.7%), CRF18_cpx (12; 1.1%), CRF06_cpx (2; 0.2%) and CRF19_cpx (2; 0.2%) were also detected (Figure 1B). We classified 43 sequences as non-typable due to ambiguous subtyping and considered them as “non-B” in subsequent analyses.
The annual incidence of subtype B significantly decreased over the study period, from almost complete dominance in 2014 to approximately 60% in 2024 (OR = 0.82, 95% CI 0.78–0.86; Δpp/year = −3.68; FDR < 0.0001) (Figures 1C, D). This decline in incidence remained significant after adjusting for co-factors (age and sex) in multivariate analyses (OR = 0.83, 95% CI 0.78–0.87; Δpp/year = −3.52; P < 0.0001) and when the test was restricted to men (OR = 0.81, 95% CI 0.77–0.86; Δpp/year = −3.66; P < 0.0001). The decreasing trend of subtype B was observed among MSM (OR = 0.82, 95% CI 0.77–0.87; Δpp/year = −3.42; OR < 0.0001), but for the HET transmission group it was not significant (OR = 0.92, 95% CI 0.82–1.04; Δpp/year = −2.02; P = 0.188) (Supplementary Figure 1). Among non-B subtypes, the incidence of subtype A (OR = 1.13, 95% CI 1.03–1.13; Δpp/year = 0.80; FDR = 0.020), subtype F (OR = 0.81, 95% CI 0.77–0.86; Δpp/year = −3.66; FDR = 0.002), CRF18_cpx (OR = 4.40, 95% CI 1.80–10.7; Δpp/year = 1.64; FDR = 0.003), and non-typable sequences (OR = 1.27, 95% CI 1.12–1.45; Δpp/year = 0.89; FDR = 0.001) increased significantly over the study period.
The ratio of non-B subtypes was significantly associated with female sex (P < 0.0001) and with the HET risk group (P < 0.0001) compared to all other risk groups (Table 1). The most prevalent non-B subtypes among HET individuals were A (18/112; 16.1%), F (15/112; 13.4%), CRF02_AG (9/112; 8.0%), CRF01_AE (7/112; 6.3%), and C (4/112; 3.6%). The percentage of females and non-MSM individuals was exceptionally high in the case of subtype C (45.5% and 54.6%), CRF01_AE (26.3% and 36.8%) and CRF02_AG (22.6% and 30.0%) and lower for more prevalent subtypes such as subtype A (12.7% females and 26.5% non-MSM), B (5.05% and 6.92%) and F (10.7% and 18.2%) and non-typable sequences (11.6% and 23.3%). The ratio of B compared to non-B subtypes showed no significant association with age groups or CD4+ T cell number at diagnosis (Table 1).
3.3 Maximum likelihood phylogenetic reconstruction and clustering analysis of sequences from Hungary
To characterize the domestic dynamics of the epidemic in Hungary, we first conducted analyses on the set of sequences obtained in Hungary. A total of 198 partial pol sequences reported in earlier studies (Mezei et al., 2011; Áy et al., 2020), along with the 922 reported in this study, were included in the phylogenetic and transmission clustering analyses, resulting in a final dataset of 1,120 sequences (Figure 2 and Supplementary Figure 2). Overall, out of the 1,120 total sequences, 979 (87.4%) included both the partial PRRT and INT regions, 139 (12.4%) included only PRRT, and 2 (0.2%) included only INT.
Figure 2. Maximum likelihood phylogenetic tree of HIV-1 sequences obtained in Hungary up to the end of 2024. A reference dataset encompassing HIV-1 group M subtypes (A, B, C, D, F, G) and circulating recombinant forms (CRFs) present in Hungary was downloaded from the Los Alamos HIV Sequence Database (RIP Alignment 2023). Branches of the phylogenetic tree are colored according to transmission clusters. Colored concentric circles around the phylogeny denote the following patient/viral characteristics corresponding to each sequence (from innermost to outermost): (1) subtype of the sequence; (2) biological sex of the patient; (3) most likely route of infection (MSM, men who have sex with men; HET, heterosexual contact; MTCT, mother to child transmission; PWID, people who inject drugs); (4) age at diagnosis, (5) CD4+ T cell count, (6) surveillance drug resistance mutations grouped by drug class (PI, protease inhibitor; NRTI, nucleoside reverse transcriptase inhibitor; NNRTI, non-nucleoside reverse transcriptase inhibitor; INI, integrase inhibitor); and (7) positive or negative antiretroviral treatment history at the time of sampling.
Our applied molecular clustering approach identified 136 possible transmission clusters, each containing at least two sequences from Hungary. Approximately two-thirds of the sequences (846/1120; 75.5%) were part of clusters. Among clustered patients from Hungary with available metadata, the majority were male (95.7%), belonged to the MSM population (92.8%), and carried subtype B strains (75.5%) (Supplementary Table 1). Of the 76 patients with a nationality other than Hungarian, 33 (43.4%) were part of clusters.
Cluster sizes ranged from 2 to 96 (average = 6.2, median = 3, IQR: 2–5), with 17 clusters containing at least 10 sequences. These large clusters included 39.5% (442/1120) of all sequences, and the patients in these clusters were more frequently MSM (93.4% vs. 79.4%; OR = 3.71 (2.47–5.74), P < 0.0001), younger at sampling (median 33.6 vs. 37.1 years; OR for age below the population median = 1.94 (1.52–2.47), P < 0.0001), and had higher CD4+ T cell counts (median 520 vs. 404 cells/mm3; OR for CD4 above the population median = 2.33 (1.81–3.00), P < 0.0001) and higher viral loads (median 4.75 vs. 4.63 log copies/mL; OR for viral load above the population median = 1.42 (1.11–1.82), P = 0.005) compared with patients assigned to no or to smaller clusters. Of the 17 large clusters, 12 consisted of subtype B sequences (9 patients with non-Hungarian nationality from 6 clusters), while we also detected two subtype F, one subtype A (A6, one patient with non-Hungarian nationality), one CRF02_AG (one patient with non-Hungarian nationality), and one CRF18_cpx cluster (one patient with non-Hungarian nationality) with at least ten sequences. Only one non-Hungarian patient was in a basal position (a Romanian patient within the largest cluster, n = 96), suggesting that all others likely acquired the infection in Hungary.
Four transmission clusters (n = 19) contained only non-typable sequences, all detected after 2018. An additional phylogenetic analysis with a more extended reference dataset indicated that their closest relatives were CRF56_cpx (two clusters of sizes 7 and 3), CRF20_BG (n = 3), and CRF141_BF (n = 6, including three non-Hungarian patients, one of whom—a Chinese patient—was in a basal position).
We performed sensitivity analyses for transmission cluster identification using a wide range of commonly applied parameters. The number of identified clusters and average cluster size varied considerably for ClusterPicker and HIV-TRACE when applied separately (Supplementary Figures 3A, 4). Using the union of the two methods identified 116–182 transmission clusters with an average size of 4.2–7.7, encompassing 46.0%−85.6% of all sequences (Supplementary Figure 5).
3.4 Transmitted drug resistance
We identified four TDR clades (including 5.0% of sequences overall) based on the clustering of surveillance drug resistance mutations (SDRMs) and polymorphisms in the Hungarian-only phylogeny (Figure 2). Three of these corresponded directly to clusters identified by our molecular clustering approach. The fourth and largest TDR clade was divided into three clusters (along with several non-clustering sequences) by the automated analysis due to the high genetic divergence within the clade (Supplementary Figure 6). Two of the 36 patients had received treatment at the time of sampling, with one occupying a basal position in the clade, suggesting that the origin of these SDRMs may have resulted from unsuccessful treatment of a patient in Hungary. The 36 sequences in this subtype B clade carried two characteristic DRMs: RT M41L (28/36) and RT T215E (36/36). Three additional polymorphisms were also prevalent: PR A71V (36/36), INT S230N (25/26), and INT M50I (10/26). The emergence of these five mutations was located near the root of the clade, and loss of mutations occurred several times, most often for RT M41L (6 times). Loss of INT M50I characterized a subclade that corresponded to the largest cluster within the clade identified by our molecular clustering approach. The combination of M41L and T215E, observed in most sequences, is associated with low to intermediate resistance to certain nucleoside reverse transcriptase inhibitors (NRTIs). In addition to the major TDR clade, a smaller subtype B cluster (n = 3) carried the RT T215E mutation only, which confers potential low to intermediate resistance to certain NRTIs. A subtype F cluster (n = 5) was characterized by the presence of PR L10V (5/5), INT S230N (5/5), RT T215S (4/5), and RT M41L (4/5), with the last two mutations contributing to low or intermediate NRTI resistance. Finally, a CRF18_cpx cluster (n = 12) contained PR F53Y, PR K10I, and INT L74I mutations in all sequences, none of which are associated with PR or INT resistance.
3.5 Cluster growth analysis
We analyzed the detectable growth of clusters based on the time of diagnosis for samples within each cluster (Figure 3). Using our main clustering parameters, we found that during the 2023–2024 period, 58 out of 136 clusters (42.65%) grew by at least one sequence, while 5 out of 136 clusters (3.68%) expanded by at least five sequences. Sequences added to these five clusters accounted for 35.7% (82/230) of all newly diagnosed and sequenced cases in 2023–2024.
Figure 3. Cumulative size of clusters identified from Hungarian HIV-1 sequences up to the end of 2024. (A) Sequencing rate of new diagnoses in Hungary (excluding anonymous testing) since 2014. Diagnosed cases lacking identifying information in the national surveillance system were excluded from this analysis to avoid bias from potential duplicates. (B) Cumulative size of domestic clusters over time, where sequences were added to clusters based on the patient's diagnosis date when available; otherwise, they were added based on the sequence sampling date. The cumulative cluster size represents the total number of patients diagnosed up to a given year, with point sizes proportional to their share among all new diagnoses in that year. (C) Active time periods of all identified Hungarian clusters, approximated by the first and last recorded diagnoses within each cluster. Individual sequences are represented by gray diamonds, with increasing opacity indicating the number of new diagnoses in the cluster in a given year. Cluster timelines are colored according to sequence subtype. In both panels, the 10 largest clusters at the end of 2024 are labeled from 1 to 10 in descending order of size.
The three largest clusters identified (designated as #1, #2, and #3 in Figure 3) were MSM-dominant (>85% MSM) subtype B clusters with first samples collected in 2014, 2009, and 2015. By the end of 2024, these clusters contained 96, 54, and 43 sequences, respectively, with 42, 8, and 5 sequences added during the 2023–2024 period. This corresponded to a factor of expansion over the 2 years of 1.78, 1.17, and 1.13, respectively, and meant also that approximately one in six new diagnoses in the last 2 years could be linked to the single largest cluster. In addition, two MSM-dominant non-B subtype clusters exhibited substantial growth during the investigated period. A subtype F cluster (#7 in Figure 3), first detected in 2019, expanded from 7 to 21 sequences between 2023 and 2024. Similarly, a CRF18_cpx cluster, first identified in 2023, rapidly grew to 12 sequences by the end of 2024.
In our sensitivity analyses, the proportion of clusters that grew by at least one sequence ranged from 30.6% to 51.6%, while the proportion of clusters expanding by at least five sequences ranged from 1.3% to 5.6% (Supplementary Figure 7).
3.6 Extended phylogenetic analysis with international sequences
To infer the international connections of the epidemic in Hungary, we extended our maximum likelihood phylogenetic analyses by incorporating 2,202 unique international background sequences selected for the highest similarity to those reported from Hungary (Figure 4A and Supplementary Figure 1). The majority of international background sequences originated from the USA (n = 427), Germany (n = 133), Spain (n = 110), Russia (n = 105), Romania (n = 104), the UK (n = 93), Canada (n = 68), South Korea (n = 67), Australia (n = 64), Poland (n = 60), Brazil (n = 59), Italy (n = 58), Cyprus (n = 57), Cameroon (n = 56), China (n = 54), and Thailand (n = 48).
Figure 4. Maximum likelihood phylogenetic tree of HIV-1 sequences obtained in Hungary and the most similar international sequences. (A) Country of origin count data in the deduplicated set of top 5 most similar public sequences queried to isolates from Hungary. Bars representing each country are colored by geographical region, as defined by Nair et al. (2024). (B) Maximum likelihood phylogenetic tree of Hungarian and international background partial pol HIV-1 sequences. Branches of the phylogenetic tree are colored by transmission clusters containing at least one sequence from Hungary. Colored concentric circles denote the following patient/viral characteristics for each sequence (from innermost to outermost): (1) cluster assignment (as defined above); (2) subtype of the sequence; (3) sequence origin (Hungary or international); and (4) geographical region.
We repeated the phylogenetic clustering analysis on the extended phylogenetic tree (Figure 4B). Using the default parameters in the clustering analysis, we identified 76 Hungarian-only clusters with an average size of 6.39 sequences and 149 mixed clusters (containing at least one Hungarian and one international sequence) with an average cluster size of 7.50, including 3.05 sequences sampled in Hungary on average. The topology of larger mixed clusters (size greater than 15 and containing at least 25% sequences from Hungary) in the extended phylogenetic tree remained highly similar to their counterparts in the Hungarian-only tree, with background sequences positioned primarily at basal locations within the subtrees (Supplementary Figure 8). In our sensitivity analyses, we observed a strong negative association between the number of Hungarian and mixed clusters (Supplementary Figure 9).
Among the background sequences included in mixed clusters, the top countries of origin were Germany (n = 102), Romania (n = 69), the UK (n = 54), Poland (n = 49), Spain (n = 39), Italy (n = 37), Russia (n = 32), China (n = 23), the USA (n = 23), Croatia (n = 22), Czechia (n = 20), Belgium (n = 17) and Cyprus (n = 15) (Figure 5).
Figure 5. International connections of Hungarian HIV-1 sequences based on mixed sequence clusters. We identified mixed clusters (containing at least one Hungarian and one international sequence) on the maximum likelihood phylogenetic tree of Hungarian and international background sequences. (A) The number of sequences found in mixed clusters by country of origin. Bars representing each country are colored by geographical region, as defined by (Nair et al., 2024). Cross-border connections of Hungarian sequences within mixed clusters, displayed (B) worldwide, colored by geographical region and (C) within Europe, colored by country. The width of the lines is proportional to the number of international sequences in mixed clusters per country, while point sizes correspond to the number of sequences uploaded to the Los Alamos HIV Database from each country.
The majority of background sequences in mixed clusters came from the Western and Central Europe and North America (WCENA) region (n = 495, subtype distribution: 61.8% B, 15.6% F, 7.68% A, 5.05% CRF02_AG, 1.98% other subtypes, and 7.89% unknown), followed by Eastern Europe and Central Asia (EECA) (n = 40, subtype distribution: 50.00% B, 47.50% A, and 2.50% other), East Asia (n = 34, subtype distribution: 35.29% B, 29.41% CRF07_BC, 20.59% CRF01_AE, 20.59% CRF02_AG, and 14.70% other), Latin America (n = 18, subtype distribution: 61.1% B and 38.9% other), Southeast Asia (n = 21, subtype distribution: 85.7% CRF01_AE and 14.3% other), Oceania (n = 10), West Africa (n = 9), the Middle East and North Africa (MENA) (n = 8), East Africa (n = 7), The Caribbean (n = 6) and Ethiopia (n = 1) regions, as defined by the modified UNAIDS geographical regions list adopted from (Nair et al., 2024) (Supplementary Figure 10).
3.7 Cherry and assortativity analysis
To identify closely related samples with potential infection-relatedness, we extracted all monophyletic pairs [hereafter referred to as “cherries” (Kusejko et al., 2022)] that included at least one sequence originating from Hungary from the extended phylogeny (Figure 6). These cherries were classified into two groups: domestic cherries, where both sequences originated from Hungary, and mixed cherries, where one sequence was obtained from a sample from Hungary and the other was an international background sequence. In our main analysis, we considered only cherries with a maximum genetic distance of 0.045, which included 95.5% of all domestic cherries (273/286) and 50.0% of all mixed cherries (56/112) (Supplementary Figure 11). With these settings, 48.9% of all sequences from Hungary were in domestic cherries and 5.0% in mixed cherries. Individuals in the MSM risk group were significantly more likely to be part of a cherry compared to those in the HET group (54.9% vs. 42.9%, P = 0.021). In contrast, no significant associations were observed between being in a cherry and subtype (54.5% for subtype B vs. 53.1% for non-B, P = 0.732) or sex (54.7% for males vs. 44.0% for females, P = 0.076). Additionally, the number of mixed cherries showed no association with subtype, sex, or risk group.
Figure 6. International connections of Hungarian HIV-1 sequences identified in the cherry analysis. Hungarian and mixed cherries—defined as genetically closely related monophyletic pairs containing wither two Hungarian or one Hungarian and one international sequence respectively—were identified based on a maximum cophenetic distance of 0.045. (A) Distribution of Hungarian HIV-1 sequences by cherry type. (B) Assortativity of risk groups based on domestic cherries. An assortativity factor (AF) above 1 indicates a higher frequency of cherry pairs with a given risk group combination than expected under random pairing of tips, while AF < 1 indicates overdispersion compared to random pairing. Risk group abbreviations: MSM, men who have sex with men; HET, heterosexual. Error bars indicate 95% confidence intervals. (C) Map of cross-border connections of Hungarian sequences in mixed cherries within Europe. The width of the lines is proportional to the number of international sequences in mixed cherries per country, while point sizes correspond to the number of sequences uploaded to the Los Alamos HIV Database by each country. (D) Difference in time of sample collection between the Hungarian and international sequences in mixed cherries. Cherries are colored by the country of origin of the international sequence. Cases where the international background sequence lacks a sampling date are faded out. The date of each sample (indicated by empty circles) corresponds to the patient's diagnosis date when available or otherwise to the sequence sampling date; the narrow end of the symbols indicates the sampling year of the international sequence.
To assess the degree of separation or mixture between the transmission groups in Hungary, we conducted an assortativity analysis on the domestic cherries based on the risk group of patients. Our analysis revealed that MSM/MSM (assortativity factor, AF = 1.20, 95% CI: 1.15–1.25) and HET/HET (AF = 4.34, 95% CI: 1.18–7.89) cherries were significantly overrepresented, while MSM/HET (AF = 0.40, 95% CI: 0.21–0.60) cherries were underrepresented in our phylogeny, indicating limited mixing between the two major risk groups. These results remained qualitatively robust when the analysis was repeated using maximum cophenetic distances of 0.015, 0.03, and 0.055 for domestic cherries (Supplementary Figure 12).
We also used the results of the cherry analysis to highlight possible cases of close epidemiological linkage to other countries. The international sequences in the 56 mixed cherries originated from 18 different countries, most frequently from Poland (n = 10, 17.9%), Germany (n = 8, 14.3%), Croatia (n = 6, 10.7%), the UK (n = 6, 10.7%), Romania (n = 5, 8.9%), and Spain (n = 4, 7.1%). Overall, 83.9% (47/56) of cases originated from the WCENA region. Among the 53 cases where the sampling date of the international sequence was known, the international sequence was detected earlier in 39 cases, later in 9 cases, and in the same year in 5 cases.
3.8 Bayesian phylogenetic analysis of Hungarian and international sequences
To estimate the number and timing of independent HIV introduction events to Hungary, we performed a subtype-specific Bayesian phylogenetic analysis of Hungarian and international sequence sets (Supplementary Figures 13–20). The subtype F Bayesian MCC tree showed the closest agreement with the maximum likelihood consensus tree (normalized Clustering Information Distance, nCID = 0.171), followed by CRF18_cpx (nCID = 0.188), CRF56_cpx (nCID = 0.205), subtype C (nCID = 0.298), subtype A (nCID = 0.323), CRF02_AG (nCID = 0.354), subtype B (nCID = 0.370), and CRF01_AE (nCID = 0.498) (Supplementary Figure 21). In all cases, the Bayesian MCC trees showed greater similarity to the maximum likelihood consensus tree than either that expected from random trees of equal size or the average similarity among maximum likelihood bootstrap replicates. We attempted Bayesian phylogenetic reconstruction for the non-typable clusters with probable subtypes CRF56_cpx, CRF20_BG, and CRF141_BF; however, only the CRF56_cpx tree converged successfully.
We estimated at least 122 independent introduction events to Hungary; 83 for subtype B (time to most recent common ancestor intervals: 1979–2022), 15 for subtype F (1996–2018), 12 for subtype A (1998–2021), four for CRF01_AE (1997–2022), four for CRF02_AG (1992–2022), two for CRF56_cpx (1997–2020), one for CRF18_cpx (2013–2021) and one for subtype C (1990–2012) (Figure 7). Among the 17 large transmission clusters (≥10 sequences) identified on the Hungarian-only maximum likelihood tree, 15 corresponded to a single introduction event in the subtype-specific Bayesian MCC trees. The clades corresponding to the four active transmission clusters among the 10 largest clusters—the three subtype B clusters and the subtype F cluster described in the Cluster growth analysis subsection—were estimated to have been introduced between 2009–2010, 1984–1999, 2006–2017, and 1996–2007, respectively. In contrast, the six clusters inactive in 2023–24 (clusters #4, #5, #6, #8, #9, and #10 in Figure 3) were estimated to have been introduced between 2006–2008, 2005–2011, 2005–2009, 1987–2013, 2005–2008, and 1998–2002. Regarding three TDR clusters—the large subtype B clade and the small subtype B and subtype F clusters described in the Transmitted drug resistance subsection—introduction to Hungary was estimated between 1988–1993, 1978–2010, and 2001–2011.
Figure 7. Bayesian phylogenetic and phylogeographic analyses of HIV-1 sequences obtained in Hungary and the most similar international sequences. (A) Estimated introduction date intervals for all lineages introduced to Hungary, defined as the time between the most recent common ancestor of sequences from Hungary and the most recent common ancestor of those sequences together with their closest international relatives. Error bars indicate 95% highest posterior density intervals. Colored circles represent the annual number of diagnosed or sampled cases from Hungary in each lineage. Lineage intervals, error bars, and circles are colored according to sequence subtype. (B–E) Cross-border HIV-1 import (B) global, (C) Europe and export (D) global, (E) Europe events inferred from the Bayesian phylogeographic analyses. Line widths reflect the median estimated Markov jumps per country, line colors indicate HIV-1 subtypes, and countries are shaded by geographical region.
3.9 Bayesian phylogeographic analysis
To quantify the number of import and export events from and to specific countries for each subtype, we performed Bayesian discrete phylogeographic analyses with an asymmetric substitution model. For subtype B, the most strongly supported sources of importation were the USA [median number of Markov jumps, 95% highest posterior density (HPD): 46 (35–57)], Germany [19 (14–25)], the UK [7 (4–10)], Poland [6 (4–8)], Spain [5 (2–8)], Italy [4 (2–6)], Croatia [3 (1–4)], and Russia [3 (2–5)]. Export events of subtype B were also supported, with the largest numbers inferred toward Germany [29 (22–35)], the United Kingdom [13 (6–18)], Poland [10 (7–12)], the USA[9 (3–14)], Croatia [8 (6–10)], Spain [7 (4–12)], Cyprus [6 (4–9)], Romania [6 (3–7)], Italy [5 (2–9)], Russia [4 (1–6)], Argentina [3 (1–5)], Belgium [3 (1–6)], and Czechia [3 (1–4)]. Among non-B subtypes, subtype A was estimated to have been introduced primarily from Russia [29 (19–36)] and Poland [4 (2–10)], with exports mainly directed toward Germany [5 (0–7)]. Other non-B subtypes were inferred to be predominantly imported without strong evidence of onward export, including subtype F from Romania [28 (24–32)], CRF01_AE mainly from Thailand [11 (0–15)] and China [4 (0–10)], CRF02_AG from Spain [9 (3–16)] and Cameroon [3 (0–9)], CRF18_cpx from Italy [1 (0–1)], CRF56_cpx from Turkey [3 (0–4)] and subtype C possibly from India [2 (0–5)], Germany [1 (0–4)], or Tanzania [1 (0–5)].
We compared observed Markov jump imports and exports between Hungary and other countries against model-based expectations. Expected values were derived from a multivariable joint negative binomial model using bilateral migration or tourism estimates, national HIV prevalence rates, and the ratio of published sequences to estimated PLWH (sequencing intensity) as predictors. The migration-based model provided a better fit to the observed Markov jump patterns than a model using bilateral tourism data (AIC: 23.3 vs. 34.4). Within the migration-based model, migration rate was the strongest predictor of Markov jump counts (likelihood ratio [LR] = 93.10, P = 0.001), followed by the ratio of published sequences to estimated PLWH (LR = 20.03, P = 0.001) and national HIV prevalence (LR = 13.22, P = 0.002). Compared to model expectations from the best fit model, the countries most overrepresented in estimated Markov jump imports were the USA (observed = 47 vs. expected = 10), Russia (32 vs. 8), Thailand (11 vs. 2), Poland (10 vs. 3), Romania (29 vs. 22) and Spain (14 vs. 8), while the most underrepresented were Ukraine (0 vs. 16), Germany (21 vs. 33), South Africa (0 vs. 12), Canada (1 vs. 12), Austria (1 vs. 11), Italy (7 vs. 16), Australia (0 vs. 8), Switzerland (0 vs. 6), and Serbia (0 vs. 6). For Markov jump exports, the most overrepresented countries were Germany (35 vs. 17), the UK (15 vs. 5), Poland (10 vs. 3), the USA (10 vs. 3), and Spain (7 vs. 3), whereas Serbia (1 vs. 17), Romania (6 vs. 13), Australia (1 vs. 7), and Ukraine (0 vs. 5) were the most underrepresented.
4 Discussion
Building on previous efforts to assess the prevalence and transmission of drug resistance mutations (Mezei et al., 2011; Áy et al., 2020, 2021), this study is the first to present a comprehensive molecular epidemiological analysis of the HIV-1 epidemic in Hungary. Our results offer insights into the shifting patterns and composition of the epidemic, which could serve as a good basis for targeted intervention strategies and may hold clues to the future trends of HIV-1 in other countries, as well.
The overrepresentation of the MSM exposure group and of younger age in large transmission clusters reinforces the notion that active local transmission in Hungary has been primarily driven by sexual contacts among young and middle-aged MSM. This finding is consistent with clinical observations and with our previous report, which included cases diagnosed between 2013 and 2017 in Hungary (Áy et al., 2020). Additionally, the presence of large clusters of closely related sequences and the short intervals between most diagnoses within clusters imply intense episodic transmission among MSM, driving the local epidemic similarly to observations from several other countries (Lewis et al., 2008; Sullivan et al., 2009; van Griensven et al., 2009; Phillips et al., 2013). However, while the transmission of HIV remains apparently more intense among MSM in Hungary, the higher CD4+ T cell counts at diagnosis in large clusters suggest early presentation for testing and treatment, possibly due to greater awareness of exposure risk in this group (Mukolo et al., 2013). This suggests that although individuals in sexually active MSM communities tend to present earlier for testing and treatment, further reduction of new transmissions would require even earlier diagnosis and expanded testing efforts, as already shown for both MSM and PWID (Marzel et al., 2016; Ratmann et al., 2016; Günthard et al., 2025). Our observation that individuals in larger clusters in Hungary tend to have higher viral loads is consistent with other reports (Quinn et al., 2000; Aldous et al., 2012; Wertheim et al., 2019). This phenomenon implies the possibility that HIV-1 may be undergoing natural selection for increased infectivity and virulence (Herbeck et al., 2012; Wertheim et al., 2019).
In agreement with earlier observations from other countries (Esbjörnsson et al., 2016; Kusejko et al., 2022; Nduva et al., 2022), our assortativity analysis suggests that transmission networks among MSM and HET are largely, but not completely separated, indicating a bridge role for bisexual individuals and occasional spillover between the two groups. We note that misclassification of MSM transmission based on self-reporting (Goldstein et al., 2015) may have introduced false MSM-HET cherries: in 53 out of 56 mixed MSM-HET cherries, the HET node was also male, which is suggestive of a reporting bias, and implies that the true degree of separation is likely stronger than our estimate. Nonetheless, the presence of occasional connections between HET and MSM risk groups is further supported by our observation of several transmission clusters with mixed sex and risk group composition.
Our results strongly suggest that the emergence of local transmission networks appears to be fueled by both domestic transmissions and repeated viral introductions from abroad, contributing to the increasing genetic diversity of HIV-1 and the decreasing incidence of previously dominant subtype B in Hungary. Substantial increase was observed in the incidence of subtypes A and F, CRF18_cpx and non-typable isolates (possibly belonging to or closely related to CRFs 56_cpx, 20_BG and 141_BF based on their phylogenetic position). While this rise in genetic diversity in recent years is not broadly observed across the Western and Central Europe and North America region (Nair et al., 2024), it is characteristic of several national epidemics in Central Europe, including those in Croatia (Planinić et al., 2023), Slovakia (Habekova et al., 2010), and Poland (Serwin et al., 2021). These increasingly prevalent non-B subtypes—particularly those uncommon in Europe and Hungary, such as subtype C, CRF01_AE, and CRF02_AG—were associated with female sex and with heterosexual and maternal transmission, suggesting a potential role for immigration from regions with higher rates of heterosexual transmission. This hypothesis is further supported by the observation that approximately one-third of all new HIV diagnoses in 2022 and 2023 were among individuals whose country of origin was outside Hungary. These included regions with a higher proportion of heterosexual cases, such as Central and Eastern Europe (18.3% of diagnoses in 2022 and 13.2% in 2023, respectively), Latin America and the Caribbean (3.6% and 5.3%), and South and Southeast Asia (1.3% and 3.1%), along with a notable proportion of cases with unknown origin (8.0% and 11.4%) (European Centre for Disease Prevention Control, 2024).
Regardless of the route of infection, Bayesian phylogeographic analysis revealed that subtype B introductions were mainly associated with the USA and European countries such as Germany, the UK, Poland, Spain, Italy, Croatia and Russia, while subtype A6 appeared to originate primarily from Russia and Poland; A1 from Italy; subtype F from Romania; CRF01_AE from Thailand and China; CRF02_AG from Spain and Cameroon, CRF18_cpx from Italy and CRF56_cpx from Turkey. Multiple potential export events from Hungary were also inferred, including subtype B to Germany, the UK, Poland, the USA, Croatia, Spain, Cyprus, Romania, and Italy, and subtype A6 to Germany. We showed that suspected HIV-1 import and export events—based on the number of international sequences in mixed clusters, mixed cherries, or Markov jumps—can be partially explained by known migration patterns, HIV prevalence and public sequence availability. Migration appears to account for international connections more effectively than tourism data, suggesting that long-term migration plays a more significant role than tourism (with the possible exception of the dispersal of CRF01_AE sequences from Thailand related to sex tourism (Angelis et al., 2015), which may underline the origin of the eight CRF01_AE sequences from Thailand observed in our mixed clusters).
We found that the USA, Germany, the UK, Poland, Romania, Russia, Thailand, and Spain were overrepresented in international sequence connections compared to the expectations derived from our statistical model, which incorporated international migration, HIV prevalence, and sequencing coverage data. This suggests either the presence of additional, unexplained factors driving higher transmission between vulnerable populations in Hungary and these countries, or a relative overrepresentation of publicly available sequence data from these regions that was not fully captured by our model. In contrast, we observed the complete absence of connections with some countries where links would have been expected, particularly in Hungary's close geographical vicinity—such as Ukraine (17 patients registered in the Hungarian surveillance system, of which only one heterosexual case clustered with two other Hungarian sequences), Austria, and Serbia (2 of 4 patients clustering with Hungarian sequences). This may reflect a strong underrepresentation of these countries in public sequence databases that our model may not have been able to account for, or it may indicate genuinely weaker epidemiological connections between Hungary and these neighboring countries.
Although several war refugees from Ukraine have been registered in the Hungarian HIV surveillance system since 2022, our Bayesian phylogeographic analysis found no evidence of international transmission between the two countries through the end of 2024—unlike in other European countries such as Germany (Hanke et al., 2024) and Poland (Serwin et al., 2023, 2024). Several possible explanations may account for this observation. First, while individuals living with HIV-1 from Ukraine are recorded in the Hungarian surveillance system, many may be transiting through the country without significantly contributing to the local epidemic. This is particularly plausible given that most Ukrainian patients entering the Hungarian system were already receiving treatment. Second, the limited availability of HIV-1 sequence data from Ukraine likely contributes to underdetection. Finally, delays between infection and diagnosis may have prevented our sequence-based analysis from capturing transmission events within the study period.
Our work is subject to limitations. First, the limited availability of publicly accessible HIV-1 sequence data constrained our ability to detect all potential international connections to sequences from Hungary, particularly for countries with sparse representation. Because our national dataset is more up-to-date than international repositories, potential importation events are likely to be captured more completely than export events, which may be missing from databases due to delayed reporting. Second, short partial pol sequences (~2,000 bp) used for routine drug resistance genotyping limit phylogenetic resolution and frequently produce low branch support values, particularly at deep nodes—as observed in our dataset (Supplementary Figure 1). Under these conditions, phylogeny-based methods that rely on maximum pairwise genetic distance and branch support thresholds (e.g., ClusterPicker) may fail to detect transmission clusters or may subdivide them, especially when monophyletic clades contain outlier sequences with greater genetic distances (Supplementary Figure 3). By integrating distance-based (HIV-TRACE) and phylogenetic (ClusterPicker) approaches, we recovered several large, highly active clusters that ClusterPicker alone either missed or subdivided (Supplementary Figure 3). The observed low quartet divergence between our subtype-specific maximum likelihood and Bayesian phylogenies indicates strong local topological concordance, and thus confirms that, despite limited resolution at deep nodes, closely related sequences were consistently placed in close topological proximity across phylogenetic methods (Supplementary Figure 21). Finally, considering that molecular clustering is sensitive to parameter choice, genetic distance and branch support parameters must be selected in accordance with the objectives of the study (Hassan et al., 2017; Novitsky et al., 2020). To assess long-term HIV-1 transmission dynamics in Hungary, we followed recommendations for the routine tracking of transmission clusters over extended periods (Hassan et al., 2017; Novitsky et al., 2020; Labarile et al., 2023) to select our default parameters. Testing a wide range of key parameters showed that overall conclusions remained consistent. Our default parameter selection is supported by the finding that 15 of 17 large clusters corresponded to a single introduction event in Bayesian analyses. Moreover, three of four TDR clades identified by the clustering of drug resistance mutations were identical to those detected by our molecular clustering approach (in which we excluded DRM positions). The remaining TDR clade shows high internal divergence, consistent with long-term persistence [diagnoses span 24 years, with the older cases previously reported in (Áy et al., 2020)], and a bootstrap support of 95 at the root node is strongly suggestive of a single transmission cluster (Supplementary Figure 6).
5 Conclusion
Overall, our analyses indicate that the HIV-1 epidemic in Hungary is primarily sustained by domestic transmission, characterized by a limited number of highly active transmission clusters, and complemented by repeated introductions of viral strains from abroad that subsequently initiate local spread. While the epidemic exhibits the strongest international links with Western and Central Europe, its increasing genetic diversity appears to be driven primarily by the repeated introduction and local transmission of non-B subtypes, predominantly originating from Eastern Europe as well as Southeast and East Asia. Further surveillance is needed to monitor the spread of these non-B subtypes within Hungary and into Western and other Central European countries, toward which Hungary may act as a bridge through the epidemic links that were demonstrated in our analyses.
Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi.nlm.nih.gov/genbank/, JF419394–JF419453, KX999940–KX999999, KY021932–KY021991, KY950415–KY950436, KY967234–KY967255, MK213272–MK213318, MK236491–MK236537, MK250657–MK250695, PP333487–PP333522, PP313557, PV816828–PV816852, PV816854–PV816951, PV816953–PV817030, PV817032–PV817114, PV817116–PV817176, PV817178–PV817198, PV817200–PV817202, PV817204–PV817207, PV817209–PV817298, PV817300–PV817315, PV817317–PV817339, PV817341–PV817403, PV817405–PV817416, PV817418–PV817493, PV817495–PV817508, PV817510–PV817530, PV817532–PV817552, PV817555–PV817578, PV817580–PV817592, PV817594–PV817693, PV817695–PV817697, PV817699–PV817714, PV817716–PV817727 and PV817729–PV817735 for the PR and RT regions and PP313558–PP313598, PV903188–PV903273, PV903275–PV903401, PV903403–PV903478, PV903480–PV903563, PV903565–PV903625, PV903627–PV903647, PV903649–PV903651, PV903653–PV903656, PV903658–PV903747, PV903749–PV903764, PV903766–PV903779, PV903781–PV903789, PV903791–PV903853, PV903855–PV903866, PV903868–PV903878, PV903880–PV903913, PV903915–PV903926, PV903928–PV903948, PV903951–PV903971, PV903974–PV903997, PV903999–PV904011, PV904013–PV904111, PV904113–PV904131, PV904133–PV904144, and PV904146–PV904152 for the INT region.
Ethics statement
The studies involving humans were approved by Medical Research Council, Scientific and Research Ethics Committee, Budapest, Hungary (BM/18124-1/2023 and BM/1753-1/2025). The studies were conducted in accordance with the local legislation and institutional requirements. The Ethics Committee/Institutional Review Board waived the requirement of written informed consent for participation from the participants or the participants' legal guardians/next of kin because this was a retrospective anonymized analysis of viral sequence and patient data.
Author contributions
LZ: Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing. LA: Data curation, Formal Analysis, Methodology, Resources, Writing – review & editing. DM: Formal analysis, investigation, Methodology, Software, Writing – review & editing. PL: Methodology, Writing – review & editing. ÁL: Data curation, Writing – review & editing. EA: Methodology, Writing – review & editing. KK: Methodology, Writing – review & editing. RK: Methodology, Writing – review & editing. JS: Data curation, Writing – review & editing. BL: Data curation, Writing – review & editing. ÉÁ: Conceptualization, Data curation, Methodology, Resources, Supervision, Writing – review & editing. VM: Conceptualization, Funding acquisition, Methodology, project administration, Supervision, Writing – review & editing.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This work was supported by the National Research, Development and Innovation Office in Hungary (RRF-2.3.1-21-2022-00006) as a part of the National Laboratory for Health Security, and by HUN-REN (TKCS-2024/66 to EA) and by the EKÖP-25 University Excellence Scholarship Program (EKÖP-25-3-I-ELTE-680 to LZ) of the Ministry for Culture and Innovation from the source of the National Research, Development and Innovation Fund.
Acknowledgments
We thank the staff of the Department of Dermatology, Venereology and Dermatooncology and the Department of Pediatrics at Semmelweis University, the Department of Infectology at the University of Debrecen, the Department of Infectious Diseases at the University of Pécs, and the National Reference Laboratory for Retroviruses, and all patients involved in the study for providing blood samples for routine HIV-1 genotyping. We are deeply grateful to János Minárovits, who was instrumental in the launch of HIV molecular diagnostics in Hungary, and who put us (ÉÁ and VM) on the path that led to this work. We are also grateful to Marco Labarile for insightful recommendations on methodological decisions and important discussions on the interpretation of our results. We disclose the use of ChatGPT to aid analysis code development and to identify grammatical mistakes and improve text flow during manuscript preparation.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that Gen AI was used in the creation of this manuscript. We disclose the use of ChatGPT to aid analysis code development and to identify grammatical mistakes and improve text flow during manuscript preparation.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb.2025.1732254/full#supplementary-material
References
Abel, G. J., and Cohen, J. E. (2019). Bilateral international migration flow estimates for 200 countries. Sci. Data 6:82. doi: 10.1038/s41597-019-0089-3
Aldous, J. L., Pond, S. K., Poon, A., Jain, S., Qin, H., Kahn, J. S., et al. (2012). Characterizing HIV transmission networks across the United States. Clin. Infect. Dis. 55, 1135–1143. doi: 10.1093/cid/cis612
Angelis, K., Albert, J., Mamais, I., Magiorkinis, G., Hatzakis, A., Hamouda, O., et al. (2015). Global dispersal pattern of HIV type 1 subtype CRF01_AE: a genetic trace of human mobility related to heterosexual sexual activities centralized in Southeast Asia. J. Infect. Dis. 211, 1735–1744. doi: 10.1093/infdis/jiu666
Áy, É., Müller, V., Mezei, M., Pocskay, Á., Koroknai, A., Müller, D., et al. (2020). Transmitted drug resistance in newly diagnosed and treatment-naïve HIV type 1-infected patients in Hungary. J. Glob. Antimicrob. Resist. 20, 124–130. doi: 10.1016/j.jgar.2019.07.014
Áy, É., Pocskay, Á., Lakatos, B., Szlávik, J., Mezei, M., and Minárovits, J. (2021). Prevalence of resistance mutations associated with integrase inhibitors in therapy-naive HIV-positive patients in Hungary. Acta Microbiol. Immunol. Hung. 68, 87–91. doi: 10.1556/030.2021.01433
Baele, G., Ji, X., Hassler, G. W., McCrone, J. T., Shao, Y., Zhang, Z., et al. (2025). BEAST X for Bayesian phylogenetic, phylogeographic and phylodynamic inference. Nat. Methods 22, 1653–1656. doi: 10.1038/s41592-025-02751-x
Beloukas, A., Psarris, A., Giannelou, P., Kostaki, E., Hatzakis, A., and Paraskevis, D. (2016). Molecular epidemiology of HIV-1 infection in Europe: an overview. Infect. Genet. Evol. 46, 180–189. doi: 10.1016/j.meegid.2016.06.033
Bletsa, M., Suchard, M. A., Ji, X., Gryseels, S., Vrancken, B., Baele, G., et al. (2019). Divergence dating using mixed effects clock modelling: an application to HIV-1. Virus Evol. 5:vez036. doi: 10.1093/ve/vez036
de Oliveira, T., Deforche, K., Cassol, S., Salminen, M., Paraskevis, D., Seebregts, C., et al. (2005). An automated genotyping system for analysis of HIV-1 and other microbial sequences. Bioinformatics 21, 3797–3800. doi: 10.1093/bioinformatics/bti607
Esbjörnsson, J., Mild, M., Audelin, A., Fonager, J., Skar, H., Bruun Jørgensen, L., et al. (2016). HIV-1 transmission between MSM and heterosexuals, and increasing proportions of circulating recombinant forms in the Nordic Countries. Virus Evol. 2:vew010. doi: 10.1093/ve/vew010
Estabrook, G. F., McMorris, F. R., and Meacham, C. A. (1985). Comparison of undirected phylogenetic trees based on subtrees of four evolutionary units. Syst. Biol. 34, 193–200. doi: 10.2307/sysbio/34.2.193
European Centre for Disease Prevention and Control (2024). HIV/AIDS Surveillance in Europe 2024 – 2023 Data. Available onlien at: https://www.ecdc.europa.eu/en/publications-data/hiv-aids-surveillance-europe-2024-2023-data (accessed May 27, 2025).
Ferdinandy, B., Mones, E., Vicsek, T., and Müller, V. (2015). HIV competition dynamics over sexual networks: first comer advantage conserves founder effects. PLoS Comput. Biol. 11:e1004093. doi: 10.1371/journal.pcbi.1004093
Global Burden of Disease Collaborative Network (2022). Global Burden of Disease Study 2021 (GBD 2021) Results. Available onlien at: https://vizhub.healthdata.org/gbd-results/ (Accessed April 2, 2025).
Goldstein, N. D., Welles, S. L., and Burstyn, I. (2015). To be or not to be: Bayesian correction for misclassification of self-reported sexual behaviors among men who have sex with men. Epidemiology 26:637. doi: 10.1097/EDE.0000000000000328
Grabowski, M. K., Herbeck, J. T., and Poon, A. F. Y. (2018). Genetic cluster analysis for HIV prevention. Curr. HIV/AIDS Rep. 15, 182–189. doi: 10.1007/s11904-018-0384-1
Günthard, H. F., Kusejko, K., and Kouyos, R. D. (2025). Phylogenetics and molecular evolution to understand and curb the HIV pandemic. Nat. Rev. Microbiol. doi: 10.1038/s41579-025-01202-w
Habekova, M., Takacova, M., Lysy, J., Mokras, M., Camacho, R., Truska, P., et al. (2010). Genetic subtypes of HIV type 1 circulating in Slovakia. AIDS Res. Hum. Retroviruses 26, 1103–1107. doi: 10.1089/aid.2009.0220
Hanke, K., Rykalina, V., Koppe, U., Gunsenheimer-Bartmeyer, B., Heuer, D., and Meixenberger, K. (2024). Developing a next level integrated genomic surveillance: advances in the molecular epidemiology of HIV in Germany. Int. J. Med. Microbiol. 314:151606. doi: 10.1016/j.ijmm.2024.151606
Hassan, A. S., Pybus, O. G., Sanders, E. J., Albert, J., and Esbjörnsson, J. (2017). Defining HIV-1 transmission clusters based on sequence data. AIDS 31, 1211–1222. doi: 10.1097/QAD.0000000000001470
Helfrich, P., Rieb, E., Abrami, G., Lücking, A., and Mehler, A. (2018). “TreeAnnotator: versatile visual annotation of hierarchical text relations, in Language Resources and Evaluation Conference, 11th Edn (Miyazaki).
Herbeck, J. T., Müller, V., Maust, B. S., Ledergerber, B., Torti, C., Di Giambenedetto, S., et al. (2012). Is the virulence of HIV changing? A meta-analysis of trends in prognostic markers of HIV disease progression and transmission. AIDS 26:193. doi: 10.1097/QAD.0b013e32834db418
Jovanović, L., Šiljić, M., Cirković, V., Salemović, D., Pešić-Pavlović, I., Todorović, M., et al. (2019). Exploring evolutionary and transmission dynamics of HIV epidemic in Serbia: bridging socio-demographic with phylogenetic approach. Front. Microbiol. 10:287. doi: 10.3389/fmicb.2019.00287
Katoh, K., and Standley, D. M. (2013). MAFFT Multiple Sequence Alignment Software Version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780. doi: 10.1093/molbev/mst010
Kincses, Á. (2015). Romanian citizens in Hungary according to 2011 Population Census data. Forum Geogr. 14, 190–198. doi: 10.5775/fg.2067-4635.2015.225.d
Kosakovsky Pond, S. L., Weaver, S., Leigh Brown, A. J., and Wertheim, J. O. (2018). HIV-TRACE (TRAnsmission Cluster Engine): a tool for large scale molecular epidemiology of HIV-1 and other rapidly evolving pathogens. Mol. Biol. Evol. 35, 1812–1819. doi: 10.1093/molbev/msy016
Kozlov, A. M., Darriba, D., Flouri, T., Morel, B., and Stamatakis, A. (2019). RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics 35, 4453–4455. doi: 10.1093/bioinformatics/btz305
Kusejko, K., Tschumi, N., Chaudron, S. E., Nguyen, H., Battegay, M., Bernasconi, E., et al. (2022). Similar but different: integrated phylogenetic analysis of Austrian and Swiss HIV-1 sequences reveal differences in transmission patterns of the local HIV-1 epidemics. J. Acquir. Immune Defic. Syndr. 90:e4–e12. doi: 10.1097/QAI.0000000000002949
Labarile, M., Loosli, T., Zeeb, M., Kusejko, K., Huber, M., Hirsch, H. H., et al. (2023). Quantifying and predicting ongoing human immunodeficiency virus type 1 transmission dynamics in Switzerland using a distance-based clustering approach. J. Infect. Dis. 227, 554–564. doi: 10.1093/infdis/jiac,457
LANL Sequence Database (2025). Available onlien at: https://www.hiv.lanl.gov/ (accessed May 27, 2025).
Larsson, A. (2014). AliView: a fast and lightweight alignment viewer and editor for large datasets. Bioinformatics 30, 3276–3278. doi: 10.1093/bioinformatics/btu531
Leierer, G., Rappold, M., Strickner, S., and Zangerle, R. (2024). 47th Report of the Austrian HIV Cohort Study. Available onlien at: https://www.google.com/search?q=https://www.aidsgesellschaft.at/wp-content/uploads/2025/01/Kohortenbericht-AHIVCOS-47_2024.pdf (accessed May 27, 2025).
Lewis, F., Hughes, G. J., Rambaut, A., Pozniak, A., and Brown, A. J. L. (2008). Episodic sexual transmission of HIV revealed by molecular phylodynamics. PLoS Med. 5:e50. doi: 10.1371/journal.pmed.0050050
Liu, T. F., and Shafer, R. W. (2006). Web resources for HIV type 1 genotypic-resistance test interpretation. Clin. Infect. Dis. 42, 1608–1618. doi: 10.1086/503914
Lloyd, A. T., and Sirkeci, I. (2022). A long-term view of refugee flows from Ukraine: war, insecurities, and migration. Migration Letters 19, 523–535. doi: 10.33182/ml.v19i4.2313
Lunar, M. M., Lepej, S. Ž., TomaŽič, J., Vovko, T. D., Pečavar, B., Turel, G., et al. (2018). HIV-1 transmitted drug resistance in Slovenia and its impact on predicted treatment effectiveness: 2011–2016 update. PLoS ONE 13:e0196670. doi: 10.1371/journal.pone.0196670
Lunar, M. M., Matković, I., TomaŽič, J., Vovko, T. D., Pečavar, B., and Poljak, M. (2015). Longitudinal trends of recent HIV-1 infections in Slovenia (1986-2012) determined using an incidence algorithm. J. Med. Virol. 87, 1510–1516. doi: 10.1002/jmv.24209
Marzel, A., Shilaih, M., Yang, W.-L., Böni, J., Yerly, S., Klimkait, T., et al. (2016). HIV-1 transmission during recent infection and during treatment interruptions as major drivers of new infections in the Swiss HIV Cohort Study. Clin. Infect. Dis. 62, 115–122. doi: 10.1093/cid/civ732
Mezei, M., Áy, É., Koroknai, A., Tóth, R., Balázs, A., Bakos, Á., et al. (2011). Molecular epidemiological analysis of env and pol sequences in newly diagnosed HIV type 1-infected, untreated patients in Hungary. AIDS Res. Hum. Retroviruses 27, 1243–1247. doi: 10.1089/aid.2011.0077
Mukolo, A., Villegas, R., Aliyu, M., and Wallston, K. A. (2013). Predictors of late presentation for HIV diagnosis: a literature review and suggested way forward. AIDS Behav 17, 5–30. doi: 10.1007/s10461-011-0097-6
Nair, M., Gettins, L., Fuller, M., Kirtley, S., and Hemelaar, J. (2024). Global and regional genetic diversity of HIV-1 in 2010–21: systematic review and analysis of prevalence. Lancet Microbe 5:e776–e787. doi: 10.1016/S2666-5247(24)00151-4
National Center for Public Health and Pharmacy (2025). Data on HIV/AIDS cases until the end of 2024 in Hungary. ÁNTSZ. Available onlien at: https://antsz.hu/felso_menu/temaink/jarvany/hivaids/HIV_adatok/HIV_2024_eves/HIV_2024_eves.html (accessed September 8, 2025).
Nduva, G. M., Nazziwa, J., Hassan, A. S., Sanders, E. J., and Esbjörnsson, J. (2021). The role of phylogenetics in discerning HIV-1 mixing among vulnerable populations and geographic regions in Sub-Saharan Africa: a systematic review. Viruses 13:1174. doi: 10.3390/v13061174
Nduva, G. M., Otieno, F., Kimani, J., Wahome, E., McKinnon, L. R., Cholette, F., et al. (2022). Quantifying rates of HIV-1 flow between risk groups and geographic locations in Kenya: a country-wide phylogenetic study. Virus Evol. 8:veac016. doi: 10.1093/ve/veac016
Novitsky, V., Steingrimsson, J. A., Howison, M., Gillani, F. S., Li, Y., Manne, A., et al. (2020). Empirical comparison of analytical approaches for identifying molecular HIV-1 clusters. Sci. Rep. 10:18547. doi: 10.1038/s41598-020-75560-1
Oroz, M., Begovac, J., Planinić, A., Rokić, F., Lunar, M. M., Zorec, T. M., et al. (2019). Analysis of HIV-1 diversity, primary drug resistance and transmission networks in Croatia. Sci. Rep. 9:17307. doi: 10.1038/s41598-019-53520-8
Paradis, E., and Schliep, K. (2019). ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 35, 526–528. doi: 10.1093/bioinformatics/bty633
Phillips, A. N., Cambiano, V., Nakagawa, F., Brown, A. E., Lampe, F., Rodger, A., et al. (2013). Increased HIV incidence in men who have sex with men despite high levels of ART-induced viral suppression: analysis of an extensively documented epidemic. PLoS ONE 8:e55312. doi: 10.1371/journal.pone.0055312
Planinić, A., Begovac, J., Rokić, F., Šimičić, P., Oroz, M., Jakovac, K., et al. (2023). Characterization of Human Immunodeficiency Virus-1 transmission clusters and transmitted drug-resistant mutations in Croatia from 2019 to 2022. Viruses 15:2408. doi: 10.3390/v15122408
Preda, M., and Manolescu, L. C. S. (2022). Romania, a harbour of HIV-1 subtype F1: where are we after 33 years of HIV-1 infection? Viruses 14:2081. doi: 10.3390/v14092081
Quinn, T. C., Wawer, M. J., Sewankambo, N., Serwadda, D., Li, C., Wabwire-Mangen, F., et al. (2000). Viral load and heterosexual transmission of human immunodeficiency virus type 1. N. Engl. J. Med. 342, 921–929. doi: 10.1056/NEJM,200003303421303
R Core Team (2023). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. Available onlien at: https://www.R-project.org/ (Accessed February, June 16, 2023).
Ragonnet-Cronin, M., Hodcroft, E., Hué, S., Fearnhill, E., Delpech, V., Brown, A. J. L., et al. (2013). Automated analysis of phylogenetic clusters. BMC Bioinformatics 14:317. doi: 10.1186/1471-2105-14-317
Rambaut, A., Drummond, A. J., Xie, D., Baele, G., and Suchard, M. A. (2018). Posterior summarization in Bayesian phylogenetics using Tracer 1.7. Syst. Biol. 67, 901–904. doi: 10.1093/sysbio/syy032
Ratmann, O., van Sighem, A., Bezemer, D., Gavryushkina, A., Jurriaans, S., Wensing, A., et al. (2016). Sources of HIV infection among men having sex with men and implications for prevention. Sci. Transl. Med. 8:320ra2. doi: 10.1126/scitranslmed.aad1863
Rhee, S.-Y., Gonzales, M. J., Kantor, R., Betts, B. J., Ravela, J., and Shafer, R. W. (2003). Human immunodeficiency virus reverse transcriptase and protease sequence database. Nucleic Acids Res. 31, 298–303. doi: 10.1093/nar/gkg100
Roberts, J. D., Bebenek, K., and Kunkel, T. A. (1988). The accuracy of reverse transcriptase from HIV-1. Science 242, 1171–1173. doi: 10.1126/science.2460925
Robinson, D. F., and Foulds, L. R. (1981). Comparison of phylogenetic trees. Math. Biosci. 53, 131–147. doi: 10.1016/0025-5564(81)90043-2
Serwin, K., Chaillon, A., Scheibe, K., Urbańska, A., Aksak-Was, B., Zabek, P., et al. (2023). Circulation of Human Immunodeficiency Virus 1 A6 variant in the eastern border of the European Union-dynamics of the virus transmissions between Poland and Ukraine. Clin. Infect. Dis. 76, 1716–1724. doi: 10.1093/cid/ciad058
Serwin, K., Scheibe, K., Urbańska, A., Aksak-Was, B., Karasińska-Cieślak, M., Zabek, P., et al. (2024). Phylodynamic evolution of HIV-1 A6 sub-subtype epidemics in Poland. J. Med. Virol. 96:e29482. doi: 10.1002/jmv.29482
Serwin, K., Urbańska, A., Scheibe, K., Witak-Jedra, M., Jankowska, M., Hlebowicz, M., et al. (2021). Molecular epidemiology and HIV-1 variant evolution in Poland between 2015 and 2019. Sci. Rep. 11:16609. doi: 10.1038/s41598-021-96125-w
Smith, M. R. (2020). Information theoretic generalized Robinson–Foulds metrics for comparing phylogenetic trees. Bioinformatics 36, 5007–5013. doi: 10.1093/bioinformatics/btaa614
Smith, M. R. (2022). Robust analysis of phylogenetic tree space. Syst. Biol. 71, 1255–1270. doi: 10.1093/sysbio/syab100
Struck, D., Lawyer, G., Ternes, A.-M., Schmit, J.-C., and Bercoff, D. P. (2014). COMET: adaptive context-based modeling for ultrafast HIV-1 subtype identification. Nucleic Acids Res. 42:e144. doi: 10.1093/nar/gku739
Sullivan, P. S., Hamouda, O., Delpech, V., Geduld, J. E., Prejean, J., Semaille, C., et al. (2009). Reemergence of the HIV epidemic among men who have sex with men in North America, Western Europe, and Australia, 1996–2005. Ann. Epidemiol. 19, 423–431. doi: 10.1016/j.annepidem.2009.03.004
van Griensven, F., de Lind van Wijngaarden, J. W., Baral, S., and Grulich, A. (2009). The global epidemic of HIV infection among men who have sex with men. Curr. Opin. HIV AIDS 4:300. doi: 10.1097/COH.0b013e32832c3bb3
Vasylyeva, T. I., Liulchuk, M., du Plessis, L., Fearnhill, E., Zadorozhna, V., Babii, N., et al. (2019). The changing epidemiological profile of HIV-1 subtype B epidemic in Ukraine. AIDS Res. Hum. Retroviruses 35, 155–163. doi: 10.1089/aid.2018.0167
Vasylyeva, T. I., Liulchuk, M., Friedman, S. R., Sazonova, I., Faria, N. R., Katzourakis, A., et al. (2018). Molecular epidemiology reveals the role of war in the spread of HIV in Ukraine. Proc. Natl. Acad. Sci. U. S. A. 115, 1051–1056. doi: 10.1073/pnas.1701447115
Weibull Wärnberg, A., Brännström, J., Elvstam, O., Gisslén, M., Månsson, F., Sönnerborg, A., et al. (2023). The molecular epidemiology of HIV-1 in Sweden 1996 to 2022, and the influence of migration from Ukraine. Euro Surveill 28:2300224. doi: 10.2807/1560-7917.ES.2023.28.48.2300224
Wertheim, J. O., Oster, A. M., Switzer, W. M., Zhang, C., Panneer, N., Campbell, E., et al. (2019). Natural selection favoring more transmissible HIV detected in United States molecular transmission network. Nat. Commun. 10:5788. doi: 10.1038/s41467-019-13723-z
Wertheim, J. O., Pond, S. L. K., Forgione, L. A., Mehta, S. R., Murrell, B., Shah, S., et al. (2017). Social and genetic networks of HIV-1 transmission in New York City. PLoS Pathog. 13:e1006000. doi: 10.1371/journal.ppat.1006000
Keywords: molecular epidemiology, HIV, phylogenetics, phylogeography, transmission cluster analysis
Citation: Zsichla L, Adravecz L, Müller D, Lemey P, Lakatos, Ari E, Kusejko K, Kouyos R, Szlávik J, Lakatos B, Áy and Müller V (2025) Molecular epidemiology of HIV-1 in Hungary: an evolving contact zone of colliding virus subtypes. Front. Microbiol. 16:1732254. doi: 10.3389/fmicb.2025.1732254
Received: 25 October 2025; Revised: 21 November 2025;
Accepted: 27 November 2025; Published: 16 December 2025.
Edited by:
Maria Santoro, University of Rome Tor Vergata, ItalyReviewed by:
Hehe Zhao, Chinese Center for Disease Control and Prevention, ChinaMonika Habekova, Slovak Medical University, Slovakia
Copyright © 2025 Zsichla, Adravecz, Müller, Lemey, Lakatos, Ari, Kusejko, Kouyos, Szlávik, Lakatos, Áy and Müller. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Viktor Müller, bXVlbGxlci52aWt0b3JAdHRrLmVsdGUuaHU=; Éva Áy, YXkuZXZhQG5uZ3lrLmdvdi5odQ==
Lilla Adravecz2,3