The integration of human papillomavirus (HPV) is closely related to the occurrence of cervical cancer. However, little is known about the complete state of HPV integration into the host genome.
In this study, three HPV-positive cell lines, HeLa, SiHa, and CaSki, were subjected to NANOPORE long-read sequencing to detect HPV integration. Analysis of viral integration patterns using independently developed software (HPV-TSD) yielded multiple complete integration patterns for the three HPV cell lines.
We found distinct differences between the integration patterns of HPV18 and HPV16. Furthermore, the integration characteristics of the viruses were significantly different, even though they all belonged to HPV16 integration. The HPV integration in the CaSki cells was relatively complex. The HPV18 integration status in HeLa cells was the dominant, whereas the percentage of integrated HPV 16 in SiHa and CaSki cells was significantly lower. In addition, the virus sequences in the HeLa cells were incomplete and existed in an integrated state. We also identified a large number of tandem repeats in HPV16 and HPV18 integration. Our study not only clarified the feasibility of high-throughput long-read sequencing in the study of HPV integration, but also explored a variety of HPV integration models, and confirmed that viral integration is an important form of HPV in cell lines.
Elucidating HPV integration patterns will provide critical guidance for developing a detection algorithm for HPV integration, as well as the application of virus integration in clinical practice and drug research and development.