- 1Division of Dermatology, McGill University Health Centre, Montreal, QC, Canada
- 2Department of Pathology, Section of Dermatopathology, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
- 3Department of Pathology, McGill University Health Centre, Montreal, QC, Canada
- 4Division of Dermatology, Université de Montréal, Montréal, QC, Canada
- 5Division of Dermatology, Université Laval, Québec, QC, Canada
- 6Department of Dermatology, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
- 7Division of Dermatology, Ottawa Hospital Research Institute, University of Ottawa, Ottawa, ON, Canada
Cutaneous T-cell lymphomas (CTCLs) are a heterogeneous group of malignancies with courses ranging from indolent to potentially lethal. We recently studied in a 157 patient cohort gene expression profiles generated by the TruSeq targeted RNA gene expression sequencing. We observed that the sequencing library quality and depth from formalin-fixed paraffin-embedded (FFPE) skin samples were significantly lower when biopsies were obtained prior to 2009. We also observed that the fresh CTCL samples clustered together, even though they included stage I–IV disease. In this study, we compared TruSeq gene expression patterns in older (≤2008) vs. more recent (≥2009) FFPE samples to determine whether these clustering analyses and earlier described differentially expressed gene findings are robust when analyzed based on the year of biopsy. We also explored biases found in FFPE samples when subjected to the TruSeq analysis of gene expression. Our results showed that ≤2008 and ≥2009 samples clustered equally well to the full data set and, importantly, both analyses produced nearly identical trends and findings. Specifically, both analyses enriched nearly identical DEGs when comparing benign vs. (1) stage I–IV and (2) stage IV (alone) CTCL samples. Results obtained using either ≤2008 or ≥2009 samples were strongly correlated. Furthermore, by using subgroup analyses, we were able to identify additional novel differentially expressed genes (DEGs), which did not reach statistical significance in the prior full data set analysis. Those included CTCL-upregulated BCL11A, SELL, IRF1, SMAD1, CASP1, BIRC5, and MAX and CTCL-downregulated MDM4, SERPINB3, and THBS4 genes. With respect to sample biases, no matter if we performed subgroup analyses or full data set analysis, fresh samples tightly clustered together. While principal component analysis revealed that fresh samples were spatially closer together, indicating some preprocessing batch effect, they remained in the proximity to other normal/benign and FFPE CTCL samples and were not clustering as outliers by themselves. Notably, this did not affect the determination of DEGs when analyzing ≥2009 samples (fresh and FFPE biopsies) vs. ≥2009 FFPE samples alone.
Introduction
Cutaneous T-cell lymphomas (CTCLs) represent ~4–8% of all non-Hodgkin’s lymphomas and are characterized by infiltration of malignant T lymphocytes into the skin (1). Most patients first present with stage I disease, limited to the skin, which can either follow an indolent course (in 70–80% of cases) or progress to a potentially devastating, deadly malignancy with a median survival of <3 years (2). The diagnosis of CTCL is rather challenging for several reasons. First, mycosis fungoides (MF) and Sézary syndrome (SS), the most recognized variants of CTCL, can have variable presentation (3). Second, other common and rare benign inflammatory dermatoses can mimic CTCL and vice versa. Classically, MF may present with centrally distributed erythematous patches and plaques that are not specific to CTCL and are commonly misdiagnosed as chronic eczema, psoriasis, pityriasis rubra pilaris, drug eruptions, and dermatophyte infections. Finally, histopathological analysis of skin biopsies and PCR evaluation of T-cell receptor clonality lacks sensitivity in early MF patients and in erythrodermic disease. Unfortunately, current time to CTCL diagnosis from its initial presentation averages ~6 years (4).
Factors involved in the pathogenesis and prognostication of CTCL have emerged from recent epidemiological (5–8), karyotype/chromosomal (9–23), exome sequencings (24–28), gene and microRNA expression profiling studies (3, 29–41), but remain incomplete and poorly elucidated. The lymphocyte precursor population was proposed to be different between MF (skin resident memory T lymphocytes) vs. SS (skin tropic central memory T lymphocytes with wide tropism) (42–45). Importantly, significant disease heterogeneity was noted on a molecular level, and genetic alterations in MF/SS were often not replicated between different studies. Pathways that are believed to be involved in CTCL pathogenesis include T-cell function/signaling/differentiation, JAK/STAT/NF-κB signaling, cytokine production, chromatin remodeling, cell cycle checkpoint regulation, DNA repair, as well as cancer testis and embryonic stem cell signaling and function (24, 25, 28, 46). The goal of discovery and validation of prognostic biomarkers for disease progression and patient survival remains critical to help identify the minority of stage I MF patients, who will eventually progress to advanced disease (~20–30% of patients). Poor disease outcome may be heralded by high expression of TOX, GTSF1, NOTCH1, CCR4, ITK, FYB, SYC1, LCK or miR155, miR21, and let-7i microRNAs (26, 31, 39, 47).
Recently, we analyzed using Illumina’s TruSeq targeted RNA gene expression platform a new cohort of 157 patients, with biopsy-confirmed CTCL and compared it to a cohort of patients with normal skin and benign skin conditions (41). A number of patients in this study provided longitudinal biopsy samples (41). Analyzed samples included (A) 29 formalin-fixed paraffin-embedded (FFPE) tissues from benign inflammatory dermatoses and skin tag biopsies (1 sample per patient; 7 skin tag samples and 22 benign inflammatory dermatoses samples); (B) 134 FFPE samples of lesional CTCL skin from 110 patients; and (C) an additional 18 samples of freshly obtained and liquid nitrogen snap-frozen skin samples from a different group of CTCL patients. We processed 181 skin biopsy samples either freshly obtained or FFPE using TruSeq platform, capturing 284 genes that were previously identified as important for CTCL diagnosis and/or prognosis (32, 48). We identified 75 statistically significant differentially expressed genes (DEGs) between benign skin samples and either all CTCL or stage IV CTCL samples (41) and validated a number of our previous diagnostic and prognostic expression markers (3, 41).
However, we noticed non-trivial heterogeneity when performing clustering based on the TruSeq gene expression data, where early-stage CTCL samples and benign samples were admixed in the same clusters with the stage IV advanced CTCL disease. We hypothesized that this could be due to differences in TruSeq library sequencing depth and/or variation in the quality of the FFPE samples obtained during 2007–2008 (older) vs. 2009–2012 (more recent) years. Indeed, recent samples that were freshly obtained and snap frozen had comparable total number of sequencing reads (400–1,000 K reads), while older FFPE samples had often <300 K sequencing reads (41). In addition, we observed that freshly obtained snap-frozen CTCL samples were often tightly grouped in the same cluster, independent of their disease stage (41). This may indicate that TruSeq gene expression analysis may be affected by intrinsic biases based on the very natures of the samples analyzed (e.g., FFPE vs. fresh-frozen biopsies).
Notably, these variables (i.e., old vs. new; FFPE vs. freshly obtained snap frozen) were not formally evaluated in the prior publication but may contribute to the observed heterogeneity. These variations contribute toward a larger problem, known as the batch effect, in the field of gene expression-based analyses that utilize TruSeq, RNA-Seq, gene expression microarrays, and other approaches to identify DEGs. Differences in preprocessing, sequencing runs, technicians/centers, date of experiments, populations, and experimental design can account for heterogeneity that will remain despite normalization and use of control samples. Potential consequences of batch effect include reduction of statistical accuracy, introduction of spurious DEGs, and discrepancies between observed and true correlations (49). Several techniques can be used to minimize batch effects without removing true signals including surrogate variable analysis (50), ComBat (51), and principal component-based approaches (i.e., EIGENSTRAT among others) (52).
In this study, we aimed to characterize TruSeq gene expression patterns separately in older (≤2008) vs. more recent (≥2009) FFPE samples to determine whether clustering analyses results display robustness when compared to the full data set. We also explored sample processing biases (old vs. new and FFPE vs. freshly obtained snap frozen).
Materials and Methods
Patients and Samples
As described before (41), all patients were enrolled in the study in accordance with the IRB-approved protocols: PA12-0267, PA12-0497, and Lab97-256 at the MD Anderson Cancer Center (MDACC) and A09-M106-13A and 13-201-GEN at McGill University/McGill University Health Centre (MUHC). This study was carried out in accordance with the recommendations of the Research Ethics Board of the McGill University/MUHC with written informed consent from all subjects in accordance with the Declaration of Helsinki. This study was carried out in accordance with the recommendations of the MDACC Research Ethics Board, which exempted us from obtaining written informed consent from patients, who earlier signed a hospital consent allowing their stored biopsy samples to be used for research.
Data Acquisition
Processed TruSeq data from Litvinov et al (41) were re-analyzed in this study based on transcripts per million (TPM) and RNA integrity number (RIN) parameters. Raw data were deposited in the NCBI SRA, accession number SRP114956. We separated CTCL FFPE samples obtained from the MDACC into two subgroups: older (≤2008) vs. more recent (≥2009).
Clustering
Unsupervised hierarchal clustering was performed in R, using packages stats, cluster, and gplots. Pairwise dissimilarity (distance) matrix was calculated using Gower’s method, which performs well in the case of incomplete/missing data when compared to other methods (53). Clusters were obtained using Ward’s clustering method and criteria (54). Silhouette plots followed by visual inspection (to ensure appropriately sized clusters) were used to assess clusters and subclusters divisions. We repeated similar comparisons for all samples, benign samples vs. stage IV CTCL disease, and early (stage ≤IIA) vs. intermediate (stages IIB and III) vs. advanced (stage IV) CTCL.
Principal Component Analysis (PCA)
Principal component analysis was performed on scaled, centered TPM data using package pcaMethods (55). Probabilistic PCA was used to account for missing data. Score plots of principal components 1 and 2 were generated.
Statistical Analyses
Differences in mean TPMs were determined using two-tailed Ward’s t-test. Power analysis showed an 86% power to detect a twofold expression change at a significance level of 0.05 for the comparison between the smallest subgroups, with complete data points. Correlations were computed using Spearman’s rho, on log-2 ratios. Mean RINs were compared using a Bayesian analysis with Markov Chain Monte Carlo (MCMC) simulations, using R package rjags; at least 100,000 iterations were performed to estimate p values.
Results
Subgroup Clustering Analysis of All Samples
We previously noted that the ≤2008 FFPE samples had significantly decreased number of sequence reads per sample when compared to the ≥2009 samples (mean 103,406 ± 96,620 vs. 437,218 ± 550,840 reads, respectively). Therefore, we repeated unsupervised hierarchical clustering for benign samples (skin tags and benign inflammatory dermatoses), fresh liquid nitrogen snap-frozen CTCL samples, and either ≤2008 or ≥2009 FFPE CTCL samples. For ≤2008 FFPE sample analysis (Figure 1), we observed three major clusters. Cluster 1 comprised exclusively the FFPE CTCL samples, mostly early-stage (≤IIA) (12/33), along with two mid-stage (IIB and III) (2/8) and one late-stage (IV) (1/14) disease. In Cluster 2, 21 of 22 samples were from CTCL patients representing advanced stages (mid = 4/8 and late = 9/14), along with one eczema sample and a number of early-stage CTCL samples (8/33). Cluster 3 formed multiple subgroups (~4) that comprised mostly benign samples (28/29) and fresh CTCL samples, along with many early-stage and mid-late stage FFPE CTCL samples (early = 13/33, mid = 2/8, and late = 4/14). As previously discussed (41), one of the subgroups encompassed all fresh CTCL samples, which tightly clustered together (18/18). Two of the subgroups contained mostly benign samples, while the last one had early-stage FFPE CTCL samples. For ≥2009 FFPE samples (Figure 2), we noted two small clusters and two larger clusters. The first small cluster on the left panel (Cluster 1) contained six FFPE CTCL samples (two early, one mid, and three late). The second small cluster on the right (Cluster 4) included 18/18 fresh CTCL samples similarly to our previous analyses along with 2 benign biopsies. The first large cluster on the center left panel (Cluster 2) exhibited significant molecular disease heterogeneity. The first subgroup (A) had primarily mid-stage (n = 9) CTCL skin biopsies, one early and two late-stage samples, while the other two subgroups (B and C) were very heterogeneous with respect to their composition. For the second large cluster on the center right panel (Cluster 3), a similar admixture was observed with three subgroups, one subgroup being comprised primarily benign skin samples (C) and the other two containing predominantly early (A) and advanced (B) stage CTCL disease samples.
Figure 1. Unsupervised hierarchical clustering analysis based on TruSeq targeted RNA gene expression analysis of 284 select genes in benign inflammatory dermatoses (green), freshly obtained and snap-frozen cutaneous T-cell lymphoma (CTCL) samples (gray), and ≤2008 formalin-fixed paraffin-embedded (FFPE) CTCL samples (early stage, yellow; mid stage, orange; and late stage, dark red). A color key refers to gene expression in log(transcripts per million).
Figure 2. Unsupervised hierarchical clustering analysis based on TruSeq targeted RNA gene expression analysis of 284 select genes in benign inflammatory dermatoses (green), freshly obtained and snap-frozen cutaneous T-cell lymphoma (CTCL) samples (gray), and ≥2009 formalin-fixed paraffin-embedded (FFPE) CTCL samples (early stage, yellow; mid stage, orange; and late stage, dark red). A color key refers to gene expression in log(transcripts per million).
Subgroup Clustering Analysis of Healthy Skin/Benign Inflammatory Dermatoses Samples vs. Stage IV CTCL Samples
We then performed unsupervised hierarchical clustering for benign samples (which included skin tags and benign dermatoses that often clinically mimic CTCL) vs. stage IV CTCL disease. Similarly, two analyses were performed for ≤2008 and ≥2009 FFPE biopsies. In the case of ≤2008 samples (Figure 3), there were two major clusters that separated quite well these biopsies based on gene expression changes. Cluster 1 had 13 samples, 12 of which were stage IV CTCL disease (including 12/14 of total stage IV CTCL samples) and 1 sample form a patient with chronic eczema. Cluster 2 contained 30 samples in total and comprised mostly benign dermatoses and skin tags (n = 28) and 2 stage IV CTCL samples.
Figure 3. Unsupervised hierarchical clustering analysis based on TruSeq targeted RNA gene expression analysis of 284 select genes in benign inflammatory dermatoses (green) vs. ≤2008 stage IV formalin-fixed paraffin-embedded cutaneous T-cell lymphoma (CTCL) samples (red). A color key refers to gene expression in log(transcripts per million).
Surprisingly, for ≥2009 samples (Figure 4), greater overall heterogeneity was observed. However, we noted one small cluster in the right panel and one large cluster with three subgroups in the center. Cluster 1 (right panel) had nine samples, eight of which were stage IV samples (8/20 total stage IV CTCL samples). Cluster 2 was subdivided into three subgroups, where 2A samples (n = 7) with advanced CTCL disease tightly clustered together, while 2B (n = 20) and 2C (n = 11) samples included primarily benign dermatoses and skin tags (85 and 82%, respectively, for each subcluster).
Figure 4. Unsupervised hierarchical clustering analysis based on TruSeq targeted RNA gene expression analysis of 284 select genes in benign inflammatory dermatoses (green), vs. ≥2009 stage IV formalin-fixed paraffin-embedded cutaneous T-cell lymphoma (CTCL) samples (red). A color key refers to gene expression in log(transcripts per million).
Identification of Differentially Expressed Genes (DEGs) in All Samples Using Subgroup Analyses Based on the Year of Biopsy and Benign vs. Malignant Nature of Samples
We then analyzed our full data set, by performing a Wald’s t-test to compare either benign dermatoses vs. (1) all CTCL samples or (2) stage IV CTCL. In our initial report (41), we identified important differentially expressed genes (DEGs) including TOX, FYB, LEF1, CCR4, ITK, EED, POU2AF, IL-26, STAT5, BLK, GTSF1, PSORS1C2, CD70, and STAT signaling genes; LTA, NFKB1, NFKB2, and IL-15; and other inflammatory cytokines. In this study, we repeated the analysis of the FFPE samples obtained ≤2008 vs. ≥2009.
As presented in Table 1, our analysis revealed 54 DEGs (p < 0.05), when ≤2008 stage I–IV CTCL or ≤2008 stage IV CTCL samples were compared to benign skin samples. This list included 47/75 DEGs that were enriched in the initially reported full data set (41). New highlighted CTCL-upregulated targets in this analysis included BCL11A, SELL, IRF1, SMAD1, CASP1, and BIRC5, while THBS4 was upregulated in benign skin samples. For ≥2009 samples, 41 significant DEGs (p < 0.05) were found when freshly obtained and for ≥2009 samples, FFPE CTCL biopsies were analyzed together in a similar way (Table 2). Importantly, the same 41 DEGs were identified using only the ≥2009 FFPE samples alone (i.e., excluding the freshly obtained biopsies from this analysis). In the latter analysis, four additional CTCL-upregulated DEGs (EP400, NFKB1, TRRAP, and MAX) were revealed as being statistically significant (Table 2).
Table 1. Genes with statistically significant differences in expression both between benign skin dermatoses vs. all ≤2008 CTCL samples (left panel) and between benign skin lesions vs. ≤2008 stage IV CTCL samples (right panel).
Table 2. Genes with statistically significant differences in expression both between benign skin dermatoses vs. all ≥2009 CTCL samples (left panel) and between benign skin lesions vs. ≥2009 stage IV CTCL samples (right panel).
Based on these combined results, 42/75 DEGs were confirmed in both analyses, which highlights significant robustness of these tests. Of course, many of the initially identified DEGs did not achieve statistical significance since the number of samples analyzed in each of these subanalyses (i.e., ≤2008 and ≥2009) was significantly smaller than when all the data were analyzed as one set. Moreover, based on the original TruSeq data, subgroup analysis showed consistency in log-2 ratios between ≤2008 and ≥2009 CTCL samples. Indeed, rank–rank correlation when comparing benign dermatoses vs. all FFPE CTCL samples was ρ = 0.71 (strong; p < 10−16), while this indicator was ρ = 0.55 (medium; p < 10−16) when comparison was made between benign dermatoses and stage IV FFPE biopsies.
Clustering Analysis of All FFPE CTCL Samples Using Subgroup Analyses Based on the Year of Biopsy and CTCL Clinical Cancer Stage
We then performed unsupervised hierarchical clustering analysis for early (≤IIA) vs. mid (IIB and III) vs. late (IV) stage CTCL for ≤2008 vs. ≥2009 samples. Similarly, we noted a significant molecular heterogeneity that was seen in our original report (41). However, for the ≤2008 CTCL FFPE samples (Figure 5), there were two major clusters. Cluster 1 had 12 samples, 10 of which were early-stage CTCL biopsies (10/31 of the total early-stage CTCL samples). Cluster 2 was rather heterogeneous with respect to its composition and could be subdivided into two subclusters: 2A, larger, with samples from all different stages and 2B with the well-defined subgroup of early-stage CTCL biopsies (10/11) on the right side of this subcluster. For ≥2009 samples (Figure 6), the distribution of samples was very heterogeneous as was seen in our earlier report (41).
Figure 5. Unsupervised hierarchical clustering analysis based on TruSeq targeted RNA gene expression analysis of 284 select genes in ≤2008 early-stage (stage ≤IIA, yellow), mid-stage (stages IIB and III, orange), and late-stage (stage IV, dark red) formalin-fixed paraffin-embedded cutaneous T-cell lymphoma (CTCL) samples. A color key refers to gene expression in log(transcripts per million).
Figure 6. Unsupervised hierarchical clustering analysis based on TruSeq targeted RNA gene expression analysis of 284 select genes in ≥2009 early-stage (stage ≤IIA, yellow), mid-stage (stages IIB and III, orange), and late-stage (stage IV, dark red) formalin-fixed paraffin-embedded cutaneous T-cell lymphoma (CTCL) samples. A color key refers to gene expression in log(transcripts per million).
Identification of Differentially Expressed Genes (DEGs) in All FFPE CTCL Samples Using Subgroup Analyses Based on the Year of Biopsy and CTCL Clinical Cancer Stage
We next searched for the DEGs that were highlighted when we compared (1) early-stage (≤IIA) to mid and late CTCL stages (≥IIB) samples and (2) stage I vs. stage IV CTCL samples. Similarly, in this case, we analyzed ≤2008 and ≥2009 CTCL samples separately to test the robustness of the TruSeq results based on the year of skin biopsy. For ≤2008 samples, 12 genes were highlighted as being statistically significant: TOX, EED, and LCP2 were upregulated in late-stage CTCL, while ATXN7, CHD1, HUNK, TP63, KIT, JUNB, LTBP4, HDAC2, and OTUB2 were expressed preferentially in early-stage CTCL samples (Table 3). For ≥2009 samples, three different genes were identified: SKAP1 and GTSF1 were upregulated in late-stage CTCL, while BCL11A was upregulated in early-stage CTCL (Table 4). Overall, merging both subgroups, we validated three of the four genes observed in the full data set when performing the same analysis: TOX and GTSF1 were upregulated in late-stage CTCL, and LTBP4 was upregulated in early-stage CTCL.
Table 3. Genes with statistically significant differences in expression both between ≤2008 early-stage (≤IIA) vs. mid- and late-stage (≥IIB) formalin-fixed paraffin-embedded (FFPE) cutaneous T-cell lymphoma (CTCL) samples (left panel) and between ≤2008 stage I vs. stage IV FFPE CTCL samples (right panel).
Table 4. Genes with statistically significant differences in expression both between ≥2009 early stage (≤IIA) vs. mid and late stage (≥IIB) formalin-fixed paraffin-embedded (FFPE) cutaneous T-cell lymphoma (CTCL) samples (left panel) and between ≥2009 stage I vs. stage IV FFPE CTCL samples (right panel).
This subgroup analysis showed moderate consistency in log-2 ratios ≤2008 and ≥2009 samples when comparing early vs. mid and late FFPE CTCL samples (ρ = 0.28; low; p < 10−4). However, there was no correlation when comparing stage I vs. stage IV FFPE tissues (ρ = 0.06; no correlation; p = 0.36).
Comparison of the TruSeq Data Quality in FFPE vs. Freshly Obtained Snap-Frozen Samples
With respect to the RINs, a measure of sample quality prior to conducting the TruSeq analysis, we observed lower RINs for FFPE samples than freshly obtained snap-frozen samples. However, these RINs were within the expected range for FFPE samples (56, 57). RINs were much higher in fresh samples than in the FFPE samples, as expected (fresh: mean 6.1, 95% CI 5.5–6.8; FFPE: mean 2.4, 95% CI 2.3–2.5, p < 10−6 with MCMC). RNA libraries were also more concentrated in the freshly obtained samples (fresh: mean 227 ng/µL, 95% CI 110–239; FFPE: mean 64 ng/µL, 95% CI 53–75; p = 0.0012 with MCMC). There was no difference in RINs between ≤2008 and ≥2009 FFPE samples [≤2008: mean 2.3, 95% confidence interval (95% CI) 2.2–2.3; ≥2009: mean 2.4, 95% CI 2.3–2.6; p = 0.92 with MCMC]. RNA libraries were less concentrated in ≤2008 FFPE samples vs. ≥2009 samples, possibly explaining in part the lower TruSeq sequencing depth in ≤2008 samples (≤2008: mean 44 ng/µL, 95% CI 33–55; ≥2009: mean 76 ng/µL, 95% CI 60–92; p = 0.0007 with MCMC).
To detect possible batch effects, we performed PCA on TPM data. We aimed to determine whether (1) freshly obtained flash snap-frozen samples cluster together and (2) ≤2008 and ≥2009 FFPE samples cluster in different areas. We observed a tight cluster of fresh CTCL samples (gray), whether using ≤2008 (Figures 7A,B) or ≥2009 (Figures 7C,D) FFPE CTCL samples, indicating that differences in preprocessing protocols might explain these findings (tight associations in clustering analyses). However, these freshly obtained samples were also in close spatial proximity to normal/benign samples and a number of FFPE samples. When comparing ≤2008 and ≥2009 FFPE samples, we observed no clear clusters (Figures 7E,F), but rather two loose associations. First, many newer samples (≥2009) were clustering around the center of the distribution, toward normal/benign and freshly obtained samples, indicating less preprocessing batch effect. Second, samples showing greater variability were mostly older samples (≤2008), indicating that there might be some processing batch effect among these. Taken together, these findings may explain why performing individual subgroup analyses enabled us to uncover additional DEGs.
Figure 7. Principal component score plots. (A,B) First and second principal component scores of normal/benign (green), freshly obtained and liquid nitrogen snapped-frozen cutaneous T-cell lymphoma (CTCL) (gray), ≤2008 early-stage formalin-fixed paraffin-embedded (FFPE) CTCL (yellow), ≤2008 mid-stage FFPE CTCL (orange), and ≤2008 advanced stage FFPE CTCL (red) samples are plotted. (C,D) First and second principal component scores of normal/benign (green), freshly obtained and liquid nitrogen snapped-frozen CTCL (gray), ≥2009 early-stage FFPE CTCL (yellow), ≥2009 mid-stage FFPE CTCL (orange), and ≥2009 advanced stage FFPE CTCL (red) samples are plotted. (E,F) First and second principal component scores of normal/benign (green), freshly obtained, and liquid nitrogen snapped-frozen CTCL (gray), ≤2008 FFPE CTCL (red), and ≥2009 FFPE CTCL (yellow) samples are plotted.
Discussion
In this study, we used subgroup analysis to determine whether older ≤2008 FFPE samples, which were sequenced at a lower depth on the TruSeq platform, were comparable to those obtained ≥2009. In this study, we also systematically analyzed sample processing biases based on the year of biopsy and the nature (i.e., FFPE vs. freshly obtained snap frozen) of the samples. Clustering analysis showed that ≤2008 and ≥2009 samples clustered equally well to the full data set and, furthermore, in a number of instances they demonstrated even better defined clusters. In particular, for ≤2008 samples, clusters were more reminiscent of the three clusters found in the landmark Boston CTCL cohort (3, 32, 48) when looking at all samples. There was also a better discrimination between benign and stage IV CTCL samples in ≤2008 samples than in the ≥2009 samples. Both analyses produced nearly identical trends and findings. Specifically, both analyses enriched nearly identical DEGs when comparing benign vs. (1) stage I–IV and (2) stage IV (alone) CTCL samples. Importantly, in this subgroup analysis, we recapitulated most of the targets seen within the full data set. Results obtained using either ≤2008 or ≥2009 samples were strongly correlated. Known upregulated targets in CTCL vs. benign dermatoses were validated, including TOX, FYB, LEF, and STAT signaling genes, inflammatory interleukins, NF-κB pathway signaling members, cancer testis genes, etc. We had previously reviewed in detail how these genes relate to the biology of CTCL tumorigenesis (3, 31).
Furthermore, this subgroup analysis enabled us to discover additional genes, which did not reach statistical significance in the full data set analysis. One may find it to be counterintuitive. However, indeed, despite the inherently decreased power, potential reasons why additional DEGs can be identified through subgroup analysis may include reduced variability on a per-sample basis due to increased in-group similarity and removal of outliers in some groups.
Those new DEGs included CTCL-upregulated BCL11A (regulation of RNA transcription), SELL (cell adhesion molecule in the selectin family), IRF1 (Interferon transcription factor), SMAD1 (BMP signaling and gene expression), CASP1 (caspase involved in proteolysis), BIRC5 (inhibitor of apoptosis, survivin), MAX (Myc-associated transcription factor), and CTCL-downregulated MDM4 (negative regulator of p53), SERPINB3 (serine protease involved in inflammatory response), and THBS4 (cell-cell and cell-matrix interactions) genes. Of note, THBS4 promoter was previously found to be frequently hypermethylated in 52% of CTCL samples, which leads to the downregulation in expression of this tumor suppressor gene (58). CASP1 single-nucleotide polymorphisms were associated with changes in NF-κB signaling and development of other non-Hodgkin lymphomas, including diffuse-large B cell lymphomas and small lymphocytic lymphoma/chronic lymphocytic leukemia (59).
By using the full data set analysis, we found significant heterogeneity in our clusters (41). When we performed clustering on ≤2008 or ≥2009 FFPE CTCL samples, we still did not obtain three clusters that were previously described in the historic Boston cohort of CTCL patients (3, 32, 48). However, in this study of subgroup analyses, we noted less heterogeneity than we observed in the full data set analysis (41). PCA results also supported this conclusion. Indeed, in this subgroup analysis, samples of similar clinical disease stages were most often grouped together.
Subsequently, when we studied the DEGs enriched in both (1) early vs. mid and late CTCL and (2) stage I vs. stage IV disease, four genes were differentially expressed: TOX (involved in chromatin processes and T-cell development), FYB (T-cell adaptor protein), and GTSF1 (germ cell maintenance) were upregulated, and LTBP4 (latent TGF-beta binding protein) was downregulated in later CTCL stages. By merging subgroup analysis of ≤2008 and ≥2009 FFPE samples, our targets included TOX, GTSF1, and LTBP4 as well. In particular, TOX overexpression is a hallmark of poor prognosis in CTCL, although low level of TOX expression has been previously reported in benign dermatoses (31, 60). TOX and GTSF1 are aberrantly expressed developmental and meiotic genes that can prognosticate CTCL progression toward advanced disease (29, 31, 34). We also found that EED (Polycomb complex member expressed in embryonic stem cells), SKAP1 (T-cell adhesion), and LCP2 (T-cell receptor-mediated signaling) were upregulated in advanced CTCL stages. Surprisingly, we also found multiple genes with higher expression in early-stage tumors. These included BCL11A (see above), ATXN7 (chromatin remodeling, AKT signaling), HUNK (AMPK-related kinase), CHD1 (chromatin remodeling), TP63 (transcription factor), KIT (receptor tyrosine kinase), JUNB (transcription factor), HDAC2 (histone deacetylase), and OTUB2 (deubiquitinase, inhibits proteolysis). Based on these combined results, transcription factors, chromatin remodelers, and global cell signaling processes are upregulated early in the disease, while in the advanced stages of CTCL, T-cell-specific genes, inflammatory mediators, and stem cell/germ cell maintenance genes appear to be driving cancer progression. These results further argue that subgroup analysis can often yield additional clues into the biology of cancers.
Formalin-fixed paraffin-embedded samples have RNA of lesser quality than the freshly obtained snap-frozen samples (61). However, FFPE samples are much easier to obtain in the clinical setting, have longer storage half-life, and are suitable for immunohistochemistry in a clinical pathology lab (62). Our FFPE RINs were comparable to those obtained in previous studies (56, 57). No matter if we performed subgroup analyses or full data set analysis, fresh samples tightly clustered together. While PCA revealed that fresh samples were spatially closer together, indicating some preprocessing batch effect, they remained in the proximity to other normal/benign and FFPE CTCL samples and were not clustering as outliers by themselves. However, this observed batch effect did not affect the determination of DEGs when analyzing all ≥2009 samples (fresh and FFPE biopsies) vs. ≥2009 FFPE samples alone. Other reports comparing freshly obtained frozen samples to FFPE samples showed a strong correlation (ρ > 0.70) in gene expression analysis (63). Formalin acts as a crosslinking agent for protein–protein, DNA–protein, and RNA–protein interactions (64). Crosslinking nucleic acid to proteins has its advantages in molecular medicine and is especially useful in characterizing transcription factor binding sites via chromatin immunoprecipitation (65) or RNA–protein interactions using RNA immunoprecipitation (66). In this study, we have successfully applied TruSeq targeted RNA sequencing to CTCL samples, both fresh and FFPE. A recent, direct comparison of TruSeq-analyzed RNA obtained from matched FFPE vs. fresh samples produced strongly correlated gene expression findings (R2 > 0.70) (67). Interestingly, previous studies showed that the RINs can range from 2.2 to 2.8 (median 2.3) for FFPE samples and 3.8 to 8.0 (median 6.8) for freshly obtained samples (67), which is consistent with our findings detailed in this report. In the study by Graw et al., illumina sequence reads between FFPE and freshly obtained matched samples showed a 0.33% error rate (67), which is consistent to previous reports for identical samples processed on the Illumina platform, when a 0.30% error rate was reported (68). In summary, our results indicate that performing targeted gene expression studies on the TruSeq platform from FFPE samples is a viable option that can be used in the real-life, clinical medicine setting.
Ethics Statement
All patients were enrolled in the study in accordance with the IRB-approved protocols: PA12-0267, PA12-0497, and Lab97-256 at the MD Anderson Cancer Center (MDACC) and A09-M106-13A and 13-201-GEN at McGill University/McGill University Health Centre (MUHC). This study was carried out in accordance with the recommendations of the Research Ethics Board of the McGill University/McGill University Health Centre with written informed consent from all subjects in accordance with the Declaration of Helsinki. This study was carried out in accordance with the recommendations of the MD Anderson Cancer Center (MDACC) Research Ethics Board, which exempted us from obtaining written informed consent from patients, who earlier signed a hospital consent allowing their stored biopsy samples to be used for research.
Author Contributions
PL, EN, MT, LM, AW, DS, XN, NP, MG, MD, and IL procured and analyzed patient sample data presented in this paper. PL, EN, and IL performed bioinformatic and statistical analyses. MT and AW performed pathological analysis of the original skin samples. PL, EN, MT, LM, AW, DS, XN, NP, MG, MD, and IL wrote the paper. NP, LM, MG, DS, MD, and IL supervised the study.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The reviewer YG and handling editor declared their shared affiliation.
Acknowledgments
We thank Mr. Philippe Thibault for his assistance with the bioinformatical analysis. This work was supported by the new investigator funding program from the Ottawa Hospital Research Institute to IL and the Canadian Dermatology Foundation research grants to DS and IL, Joan Sealy Trust Cancer Research Fund (Ottawa Hospital Research Institute) grant to IL and the Department of Medicine, The Ottawa Hospital research grant to IL and the Fonds de la recherche en santé du Québec (FRSQ) research grants to DS (FRQS# 22648) and to IL (# 34753 and 36769). MD is a Blanche Bender Professor of Cancer Research and received funding from the Dorothy and Martin Spatz Foundation.
References
1. Willemze R, Jaffe ES, Burg G, Cerroni L, Berti E, Swerdlow SH, et al. WHO-EORTC classification for cutaneous lymphomas. Blood (2005) 105:3768–85. doi:10.1182/blood-2004-09-3502
2. Han T, Abdel-Motal UM, Chang DK, Sui J, Muvaffak A, Campbell J, et al. Human anti-CCR4 minibody gene transfer for the treatment of cutaneous T-cell lymphoma. PLoS One (2012) 7:e44455. doi:10.1371/journal.pone.0044455
3. Litvinov IV, Netchiporouk E, Cordeiro B, Dore MA, Moreau L, Pehr K, et al. The use of transcriptional profiling to improve personalized diagnosis and management of cutaneous T-cell lymphoma (CTCL). Clin Cancer Res (2015) 21:2820–9. doi:10.1158/1078-0432.CCR-14-3322
4. Kirsch IR, Watanabe R, O’Malley JT, Williamson DW, Scott LL, Elco CP, et al. TCR sequencing facilitates diagnosis and identifies mature T cells as the cell of origin in CTCL. Sci Transl Med (2015) 7:308ra158. doi:10.1126/scitranslmed.aaa9122
5. Ghazawi FM, Netchiporouk E, Rahme E, Tsang M, Moreau L, Glassman S, et al. Comprehensive analysis of cutaneous T-cell lymphoma (CTCL) incidence and mortality in Canada reveals changing trends and geographic clustering for this malignancy. Cancer (2017) 123(18):3550–67. doi:10.1002/cncr.30758
6. Litvinov IV, Tetzlaff MT, Rahme E, Habel Y, Risser DR, Gangar P, et al. Identification of geographic clustering and regions spared by cutaneous T-cell lymphoma in Texas using 2 distinct cancer registries. Cancer (2015) 121:1993–2003. doi:10.1002/cncr.29301
7. Litvinov IV, Tetzlaff MT, Rahme E, Jennings MA, Risser DR, Gangar P, et al. Demographic patterns of cutaneous T-cell lymphoma incidence in Texas based on two different cancer registries. Cancer Med (2015) 4:1440–7. doi:10.1002/cam4.472
8. Moreau JF, Buchanich JM, Geskin JZ, Akilov OE, Geskin LJ. Non-random geographic distribution of patients with cutaneous T-cell lymphoma in the Greater Pittsburgh Area. Dermatol Online J (2014) 20(7):13030.
9. Barba G, Matteucci C, Girolomoni G, Brandimarte L, Varasano E, Martelli MF, et al. Comparative genomic hybridization identifies 17q11.2 approximately q12 duplication as an early event in cutaneous T-cell lymphomas. Cancer Genet Cytogenet (2008) 184:48–51. doi:10.1016/j.cancergencyto.2008.03.007
10. Caprini E, Cristofoletti C, Arcelli D, Fadda P, Citterich MH, Sampogna F, et al. Identification of key regions and genes important in the pathogenesis of sezary syndrome by combining genomic and expression microarrays. Cancer Res (2009) 69:8438–46. doi:10.1158/0008-5472.CAN-09-2367
11. Fischer TC, Gellrich S, Muche JM, Sherev T, Audring H, Neitzel H, et al. Genomic aberrations and survival in cutaneous T cell lymphomas. J Invest Dermatol (2004) 122:579–86. doi:10.1111/j.0022-202X.2004.22301.x
12. Karenko L, Sarna S, Kahkonen M, Ranki A. Chromosomal abnormalities in relation to clinical disease in patients with cutaneous T-cell lymphoma: a 5-year follow-up study. Br J Dermatol (2003) 148:55–64. doi:10.1046/j.1365-2133.2003.05116.x
13. Laharanne E, Oumouhou N, Bonnet F, Carlotti M, Gentil C, Chevret E, et al. Genome-wide analysis of cutaneous T-cell lymphomas identifies three clinically relevant classes. J Invest Dermatol (2010) 130:1707–18. doi:10.1038/jid.2010.8
14. Mao X, Lillington D, Scarisbrick JJ, Mitchell T, Czepulkowski B, Russell-Jones R, et al. Molecular cytogenetic analysis of cutaneous T-cell lymphomas: identification of common genetic alterations in Sezary syndrome and mycosis fungoides. Br J Dermatol (2002) 147:464–75. doi:10.1046/j.1365-2133.2002.04966.x
15. Mao X, Lillington DM, Czepulkowski B, Russell-Jones R, Young BD, Whittaker S. Molecular cytogenetic characterization of Sezary syndrome. Genes Chromosomes Cancer (2003) 36:250–60. doi:10.1002/gcc.10152
16. Mao X, McElwaine S. Functional copy number changes in Sezary syndrome: toward an integrated molecular cytogenetic map III. Cancer Genet Cytogenet (2008) 185:86–94. doi:10.1016/j.cancergencyto.2008.05.006
17. Prochazkova M, Chevret E, Mainhaguiet G, Sobotka J, Vergier B, Belaud-Rotureau MA, et al. Common chromosomal abnormalities in mycosis fungoides transformation. Genes Chromosomes Cancer (2007) 46:828–38. doi:10.1002/gcc.20469
18. Salgado R, Servitje O, Gallardo F, Vermeer MH, Ortiz-Romero PL, Karpova MB, et al. Oligonucleotide array-CGH identifies genomic subgroups and prognostic markers for tumor stage mycosis fungoides. J Invest Dermatol (2010) 130:1126–35. doi:10.1038/jid.2009.306
19. Shapiro PE, Warburton D, Berger CL, Edelson RL. Clonal chromosomal abnormalities in cutaneous T-cell lymphoma. Cancer Genet Cytogenet (1987) 28:267–76. doi:10.1016/0165-4608(87)90213-5
20. Thangavelu M, Finn WG, Yelavarthi KK, Roenigk HH Jr, Samuelson E, Peterson L, et al. Recurring structural chromosome abnormalities in peripheral blood lymphocytes of patients with mycosis fungoides/Sezary syndrome. Blood (1997) 89:3371–7.
21. van Doorn R, van Kester MS, Dijkman R, Vermeer MH, Mulder AA, Szuhai K, et al. Oncogenomic analysis of mycosis fungoides reveals major differences with Sezary syndrome. Blood (2009) 113:127–36. doi:10.1182/blood-2008-04-153031
22. Vermeer MH, van Doorn R, Dijkman R, Mao X, Whittaker S, van Voorst Vader PC, et al. Novel and highly recurrent chromosomal alterations in Sezary syndrome. Cancer Res (2008) 68:2689–98. doi:10.1158/0008-5472.CAN-07-6398
23. Wain EM, Mitchell TJ, Russell-Jones R, Whittaker SJ. Fine mapping of chromosome 10q deletions in mycosis fungoides and sezary syndrome: identification of two discrete regions of deletion at 10q23.33-24.1 and 10q24.33-25.1. Genes Chromosomes Cancer (2005) 42:184–92. doi:10.1002/gcc.20115
24. Wang L, Ni X, Covington KR, Yang BY, Shiu J, Zhang X, et al. Genomic profiling of Sezary syndrome identifies alterations of key T cell signaling and differentiation genes. Nat Genet (2015) 47:1426–34. doi:10.1038/ng.3444
25. Ungewickell A, Bhaduri A, Rios E, Reuter J, Lee CS, Mah A, et al. Genomic analysis of mycosis fungoides and Sezary syndrome identifies recurrent alterations in TNFR2. Nat Genet (2015) 47:1056–60. doi:10.1038/ng.3370
26. Sandoval J, Diaz-Lagares A, Salgado R, Servitje O, Climent F, Ortiz-Romero PL, et al. MicroRNA expression profiling and DNA methylation signature for deregulated microRNA in cutaneous T-cell lymphoma. J Invest Dermatol (2015) 135:1128–37. doi:10.1038/jid.2014.487
27. McGirt LY, Jia P, Baerenwald DA, Duszynski RJ, Dahlman KB, Zic JA, et al. Whole-genome sequencing reveals oncogenic mutations in mycosis fungoides. Blood (2015) 126:508–19. doi:10.1182/blood-2014-11-611194
28. da Silva Almeida AC, Abate F, Khiabanian H, Martinez-Escala E, Guitart J, Tensen CP, et al. The mutational landscape of cutaneous T cell lymphoma and Sezary syndrome. Nat Genet (2015) 47:1465–70. doi:10.1038/ng.3442
29. Huang Y, Litvinov IV, Wang Y, Su MW, Tu P, Jiang X, et al. Thymocyte selection-associated high mobility group box gene (TOX) is aberrantly over-expressed in mycosis fungoides and correlates with poor prognosis. Oncotarget (2014) 5:4418–25. doi:10.18632/oncotarget.2031
30. Litvinov IV, Cordeiro B, Fredholm S, Odum N, Zargham H, Huang Y, et al. Analysis of STAT4 expression in cutaneous T-cell lymphoma (CTCL) patients and patient-derived cell lines. Cell Cycle (2014) 13:2975–82. doi:10.4161/15384101.2014.947759
31. Litvinov IV, Cordeiro B, Huang Y, Zargham H, Pehr K, Dore MA, et al. Ectopic expression of cancer testis antigens in cutaneous T-cell lymphoma (CTCL) patients. Clin Cancer Res (2014) 20(14):3799–808. doi:10.1158/1078-0432.CCR-14-0307
32. Litvinov IV, Jones DA, Sasseville D, Kupper TS. Transcriptional profiles predict disease outcome in patients with cutaneous T-cell lymphoma. Clin Cancer Res (2010) 16:2106–14. doi:10.1158/1078-0432.CCR-09-2879
33. Litvinov IV, Kupper TS, Sasseville D. The role of AHI1 and CDKN1C in cutaneous T-cell lymphoma progression. Exp Dermatol (2012) 21:964–6. doi:10.1111/exd.12039
34. Litvinov IV, Netchiporouk E, Cordeiro B, Zargham H, Pehr K, Gilbert M, et al. Ectopic expression of embryonic stem cell and other developmental genes in cutaneous T-cell lymphoma. Oncoimmunology (2014) 3:e970025. doi:10.4161/21624011.2014.970025
35. Litvinov IV, Pehr K, Sasseville D. Connecting the dots in cutaneous T cell lymphoma (CTCL): STAT5 regulates malignant T cell proliferation via miR-155. Cell Cycle (2013) 12:2172–3. doi:10.4161/cc.25550
36. Litvinov IV, Zhou Y, Kupper TS, Sasseville D. Loss of BCL7A expression correlates with poor disease prognosis in patients with early-stage cutaneous T-cell lymphoma. Leuk Lymphoma (2013) 54(3):653–4. doi:10.3109/10428194.2012.717695
37. Kopp KL, Ralfkiaer U, Nielsen BS, Gniadecki R, Woetmann A, Odum N, et al. Expression of miR-155 and miR-126 in situ in cutaneous T-cell lymphoma. APMIS (2013) 121:1020–4. doi:10.1111/apm.12162
38. Marstrand T, Ahler CB, Ralfkiaer U, Clemmensen A, Kopp KL, Sibbesen NA, et al. Validation of a diagnostic microRNA classifier in cutaneous T-cell lymphomas. Leuk Lymphoma (2014) 55:957–8. doi:10.3109/10428194.2013.815352
39. Ralfkiaer U, Hagedorn PH, Bangsgaard N, Lovendorf MB, Ahler CB, Svensson L, et al. Diagnostic microRNA profiling in cutaneous T-cell lymphoma (CTCL). Blood (2011) 118:5891–900. doi:10.1182/blood-2011-06-358382
40. Ralfkiaer U, Lindahl LM, Litman T, Gjerdrum LM, Ahler CB, Gniadecki R, et al. MicroRNA expression in early mycosis fungoides is distinctly different from atopic dermatitis and advanced cutaneous T-cell lymphoma. Anticancer Res (2014) 34(12):7207–17.
41. Litvinov IV, Tetzlaff MT, Thibault P, Gangar P, Moreau L, Watters AK, et al. Gene expression analysis in Cutaneous T-Cell Lymphomas (CTCL) highlights disease heterogeneity and potential diagnostic and prognostic indicators. Oncoimmunology (2017) 6:e1306618. doi:10.1080/2162402X.2017.1306618
42. Clark RA. Resident memory T cells in human health and disease. Sci Transl Med (2015) 7:269rv1. doi:10.1126/scitranslmed.3010641
43. Scarisbrick JJ, Prince HM, Vermeer MH, Quaglino P, Horwitz S, Porcu P, et al. Effect of specific prognostic markers on survival and development of a prognostic model. J Clin Oncol (2015) 33:3766–73. doi:10.1200/JCO.2015.61.7142
44. Alberti-Violetti S, Talpur R, Schlichte M, Sui D, Duvic M. Advanced-stage mycosis fungoides and Sezary syndrome: survival and response to treatment. Clin Lymphoma Myeloma Leuk (2015) 15:e105–12. doi:10.1016/j.clml.2015.02.027
45. Talpur R, Singh L, Daulat S, Liu P, Seyfer S, Trynosky T, et al. Long-term outcomes of 1,263 patients with mycosis fungoides and Sezary syndrome from 1982 to 2009. Clin Cancer Res (2012) 18:5051–60. doi:10.1158/1078-0432.CCR-12-0604
46. McGirt LY, Baerenwald DA, Vonderheid EC, Eischen CM. Early changes in miRNA expression are predictive of response to extracorporeal photopheresis in cutaneous T-cell lymphoma. J Eur Acad Dermatol Venereol (2015) 29:2269–71. doi:10.1111/jdv.12571
47. Kamstrup MR, Gjerdrum LM, Biskup E, Lauenborg BT, Ralfkiaer E, Woetmann A, et al. Notch1 as a potential therapeutic target in cutaneous T-cell lymphoma. Blood (2010) 116:2504–12. doi:10.1182/blood-2009-12-260216
48. Shin J, Monti S, Aires DJ, Duvic M, Golub T, Jones DA, et al. Lesional gene expression profiling in cutaneous T-cell lymphoma reveals natural clusters associated with disease outcome. Blood (2007) 110:3015–27. doi:10.1182/blood-2006-12-061507
49. Danish HH, Liu S, Jhaveri J, Flowers CR, Lechowicz MJ, Esiashvili N, et al. Validation of cutaneous lymphoma international prognostic index (CLIPI) for mycosis fungoides and Sezary syndrome. Leuk Lymphoma (2016) 57(12):2813–9. doi:10.3109/10428194.2016.1173210
50. Leek JT, Storey JD. Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet (2007) 3:1724–35. doi:10.1371/journal.pgen.0030161
51. Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics (2007) 8:118–27. doi:10.1093/biostatistics/kxj037
52. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet (2006) 38:904–9. doi:10.1038/ng1847
53. Gower JC. A general coefficient of similarity and some of its properties. Biometrics (1971) 27:857–74. doi:10.2307/2528823
54. Murtagh F, Legendre P. Ward’s hierarchical agglomerative clustering method: which algorithms implement ward’s criterion? J Classif (2014) 31:274–95. doi:10.1007/s00357-014-9161-z
55. Stacklies W, Redestig H, Scholz M, Walther D, Selbig J. pcaMethods – a bioconductor package providing PCA methods for incomplete data. Bioinformatics (2007) 23:1164–7. doi:10.1093/bioinformatics/btm069
56. von Ahlfen S, Missel A, Bendrat K, Schlumpberger M. Determinants of RNA quality from FFPE samples. PLoS One (2007) 2:e1261. doi:10.1371/journal.pone.0001261
57. Ribeiro-Silva A, Zhang H, Jeffrey SS. RNA extraction from ten year old formalin-fixed paraffin-embedded breast cancer samples: a comparison of column purification and magnetic bead-based technologies. BMC Mol Biol (2007) 8:118. doi:10.1186/1471-2199-8-118
58. van Doorn R, Zoutman WH, Dijkman R, de Menezes RX, Commandeur S, Mulder AA, et al. Epigenetic profiling of cutaneous T-cell lymphoma: promoter hypermethylation of multiple tumor suppressor genes including BCL7a, PTPRG, and p73. J Clin Oncol (2005) 23:3886–96. doi:10.1200/JCO.2005.11.353
59. Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol (2009) 10:R25. doi:10.1186/gb-2009-10-3-r25
60. Huang Y, Su MW, Jiang X, Zhou Y. Evidence of an oncogenic role of aberrant TOX activation in cutaneous T-cell lymphoma. Blood (2015) 125:1435–43. doi:10.1182/blood-2014-05-571778
61. Scicchitano MS, Dalmas DA, Bertiaux MA, Anderson SM, Turner LR, Thomas RA, et al. Preliminary comparison of quantity, quality, and microarray performance of RNA extracted from formalin-fixed, paraffin-embedded, and unfixed frozen tissue samples. J Histochem Cytochem (2006) 54:1229–37. doi:10.1369/jhc.6A6999.2006
62. Perlmutter MA, Best CJ, Gillespie JW, Gathright Y, Gonzalez S, Velasco A, et al. Comparison of snap freezing versus ethanol fixation for gene expression profiling of tissue specimens. J Mol Diagn (2004) 6:371–7. doi:10.1016/S1525-1578(10)60534-X
63. Mittempergher L, de Ronde JJ, Nieuwland M, Kerkhoven RM, Simon I, Rutgers EJ, et al. Gene expression profiles from formalin fixed paraffin embedded breast cancer tissue are largely comparable to fresh frozen matched tissue. PLoS One (2011) 6:e17163. doi:10.1371/journal.pone.0017163
64. Werner M, Chott A, Fabiano A, Battifora H. Effect of formalin tissue fixation and processing on immunohistochemistry. Am J Surg Pathol (2000) 24:1016–9. doi:10.1097/00000478-200007000-00014
65. Solomon MJ, Varshavsky A. Formaldehyde-mediated DNA-protein crosslinking: a probe for in vivo chromatin structures. Proc Natl Acad Sci U S A (1985) 82:6470–4. doi:10.1073/pnas.82.19.6470
66. Au PC, Helliwell C, Wang MB. Characterizing RNA-protein interaction using cross-linking and metabolite supplemented nuclear RNA-immunoprecipitation. Mol Biol Rep (2014) 41:2971–7. doi:10.1007/s11033-014-3154-1
67. Graw S, Meier R, Minn K, Bloomer C, Godwin AK, Fridley B, et al. Robust gene expression and mutation analyses of RNA-sequencing of formalin-fixed diagnostic tumor samples. Sci Rep (2015) 5:12335. doi:10.1038/srep12335
Keywords: cutaneous T-cell lymphoma, mycosis fungoides, Sézary syndrome, prognostic markers, diagnostic markers, expression profiling, TruSeq
Citation: Lefrançois P, Tetzlaff MT, Moreau L, Watters AK, Netchiporouk E, Provost N, Gilbert M, Ni X, Sasseville D, Duvic M and Litvinov IV (2017) TruSeq-Based Gene Expression Analysis of Formalin-Fixed Paraffin-Embedded (FFPE) Cutaneous T-Cell Lymphoma Samples: Subgroup Analysis Results and Elucidation of Biases from FFPE Sample Processing on the TruSeq Platform. Front. Med. 4:153. doi: 10.3389/fmed.2017.00153
Received: 17 July 2017; Accepted: 06 September 2017;
Published: 22 September 2017
Edited by:
Ralf J. Ludwig, University of Lübeck, GermanyReviewed by:
Hiroaki Iwata, Hokkaido University, JapanUnni Samavedam, University of Cincinnati, United States
Yask Gupta, University of Lübeck, Germany
Copyright: © 2017 Lefrançois, Tetzlaff, Moreau, Watters, Netchiporouk, Provost, Gilbert, Ni, Sasseville, Duvic and Litvinov. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Philippe Lefrançois, cGhpbGlwcGUubGVmcmFuY29pczImI3gwMDA0MDttYWlsLm1jZ2lsbC5jYQ==;
Ivan V. Litvinov, aXZhbi5saXR2aW5vdiYjeDAwMDQwO21jZ2lsbC5jYQ==