- 1Division of Abdominal Organ Transplantation and Hepatobiliary Surgery, Department of Surgery, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States
- 2Eli and Edythe Broad Center for Regenerative Medicine and Stem Cell Research, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States
- 3Department of Molecular Microbiology and Immunology, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States
- 4Norris Medical Library, University of Southern California, Los Angeles, CA, United States
- 5Lawrence J. Ellison Institute for Transformative Medicine, University of Southern California, Los Angeles, CA, United States
- 6Department of Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States
- 7Department of Chemical Engineering and Materials Sciences, University of Southern California, Los Angeles, CA, United States
The liver is unique in both its ability to maintain immune homeostasis and in its potential for immune tolerance following solid organ transplantation. Single-cell RNA sequencing (scRNA seq) is a powerful approach to generate highly dimensional transcriptome data to understand cellular phenotypes. However, when scRNA data is produced by different groups, with different data models, different standards, and samples processed in different ways, it can be challenging to draw meaningful conclusions from the aggregated data. The goal of this study was to establish a method to combine ‘human liver’ scRNA seq datasets by 1) characterizing the heterogeneity between studies and 2) using the meta-atlas to define the dominant phenotypes across immune cell subpopulations in healthy human liver. Publicly available scRNA seq data generated from liver samples obtained from a combined total of 17 patients and ~32,000 cells were analyzed. Liver-specific immune cells (CD45+) were extracted from each dataset, and immune cell subpopulations (myeloid cells, NK and T cells, plasma cells, and B cells) were examined using dimensionality reduction (UMAP), differential gene expression, and ingenuity pathway analysis. All datasets co-clustered, but cell proportions differed between studies. Gene expression correlation demonstrated similarity across all studies, and canonical pathways that differed between datasets were related to cell stress and oxidative phosphorylation rather than immune-related function. Next, a meta-atlas was generated via data integration and compared against PBMC data to define gene signatures for each hepatic immune subpopulation. This analysis defined key features of hepatic immune homeostasis, with decreased expression across immunologic pathways and enhancement of pathways involved with cell death. This method for meta-analysis of scRNA seq data provides a novel approach to broadly define the features of human liver immune homeostasis. Specific pathways and cellular phenotypes described in this human liver immune meta-atlas provide a critical reference point for further study of immune mediated disease processes within the liver.
Introduction
The liver is an immunologically complex organ with an abundance of resident leukocytes which comprise a significant proportion of residing nonparenchymal cells (1, 2). A unique but poorly understood feature of the liver is its physiologic immunotolerance, which is thought to be the result of the organ receiving much of its dual blood supply from the portal vein. The portal vein shuttles blood to the liver via the enterohepatic circulation, which carries both nutritionally-derived and bacterial antigens without causing an untoward inflammatory response (3). Further characterization of the unique hepatic immune microenvironment has the potential to aid study of a spectrum of immune-mediated disease processes of the liver including autoimmune hepatitis, cholangiopathy, transplant rejection, and the tumor microenvironment of intrahepatic malignancy (4).
Single cell RNA sequencing (scRNA seq) has led to rapid advances in our understanding of cellular phenotypes of both in situ human tissues in recent years (5, 6). Application of scRNA seq to human liver tissue to examine both healthy and pathogenic states at the molecular level has proven useful to identify heterogeneity among various cell types, including immune cell populations and epithelial progenitors (7–9). Generation of these libraries through high-throughput sequencing techniques such as 10x Genomics or Drop-seq generates highly dimensional data, which is suitable to both answer and also to rapidly generate hypotheses. Given the relatively high cost of scRNA seq, a typical human study only analyzes a small number of unique samples (as few as three unique patients), despite the known genetic heterogeneity across individuals (10, 11). In 2018, the NIH updated the policy for data access of genomic datasets, with the goal of enhancing researchers’ ability to perform pooled analyses or meta-analyses on genomics data particularly in human specimens (12). More recently, there are several NIH requests for applications specifically targeted at secondary and integrated analytic approaches to harness the power of these existing, funded datasets (13).
Combining liver-specific immune cell subsets of scRNA seq datasets yields a higher number of cell libraries across a larger sample of patients, which is crucial when considering that knowledge generated from these human “atlases” may be used to understand biochemical mechanisms of disease and to develop therapeutics for clinical use. One major drawback of combining unique studies is that differences in technique have been demonstrated to lead to unexpected alterations in sequencing results (14, 15). A study by Bonnycastle et al. used scRNA seq of human pancreatic islet cells and compared different tissue processing techniques, including fresh, fixed, and cryopreserved tissues. Despite processing tissue samples from the same source, this analysis demonstrated differences among cell type proportions recovered as well as changes in gene expression signatures across different processing techniques (16). Thus, combining multiple human liver scRNA datasets has the potential to account for more biological variability across different patients, potentially attenuating the effects of pre-analytical variables that are not biologically meaningful.
The aim of this study was to determine if a meta-analysis of existing normal human liver scRNA seq datasets could be performed to generate a comprehensive human liver immune meta-atlas for future use as a reference point for examining immune-mediated liver diseases. In Part 1, an extensive comparison was performed between studies in order to establish whether generalizability across datasets would be meaningful. In Part 2, the pooled meta-atlas was explored to define expression profiles across four major immune subpopulations and to provide a re-usable meta-atlas of healthy liver immune homeostasis.
Methods
Institutional Review Board approval was not required for this study, as deidentified, publicly available data obtained from human subjects was analyzed. Data is available in the public domain as outlined below.
Systematic Review and Data Acquisition
A comprehensive review of the literature for scRNA seq studies involving normal human liver yielded three recent publications: 1. Aizarani et al., and based on the methods used in the study, it was referred to as “Liver Atlas, Cholangiocyte and Endothelial enriched” (LACEe), 2. Zhao et al., referred to as “Liver CD45+ enriched” (LCD45e), and 3. MacParland et al., referred to as “Non-biased Liver” (Lnb) (7–9). Each raw dataset was found in the Gene Expression Omnibus, LACEe: GSE124395, LCD45e: GSE125188, Lnb: GSE115469 (17). Datasets were imported into R as UMI count matrices using the RStudio interface. The workflow for isolation of our cell populations of interest from each dataset is summarized in Figure 1. Briefly, the CD45+ cell compartment was isolated in the LACEe and Lnb studies in order to extract the leukocyte population and exclude hepatocytes and other non-parenchymal cells. For LCD45e, the CD45 population was isolated prior to sequencing, and included cells derived from liver, spleen, and peripheral blood that had been barcoded for source identification. In this case, only the liver cells were extracted for analysis. Authors from each dataset were contacted for clarification on aspects of the datasets as needed in order to ensure the accuracy of our analysis.
Figure 1 Combining liver-specific scRNA seq data to create a human liver immune meta-atlas. Schematic diagram summarizing the number of subjects and major pre-analytical variables pertaining to scRNA sequencing of human liver. Post-experimental analysis performed in this study is highlighted in the shaded region. CD45+ cells were selected from the LACEe and Lnb studies. CD45+ cells (which were already pre-selected for in the LCD45e dataset) which were of liver origin were post-experimentally extracted as splenic and peripheral blood cells were also included in the original dataset. The scRNA seq analysis pipeline (Seurat v3 in R) was applied, then clustering and differential expression analysis were used to assess the ability to combine scRNA seq techniques and generate datasets for tertiary pathway analysis of immune function in the normal human liver. Created with BioRender.com.
Data Analysis and Visualization
The Seurat version 3.0 R toolkit (RRID: SCR_007322) was used for all data analysis due to its ability to integrate across datasets; the data analysis pipeline followed the tutorial outlined by the package developers (18, 19). Single cell expression count matrices barcodes and gene IDs from each study were downloaded from GSE. After import of the datasets into Seurat, we filtered barcodes for CD45+ when handling the LACEe and Lnb datasets. The LCD45e dataset included cell counts which were exclusively from CD45+ cells, therefore this filtering was not necessary. Spleen and peripheral blood mononuclear cells (PBMCs) were handled by excluding the associated barcodes and only including barcodes for the liver immune cells. To normalize the data, UMI counts were scaled by library size and a natural log transformation; gene counts for each cell are divided by the total UMI count of that cell, scaled by a factor of 10,000, and then transformed via a natural log plus 1 function (“NormalizeData”). For downstream analysis, normalized data is additionally scaled so that the mean expression across cells is 0 and the variance is 1 (“ScaleData”). In order to reduce the dimensionality of the data for clustering functions, Principal Component Analysis (PCA) was utilized, and we determined the first 30 principal components explained sufficient observed variance (“RunPCA”). Next, to identify clusters within the reduced dimensional space, cells were embedded in a k-nearest neighborhood-based graph structure (“FindNeighbors”) and were then partitioned into clusters (“FindClusters”). Finally, to aid in visualization, Uniform Manifold Approximation and Projection (UMAP) was run over the reduced dimensional space (‘RunUMAP”) and identified clusters were projected on to the UMAP plot. Accounting for noise and batch differences between the three studies was done by using reference cells from each. These anchors were identified (“FindIntegrationAnchors”), which represent pairs of cells from each dataset identified and scored based on their close proximity using k-nearest neighborhood approach. These anchors are then used to measure the expression difference between studies (“IntegrateData”), which is then removed from the corresponding normalized data. Integration is run between the normalization and scaling steps (19).
Clustering was performed in Seurat v3. To establish reliability of this method, the clustering algorithm was compared between Seurat and an alternative approach using the Harmony R package (https://github.com/immunogenomics/harmony) (20). This package scales the data to make nearby cells more similar when clustering is performed. Harmony is also easily incorporated into the Seurat pipeline. Results using Harmony-based clustering were essentially identical to Seurat. Across cell types, 96.5% to 99.9% of cells in Seurat clusters were present in the same Harmony clusters (Supplemental Figure 1). Additionally, agreement was high across datasets ranging from 96.4% to 99.4%. This data confirms that our clusters are accurate and downstream analyses should not be affected by the choice in algorithm.
Utilizing the UMAP plot of the integrated data, we were able to manually identify four clusters corresponding to four major immune subpopulations by plotting expression of known biomarkers and grouping previously identified clusters: myeloid cells, NK&T cells, B cells, and plasma cells (Supplemental Figure 2). CD3D was used to identify T lymphocytes, and NK cells were identified by expression of KLRF1 and FCGR3A. The myeloid cell lineage was identified using FCGR3A and the specific marker, CD14. The B cell cluster was classified using the CD19 marker, and plasma cells were identified by SDC1 (CD138) expression (21–25). After clustering, immune cell proportions were quantified and characterized between datasets using a Chi-squared test (α=0.05). Gene correlation analysis was performed between datasets as pooled cell types and additionally with stratification by major cell type using both standard linear regression and rank-rank hypergeometric overlap (RRHO).
Differential Expression Analysis
DE genes were identified across immune cell subpopulations (i.e., between a particular cell subpopulation and all other cell subpopulations) within a dataset. Log normalized gene expression were used, averaging over all cells in a given subpopulation for a specific gene and taking the difference to the same of all other cell subpopulations (“FindMarkers”), using the Wilcoxon rank-sum test for significance. Genes that were DE were examined both within an immune cell subpopulation and between datasets. Similarly, DE genes were identified across datasets (i.e., pairwise between datasets) within a particular cell subpopulation using the same methods. To account for some technical variation between dataset, we calculated the difference of differences of a particular subpopulation between datasets and all other cell subpopulations between datasets. Log normalized gene expression was used as described above taking the difference of differences between dataset (pairwise) and cell subpopulation (particular versus all other), using a t-test for significance.
The candidates for immune profile of human liver across our major cell types of interest were identified by applying differential expression analysis across immune cell subpopulations in liver in the combined meta-atlas compared with immune cell subpopulations from a published dataset generated from normal human PBMC (67,221 cells; NCBI Gene Expression Omnibus: GSE171555).
Pathway Analysis
Gene IDs, expression fold changes and p-values generated from differential expression analyses were imported into Ingenuity Pathway Analysis software (IPA, Qiagen, RRID: SCR_008653). Core expression analysis was performed which generated canonical pathways which were upregulated or downregulated based on these data. Gene function heatmaps were also generated in order to provide a comprehensive expression profile.
Volcano Plots
Pairwise comparisons of genes between data sets or cell types were conducted using Seurat “FindMarkers” function with default Wilcoxon Rank Sum test. Volcano plots were constructed using these results; genes were identified and colored based on P-adjBonfcorr<0.05 and a ratio of expression (exponentiated log fold change) ≥1.25 or ≤0.8. Additionally, differences of differences were calculated in a similar way comparing the differential expression between datasets and the differential expression between immune cell subpopulations. T-tests were performed to determine significance and genes were colored based on the previous criteria plus P-adjBonfcorr<0.05 of the difference of differences effect.
Correlation Analysis
Expression levels were averaged over cells and plotted pairwise between datasets. Scatter plots were fit with linear and quadratic regression models to show potential relationships. Additionally, genes were ranked based on their differential expression between cell types and these rankings were compared between datasets. Correlation between the rankings was assessed multiple ways. First via scatter plots and Spearman’s correlation coefficient. Second using Rank-Rank Hypergeometric Overlap from the RRHO package in R and constructing heatmaps showing regions of significant overlap between the ranked lists (26).
Heatmaps
The top DE genes between cell types were identified and combined between each dataset. Raw expression values for these genes were then normalized by subtracting the mean expression and dividing by the standard deviation of a given gene over all cells. Heatmaps were constructed based on these values; identified genes are shown as rows while cells are shown as columns. Cells are grouped based on cell type while genes are grouped based on the cell type differential expression from which they were identified. Outputs from heatmap analyses were used in the creation of an online interactive tool of the combined human liver meta-atlas (27).
Chord Plots
Results from the pathway analysis were used to select cell functions that were relevant to the immune subpopulations. Cell functions and genes were linked to represent the presence of a connection between the two, yes/no. From there, plots were constructed (Circlize “chordDiagram”) by connecting selected cell functions to the genes that were identified performing them, grouping by cell function.
Results
Part 1: Approach to Meta-Analysis of Liver Immune scRNASeq Data
To begin, immune-specific subsets of human liver tissue were identified and extracted from three unique studies: “LACEe”, “Lnb” and “LCD45e” (Figure 1) (7–9). Techniques used for the RNA sequencing of individual datasets are summarized in Table 1. The combined dataset included single-cell RNA sequencing of 17 normal human liver samples with approximately 32,000 hepatic CD45+ cells.
Table 1 Comparison of methods from three peer-reviewed single cell RNA sequencing studies of the liver.
Clustering analysis was performed on all three datasets individually to detect differences in global cellular phenotypes (Figures 2A–C). Projection of all three datasets to the same UMAP coordinates revealed a homogeneous interdigitation of each cluster, however the LCD45e subset comprised the majority of cells (~24,000 cells compared to ~4,000 cells in each of the other two studies; Figure 2D). Pairwise gene expression correlation analysis was performed which revealed that the LACEe dataset had less concordance with both the Lnb (R=0.81; Figure 2E), and LCD45e datasets (R=0.79; Figure 2F). Lnb and LCD45e, both obtained using the 10x platform, demonstrated a more idealized relationship (R=0.95; Figure 2G).
Figure 2 Data visualization with clustering and gene expression correlation between datasets. (A–C) UMAP plots of individual studies. (D) Plot showing co-clustering of three single cell datasets. (E–G) Gene expression correlation plots with solid lines showing linear and quadratic regressions. Dashed line representing idealized relationship. (H) Bar graph representation of immune cell proportions of each individual dataset. (I) X2 comparison showing significant differences in proportions of immune cell subpopulations. With pairwise analysis, most chi-squared values reached statistical significance except for plasma cells in LCD45e versus Lnb and for B Cells in Lnb versus LACEe (indicated by blue rectangles).
Characterization of the Frequency and Proportion of Immune Cell Subpopulations Between Liver scRNA Seq Datasets
Next, leukocyte subpopulations were identified based on expression of major cell-specific lineage marker genes and compared between the individual datasets (Supplemental Figure 2). Clusters were then designated as NK and T cells (combined), myeloid cells, B cells and plasma cells, representing the four major groups of interest for our study.
When compared between all three datasets there were differences in proportions of each immune cell subpopulation (X2 test, p<0.01 for each cell type; Figure 2I). Pairwise analysis between datasets also revealed differences in cell-type composition, and only two conditions did not reach statistical significance: comparison of LCD45e and Lnb in plasma cells (X2 test, p=0.16) and comparison of Lnb and LACEe in B cells (X2 test, p=0.08), which may be a consequence of B cells and plasma cells representing the smallest populations of cells between datasets. This observed scarcity of liver resident B cells and plasma cells is consistent previous reports of liver immune cell composition (2). Despite some differences in cell-type recovery, the ranking of abundance of each cell type was preserved between datasets, such that NK and T cells were the most numerous (ranging from 51%-69% of cells), followed by myeloid cells (18%-32%), plasma cells (4%-14%), and then B cells (2%-7%; Figure 2H).
Comparative Analysis of Leukocyte Subpopulations Between Datasets Reveals Concordance, Despite Differences in Liver Tissue Handling and Sequencing Techniques
A deeper analysis of gene expression profiles within immune cell subpopulation was conducted to identify differences in cellular phenotypes. Analysis of housekeeping genes was conducted, as an internal control of biological noise, thus serving as a marker of technical noise between datasets (28). Expression analysis was performed on an established set of human housekeeping genes across immune cell subpopulations and between datasets using log normalized expression values (Supplemental Figure 3). Despite a relatively good agreement between means, comparison with one-way ANOVA identified differences in housekeeping gene expression (p<0.01). Pairwise analysis also showed significant differences in expression levels, which is likely the result of the large number of cells included and is thus a highly powered test.
Differences in gene expression profiles across immune cell subpopulations are anticipated due to their distinct biological functions. The number of differentially expressed (DE) genes was quantified between individual immune cell subpopulations (Supplemental Figure 4). Volcano plots were used to illustrate meaningfully DE genes, which were plotted as blue dots and indicate a fold-change ratio in expression of either <0.8 or >1.25 and Bonferroni corrected p-value<0.05. Non-DE genes were plotted as red dots. B cells had the fewest DE genes between all datasets, whereas cells of the myeloid lineage had the most DE genes. No individual dataset was an outlier with respect to the quantification of differential expression, but the LCD45e dataset had the highest number of DE genes in B cells and myeloid cells, and the lowest number in plasma cells and NK and T cells.
Due to heterogeneous gene expression between leukocyte subpopulations (Supplemental Figure 4) and differences in cell proportions (Figures 2H, I), both of which could potentially bias dataset correlation analysis, pairwise expression correlation between datasets was performed and stratified by leukocyte subpopulation (Supplemental Figures 5A–D and Supplemental Figure 6). Comparison of Lnb and LCD45e showed a more idealized relationship (R=0.93-0.96 depending on leukocyte subpopulation, Supplemental Figures 5A.vi.-D.vi), while LACEe correlated less with the other two studies (Supplemental Figures 5A.iv,v-D.iv,v). The pattern of gene expression correlation was also analyzed using Rank-Rank Hypergeometric Overlap (RRHO) heatmaps (Supplemental Figures 5A–D, panels i-iii) (29). This technique uses heatmaps to indicate the degree of ranked differential expression agreement, where yellow indicates strong statistical overlap and blue represents no statistical overlap between corresponding axes. These panels represent overlap of differential expression in ranked form such that the bottom left of the panel represents the most upregulated genes, and genes in the upper right portion of the panel represent the downregulated genes (thus a higher rank corresponds to downregulation). When comparing NK and T cell expression patterns in LACEe versus Lnb, there are two heatmap regions with a high degree of statistical overlap (Supplemental Figure 5A). Examination of the remaining two pairwise comparisons in NK and T cells from different datasets, the overlap among the more highly expressed genes is evident, but genes at lower expression levels have less overlap. Plasma cells and B cells also exhibit RRHO agreement between studies. For myeloid cells, LACEe and Lnb have better RRHO for the most highly expressed genes, whereas Lnb and LCD45e have better agreement in genes expressed at lower levels. There is a bimodal agreement between LACEe and LCD45e (Supplemental Figure 5B).
Differential Gene Expression in Leukocyte Subpopulations Highlights the Degree of Agreement Between Datasets
The RRHO analysis demonstrated that there was relatively strong correlation with genes at the highest levels of expression. To begin to identify the dominant phenotype for each leukocyte subpopulation, the top 100 most abundant transcripts were identified for each cell type in Lnb which was the dataset with the most agreement with the other two datasets based on correlation (Figures 2E–G). A comparison was performed to determine if the same genes were present as the top 100 most abundant transcripts for each immune cell subpopulation of the other two studies. Depending on leukocyte subpopulation, LCD45e had between 87 and 94 out of 100 ‘top genes’ in agreement. Alternately, LACEe had rather poor agreement with top matching genes for each leukocyte subpopulation (Supplemental Figure 7).
Analysis was then performed between datasets in order to quantify how many genes were DE in each immune cell subpopulation (Figure 3B). In this analysis, DE genes with a fold change ratio <0.8 or >1.25 and a Bonferroni corrected p-value <0.05 are represented in volcano plots and compare individual gene expression level within an immune cell subpopulation, between datasets (16). Some comparisons yielded a high number of DE genes between datasets (e.g., 3526 genes when comparing LCD45e and LACEe in myeloid cells). In contrast, when comparing the Lnb dataset to the LCD45e dataset among the B cell population, there were only 767 DE genes and the myeloid cell population between these two datasets had 924 DE genes, again demonstrating fewer expression differences between Lnb and LCD45e relative to LACEe.
Figure 3 Differential gene expression analysis between datasets. (A) Schematic diagram for identification of a differentially expressed (DE) gene. The blue cells represent different expression levels for a gene in each leukocyte subpopulation (e.g., expression in B cells) and the purple cells represent the expression levels of that same gene, but in all other cell subpopulations. Differential expression only takes levels from the B cell into account. We also examine the Difference of Differences (DoD) which takes the difference in expression between B cells and all other cell types for dataset #1 and subtracts it from the differences in expressed between B cells and all other cell types for dataset #2 which may help categorize the differential expression that is accounted for by technical differences between studies (such as sequencing depth) rather than biologically relevant changes. (B) Volcano plots showing the number of DE genes (blue dots) and DoD genes (green dots) for each dataset condition as pairwise comparisons between studies and stratified by cell type. (C) Lists of differential gene expression numbers, DoD genes and then the percent of DoD genes out of the number of DE genes.
To better characterize the heterogeneity in gene expression, the difference of differences (DoD) was calculated between datasets (pairwise) and stratified by leukocyte subpopulation (particular versus all other). This metric involved calculating a gene’s expression difference between the cell subpopulation of interest and all other cell types and then comparing that difference to the difference seen in another dataset (Figure 3A). For example, a B cell in the LACEe dataset has a difference of expression of the gene CD25 compared to all other cell types (represented as Δ1), the difference in CD25 expression between B cells and other cell types in Lnb is Δ2. Subtracting Δ2 from Δ1 yields the DoD, (Figure 3) which we used to approximate the proportion of DE genes where expression alteration could be explained by changes relative to other immune cell subpopulations (DoD significance defined as: DE with a fold change ratio <0.8 or >1.25 with P-adjBonfcorr<0.05 and DoD P-adjBonfcorr<0.05). We also anticipate that due to comparison between datasets and across differing technical platforms (10x Genomics versus mCel-seq) that there will be differences in the depth of sequencing, a phenomenon which has been known to affect differential expression analysis when comparing scRNA seq datasets (30). Given this, we propose that our DoD analysis “explains”, or at least accounts for, some of the differential expression. Figure 3C lists the number of DE and DoD genes for each analysis as well as a percent total. The last column in the table shows the proportion of DE genes that also had a significant DoD (the ratio of the two previous columns). Notably in comparing myeloid cells between LACEe and Lnb, there are a high number of DE genes, but only 34% of these are also DoD genes. This suggests that nearly 70% of the observed differences of myeloid cells between these two datasets can be accounted for by comparing to the average expression level in the remaining cells of each rather than representing a unique feature of the myeloid population.
Dominant Gene Expression Profiles for Immune Cell Subpopulations Are Concordant Between Datasets
Once DE genes were identified, the most significantly DE genes were chosen among each dataset to determine if an integrated dataset could identify dominant gene expression signatures associated with each immune cell subpopulation within human liver. The topmost DE genes for each leukocyte subpopulation in each individual dataset were identified. All three datasets’ candidate genes were combined to create a master list of top DE genes for each immune cell subpopulation (Figure 4D). Expression levels were plotted as a heatmap for each individual cell and organized by immune cell subpopulation (Figures 4A–C). Some heterogeneity was expected within cell type, as the myeloid and NK and T cell populations represent a variety of unique subsets. Despite this, the heatmaps demonstrate that candidate genes listed are indeed DE in the immune subpopulation of interest when compared with all other subpopulations and represent reproducible markers between each dataset, regardless of processing technique. The expression patterns outlined in the heatmaps have excellent agreement across studies, confirming that these datasets could be integrated to produce a meta-atlas of human liver immune function.
Figure 4 Heatmap representation of dominant differentially expressed genes between each immune cell subpopulation in each dataset. (A) Heatmap of the most DE genes across leukocyte subpopulation plotted using cells from the LACEe dataset. (B) Heatmap of the most DE genes across immune cell subpopulation plotted using cells from the Lnb dataset. (C) Heatmap of the most DE genes across leukocyte subpopulation plotted using cells from the LCD45e dataset. For all heatmaps, z-scores are shown representing standardized, log normalized expression across cells for a given gene. (D) Table of the list of genes used for heatmap analysis for each leukocyte subpopulation. Immune cell subpopulations are grouped from left to right: NK&T, myeloid cells, plasma cells then B cells and the gene lists going from top to bottom are ranked in the same order. The same gene lists were used to analyze cells from each of the three datasets.
Differential Gene Expression Profiles Between Datasets Reveals Pathways Affected by Tissue Processing and Establishes That Immunologic Pathways Are Preserved
After identification of all the genes that had a statistically significant differential expression (log fold change <0.8 or >1.25 and adjusted p<0.05 as shown in volcano plots from Figure 3B) among the three datasets, a canonical pathway analysis was performed using the Ingenuity Pathway Analysis software (IPA, Qiagen, RRID: SCR_008653). The association study was performed for each leukocyte subpopulation: NK & T cells, B cell, plasma cell and myeloid cells. Input of the DE gene name, average log fold change and p values provided an output of canonical pathways which were ranked on their likelihood to be altered between pairwise datasets. The probability that a signaling pathway was affected by differences between datasets was represented as a -log(P value) and a cutoff of 7.0 was set in order to establish those canonical pathways which were most different. Disease-specific pathways of no relevance to immune-specific subpopulations were excluded (e.g., Atherosclerosis Signaling, Bladder Cancer Signaling) as they were extraneous to this analysis.
Across immune cell subpopulations, there was general agreement between the canonical pathways affected based on the pairwise dataset comparison and therefore NK & T cell subsets and B cell subsets are shown as examples (Figures 5A–F respectively) as myeloid and plasma cells demonstrated the same pathways which were affected, albeit to varying degrees.
Figure 5 Characterization of differentially expressed genes points to signaling pathways that differ between datasets. (A) Pathway analysis of NK and T cells based on differential expression between the LACEe and Lnb datasets. (B) Identification of canonical pathways altered between LCD45e and LACEe. (C) Pathway analysis of NK and T cells based on differential expression between Lnb and LCD45e in NK and T cells. (D–F) B cell pathway analysis as in A-C. The same comparisons were made for myeloid cells and plasma cells and were largely similar to what is shown for the immune subpopulations represented in this figure. Bars represent -log(P values) of each canonical pathway’s likelihood of being altered between comparisons and was obtained with Ingenuity Pathway Analysis Software (Qiagen).
The analysis of DE genes in NK and T cells for LACEe vs Lnb pair identified three pathways: EIF2 signaling (-log(P value)=14.7, 20 DE genes involved), oxidative phosphorylation (-log(P value)=8.8, 11 DE genes involved), and regulation of eIF4 and 70S6K signaling (-log(P value)=7, 11 DE genes involved; Figure 5A). B cell DE genes between LACEe vs Lnb, showed a similar result with the addition of mTOR signaling. Of note, many of the genes that led to the presumptive alterations of these pathways were ribosomal proteins with the exception of the oxidative phosphorylation canonical pathway, where genes affected were NADH dehydrogenase subunits, cytochrome c oxidase subunits and ATP synthase subunit F0 subunit 6.
The LCD45e versus LACEe pairwise comparison of canonical pathways was reminiscent of the altered pathways seen in the LACEe versus Lnb differential comparison. The expression analysis between Lnb and LCD45e, which was the pairwise comparison with the best transcriptome concordance and the fewest DE genes, actually had the highest number of altered canonical pathways (Figures 5C, F). Even though several canonical pathways were identified as altered between these two datasets (Lnb vs LCD45e) many of the DE genes that made up the pathway differences were repeated. For instance, ALB or the protein albumin was counted as being a DE molecule in six out of the eight canonical signaling pathways. Likewise, genes such as APOA1, APOA2, ORM1, ORM2 and SERPINA1 contributed to the majority of canonical pathway alterations. In fact, most of the molecules identified were redundant downstream effectors of the pathways and thus do not necessarily reflect true perturbations of the signaling pathway between datasets.
In characterizing the canonical pathways that differed between datasets, the major differences were noted among pathways which involved protein synthesis, cell death and apoptosis signaling. Signaling pathways related to immunologic functions, which are the functions that are of relevance to this study were largely preserved. For example, the T cell receptor signaling pathway, CD28 signaling in T helper cells and IL-6 signaling pathways all had 2 or fewer genes which contributed to differences noted which supports the idea that these pathways were largely comparable between studies and that proceeding with a combined meta-dataset is feasible.
Part 2: Key Features of the Integrated Liver Immune Meta-Atlas
Given that gene expression correlation was high between datasets and differential gene expression was relatively low, with conservation of phenotypically relevant genes and signaling pathways, we established meta-signatures for each immune cell subpopulation. Genes which were DE from one leukocyte subpopulation versus corresponding cells from a PBMC dataset, were identified. This is in contrast with the prior analysis, which used genes DE between datasets. These candidate genes comprising a list of 449 to 573 total genes depending on leukocyte subpopulation were generated and run through the Ingenuity Pathway Analysis Software (Qiagen, RRID: SCR_008653). The output of this software identified a series of genes with linked cellular functions to determine which canonical pathways were affected.
Pathway analysis output generated characterizations of various cellular biochemical functions. Functions were sub-divided into types including Systemic Autoimmune Syndrome, Signal Transduction, Proliferation, Neoplasia, Inflammation, Infection, Differentiation, Chronic Inflammatory Disorder, Cell Viability, Cell Death, Cell Proliferation, Cell Movement, Binding, and Activation. Z-score values of whether expression was upregulated or downregulated were used to establish a functionality signature for each cell type. Specifically, these pathway analysis data were then transferred for further tertiary analysis in R using the Circlize package to produce meta-signatures as chord plots which relate specific cell functions to individual genes having differential expression. In order to reduce complexity of the figures, redundant cell functions and functions related to disease states noted to be irrelevant to immune subpopulations were excluded. The resulting chord diagrams link genes associated with specific cellular functions, allowing for visualization of the immune phenotype for each leukocyte subpopulation (Figure 6).
Figure 6 Gene meta-signatures reveal the expression landscape of immune homeostasis in the human liver immune meta-atlas. (A–D) Chord plots representing 8-14 immune cell-specific diseases or functions (Systemic Autoimmune Syndrome, Signal Transduction, Proliferation, Neoplasia, Inflammation, Infection, Differentiation, Chronic Inflammatory Disorder, Cell Viability, Cell Death, Cell Proliferation, Cell Movement, Binding, Activation) with links to the respective genes which have been identified as DE between the liver leukocyte subpopulation listed versus the corresponding PBMC leukocyte subpopulation.
NK and T cells represented the largest proportion of cells in our study but had the smallest number of DE genes identified by our analysis, with plasma cells having the most genes. For this immune cell subpopulation, 16 genes were identified as being DE, representing 8 major immunologically relevant diseases and functions (Figure 6A). Of the DE genes in NK and T cells, C-C motif chemokine ligand 4 (CCL4L1/CCL4L2), C-C motif chemokine ligand 3 like 3 (CCL3L3) X-C motif chemokine ligand 1 (XCL1), ALOX5AP (involved in leukotriene synthesis) were chemokines and immune modulators noted to be upregulated. Among the most downregulated genes in the NK and T cell subpopulation were LIME1 and TRBV20-1 which are both involved in T cell receptor (TCR) activation signaling. The myeloid lineage was marked by a gene expression signature involving 22 genes across the relevant diseases and functions (Figure 6B). Upregulation of the following DE genes was shown: NOP53 ribosome biogenesis factor (involved in apoptosis and cell cycle signaling), Y box binding protein 1 (YBX1, a transcriptional regulator), and TLE5 (transcriptional regulator) among others. The most notably downregulated DE genes were the pro-inflammatory cytokine of the TNF family, TNFSF13, FOS proto-oncogene, CSF3R (GM-CSF receptor controlling proliferation and differentiation of granulocytes; all p values < 0.001).
Plasma cells with 45 DE genes shown in the chord diagram demonstrated upregulation of the histone proteins implicated in transcriptional repression (H2BC8 and HDAC3) as well as TLE5, which is implicated in transcriptional regulation and repression (all p values < 0.001). The most downregulated genes were immunoglobulin subunit genes from both heavy and light chains: IGHV4-39, IGHV3-11, IGHV3-49, IGHV1-69, IGKV1-16, IGLV2-8, IGLV6-57 (p values < 0.001; Figure 6C). Lastly, the B cell meta-signature was comprised of 54 DE genes (Figure 6D). Upregulation was seen in the chemokine CCL4, SEM1 (part of 26S proteasome complex), NSD3 (transcriptional repression), and BIRC3 (inhibition of apoptosis, all p values < 0.001). Genes which were downregulated included immunoglobulin genes: IGKV3-20, IGKV3-15, IGKV1-40, IGHV3-23, IGHV3-30, IGLV1-51, as well as HLA-DRB5 (involved with antigen presentation, all p values < 0.001). A detailed report of genes implicated in leukocyte subpopulation homeostasis including log fold change values is summarized in Supplemental Table 1.
Identification of the most significantly DE genes pointed to a phenotype of decreased immune cell functioning in human liver immune cell subpopulations versus corresponding peripheral immune cell subpopulations. To explore this further, signaling pathways and functions were characterized using Ingenuity Pathway Analysis (IPA). Histograms were generated for each immune cell subpopulation using the differential expression between PBMC and the human liver meta-atlas in order to characterize which canonical pathways were highly altered between compartments. The significance of the perturbation is charted as -log(P-value) and a cutoff of 2.4 was used in order to identify the most significantly altered pathways. The direction of the perturbation was estimated by a Z-score, which was generated as an output of IPA, and upregulated pathways are color coded as orange, downregulated pathways are blue, and grey pathways correspond to those where the directionality could not be predicted. Heatmaps were also generated which corresponded to specific immune cell functions: Infectious disease, Cell Trafficking, Inflammatory Response, Cell-Cell Interaction, Hematologic development and function, and Cell Death and Survival (Figure 7).
Figure 7 Detailed pathway analysis of human liver immune meta-atlas shows selective upregulation of death-related transcripts and downregulation of immune-related cell functions. (A) Pathway analysis of NK and T cells based on differential expression between human liver meta-atlas versus normal human PBMC. Histograms showing -log(P-value) corresponding to the significance of the involved signaling pathway. Orange corresponds to upregulated pathways, blue shows downregulated pathways and gray corresponds to unknown directionality based on Ingenuity Knowledge Base. Heatmaps showing perturbations in specific functions of NK and T cells in the human liver relative to PBMC. (B) Myeloid Cell pathway analysis showing canonical pathway functioning in human liver relative to human PBMCs. Heatmaps of specific functions within Myeloid cells relative to PBMCs. (C) Canonical pathway functioning Plasma Cells of the human liver relative to PBMC with associated heatmap of specific functions. (D) B Cell pathway analysis with histograms showing canonical pathways and heatmaps highlighting specific cellular functions in liver relative to PBMC.
NK and T cells had upregulation of Myc-Mediated Apoptosis Signaling, Death Receptor Signaling, Tumoricidal Function of Hepatic Natural Killer Cells, Inhibition of ARE-Mediated mRNA Degradation Pathway and TNFR1 signaling in the liver compartment relative to PBMCs. Downregulation was demonstrated in the Spliceosomal Cycle. Heatmaps further highlighted immune functions such as infectious disease-related activity, cell trafficking, inflammatory response and cell-cell interactions were largely downregulated, whereas cell death and apoptosis functions were upregulated (Figure 7A). In the myeloid cell subpopulation, upregulated canonical pathways were Inhibition of ARE-Mediated mRNA Degradation Pathway, Aryl Hydrocarbon Receptor Signaling, Protein Kinase A Signaling, Cyclins and Cell Cycle Regulation, and Wnt/beta-catenin Signaling. Downregulated pathways were Necroptosis Signaling and ERK/MAPK Signaling. Phenotypic patterns of diminished immune cell functioning, and enhanced cell death were seen in this population as well (Figure 7B).
Plasma cells had upregulation of the PD-1/PD-L1 Cancer Immunotherapy Pathway and downregulation of a substantial number of canonical pathways. Heatmaps showed that phenotypically these cells were largely in a state of depressed functioning relative to the peripheral blood immune compartment (Figure 7C). Human liver B cells, comprising the smallest immune cell subpopulation, showed upregulation of ERK/MAPK Signaling, PI3K/AKT Signaling, mTOR Signaling, TWEAK Signaling and MSP-RON Signaling in Macrophages Pathway. Downregulation was demonstrated in the Spliceosomal Cycle, Sumoylation Pathway, Senescence Pathway, MSP-RON Signaling in Cancer Cells Pathway, EIF2 Signaling, P13K Signaling in B Lymphocytes, Th1 Pathway and Protein Kinase A Signaling. Heatmap analysis showed a mixture of upregulated and downregulated functions. Infectious disease and hematologic development related functions were again depressed, but cell trafficking and inflammatory responses had a modest upregulation in liver relative to PBMC. Cell death functions were again enhanced. Overall, these results indicated that liver immune homeostasis points to a general phenotype of depressed immune and pro-inflammatory functions via the aforementioned pathways and a possible increase in cell death and apoptotic events.
Discussion
In this study, meta-analysis of three high-quality scRNA seq studies enabled deep characterization of liver immune cell function and features related to physiologic immunotolerance and immune homeostasis. Focused examination of immune cell types, proportions, and gene expression profiles of revealed that significant differences exist between datasets created from ‘normal human liver’ samples. Despite these differences, the overall ranking of abundance of immune subsets was preserved, with NK and T cells dominating and plasma and B cells being relatively rare. Further analysis of gene expression profiles involving the most DE genes in the integrated dataset provided the opportunity to minimize the effect of technical ‘noise’ and establish the gene expression signature of human liver immune milieu. These results point to reduced expression of immune related functions and enhanced expression of cell death-related pathways. This human liver meta-atlas has the potential to serve as a key reference point for future studies using scRNA seq designed to characterize immune-based pathologies within the liver, such as liver allograft rejection, autoimmune disease, and the tumor microenvironment in primary liver malignancies.
To establish validity of performing a meta-analysis of an integrated dataset using scRNA seq data generated in three distinct labs with different tissue handling, processing, and sequencing techniques, visualization of the integrated data with co-clustering of cell populations was performed (Figure 2). This approach was based on the work of Bonnycastle et al., who specifically studied the impact of tissue processing and handling on the integrity of scRNASeq data (16). In our analysis, immune cell proportions and expression profiles differed somewhat between datasets, with highest correlation between the Lnb and LCD45e studies (Supplemental Figure 5). This is not particularly surprising, as these two datasets used the same 10x Genomics technique for the scRNA seq. It is important to note that extraction of the CD45+ population and combination with other non-liver derived CD45+ cells did not appear to impact the integrity of the data, as was the case with the LCD45e. While various scRNA seq techniques have the potential to introduce noise and bias into the transcriptome data, the Seurat pipeline has been designed to correct for some of these differences by re-aligning cell clusters based on nearest neighbor correction and allowing for integrated downstream analyses (18, 19). This would suggest that differences we observed in expression analyses could be the result of pre-analytical variables such as tissue processing. We further characterize noise with the quantification of an established set of human housekeeping genes (28). There were differences noted in expression of human housekeeping genes within an immune cell subpopulation between datasets, however, mean expression values were not substantially altered and differences noted here are likely the result of large sample sizes. A potential explanation for differences in cellular proportions may be related to sample size effect as well. The LACEe dataset encompasses the largest sample of human tissues, derived from nine total liver specimens, which likely introduces a higher degree of biological variation than Lnb and LCD45e which have five and three liver donors, respectively. The clinical scenario also differs between LACEe, which was created using partial hepatectomy specimens from patients with liver malignancy, and the other two studies, which involve specimens derived from adult brain-dead organ donors. Despite attempts at correction, there were differences noted in gene expression analysis, yet immune subpopulations still co-clustered with good interdigitation between datasets and DE analysis showed good agreement with heatmaps (Figures 2, 4). As such, we proceeded with deeper immune profiling using the integrated dataset.
Differences in tissue handling and processing has been shown to prompt transcriptional changes within the cell (15, 16, 31). A major novelty of our study is the in-depth characterization of differential expression between datasets with identification of relevant canonical pathways affected. Our results indicate that pathways involved in oxidative phosphorylation and cell stress were primarily affected and that immunologically relevant pathways were largely preserved. We suspect that non-immune function related changes may be the consequence of pre-analytical tissue processing. Known pre-analytical variables related to each technique used for the generation of each dataset are summarized in Table 1. Unknown variables of potential importance could include the time interval of ‘Pringle’ clamp time during resection or from the moment of tissue resection to cell processing and sequencing, which could lead to significant ischemic insults and thus impact transcription profiles. In ischemic injury of rodent and human cortex, insult has been demonstrated to promote pro-inflammatory gene expression alterations (32, 33). The fact that these ischemia-induced pro-inflammatory changes are not demonstrated in this study shows that either ‘Pringle’ time in liver specimens captured in these datasets were either not significant enough to cause ischemia, or that liver ischemia does not produce as profound an inflammatory response as cortical ischemia. Overall, the observation that immune pathway profiles for each subpopulation were relatively conserved supports the use of this novel meta-analysis approach to examine existing datasets in order to understand immune homeostasis in human livers.
Immunotolerance is a unique feature of hepatic homeostasis and is thought to be a consequence of the constant stream of antigens being delivered to the liver from the gut. This involves regulation of both the adaptive and innate immune systems to prevent an uncontrolled inflammatory cascade in response to foreign antigens in the portal circulation (1, 34). One potential theory is that effector T-cells are regularly destroyed in the liver. Animal studies have suggested that hepatic immune regulation is mediated by the milieu of anti-inflammatory cytokines and inhibitory regulators such as IL-10 and TGF- ß, leading to inhibition of effector T cell function and expansion of the Treg population (35). Comparison of hepatic immune cell gene expression patterns to a different immune compartment enabled identification of a liver-specific signature of immune homeostasis. Indeed, this analysis revealed that NK, T cell, and myeloid subpopulations had enhanced cell death functions and depressed immune-related functioning (Figure 7). Similarly, plasma and B cells both showed downregulation of immunoglobulin genes and exhibited downregulation of immune functional pathways.
There are limitations to this study. Further differences in scRNA seq technique (comparing 10x Genomics to mCel-seq) likely introduces a differing amount of bias into the sequencing datasets. In addition, because this study explored pre-existing datasets, there was no ability to control for patient-specific parameters in this study design such as inclusion/exclusion based on liver function studies or other potentially relevant clinical factors including duration and storage of samples prior to processing. Despite this, combining scRNA seq and other genomics datasets for the characterization of normal physiology and disease processes will become a more prevalent practice and further efforts should be made to standardize the processes of pre-analytical processing variables as well as to enhance the integration abilities during data analysis.
In conclusion, our results present an integrated dataset of scRNA seq results of the liver immune environment from ~32,000 cells across 17 human livers, making it the largest human liver meta-atlas available. Key immune subset expression profiles that describe the landscape of liver immune homeostasis were defined and can be explored in real-time on our interactive website (27). These results can be incorporated into future RNA sequencing studies and have implications for understanding mechanism of disease and identifying new therapeutic targets in immunologic diseases of the liver including allograft rejection and immune tolerance.
Data Availability Statement
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author. An online “point-and-click” resource for interacting with the meta-atlas which is available on our lab website: https://usctransplantlab.shinyapps.io/meta_rna_seq/.
Ethics Statement
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.
Author Contributions
BR contributed to study design, acquiring data, analyzing data, and writing the manuscript. AB contributed to study design and writing the manuscript. PS was responsible for acquiring initial datasets and performed computational analyses and data integration. CG contributed to study design, analyzing data, statistical analyses and aided with writing the manuscript. DH helped with study design and writing the manuscript. Y-HL provided resources for data analysis and contributed to study design. NU provided insights for study design and contributed to verification of data analysis techniques. JL and OA helped with the initial study design. JE conceived the original idea, supervised the project and contributed to study design and data analysis. All authors contributed to the article and approved the submitted version.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Acknowledgments
We thank the Eli and Edythe Broad Center for Regenerative Medicine at the University of Southern California, Keck School of Medicine for providing the Broad Clinical Research Fellowship which was awarded to Dr. Rocque for 2020-2021. We also thank the USC Libraries Bioinformatics core for providing software licensing and resources.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fimmu.2021.679521/full#supplementary-material
References
1. Freitas-Lopes M, Mafra K, David B, Carvalho-Gontijo R, Menezes G. Differential Location and Distribution of Hepatic Immune Cells. Cells (2017) 6:48. doi: 10.3390/cells6040048
2. Racanelli V, Rehermann B. The Liver as an Immunological Organ. Hepatology (2006) 43:S54–S62. doi: 10.1002/hep.21060
3. Zheng M, Tian Z. Liver-Mediated Adaptive Immune Tolerance. Front Immunol (2019) 10:2525:1–12. doi: 10.3389/fimmu.2019.02525
4. Wang Y, Zhang C. The Roles of Liver-Resident Lymphocytes in Liver Diseases. Front Immunol (2019) 10:1582. doi: 10.3389/fimmu.2019.01582
5. Chen G, Ning B, Shi T. Single-Cell RNA-Seq Technologies and Related Computational Data Analysis. Front Genet (2019) 10:317. doi: 10.3389/fgene.2019.00317
6. Hwang B, Lee JH, Bang D. Single-Cell RNA Sequencing Technologies and Bioinformatics Pipelines. Exp Mol Med (2018) 50:1–14. doi: 10.1038/s12276-018-0071-8
7. MacParland SA, Liu JC, Ma XZ, Innes BT, Bartczak AM, Gage BK, et al. Single Cell RNA Sequencing of Human Liver Reveals Distinct Intrahepatic Macrophage Populations. Nat Commun (2018) 9(4383):1–21. doi: 10.1038/s41467-018-06318-7
8. Zhao J, Zhang S, Liu Y, He X, Qu M, Xu G, et al. Single-Cell RNA Sequencing Reveals the Heterogeneity of Liver-Resident Immune Cells in Human. Cell Discov (2020) 6(22):1–19. doi: 10.1038/s41421-020-0157-z
9. Aizarani N, Saviano A, Sagar, Mailly L, Durand S, Herman JS, Pessaux P, et al. A Human Liver Cell Atlas Reveals Heterogeneity and Epithelial Progenitors. Nature (2019) 572(7768):199–204. doi: 10.1038/s41586-019-1373-2
10. Lappalainen T, Sammeth M, Friedländer MR, T Hoen PAC, Monlong J, Rivas MA, et al. ‘Transcriptome and Genome Sequencing Uncovers Functional Variation in Humans. Nature (2013) 501:506–11. doi: 10.1038/nature12531
11. Melé M, Ferreira PG, Reverter F, DeLuca DS, Monlong J, Sammeth M, et al. The Human Transcriptome Across Tissues and Individuals. Science (2015) 348:660–5. doi: 10.1126/science.aaa0355
12. Psaty BM, Rich SS, Boerwinkle E. Innovation in Genomic Data Sharing at the NIH. N Engl J Med (2019) 380(23):2192–5. doi: 10.1056/NEJMp1902363
13. RFA-RM-21-007: Pilot Projects Enhancing Utility and Usage of Common Fund Data Sets (R03 Clinical Trial Not Allowed). Available at: https://grants.nih.gov/grants/guide/rfa-files/RFA-RM-21-007.html (Accessed January 25, 2021).
14. Van Den Brink SC, Sage F, Vértesy Á, Spanjaard B, Peterson-Maduro J, Baron CS, et al. Single-Cell Sequencing Reveals Dissociation-Induced Gene Expression in Tissue Subpopulations. Nat Methods (2017) 14:935–6. doi: 10.1038/nmeth.4437
15. Denisenko E, Guo BB, Jones M, Hou R, De Kock L, Lassmann T, et al. Systematic Assessment of Tissue Dissociation and Storage Biases in Single-Cell and Single-Nucleus RNA-Seq Workflows. Genome Biol (2020) 21:1–25. doi: 10.1186/s13059-020-02048-6
16. Bonnycastle LL, Gildea DE, Yan T, Narisu N, Swift AJ, Wolfsberg TG, et al. Single-Cell Transcriptomics From Human Pancreatic Islets: Sample Preparation Matters. Biol Methods Protoc (2019) 4(1):1–14. doi: 10.1093/biomethods/bpz019
17. Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI Gene Expression and Hybridization Array Data Repository. Nucleic Acids Res (2002) 30:207–10. doi: 10.1093/nar/30.1.207
18. Butler A, Hoffman P, Smibert P, Papalexi E, Satija R. Integrating Single-Cell Transcriptomic Data Across Different Conditions, Technologies, and Species. Nat Biotechnol (2018) 36:411–20. doi: 10.1038/nbt.4096
19. Stuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, Mauck WM, et al. Comprehensive Integration of Single-Cell Data. Cell (2019) 177:1888–1902.e21. doi: 10.1016/j.cell.2019.05.031
20. Korsunsky I, Millard N, Fan J, Slowikowski K, Zhang F, Wei K, et al. Fast, Sensitive and Accurate Integration of Single-Cell Data With Harmony. Nat Methods (2019) 16:1289–96. doi: 10.1038/s41592-019-0619-0
21. Roda-Navarro P, Arce I, Renedo M, Montgomery K, Kucherlapati R, Fernández-Ruiz E. Human KLRF1, a Novel Member of the Killer Cell Lectin-Like Receptor Gene Family: Molecular Characterization, Genomic Structure, Physical Mapping to the NK Gene Complex and Expression Analysis. Eur J Immunol (2000) 30:568–76. doi: 10.1002/1521-4141(200002)30:2<568::AID-IMMU568>3.0.CO;2-Y
22. Zhang AW, O’Flanagan C, Chavez EA, Lim JLP, Ceglia N, McPherson A, et al. Probabilistic Cell-Type Assignment of Single-Cell RNA-Seq for Tumor Microenvironment Profiling. Nat Methods (2019) 16:1007–15. doi: 10.1038/s41592-019-0529-1
23. Shimizu Y, Kohyama M, Yorifuji H, Jin H, Arase N, Suenaga T, et al. Fcγriiia-Mediated Activation of NK Cells by IgG Heavy Chain Complexed With MHC Class II Molecules. Int Immunol (2019) 31:303–14. doi: 10.1093/intimm/dxz010
24. Wu Z, Zhang Z, Lei Z, Lei P. CD14: Biology and Role in the Pathogenesis of Disease. Cytokine Growth Factor Rev (2019) 48:24–31. doi: 10.1016/j.cytogfr.2019.06.003
25. O’Connell FP, Pinkus JL, Pinkus GS. CD138 (Syndecan-1), a Plasma Cell Marker: Immunohistochemical Profile in Hematopoietic and Nonhematopoietic Neoplasms. Am J Clin Pathol (2004) 121:254–63. doi: 10.1309/617DWB5GNFWXHW4L
26. Cahill KM, Huo Z, Tseng GC, Logan RW, Seney ML. Improved Identification of Concordant and Discordant Gene Expression Signatures Using an Updated Rank-Rank Hypergeometric Overlap Approach. Sci Rep (2018) 8:1–11. doi: 10.1038/s41598-018-27903-2
27. Goldbeck C. Interactive Gene Mapper (2021). Available at: https://usctransplantlab.shinyapps.io/meta_rna_seq/ (Accessed June 4, 2021).
28. Eisenberg E, Levanon EY. Human Housekeeping Genes, Revisited. Trends Genet (2013) 29:569–74. doi: 10.1016/j.tig.2013.05.010
29. Plaisier SB, Taschereau R, Wong JA, Graeber TG. Rank-Rank Hypergeometric Overlap: Identification of Statistically Significant Overlap Between Gene-Expression Signatures. Nucleic Acids Res (2010) 38:1–17. doi: 10.1093/nar/gkq636
30. Rizzetto S, Eltahla AA, Lin P, Bull R, Lloyd AR, Ho JWK, et al. Impact of Sequencing Depth and Read Length on Single Cell RNA Sequencing Data of T Cells. Sci Rep (2017) 7:1–11. doi: 10.1038/s41598-017-12989-x
31. O’Flanagan CH, Campbell KR, Zhang AW, Kabeer F, Lim JLP, Biele J, et al. Dissociation of Solid Tumor Tissues With Cold Active Protease for Single-Cell RNA-Seq Minimizes Conserved Collagenase-Associated Stress Responses. Genome Biol (2019) 20:1–13. doi: 10.1186/s13059-019-1830-0
32. Barr TL, Conley Y, Ding J, Dillman A, Warach S, Singleton A, et al. Genomic Biomarkers and Cellular Pathways of Ischemic Stroke by RNA Gene Expression Profiling. Neurology (2010) 75:1009–14. doi: 10.1212/WNL.0b013e3181f2b37f
33. Wang X, Yue TL, Barone FC, White RF, Gagnon RC, Feuerstein GZ. Concomitant Cortical Expression of TNF-α and IL-1β mRNAs Follows Early Response Gene Expression in Transient Focal Ischemia. Mol Chem Neuropathol (1994) 23:103–14. doi: 10.1007/BF02815404
34. Heymann F, Tacke F. Immunology in the Liver-From Homeostasis to Disease. Nat Rev Gastroenterol Hepatol (2016) 13:88–110. doi: 10.1038/nrgastro.2015.200
Keywords: single cell, scRNA-seq, RNA-seq, liver, immunotolerance, liver homeostasis, immunology
Citation: Rocque B, Barbetta A, Singh P, Goldbeck C, Helou DG, Loh Y-HE, Ung N, Lee J, Akbari O and Emamaullee J (2021) Creation of a Single Cell RNASeq Meta-Atlas to Define Human Liver Immune Homeostasis. Front. Immunol. 12:679521. doi: 10.3389/fimmu.2021.679521
Received: 12 March 2021; Accepted: 28 June 2021;
Published: 16 July 2021.
Edited by:
Guido Moll, Charité – Universitätsmedizin Berlin, GermanyReviewed by:
Andrew Malone, Washington University in St. Louis, United StatesFederico Simonetta, Geneva University Hospitals (HUG), Switzerland
Copyright © 2021 Rocque, Barbetta, Singh, Goldbeck, Helou, Loh, Ung, Lee, Akbari and Emamaullee. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Juliet Emamaullee, SnVsaWV0LmVtYW1hdWxsZWVAbWVkLnVzYy5lZHU=