Recent from talks
Nothing was collected or created yet.
Single-cell transcriptomics
View on WikipediaSingle-cell transcriptomics examines the gene expression level of individual cells in a given population by simultaneously measuring the RNA concentration, typically messenger RNA (mRNA), of hundreds to thousands of genes.[1] Single-cell transcriptomics makes it possible to unravel heterogeneous cell populations, reconstruct cellular developmental pathways, and model transcriptional dynamics—all previously masked in bulk RNA sequencing.[2]
Background
[edit]The development of high-throughput RNA sequencing (RNA-seq) and microarrays has made gene expression analysis a routine. RNA analysis was previously limited to tracing individual transcripts by Northern blots or quantitative PCR. Higher throughput and speed allow researchers to frequently characterize the expression profiles of populations of thousands of cells. The data from bulk assays has led to identifying genes differentially expressed in distinct cell populations, and biomarker discovery.[3]

These studies are limited as they provide measurements for whole tissues and, as a result, show an average expression profile for all the constituent cells. This has a couple of drawbacks. Firstly, different cell types within the same tissue can have distinct roles in multicellular organisms. They often form subpopulations with unique transcriptional profiles. Correlations in the gene expression of the subpopulations can often be missed due to the lack of subpopulation identification.[1] Secondly, bulk assays fail to recognize whether a change in the expression profile is due to a change in regulation or composition — for example if one cell type arises to dominate the population. Lastly, when your goal is to study cellular progression through differentiation, average expression profiles can only order cells by time rather than by developmental stage. Consequently, they cannot show trends in gene expression levels specific to certain stages.[4]
Recent advances in biotechnology allow the measurement of gene expression in hundreds to thousands of individual cells simultaneously. While these breakthroughs in transcriptomics technologies have enabled the generation of single-cell transcriptomic data, they also presented new computational and analytical challenges. Bioinformaticians can use techniques from bulk RNA-seq for single-cell data. Still, many new computational approaches have had to be designed for this data type to facilitate a complete and detailed study of single-cell expression profiles.[5]
Experimental steps
[edit]There is so far no standardized technique to generate single-cell data: all methods must include cell isolation from the population, lysate formation, amplification through reverse transcription, and quantification of expression levels. Common techniques for measuring expression are quantitative PCR or RNA-seq.[6]
Isolating single cells
[edit]
Several methods are available to isolate and amplify cells for single-cell analysis, differing primarily in throughput and potential for cell selection. Low-throughput techniques, such as micropipetting, cytoplasmic aspiration,[7] and laser capture microdissection, typically isolate hundreds of cells but enable deliberate cell selection.
High-throughput methods allow for the rapid isolation of hundreds to tens of thousands of cells.[8] Common high-throughput approaches include Fluorescence Activated Cell Sorting (FACS) and the use of microfluidic devices. Microfluidic platforms often isolate single cells either by mechanical separation into microwells (e.g., BD Rhapsody, Takara ICELL8, Vycap Puncher Platform, CellMicrosystems CellRaft) or by encapsulation within droplets (e.g., 10x Genomics Chromium, Illumina Bio-Rad ddSEQ, 1CellBio InDrop, Dolomite Bio Nadia).[9] Furthermore, optimized protocols have been developed by integrating these isolation techniques directly with scRNA-seq workflows. For instance, combining FACS with scRNA-seq led to protocols like SORT-seq,[10] and a list of studies utilizing SORT-seq can be found here.[11] Similarly, the integration of microfluidic devices with scRNA-seq has been highly optimized in protocols such as those developed by 10x Genomics.[12]
Single cell RNA-seq techniques that rely on split-pool barcoding can uniquely label cells without requiring the isolation of individual cells, including sci-RNA-seq, SPLiT-seq, and microSPLiT.[13][14][15]
Quantitative PCR (qPCR)
[edit]To measure the level of expression of each transcript qPCR can be applied. Gene specific primers are used to amplify the corresponding gene as with regular PCR and as a result data is usually only obtained for sample sizes of less than 100 genes. The inclusion of housekeeping genes, whose expression should be constant under the conditions, is used for normalization. The most commonly used house keeping genes include GAPDH and α-actin, although the reliability of normalization through this process is questionable as there is evidence that the level of expression can vary significantly.[16] Fluorescent dyes are used as reporter molecules to detect the PCR product and monitor the progress of the amplification - the increase in fluorescence intensity is proportional to the amplicon concentration. A plot of fluorescence vs. cycle number is made and a threshold fluorescence level is used to find cycle number at which the plot reaches this value. The cycle number at this point is known as the threshold cycle (Ct) and is measured for each gene.[17]
Single-cell RNA-seq (scRNA-Seq)
[edit]The single-cell RNA-seq technique converts a population of RNAs to a library of cDNA fragments that can be sequenced. In droplet-based technologies such as 10x Genomics Chromium, single cells are isolated in droplets together with beads coated with barcoded oligonucleotides. Both cells and beads are supplied in limited amounts such that co-occupancy with multiple cells and beads is a very rare event. Cells are lysed within the droplets, and RNAs are reverse transcribed using the barcoded oligo-dT oligonucleotides as primers. After reverse transcription, the emulsion is broken, releasing the barcoded cDNA from all the droplets into a single solution. This pooled cDNA is then prepared for sequencing via the addition of sequencing adapters and PCR amplification.

These fragments are sequenced by high-throughput next generation sequencing techniques and the reads are mapped back to the reference genome, providing a count of the number of reads associated with each gene.[18] Transcripts from a particular cell are identified by each cell's unique barcode.[19][20]
Normalization of RNA-Seq data accounts for cell to cell variation in the efficiencies of the cDNA library formation and sequencing. One method relies on the use of extrinsic RNA spike-ins that are added in equal quantities to each cell lysate and used to normalize read count by the number of reads mapped to spike-in mRNA.[21] Another control uses unique molecular identifiers (UMIs)-short DNA sequences (6–10nt) that are added to each cDNA before amplification and act as a bar code for each cDNA molecule. Normalization is achieved by using the count number of unique UMIs associated with each gene to account for differences in amplification efficiency.[22]
A combination of both spike-ins, UMIs and other approaches have been combined to help identify artifacts during library preparation[23] and for more accurate normalization.
Applications
[edit]scRNA-Seq is becoming widely used across biological disciplines including Development, Neurology,[24] Oncology,[25][26][27] Autoimmune disease,[28], Infectious disease.[29], brain disease,[30] and environmental virology [31][32]. Several scRNA-Seq protocols have been published: Tang et al.,[33] STRT,[34] SMART-seq,[35] CEL-seq,[36] RAGE-seq,[37] Quartz-seq[38] and C1-CAGE.[39] These protocols differ in terms of strategies for reverse transcription, cDNA synthesis and amplification, and the possibility to accommodate sequence-specific barcodes (i.e. UMIs) or the ability to process pooled samples.[40] In 2017, two approaches were introduced to simultaneously measure single-cell mRNA and protein expression through oligonucleotide-labeled antibodies known as REAP-seq,[41] and CITE-seq.[42] A 2025 review in Science reported that applying single-cell transcriptomics to microbial communities reveals functional heterogeneity within gut communities, characteristic antibiotic responses, and the dynamics of mobile genetic elements. Pountain, Andrew W.; Yanai, Itai (2025-09-04). "Dissecting microbial communities with single-cell transcriptome analysis". Science. 389 (6764) eadp6252. doi:10.1126/science.adp6252. PMC 12467864. PMID 40906858.
scRNA-Seq has provided considerable insight into the development of embryos and organisms, including the worm Caenorhabditis elegans,[43] and the regenerative planarian Schmidtea mediterranea.[44][45] The first vertebrate animals to be mapped in this way were Zebrafish[46][47] and Xenopus laevis.[48] In each case multiple stages of the embryo were studied, allowing the entire process of development to be mapped on a cell-by-cell basis.[49] Science recognized these advances as the 2018 Breakthrough of the Year.[50]
Considerations
[edit]A problem associated with single-cell data occurs in the form of zero inflated gene expression distributions, known as technical dropouts, that are common due to low mRNA concentrations of less-expressed genes that are not captured in the reverse transcription process. The percentage of mRNA molecules in the cell lysate that are detected is often only 10-20%.[51]
When using RNA spike-ins for normalization the assumption is made that the amplification and sequencing efficiencies for the endogenous and spike-in RNA are the same. Evidence suggests that this is not the case given fundamental differences in size and features, such as the lack of a polyadenylated tail in spike-ins and therefore shorter length.[52] Additionally, normalization using UMIs assumes the cDNA library is sequenced to saturation, which is not always the case.[22]
In the amplification step, either PCR or in vitro transcription (IVT) is currently used to amplify cDNA. One of the advantages of PCR-based methods is the ability to generate full-length cDNA. However, different PCR efficiency on particular sequences (for instance, GC content and snapback structure) may also be exponentially amplified, producing libraries with uneven coverage. On the other hand, while libraries generated by IVT can avoid PCR-induced sequence bias, specific sequences may be transcribed inefficiently, thus causing sequence drop-out or generating incomplete sequences.[53][54]
Challenges for scRNA-Seq include preserving the initial relative abundance of mRNA in a cell and identifying rare transcripts.[55] The reverse transcription step is critical as the efficiency of the RT reaction determines how much of the cell's RNA population will be eventually analyzed by the sequencer. The processivity of reverse transcriptases and the priming strategies used may affect full-length cDNA production and the generation of libraries biased toward the 3' or 5' end of genes.
A further consideration when sequencing large, branched cell types, such as neurons, comes from the removal of distal processes containing local pools of RNA during the single-cell isolation process. In these cells, scRNA-seq datasets only capture transcript in the central cell body, omitting transcripts from RNA pools localized to cellular processes that can be involved in local translation or other RNA-mediated subcellular mechanisms. In the brain it has been estimated that over 40% of total RNA is not sequenced by scRNA-seq due to the prevalence of local transcriptomes in cellular processes such as axons, dendrites, myelin, and endfeet.[56]
Data analysis
[edit]Insights based on single-cell data analysis assume that the input is a matrix of normalized gene expression counts, generated by the approaches outlined above, and can provide opportunities that are not obtainable by bulk.
Three main insights provided:[57]
- Identification and characterization of cell types and their spatial organisation in time
- Inference of gene regulatory networks and their strength across individual cells
- Classification of the stochastic component of transcription
The techniques outlined have been designed to help visualise and explore patterns in the data in order to facilitate the revelation of these three features.
Clustering
[edit]

Clustering allows for the formation of subgroups in the cell population. Cells can be clustered by their transcriptomic profile in order to analyse the sub-population structure and identify rare cell types or cell subtypes. Alternatively, genes can be clustered by their expression states in order to identify covarying genes. A combination of both clustering approaches, known as biclustering, has been used to simultaneously cluster by genes and cells to find genes that behave similarly within cell clusters.[58]
Clustering methods applied can be K-means clustering, forming disjoint groups or Hierarchical clustering, forming nested partitions.
Biclustering
[edit]Biclustering provides several advantages by improving the resolution of clustering. Genes that are only informative to a subset of cells and are hence only expressed there can be identified through biclustering. Moreover, similarly behaving genes that differentiate one cell cluster from another can be identified using this method.[59]
Dimensionality reduction
[edit]
Dimensionality reduction algorithms such as Principal component analysis (PCA) and t-SNE can be used to simplify data for visualisation and pattern detection by transforming cells from a high to a lower dimensional space. The result of this method produces graphs with each cell as a point in a 2-D or 3-D space. Dimensionality reduction is frequently used before clustering as cells in high dimensions can wrongly appear to be close due to distance metrics behaving non-intuitively.[60]
Principal component analysis
[edit]The most frequently used technique is PCA, which identifies the directions of largest variance principal components and transforms the data so that the first principal component has the largest possible variance, and successive principle components in turn each have the highest variance possible while remaining orthogonal to the preceding components. The contribution each gene makes to each component is used to infer which genes are contributing the most to variance in the population and are involved in differentiating different subpopulations.[61]
Differential expression
[edit]Detecting differences in gene expression level between two populations is used both single-cell and bulk transcriptomic data. Specialised methods have been designed for single-cell data that considers single cell features such as technical dropouts and shape of the distribution e.g. Bimodal vs. unimodal.[62]
Gene ontology enrichment
[edit]Gene ontology terms describe gene functions and the relationships between those functions into three classes:
- Molecular function
- Cellular component
- Biological process
Gene Ontology (GO) term enrichment is a technique used to identify which GO terms are over-represented or under-represented in a given set of genes. In single-cell analysis input list of genes of interest can be selected based on differentially expressed genes or groups of genes generated from biclustering. The number of genes annotated to a GO term in the input list is normalized against the number of genes annotated to a GO term in the background set of all genes in genome to determine statistical significance.[63]
Pseudotemporal ordering
[edit]
Pseudo-temporal ordering (or trajectory inference) is a technique that aims to infer gene expression dynamics from snapshot single-cell data. The method tries to order the cells in such a way that similar cells are closely positioned to each other. This trajectory of cells can be linear, but can also bifurcate or follow more complex graph structures. The trajectory, therefore, enables the inference of gene expression dynamics and the ordering of cells by their progression through differentiation or response to external stimuli. The method relies on the assumptions that the cells follow the same path through the process of interest and that their transcriptional state correlates to their progression. The algorithm can be applied to both mixed populations and temporal samples.
More than 50 methods for pseudo-temporal ordering have been developed, and each has its own requirements for prior information (such as starting cells or time course data), detectable topologies, and methodology.[64] An example algorithm is the Monocle algorithm[65] that carries out dimensionality reduction of the data, builds a minimal spanning tree using the transformed data, orders cells in pseudo-time by following the longest connected path of the tree and consequently labels cells by type. Another example is the diffusion pseudotime (DPT) algorithm,[63] which uses a diffusion map and diffusion process. Another class of methods such as MARGARET [66] employ graph partitioning for capturing complex trajectory topologies such as disconnected and multifurcating trajectories.
Network inference
[edit]Gene regulatory network inference is a technique that aims to construct a network, shown as a graph, in which the nodes represent the genes and edges indicate co-regulatory interactions. The method relies on the assumption that a strong statistical relationship between the expression of genes is an indication of a potential functional relationship.[67] The most commonly used method to measure the strength of a statistical relationship is correlation. However, correlation fails to identify non-linear relationships and mutual information is used as an alternative. Gene clusters linked in a network signify genes that undergo coordinated changes in expression.[68]
Integration
[edit]The presence or strength of technical effects and the types of cells observed often differ in single-cell transcriptomics datasets generated using different experimental protocols and under different conditions. This difference results in strong batch effects that may bias the findings of statistical methods applied across batches, particularly in the presence of confounding.[69] As a result of the aforementioned properties of single-cell transcriptomic data, batch correction methods developed for bulk sequencing data were observed to perform poorly. Consequently, researchers developed statistical methods to correct for batch effects that are robust to the properties of single-cell transcriptomic data to integrate data from different sources or experimental batches. Laleh Haghverdi performed foundational work in formulating the use of mutual nearest neighbors between each batch to define batch correction vectors.[70] With these vectors, you can merge datasets that each include at least one shared cell type. An orthogonal approach involves the projection of each dataset onto a shared low-dimensional space using canonical correlation analysis.[71] Mutual nearest neighbors and canonical correlation analysis have also been combined to define integration "anchors" comprising reference cells in one dataset, to which query cells in another dataset are normalized.[72] Another class of methods (e.g., scDREAMER[73]) uses deep generative models such as variational autoencoders for learning batch-invariant latent cellular representations which can be used for downstream tasks such as cell type clustering, denoising of single-cell gene expression vectors and trajectory inference.[66]
See also
[edit]References
[edit]- ^ a b Kanter, Itamar; Kalisky, Tomer (2015). "Single cell transcriptomics: methods and applications". Frontiers in Oncology. 5: 53. doi:10.3389/fonc.2015.00053. ISSN 2234-943X. PMC 4354386. PMID 25806353.
- ^ Liu, Serena; Trapnell, Cole (2016). "Single-cell transcriptome sequencing: recent advances and remaining challenges". F1000Research. 5: F1000 Faculty Rev–182. doi:10.12688/f1000research.7223.1. ISSN 2046-1402. PMC 4758375. PMID 26949524.
- ^ Szabo, David T. (2014). "Chapter 62 - Transcriptomic biomarkers in safety and risk assessment of chemicals". Biomarkers in Toxicology. Academic Press. pp. 1033–1038. ISBN 978-0-12-404630-6.
- ^ Trapnell, Cole (October 2015). "Defining cell types and states with single-cell genomics". Genome Research. 25 (10): 1491–1498. doi:10.1101/gr.190595.115. ISSN 1549-5469. PMC 4579334. PMID 26430159.
- ^ Stegle, O.; Teichmann, S.; Marioni, J. (2015). "Computational and analytical challenges in single-cell transcriptomics". Nature Reviews Genetics. 16 (3): 133–145. doi:10.1038/nrg3833. PMID 25628217. S2CID 205486032.
- ^ Kolodziejczyk, Aleksandra A.; Kim, Jong Kyoung; Svensson, Valentine; Marioni, John C.; Teichmann, Sarah A. (May 2015). "The Technology and Biology of Single-Cell RNA Sequencing". Molecular Cell. 58 (4): 610–620. doi:10.1016/j.molcel.2015.04.005. PMID 26000846.
- ^ "Cytoplasmic aspiration | Single Cell Analysis". www.single-cell-analysis.com. Archived from the original on 2023-11-30. Retrieved 2025-04-15.
- ^ Poulin, Jean-Francois; Tasic, Bosiljka; Hjerling-Leffler, Jens; Trimarchi, Jeffrey M.; Awatramani, Rajeshwar (1 September 2016). "Disentangling neural cell diversity using single-cell transcriptomics". Nature Neuroscience. 19 (9): 1131–1141. doi:10.1038/nn.4366. ISSN 1097-6256. PMID 27571192. S2CID 14461377.
- ^ Valihrach L, Androvic P, Kubista M (March 2018). "Platforms for Single-Cell Collection and Analysis". International Journal of Molecular Sciences. 19 (3): 807. doi:10.3390/ijms19030807. PMC 5877668. PMID 29534489.
- ^ Muraro, Mauro J.; Dharmadhikari, Gitanjali; Grün, Dominic; Groen, Nathalie; Dielen, Tim; Jansen, Erik; van Gurp, Leon; Engelse, Marten A.; Carlotti, Francoise; de Koning, Eelco J. P.; van Oudenaarden, Alexander (2016-10-26). "A Single-Cell Transcriptome Atlas of the Human Pancreas". Cell Systems. 3 (4): 385–394.e3. doi:10.1016/j.cels.2016.09.002. ISSN 2405-4712. PMC 5092539. PMID 27693023.
- ^ "SORT-seq Archives". Single Cell Discoveries. Retrieved 2022-11-15.
- ^ Zheng, Grace X. Y.; Terry, Jessica M.; Belgrader, Phillip; Ryvkin, Paul; Bent, Zachary W.; Wilson, Ryan; Ziraldo, Solongo B.; Wheeler, Tobias D.; McDermott, Geoff P.; Zhu, Junjie; Gregory, Mark T.; Shuga, Joe; Montesclaros, Luz; Underwood, Jason G.; Masquelier, Donald A. (2017-01-16). "Massively parallel digital transcriptional profiling of single cells". Nature Communications. 8 (1) 14049. Bibcode:2017NatCo...814049Z. doi:10.1038/ncomms14049. ISSN 2041-1723. PMC 5241818. PMID 28091601.
- ^ Cao, Junyue; Packer, Jonathan S.; Ramani, Vijay; Cusanovich, Darren A.; Huynh, Chau; Daza, Riza; Qiu, Xiaojie; Lee, Choli; Furlan, Scott N.; Steemers, Frank J.; Adey, Andrew; Waterston, Robert H.; Trapnell, Cole; Shendure, Jay (2017-08-18). "Comprehensive single-cell transcriptional profiling of a multicellular organism". Science. 357 (6352): 661–667. Bibcode:2017Sci...357..661C. doi:10.1126/science.aam8940. ISSN 1095-9203. PMC 5894354. PMID 28818938.
- ^ Rosenberg, Alexander B.; Roco, Charles M.; Muscat, Richard A.; Kuchina, Anna; Sample, Paul; Yao, Zizhen; Graybuck, Lucas T.; Peeler, David J.; Mukherjee, Sumit; Chen, Wei; Pun, Suzie H.; Sellers, Drew L.; Tasic, Bosiljka; Seelig, Georg (2018-04-13). "Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding". Science. 360 (6385): 176–182. Bibcode:2018Sci...360..176R. doi:10.1126/science.aam8999. ISSN 1095-9203. PMC 7643870. PMID 29545511.
- ^ Gaisser, Karl D.; Skloss, Sophie N.; Brettner, Leandra M.; Paleologu, Luana; Roco, Charles M.; Rosenberg, Alexander B.; Hirano, Matthew; DePaolo, R. William; Seelig, Georg; Kuchina, Anna (October 2024). "High-throughput single-cell transcriptomics of bacteria using combinatorial barcoding". Nature Protocols. 19 (10): 3048–3084. doi:10.1038/s41596-024-01007-w. ISSN 1750-2799. PMC 11575931. PMID 38886529.
- ^ Radonić, Aleksandar; Thulke, Stefanie; Mackay, Ian M.; Landt, Olfert; Siegert, Wolfgang; Nitsche, Andreas (23 January 2004). "Guideline to reference gene selection for quantitative real-time PCR". Biochemical and Biophysical Research Communications. 313 (4): 856–862. Bibcode:2004BBRC..313..856R. doi:10.1016/j.bbrc.2003.11.177. ISSN 0006-291X. PMID 14706621.
- ^ Wildsmith, S. E.; Archer, G. E.; Winkley, A. J.; Lane, P. W.; Bugelski, P. J. (1 January 2001). "Maximization of signal derived from cDNA microarrays". BioTechniques. 30 (1): 202–206, 208. doi:10.2144/01301dd04. ISSN 0736-6205. PMID 11196312.
- ^ Wang, Zhong; Gerstein, Mark; Snyder, Michael (23 March 2017). "RNA-Seq: a revolutionary tool for transcriptomics". Nature Reviews. Genetics. 10 (1): 57–63. doi:10.1038/nrg2484. ISSN 1471-0056. PMC 2949280. PMID 19015660.
- ^ Klein AM, Mazutis L, Akartuna I, Tallapragada N, Veres A, Li V, Peshkin L, Weitz DA, Kirschner MW (May 2015). "Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells". Cell. 161 (5): 1187–1201. doi:10.1016/j.cell.2015.04.044. PMC 4441768. PMID 26000487.
- ^ Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K, Goldman M, Tirosh I, Bialas AR, Kamitaki N, Martersteck EM, Trombetta JJ, Weitz DA, Sanes JR, Shalek AK, Regev A, McCarroll SA (May 2015). "Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets". Cell. 161 (5): 1202–1214. doi:10.1016/j.cell.2015.05.002. PMC 4481139. PMID 26000488.
- ^ Jiang, Lichun; Schlesinger, Felix; Davis, Carrie A.; Zhang, Yu; Li, Renhua; Salit, Marc; Gingeras, Thomas R.; Oliver, Brian (23 March 2017). "Synthetic spike-in standards for RNA-seq experiments". Genome Research. 21 (9): 1543–1551. doi:10.1101/gr.121095.111. ISSN 1088-9051. PMC 3166838. PMID 21816910.
- ^ a b Islam, Saiful; Zeisel, Amit; Joost, Simon; La Manno, Gioele; Zajac, Pawel; Kasper, Maria; Lönnerberg, Peter; Linnarsson, Sten (1 February 2014). "Quantitative single-cell RNA-seq with unique molecular identifiers". Nature Methods. 11 (2): 163–166. doi:10.1038/nmeth.2772. ISSN 1548-7091. PMID 24363023. S2CID 6765530.
- ^ Islam S, Zeisel A, Joost S, La Manno G, Zajac P, Kasper M, Lönnerberg P, Linnarsson S (February 2014). "Quantitative single-cell RNA-seq with unique molecular identifiers". Nature Methods. 11 (2): 163–6. doi:10.1038/nmeth.2772. PMID 24363023. S2CID 6765530.
- ^ Raj B, Wagner DE, McKenna A, Pandey S, Klein AM, Shendure J, Gagnon JA, Schier AF (June 2018). "Simultaneous single-cell profiling of lineages and cell types in the vertebrate brain". Nature Biotechnology. 36 (5): 442–450. doi:10.1038/nbt.4103. PMC 5938111. PMID 29608178.
- ^ Olmos D, Arkenau HT, Ang JE, Ledaki I, Attard G, Carden CP, Reid AH, A'Hern R, Fong PC, Oomen NB, Molife R, Dearnaley D, Parker C, Terstappen LW, de Bono JS (January 2009). "Circulating tumour cell (CTC) counts as intermediate end points in castration-resistant prostate cancer (CRPC): a single-centre experience". Annals of Oncology. 20 (1): 27–33. doi:10.1093/annonc/mdn544. PMID 18695026.
- ^ Levitin HM, Yuan J, Sims PA (April 2018). "Single-Cell Transcriptomic Analysis of Tumor Heterogeneity". Trends in Cancer. 4 (4): 264–268. doi:10.1016/j.trecan.2018.02.003. PMC 5993208. PMID 29606308.
- ^ Jerby-Arnon L, Shah P, Cuoco MS, Rodman C, Su MJ, Melms JC, Leeson R, Kanodia A, Mei S, Lin JR, Wang S, Rabasha B, Liu D, Zhang G, Margolais C, Ashenberg O, Ott PA, Buchbinder EI, Haq R, Hodi FS, Boland GM, Sullivan RJ, Frederick DT, Miao B, Moll T, Flaherty KT, Herlyn M, Jenkins RW, Thummalapalli R, Kowalczyk MS, Cañadas I, Schilling B, Cartwright AN, Luoma AM, Malu S, Hwu P, Bernatchez C, Forget MA, Barbie DA, Shalek AK, Tirosh I, Sorger PK, Wucherpfennig K, Van Allen EM, Schadendorf D, Johnson BE, Rotem A, Rozenblatt-Rosen O, Garraway LA, Yoon CH, Izar B, Regev A (November 2018). "A Cancer Cell Program Promotes T Cell Exclusion and Resistance to Checkpoint Blockade". Cell. 175 (4): 984–997.e24. doi:10.1016/j.cell.2018.09.006. PMC 6410377. PMID 30388455.
- ^ Stephenson W, Donlin LT, Butler A, Rozo C, Bracken B, Rashidfarrokhi A, Goodman SM, Ivashkiv LB, Bykerk VP, Orange DE, Darnell RB, Swerdlow HP, Satija R (February 2018). "Single-cell RNA-seq of rheumatoid arthritis synovial tissue using low-cost microfluidic instrumentation". Nature Communications. 9 (1) 791. Bibcode:2018NatCo...9..791S. doi:10.1038/s41467-017-02659-x. PMC 5824814. PMID 29476078.
- ^ Avraham R, Haseley N, Brown D, Penaranda C, Jijon HB, Trombetta JJ, Satija R, Shalek AK, Xavier RJ, Regev A, Hung DT (September 2015). "Pathogen Cell-to-Cell Variability Drives Heterogeneity in Host Immune Responses". Cell. 162 (6): 1309–21. doi:10.1016/j.cell.2015.08.027. PMC 4578813. PMID 26343579.
- ^ Xiang R, Wang J, Chen Z, Tao J, Peng Q, Ding R, Zhou T, Tu Z, Wang S, Yang T, Chen J, Jia Z, Li X, Zhang X, Chen S, Cheng N, Zhao M, Li J, Xue Q, Zhang H, Jiang C, Xing N, Ouyang K, Pekny A, Michalowska MM, de Pablo Y, Wilhelmsson U, Mitsios N, Liu C, Xu X, Fan X, Pekna M, Pekny M, Chen X, Liu L, Mulder J, Wang M, Wang J (July 2025). "Spatiotemporal transcriptomic maps of mouse intracerebral hemorrhage at single-cell resolution". Neuron. 113 (13): 2102-2122.e7. doi:10.1016/j.neuron.2025.04.026. PMID 40412375.
- ^ Fromm, Amir; Hevroni, Gur; Vincent, Flora; Schatz, Daniella; Martinez-Gutierrez, Carolina A.; Aylward, Frank O.; Vardi, Assaf (2024). "Single-cell RNA-seq of the rare virosphere reveals the native hosts of giant viruses in the marine environment". Nature Microbiology. 9 (6): 1619–1629. doi:10.1038/s41564-024-01669-y. PMC 1265207. PMID 38605173.
- ^ Fromm, Amir; Shaler, Talia S.; Aylward, Frank O.; Vardi, Assaf (2025). "A single-cell perspective on host–virus dynamics in the ocean". Trends in Microbiology. doi:10.1016/j.tim.2025.05.005.
- ^ Tang F, Barbacioru C, Wang Y, Nordman E, Lee C, Xu N, Wang X, Bodeau J, Tuch BB, Siddiqui A, Lao K, Surani MA (May 2009). "mRNA-Seq whole-transcriptome analysis of a single cell". Nature Methods. 6 (5): 377–82. doi:10.1038/NMETH.1315. PMID 19349980. S2CID 16570747.
- ^ Islam S, Kjällquist U, Moliner A, Zajac P, Fan JB, Lönnerberg P, Linnarsson S (July 2011). "Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq". Genome Research. 21 (7): 1160–7. doi:10.1101/gr.110882.110. PMC 3129258. PMID 21543516.
- ^ Ramsköld D, Luo S, Wang YC, Li R, Deng Q, Faridani OR, Daniels GA, Khrebtukova I, Loring JF, Laurent LC, Schroth GP, Sandberg R (August 2012). "Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells". Nature Biotechnology. 30 (8): 777–82. doi:10.1038/nbt.2282. PMC 3467340. PMID 22820318.
- ^ Hashimshony T, Wagner F, Sher N, Yanai I (September 2012). "CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification". Cell Reports. 2 (3): 666–73. doi:10.1016/j.celrep.2012.08.003. PMID 22939981.
- ^ Singh M, Al-Eryani G, Carswell S, Ferguson JM, Blackburn J, Barton K, Roden D, Luciani F, Phan T, Junankar S, Jackson K, Goodnow CC, Smith MA, Swarbrick A (2018). "High-throughput targeted long-read single cell sequencing reveals the clonal and transcriptional landscape of lymphocytes". bioRxiv. 10 (1): 3120. doi:10.1101/424945. PMC 6635368. PMID 31311926.
- ^ Sasagawa Y, Nikaido I, Hayashi T, Danno H, Uno KD, Imai T, Ueda HR (April 2013). "Quartz-Seq: a highly reproducible and sensitive single-cell RNA sequencing method, reveals non-genetic gene-expression heterogeneity". Genome Biology. 14 (4) 3097: R31. doi:10.1186/gb-2013-14-4-r31. PMC 4054835. PMID 23594475.
- ^ Kouno T, Moody J, Kwon AT, Shibayama Y, Kato S, Huang Y, Böttcher M, Motakis E, Mendez M, Severin J, Luginbühl J, Abugessaisa I, Hasegawa A, Takizawa S, Arakawa T, Furuno M, Ramalingam N, West J, Suzuki H, Kasukawa T, Lassmann T, Hon CC, Arner E, Carninci P, Plessy C, Shin JW (January 2019). "C1 CAGE detects transcription start sites and enhancer activity at single-cell resolution". Nature Communications. 10 (1) 360. Bibcode:2019NatCo..10..360K. doi:10.1038/s41467-018-08126-5. PMC 6341120. PMID 30664627.
- ^ Dal Molin A, Di Camillo B (2019). "How to design a single-cell RNA-sequencing experiment: pitfalls, challenges and perspectives". Briefings in Bioinformatics. 20 (4): 1384–1394. doi:10.1093/bib/bby007. PMID 29394315.
- ^ Peterson VM, Zhang KX, Kumar N, Wong J, Li L, Wilson DC, Moore R, McClanahan TK, Sadekova S, Klappenbach JA (October 2017). "Multiplexed quantification of proteins and transcripts in single cells". Nature Biotechnology. 35 (10): 936–939. doi:10.1038/nbt.3973. PMID 28854175. S2CID 205285357.
- ^ Stoeckius M, Hafemeister C, Stephenson W, Houck-Loomis B, Chattopadhyay PK, Swerdlow H, Satija R, Smibert P (September 2017). "Simultaneous epitope and transcriptome measurement in single cells". Nature Methods. 14 (9): 865–868. doi:10.1038/nmeth.4380. PMC 5669064. PMID 28759029.
- ^ Cao J, Packer JS, Ramani V, Cusanovich DA, Huynh C, Daza R, Qiu X, Lee C, Furlan SN, Steemers FJ, Adey A, Waterston RH, Trapnell C, Shendure J (August 2017). "Comprehensive single-cell transcriptional profiling of a multicellular organism". Science. 357 (6352): 661–667. Bibcode:2017Sci...357..661C. doi:10.1126/science.aam8940. PMC 5894354. PMID 28818938.
- ^ Plass M, Solana J, Wolf FA, Ayoub S, Misios A, Glažar P, Obermayer B, Theis FJ, Kocks C, Rajewsky N (May 2018). "Cell type atlas and lineage tree of a whole complex animal by single-cell transcriptomics". Science. 360 (6391) eaaq1723. doi:10.1126/science.aaq1723. PMID 29674432.
- ^ Fincher CT, Wurtzel O, de Hoog T, Kravarik KM, Reddien PW (May 2018). "Schmidtea mediterranea". Science. 360 (6391) eaaq1736. doi:10.1126/science.aaq1736. PMC 6563842. PMID 29674431.
- ^ Wagner DE, Weinreb C, Collins ZM, Briggs JA, Megason SG, Klein AM (June 2018). "Single-cell mapping of gene expression landscapes and lineage in the zebrafish embryo". Science. 360 (6392): 981–987. Bibcode:2018Sci...360..981W. doi:10.1126/science.aar4362. PMC 6083445. PMID 29700229.
- ^ Farrell JA, Wang Y, Riesenfeld SJ, Shekhar K, Regev A, Schier AF (June 2018). "Single-cell reconstruction of developmental trajectories during zebrafish embryogenesis". Science. 360 (6392) eaar3131. doi:10.1126/science.aar3131. PMC 6247916. PMID 29700225.
- ^ Briggs JA, Weinreb C, Wagner DE, Megason S, Peshkin L, Kirschner MW, Klein AM (June 2018). "The dynamics of gene expression in vertebrate embryogenesis at single-cell resolution". Science. 360 (6392) eaar5780. doi:10.1126/science.aar5780. PMC 6038144. PMID 29700227.
- ^ Griffith M, Walker JR, Spies NC, Ainscough BJ, Griffith OL (August 2015). "Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud". PLOS Computational Biology. 11 (8) e1004393. Bibcode:2015PLSCB..11E4393G. doi:10.1371/journal.pcbi.1004393. PMC 4527835. PMID 26248053.
- ^ You J. "Science's 2018 Breakthrough of the Year: tracking development cell by cell". Science Magazine. American Association for the Advancement of Science.
- ^ Kharchenko, Peter V.; Silberstein, Lev; Scadden, David T. (1 July 2014). "Bayesian approach to single-cell differential expression analysis". Nature Methods. 11 (7): 740–742. doi:10.1038/nmeth.2967. ISSN 1548-7091. PMC 4112276. PMID 24836921.
- ^ Svensson, Valentine; Natarajan, Kedar Nath; Ly, Lam-Ha; Miragaia, Ricardo J.; Labalette, Charlotte; Macaulay, Iain C.; Cvejic, Ana; Teichmann, Sarah A. (6 March 2017). "Power analysis of single-cell RNA-sequencing experiments". Nature Methods. advance online publication (4): 381–387. doi:10.1038/nmeth.4220. ISSN 1548-7105. PMC 5376499. PMID 28263961.
- ^ Eberwine J, Sul JY, Bartfai T, Kim J (January 2014). "The promise of single-cell sequencing". Nature Methods. 11 (1): 25–7. doi:10.1038/nmeth.2769. PMID 24524134. S2CID 11575439.
- ^ "Shapiro E, Biezuner T, Linnarsson S (September 2013). "Single-cell sequencing-based technologies will revolutionize whole-organism science". Nature Reviews. Genetics. 14 (9): 618–30. doi:10.1038/nrg3542. PMID 23897237. S2CID 500845."
- ^ "Hebenstreit D (November 2012). "Methods, Challenges and Potentials of Single Cell RNA-seq". Biology. 1 (3): 658–67. doi:10.3390/biology1030658. PMC 4009822. PMID 24832513."
- ^ Ament, Seth A.; Poulopoulos, Alexandros (2023). "The brain's dark transcriptome: Sequencing RNA in distal compartments of neurons and glia". Current Opinion in Neurobiology. 81 102725. doi:10.1016/j.conb.2023.102725. PMC 10524153. PMID 37196598.
- ^ Stegle, Oliver; Teichmann, Sarah A.; Marioni, John C. (1 March 2015). "Computational and analytical challenges in single-cell transcriptomics". Nature Reviews Genetics. 16 (3): 133–145. doi:10.1038/nrg3833. ISSN 1471-0056. PMID 25628217. S2CID 205486032.
- ^ Buettner, Florian; Natarajan, Kedar N.; Casale, F. Paolo; Proserpio, Valentina; Scialdone, Antonio; Theis, Fabian J.; Teichmann, Sarah A.; Marioni, John C.; Stegle, Oliver (1 February 2015). "Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells". Nature Biotechnology. 33 (2): 155–160. doi:10.1038/nbt.3102. ISSN 1087-0156. PMID 25599176.
- ^ Ntranos, Vasilis; Kamath, Govinda M.; Zhang, Jesse M.; Pachter, Lior; Tse, David N. (26 May 2016). "Fast and accurate single-cell RNA-seq analysis by clustering of transcript-compatibility counts". Genome Biology. 17 (1): 112. doi:10.1186/s13059-016-0970-8. ISSN 1474-7596. PMC 4881296. PMID 27230763.
- ^ Pierson, Emma; Yau, Christopher (1 January 2015). "ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis". Genome Biology. 16 241. doi:10.1186/s13059-015-0805-z. ISSN 1474-760X. PMC 4630968. PMID 26527291.
- ^ Treutlein, Barbara; Brownfield, Doug G.; Wu, Angela R.; Neff, Norma F.; Mantalas, Gary L.; Espinoza, F. Hernan; Desai, Tushar J.; Krasnow, Mark A.; Quake, Stephen R. (15 May 2014). "Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq". Nature. 509 (7500): 371–375. Bibcode:2014Natur.509..371T. doi:10.1038/nature13173. PMC 4145853. PMID 24739965.
- ^ Korthauer, Keegan D.; Chu, Li-Fang; Newton, Michael A.; Li, Yuan; Thomson, James; Stewart, Ron; Kendziorski, Christina (1 January 2016). "A statistical approach for identifying differential distributions in single-cell RNA-seq experiments". Genome Biology. 17 (1): 222. doi:10.1186/s13059-016-1077-y. ISSN 1474-760X. PMC 5080738. PMID 27782827.
- ^ a b Haghverdi, Laleh; Büttner, Maren; Wolf, F. Alexander; Buettner, Florian; Theis, Fabian J. (1 October 2016). "Diffusion pseudotime robustly reconstructs lineage branching" (PDF). Nature Methods. 13 (10): 845–848. doi:10.1038/nmeth.3971. ISSN 1548-7091. PMID 27571553. S2CID 3594049.
- ^ Saelens, Wouter; Cannoodt, Robrecht; Todorov, Helena; Saeys, Yvan (2018-03-05). "A comparison of single-cell trajectory inference methods: towards more accurate and robust tools". bioRxiv 276907. doi:10.1101/276907. Retrieved 2018-03-12.
- ^ Trapnell, Cole; Cacchiarelli, Davide; Grimsby, Jonna; Pokharel, Prapti; Li, Shuqiang; Morse, Michael; Lennon, Niall J.; Livak, Kenneth J.; Mikkelsen, Tarjei S.; Rinn, John L. (23 March 2017). "Pseudo-temporal ordering of individual cells reveals dynamics and regulators of cell fate decisions". Nature Biotechnology. 32 (4): 381–386. doi:10.1038/nbt.2859. ISSN 1087-0156. PMC 4122333. PMID 24658644.
- ^ a b Pandey, Kushagra; Zafar, Hamim (2022). "Inference of cell state transitions and cell fate plasticity from single-cell with MARGARET". Nucleic Acids Research. 50 (15): e86. doi:10.1093/nar/gkac412. ISSN 0305-1048. PMC 9410915. PMID 35639499.
- ^ Wei, J.; Hu, X.; Zou, X.; Tian, T. (1 December 2016). "Inference of genetic regulatory network for stem cell using single cells expression data". 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). pp. 217–222. doi:10.1109/BIBM.2016.7822521. ISBN 978-1-5090-1611-2. S2CID 27737735.
- ^ Moignard, Victoria; Macaulay, Iain C.; Swiers, Gemma; Buettner, Florian; Schütte, Judith; Calero-Nieto, Fernando J.; Kinston, Sarah; Joshi, Anagha; Hannah, Rebecca; Theis, Fabian J.; Jacobsen, Sten Eirik; de Bruijn, Marella F.; Göttgens, Berthold (1 April 2013). "Characterization of transcriptional networks in blood stem and progenitor cells using high-throughput single-cell gene expression analysis". Nature Cell Biology. 15 (4): 363–372. doi:10.1038/ncb2709. ISSN 1465-7392. PMC 3796878. PMID 23524953.
- ^ Hicks, Stephanie C; Townes, William F; Teng, Mingxiang; Irizarry, Rafael A (6 November 2017). "Missing data and technical variability in single-cell RNA-sequencing experiments". Biostatistics. 19 (4): 562–578. doi:10.1093/biostatistics/kxx053. PMC 6215955. PMID 29121214.
- ^ Haghverdi, Laleh; Lun, Aaron T L; Morgan, Michael D; Marioni, John C (2 April 2018). "Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors". Nature Biotechnology. 36 (5): 421–427. doi:10.1038/nbt.4091. PMC 6152897. PMID 29608177.
- ^ Butler, Andrew; Hoffman, Paul; Smibert, Peter; Papalexi, Efthymia; Satija, Rahul (2 April 2018). "Integrating single-cell transcriptomic data across different conditions, technologies, and species". Nature Biotechnology. 36 (5): 421–427. doi:10.1038/nbt.4096. PMC 6700744. PMID 29608179.
- ^ Stuart, Tim; Butler, Andrew; Hoffman, Paul; Hafemeister, Christoph; Papalexia, Efthymia; Mauck, William M III; Hao, Yuhan; Marlon, Stoeckius; Smibert, Peter; Satija, Rahul (6 June 2019). "Comprehensive Integration of Single-Cell Data". Cell. 177 (7): 1888–1902. doi:10.1016/j.cell.2019.05.031. PMC 6687398. PMID 31178118.
- ^ Shree, Ajita; Pavan, Musale Krushna; Zafar, Hamim (27 November 2023). "scDREAMER for atlas-level integration of single-cell datasets using deep generative model paired with adversarial classifier". Nature Communications. 14 (1): 7781. Bibcode:2023NatCo..14.7781S. doi:10.1038/s41467-023-43590-8. PMC 10682386. PMID 38012145.
External links
[edit]- Dissecting Tumor Heterogeneity with Single-Cell Transcriptomics
- The ultimate single-cell RNA sequencing guide by single-cell RNA sequencing service provider Single Cell Discoveries.
Single-cell transcriptomics
View on GrokipediaBackground
Overview and importance
Single-cell transcriptomics encompasses a suite of technologies designed to profile the transcriptome—the full repertoire of RNA molecules, including messenger RNAs (mRNAs) and non-coding RNAs—within individual cells, thereby enabling the measurement of gene expression at unprecedented resolution.[5] This approach captures the dynamic and heterogeneous nature of cellular states, revealing variations in gene activity that define cell types, developmental trajectories, and responses to environmental cues.[2] By focusing on single cells rather than populations, it addresses fundamental limitations of traditional methods, providing insights into biological processes at the granular level essential for understanding complex systems.[6] The importance of single-cell transcriptomics lies in its ability to uncover cellular heterogeneity that is obscured in bulk RNA sequencing, where gene expression signals are averaged across thousands of cells, masking rare subpopulations comprising less than 5% of a tissue.[7] This resolution shift has revolutionized the study of dynamic processes, such as cell differentiation and lineage tracing, by identifying transient states and rare cell types that drive tissue function and disease progression.[2] For instance, in oncology, it elucidates intratumor heterogeneity, highlighting diverse malignant subclones that contribute to therapy resistance and tumor evolution.[8] In neuroscience, single-cell transcriptomics has illuminated the vast neuronal diversity in the brain, cataloging thousands of distinct subtypes based on unique transcriptional signatures that underpin circuit function and vulnerability to disorders.[9] Overall, these capabilities position single-cell transcriptomics as a cornerstone of precision medicine, facilitating personalized diagnostics and targeted interventions by linking molecular profiles to individual cellular behaviors in health and disease.[10]Historical development
The development of single-cell transcriptomics originated in the early 1990s with pioneering efforts to quantify gene expression in individual cells using techniques like quantitative PCR (qPCR) and mRNA amplification. A foundational advance came in 1992 when Eberwine et al. demonstrated the amplification of mRNA from single live neurons via microinjection of primers, nucleotides, and enzymes into acutely dissociated rat hippocampal cells, allowing detection of specific transcripts in defined neuronal populations.[11] This approach addressed the challenges of low RNA abundance in single cells and set the stage for more comprehensive profiling.[12] A critical innovation emerged with the Switching Mechanism at the 5' end of the RNA Template (SMART), introduced in 2001, which exploited the template-switching activity of certain reverse transcriptases to generate full-length cDNA from minimal RNA inputs without fragmentation or tailing.[13] This method enabled efficient amplification while preserving transcript integrity, becoming integral to subsequent single-cell protocols. Building on this, the Smart-seq protocol was adapted for single cells in 2012, supporting plate-based full-length mRNA sequencing from individual circulating tumor cells or limited samples, though still constrained to low throughput (typically 1-100 cells per experiment).[14] The field transformed in 2009 with the first single-cell RNA sequencing (scRNA-seq) experiment by Tang et al., who developed an mRNA-seq assay to profile the whole transcriptome of individual mouse oocytes and blastomeres, detecting 5,270 genes—75% more than contemporary microarrays—and uncovering novel splice junctions and transcript isoforms.[15] This proof-of-concept shifted focus from targeted qPCR to unbiased genome-wide analysis, revealing maternal mRNA contributions to early embryonic development.[16] Scalability surged in the mid-2010s through droplet-based microfluidics, enabling high-throughput profiling. In 2015, Macosko et al. launched Drop-seq, a method that encapsulates single cells with barcoded mRNA-capture beads in nanoliter droplets, facilitating simultaneous RNA-seq of thousands of mouse retinal cells and identifying rare neuronal subtypes.00549-8) That same year, Klein et al. introduced inDrop, using releasable hydrogel barcodes in droplets to index transcripts from embryonic stem cells, achieving comparable throughput while minimizing biases in gene detection.00500-0) These innovations marked a departure from labor-intensive plate-based systems to automated, scalable platforms processing >1,000 cells per run. Commercialization in 2016 via the 10x Genomics Chromium system democratized access, integrating droplet encapsulation with gel-bead emulsions to routinely capture and barcode up to 10,000 cells per sample, accelerating adoption in diverse biological contexts.[17] The 2020 Nobel Prize in Chemistry for CRISPR-Cas9 further propelled the field by enabling genome-wide perturbation screens paired with scRNA-seq, illuminating gene function at single-cell resolution.[18] By the 2020s, throughput evolved to exceed 1 million cells per experiment, with recent integrations to spatial transcriptomics—such as sequencing-free whole-genome profiling of 23,000 human genes in single cells and tissue sections—enhancing spatiotemporal resolution of cellular heterogeneity.01037-2)[19] This progression has fundamentally expanded the ability to dissect complex tissues into their molecular constituents.Experimental methods
Cell isolation and preparation
Cell isolation and preparation represent the foundational steps in single-cell transcriptomics, where tissues or cell suspensions are processed to yield viable individual cells or nuclei suitable for downstream RNA capture and sequencing. This process aims to preserve cellular integrity and transcriptional states while minimizing artifacts such as stress-induced gene expression changes or loss of fragile cell types. Effective isolation ensures high-quality input, typically targeting yields of 10,000 to 1,000,000 cells per sample to support comprehensive profiling across diverse populations.[10][20] Mechanical dissociation techniques, such as pipetting or grinding, are commonly employed for non-adherent cells or soft tissues to gently separate cells without chemical interference, though they may result in lower yields and higher debris compared to enzymatic methods. Enzymatic dissociation, using proteases like trypsin for epithelial cells or collagenase for stromal components, is widely adopted for adherent tissues, but requires optimization to avoid prolonged exposure that induces stress responses, such as upregulation of heat shock genes. For instance, dissociation at 4°C minimizes these artifacts by reducing enzymatic activity and preserving RNA integrity. Tissue-specific protocols are essential; in brain tissue, thin slicing followed by mild enzymatic treatment helps isolate neurons while mitigating dissociation-induced signatures in glia, where aggressive methods can artifactually elevate inflammatory gene expression.[21][22][21] Flow cytometry (FACS) enables marker-based sorting of specific cell populations prior to transcriptomics, improving purity and reducing heterogeneity, particularly for rare subtypes like immune cells in complex tissues. Microfluidic approaches, such as droplet encapsulation in platforms like 10x Genomics, integrate isolation with barcoding, allowing high-throughput processing but necessitating uniform cell suspensions to avoid encapsulation biases. Post-isolation, viability is assessed using trypan blue exclusion, with protocols recommending >80% viable cells to ensure reliable RNA recovery, as dead cells compromise library quality.[10][23] For archival or frozen samples, cryopreservation with 10% DMSO in fetal bovine serum maintains cell viability for months, enabling retrospective studies, though thawing must be rapid to limit RNA degradation. Fixation methods, such as methanol for nuclei, preserve samples for delayed processing without significantly altering transcriptomic profiles, making it suitable for droplet-based workflows. Single-nucleus RNA sequencing (snRNA-seq) circumvents issues with fragile intact cells by isolating nuclei via homogenization, which is particularly advantageous for frozen or fibrous tissues like brain or muscle.[24][25][26] Key challenges include doublet formation, where multiple cells are captured as one, inflating heterogeneity and requiring computational detection downstream; this risk increases with higher cell loads in droplet systems. Dissociation biases further complicate representation, as epithelial cells are more susceptible to lysis than robust fibroblasts, leading to underrepresentation of fragile types in suspensions. Recent advances like laser capture microdissection (LCM) address spatial precision, enabling isolation of targeted cells from tissue sections for transcriptomics, with 2024 protocols enhancing RNA yield from fixed samples via optimized lysis buffers. Yield optimization focuses on balancing recovery with quality, often achieving 10^4-10^6 cells through iterative protocol refinement to support low-input RNA amplification needs.[22][27]RNA capture, amplification, and library preparation
In single-cell transcriptomics, RNA capture begins immediately after cell lysis to preserve the transcriptome snapshot, typically targeting messenger RNA (mRNA) due to its low abundance, typically comprising 1–5% of total cellular RNA (approximately 0.1–1.5 pg per mammalian cell).[28] The predominant method employs poly-A selection, where oligo-dT primers bound to magnetic beads or surfaces hybridize to the poly-A tails of mRNA molecules, enabling efficient isolation from the total RNA pool while excluding non-coding RNAs like ribosomal RNA unless specifically targeted.[5] For broader transcriptome coverage, including non-polyadenylated RNAs, alternative strategies such as ribosomal RNA depletion or total RNA capture via random priming are used, though these increase complexity and potential off-target capture.[29] To facilitate high-throughput multiplexing, unique molecular identifiers (UMIs) and cell-specific barcodes are incorporated during capture; these short DNA sequences tag individual transcripts and cells, allowing demultiplexing post-sequencing and mitigating amplification artifacts.[30] Following capture, reverse transcription converts RNA to complementary DNA (cDNA) using reverse transcriptase enzymes, often with template-switching mechanisms to add universal priming sites for subsequent amplification. Amplification is essential due to the sparse starting material, employing whole-transcriptome amplification (WTA) via polymerase chain reaction (PCR) or in vitro transcription (IVT) to generate sufficient material for sequencing.[5] Plate-based protocols like SMART-seq utilize template-switching oligo (TSO) technology to achieve full-length cDNA coverage, enabling isoform detection but at the cost of higher per-cell expense and lower throughput. In contrast, droplet-based methods, such as Drop-seq and the 10x Genomics Chromium system, focus on 3'-end capture with UMIs to reduce PCR duplicates and bias, supporting thousands of cells per run through emulsion-based barcoding in gel-beads-in-emulsion (GEMs).00549-8)[30] However, these approaches introduce 3' bias and dropout rates, where longer genes or lowly expressed transcripts are underrepresented due to incomplete reverse transcription and fragmentation inefficiencies.[29] Library preparation transforms amplified cDNA into a sequencing-ready format, involving fragmentation to generate insert sizes compatible with short-read platforms, followed by end-repair, A-tailing, and adapter ligation for Illumina-compatible indexing.[5] For efficiency, tagmentation enzymes like those in the Nextera system simultaneously fragment and append adapters, streamlining the process and reducing hands-on time in high-throughput workflows.[10] In the 10x Genomics protocol, post-amplification libraries are purified and quantified before pooling, with UMIs enabling absolute quantification by counting unique transcript molecules rather than PCR duplicates.[30] Recent enhancements include UMI-optimized kits from 10x Genomics that achieve capture efficiencies of approximately 30% of input mRNA, reducing dropout rates through advanced barcoding and enzymatic formulations and enhancing quantification accuracy in diverse cell types.[31]Sequencing platforms and protocols
Single-cell transcriptomics relies primarily on high-throughput sequencing platforms to generate the vast amounts of data required for profiling gene expression at cellular resolution. The dominant platform is Illumina's short-read sequencing by synthesis (SBS) technology, exemplified by the NovaSeq series, which offers ultra-high throughput of up to 20 billion single reads per dual flow cell run in under two days, enabling the analysis of millions of cells per experiment.[32] This platform's high accuracy, with error rates below 0.1% for single nucleotide polymorphisms that could impact quantification, makes it ideal for standard single-cell RNA sequencing (scRNA-seq) workflows.[33] Emerging long-read technologies, such as Oxford Nanopore Technologies' nanopore sequencing and Pacific Biosciences' (PacBio) single-molecule real-time (SMRT) sequencing, are gaining traction for their ability to capture full-length transcripts and detect isoforms, which short-read methods often fragment. Nanopore sequencing provides real-time, ultra-long reads exceeding 100 kb with moderate accuracy (90-98%), and pilot studies in 2024 demonstrated its application to scRNA-seq for isoform-level resolution in mouse retina cells, yielding over 1.4 billion long reads from 30,000 cells.[34] Similarly, PacBio's HiFi reads (5-30 kb) achieve high accuracy (up to 99.9%) for full-length isoform sequencing in single cells, as shown in methods like MAS-Seq, which integrate barcoding and unique molecular identifiers (UMIs) to profile isoform diversity.[35]| Platform | Read Length | Throughput | Accuracy | Key scRNA-seq Application | Citation |
|---|---|---|---|---|---|
| Illumina (e.g., NovaSeq) | Short (50-300 bp) | Very high (up to 20B reads/run) | >99.9% | High-throughput gene quantification | [33] |
| Oxford Nanopore | Long (>100 kb) | Moderate-high | 90-98% | Isoform detection in single cells | [34] |
| PacBio (SMRT/HiFi) | Long (5-30 kb) | Moderate | 90-99.9% | Full-length transcript profiling | [35] |
