E-spatial

Beta

New application is live now

E-spatial

Single-cell spatial explorer

Notebooks

Premium

Identifying tumor cells at the single-cell level using machine learning - inferCNV
lock icon

BioTuring

Tumors are complex tissues of cancerous cells surrounded by a heterogeneous cellular microenvironment with which they interact. Single-cell sequencing enables molecular characterization of single cells within the tumor. However, cell annotation—the assignment of cell type or cell state to each sequenced cell—is a challenge, especially identifying tumor cells within single-cell or spatial sequencing experiments. Here, we propose ikarus, a machine learning pipeline aimed at distinguishing tumor cells from normal cells at the single-cell level. We test ikarus on multiple single-cell datasets, showing that it achieves high sensitivity and specificity in multiple experimental contexts. **InferCNV** is a Bayesian method, which agglomerates the expression signal of genomically adjointed genes to ascertain whether there is a gain or loss of a certain larger genomic segment. We have used **inferCNV** to call copy number variations in all samples used in the manuscript.
Only CPU
inferCNV
iBRIDGE: A Data Integration Method to Identify Inflamed Tumors from Single-Cell RNAseq Data and Differentiate Cell Type-Specific Markers of Immune-Cell Infiltration
lock icon

BioTuring

The development of immune checkpoint-based immunotherapies has been a major advancement in the treatment of cancer, with a subset of patients exhibiting durable clinical responses. A predictive biomarker for immunotherapy response is the pre-existing T-cell infiltration in the tumor immune microenvironment (TIME). Bulk transcriptomics-based approaches can quantify the degree of T-cell infiltration using deconvolution methods and identify additional markers of inflamed/cold cancers at the bulk level. However, bulk techniques are unable to identify biomarkers of individual cell types. Although single-cell RNA sequencing (scRNAseq) assays are now being used to profile the TIME, to our knowledge there is no method of identifying patients with a T-cell inflamed TIME from scRNAseq data. Here, we describe a method, iBRIDGE, which integrates reference bulk RNAseq data with the malignant subset of scRNAseq datasets to identify patients with a T-cell inflamed TIME. Utilizing two datasets with matched bulk data, we show iBRIDGE results correlated highly with bulk assessments (0.85 and 0.9 correlation coefficients). Using iBRIDGE, we identified markers of inflamed phenotypes in malignant cells, myeloid cells, and fibroblasts, establishing type I and type II interferon pathways as dominant signals, especially in malignant and myeloid cells, and finding the TGFβ-driven mesenchymal phenotype not only in fibroblasts but also in malignant cells. Besides relative classification, per-patient average iBRIDGE scores and independent RNAScope quantifications were utilized for threshold-based absolute classification. Moreover, iBRIDGE can be applied to in vitro grown cancer cell lines and can identify the cell lines that are adapted from inflamed/cold patient tumors.
Only CPU
iBRIDGE
SCEVAN: Single CEll Variational ANeuploidy analysis
lock icon

BioTuring

In the realm of cancer research, grasping the intricacies of intratumor heterogeneity and its interplay with the immune system is paramount for deciphering treatment resistance and tumor progression. While single-cell RNA sequencing unveils diverse transcriptional programs, the challenge persists in automatically discerning malignant cells from non-malignant ones within complex datasets featuring varying coverage depths. Thus, there arises a compelling need for an automated solution to this classification conundrum. SCEVAN (De Falco et al., 2023), a variational algorithm, is designed to autonomously identify the clonal copy number substructure of tumors using single-cell data. It automatically separates malignant cells from non-malignant ones, and subsequently, groups of malignant cells are examined through an optimization-driven joint segmentation process.
Required GPU
scevan
PopV: the variety of cell-type transfer tools for classify cell-types
lock icon

BioTuring

PopV uses popular vote of a variety of cell-type transfer tools to classify cell-types in a query dataset based on a test dataset. Using this variety of algorithms, they compute the agreement between those algorithms and use this agreement to predict which cell-types have a high likelihood of the same cell-types observed in the reference.
Required GPU

Trends

scGPT: Towards Building a Foundational Model for Single-Cell Multi-omics Using Generative AI

BioTuring

Generative pre-trained models have demonstrated exceptional success in various fields, including natural language processing and computer vision. In line with this progress, scGPT has been developed as a foundational model tailored specifically for the field of single-cell biology. It employs the generative pre-training transformer framework on an extensive dataset comprising more than 33 million cells. scGPT effectively extracts valuable biological insights related to genes and cells and can be fine-tuned to excel in numerous downstream applications.
Required GPU
scgpt
Seurat
Bioalpha-Biocolab: Enabling Large-Scale Data Uploads for BBrowserX single-cell analysis platform

BioTuring

Single-cell data analysis is revolutionizing biological research, but often these dataset sizes can be massive and pose challenges for submission process. Bioalpha-Biocolab addresses this issue by implementing advanced algorithms and leveraging efficient computational resources to overcome these challenges.
Required GPU
AlphaSC
Inference and analysis of cell-cell communication using CellChat

BioTuring

Understanding global communications among cells requires accurate representation of cell-cell signaling links and effective systems-level analyses of those links. We construct a database of interactions among ligands, receptors and their cofactors that accurately represent known heteromeric molecular complexes. We then develop **CellChat**, a tool that is able to quantitatively infer and analyze intercellular communication networks from single-cell RNA-sequencing (scRNA-seq) data. CellChat predicts major signaling inputs and outputs for cells and how those cells and signals coordinate for functions using network analysis and pattern recognition approaches. Through manifold learning and quantitative contrasts, CellChat classifies signaling pathways and delineates conserved and context-specific pathways across different datasets. Applying **CellChat** to mouse and human skin datasets shows its ability to extract complex signaling patterns.
Required GPU
CellChat
BioTuring data converter: Seurat to Scanpy for spatial transcriptomics (Visium)

BioTuring

This notebook illustrates how to convert data from a Seurat V4 object to annotation data using the BioStudio data transformation library (currently under development). It facilitates continued research using libraries that interact with Scanpy in Python. Currently, this version of the BioStudio library only supports converting a Seurat V4 object with a single assay.
CellPhoneDB: inferring cell–cell communication from combined expression of multi-subunit ligand–receptor complexes

BioTuring

Cell–cell communication mediated by ligand–receptor complexes is critical to coordinating diverse biological processes, such as development, differentiation and inflammation. To investigate how the context-dependent crosstalk of different cell types enables physiological processes to proceed, we developed CellPhoneDB, a novel repository of ligands, receptors and their interactions. In contrast to other repositories, our database takes into account the subunit architecture of both ligands and receptors, representing heteromeric complexes accurately. We integrated our resource with a statistical framework that predicts enriched cellular interactions between two cell types from single-cell transcriptomics data. Here, we outline the structure and content of our repository, provide procedures for inferring cell–cell communication networks from single-cell RNA sequencing data and present a practical step-by-step guide to help implement the protocol. CellPhoneDB v.2.0 is an updated version of our resource that incorporates additional functionalities to enable users to introduce new interacting molecules and reduces the time and resources needed to interrogate large datasets. CellPhoneDB v.2.0 is publicly available, both as code and as a user-friendly web interface; it can be used by both experts and researchers with little experience in computational genomics. In our protocol, we demonstrate how to evaluate meaningful biological interactions with CellPhoneDB v.2.0 using published datasets. This protocol typically takes ~2 h to complete, from installation to statistical analysis and visualization, for a dataset of ~10 GB, 10,000 cells and 19 cell types, and using five threads.
Identifying tumor cells at the single-cell level using machine learning - inferCNV

BioTuring

Tumors are complex tissues of cancerous cells surrounded by a heterogeneous cellular microenvironment with which they interact. Single-cell sequencing enables molecular characterization of single cells within the tumor. However, cell annotation—the assignment of cell type or cell state to each sequenced cell—is a challenge, especially identifying tumor cells within single-cell or spatial sequencing experiments. Here, we propose ikarus, a machine learning pipeline aimed at distinguishing tumor cells from normal cells at the single-cell level. We test ikarus on multiple single-cell datasets, showing that it achieves high sensitivity and specificity in multiple experimental contexts. **InferCNV** is a Bayesian method, which agglomerates the expression signal of genomically adjointed genes to ascertain whether there is a gain or loss of a certain larger genomic segment. We have used **inferCNV** to call copy number variations in all samples used in the manuscript.
Only CPU
inferCNV
pySCENIC: Single-Cell rEgulatory Network Inference and Clustering

BioTuring

SCENIC Suite is a set of tools to study and decipher gene regulation. Its core is based on SCENIC (Single-Cell Regulatory Network Inference and Clustering) which enables you to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data. pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell Regulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
Only CPU
pySCENIC
scVI-tools: single-cell variational inference tools

BioTuring

scVI-tools (single-cell variational inference tools) is a package for end-to-end analysis of single-cell omics data primarily developed and maintained by the Yosef Lab at UC Berkeley. scvi-tools has two components - Interface for easy use of a range of probabilistic models for single-cell omics (e.g., scVI, scANVI, totalVI). - Tools to build new probabilistic models, which are powered by PyTorch, PyTorch Lightning, and Pyro.
Required GPU
scVI
expiMap: Biologically informed deep learning to query gene programs in single-cell atlases

BioTuring

The development of large-scale single-cell atlases has allowed describing cell states in a more detailed manner. Meanwhile, current deep leanring methods enable rapid analysis of newly generated query datasets by mapping them into reference atlases. expiMap (‘explainable programmable mapper’) Lotfollahi, Mohammad, et al. is one of the methods proposed for single-cell reference mapping. Furthermore, it incorporates prior knowledge from gene sets databases or users to analyze query data in the context of known gene programs (GPs).
Required GPU
expiMap
MuSiC: Multi-subject Single-cell Deconvolution

BioTuring

Knowledge of cell type composition in disease relevant tissues is an important step towards the identification of cellular targets of disease. MuSiC is a method that utilizes cell-type specific gene expression from single-cell RNA sequencing (RNA-seq) data to characterize cell type compositions from bulk RNA-seq data in complex tissues. By appropriate weighting of genes showing cross-subject and cross-cell consistency, MuSiC enables the transfer of cell type-specific gene expression information from one dataset to another. MuSiC enables the characterization of cellular heterogeneity of complex tissues for understanding of disease mechanisms. As bulk tissue data are more easily accessible than single-cell RNA-seq, MuSiC allows the utilization of the vast amounts of disease relevant bulk tissue RNA-seq data for elucidating cell type contributions in disease. This notebook provides a walk through tutorial on how to use MuSiC to estimate cell type proportions from bulk sequencing data based on multi-subject single cell data by reproducing the analysis in MuSiC paper, now is published on Nature Communications.
Only CPU
MuSiC
NicheNet: modeling intercellular communication by linking ligands to target genes

BioTuring

Computational methods that model how the gene expression of a cell is influenced by interacting cells are lacking. We present NicheNet, a method that predicts ligand–target links between interacting cells by combining their expression data with prior knowledge of signaling and gene regulatory networks. We applied NicheNet to the tumor and immune cell microenvironment data and demonstrated that NicheNet can infer active ligands and their gene regulatory effects on interacting cells.
Only CPU
nichenetr
BayesPrism: Cell type and gene expression deconvolution for bulk RNA-seq data

BioTuring

Reconstructing cell type compositions and their gene expression from bulk RNA sequencing (RNA-seq) datasets is an ongoing challenge in cancer research. BayesPrism (Chu, T., Wang, Z., Pe’er, D. et al., 2022) is a Bayesian method used to predict cellular composition and gene expression in individual cell types from bulk RNA-seq datasets, with scRNA-seq as references. This notebook illustrates an example workflow for bulk RNA-seq deconvolution using BayesPrism. The notebook content is inspired by BayesPrism's vignette and modified to demonstrate how the tool works on BioTuring's platform.
immunedeconv: an R package for unified access to computational methods for estimating immune cell fractions from bulk RNA sequencing data

BioTuring

Methods like fluorescence activated cell sorting (FACS) or Immunohistochemistry (IHC)-staining have been used as a gold standard to estimate the immune cell content within a sample, however these methods are limited in their scalability and by the availability of good antibodies against the cell type markers. High throughput transcriptomic methods allow to get a transcriptional landscape in the sample with a relatively small amount of material that can be extremely limited in clinical settings (e.g. tumor biopsies), which led to high utility of methods like RNA-seq and microarrays to characterize patient tumor samples. However, RNA-seq does not provide a detailed information on a cellular composition of a sample, which then has to be inferred using computational techniques. Such conceptual differences between the methods can be classified in two categories: - Marker gene-based approaches - Deconvolution-based apporeaches
PyWGCNA: a Python package designed to do Weighted Gene Correlation Network analysis (WGCNA)

BioTuring

PyWGCNA is a Python library designed to do weighted correlation network analysis (WGCNA). It can be used for: - Finding clusters (modules) of highly correlated genes - For summarizing such clusters using the module eigengene - For relating modules to one another and to external sample traits (using eigengene network methodology) - For calculating module membership measures. Users can also compare WGCNA networks from different datasets, or to external gene lists, to assess the conservation or functional enrichment of each module.
Only CPU
PyWGCNA
WGCNA: an R package for Weighted Gene Correlation Network Analysis

BioTuring

WGCNA: an R package for Weighted Gene Correlation Network Analysis Correlation networks are increasingly being used in bioinformatics applications. For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples. Weighted correlation network analysis (WGCNA) can be used for: - Finding clusters (modules) of highly correlated genes - Summarizing such clusters using the module eigengene or an intramodular hub gene - Relating modules to one another and to external sample traits (using eigengene network methodology) - For calculating module membership measures All of these are important for identifying potential candidate genes associated with measured traits as well as identifying genes that are consistently co-expressed and could be contributing to similar molecular pathways. Using WGCNA is also extremely useful statistically as it accounts for inter-individual variation in gene expression and alleviates issues associated with multiple testing.
Only CPU
WGCNA
Notebooks
Required GPU
scgpt
Seurat
Required GPU
AlphaSC
Required GPU
CellChat
Only CPU
inferCNV
Only CPU
pySCENIC
Required GPU
scVI
Required GPU
expiMap
Only CPU
MuSiC
Only CPU
nichenetr
Only CPU
PyWGCNA
Only CPU
WGCNA