Single-cell Analysis Tool Inputs


Cellismo / Muon file: [sample_name].cellismo

Seurat file: [sample_name]_seurat.rds

These are prebuilt input files to popular single-cell analysis toolkits:

and

In RNA and AbSeq experiments, each file contains the RSEC molecules-per-cell data table for putative cells, along with cell and bioproduct metadata.

In ATAC-Seq experiments, each file contains the cell-by-peak data table for putative cells, and a cell-by-gene_activity table calculated based on peak region annotations, along with cell and bioproduct metadata. If Transcription Factor Motif analysis was performed, a cell-by-motif table of motif enrichment z-scores generated by the pyChromvar algorithm is also included, and a peak-by-motif matrix indicating which motifs were detected in which peaks.

Metadata includes (if applicable): sample tag calls, putative cell origin, TCR/BCR chain types and CDR3, Immune cell type (Experimental), protein aggregate, and tSNE/UMAP coordinates. Additionally, the chromosome and contig lengths of the reference genome used are attached to the .cellismo object (mdata.uns['genome_contig_lengths']). The names of the chromosomes/contigs have unusual characters encoded to be "URL-safe" using the function urllib.parse.quote_plus() from the Python package urllib.parse. These are automatically decoded when used by the Cellismo viewer, but should be noted if another tool is used to analyze the file.

Data TypeSeurat Object LocationCellismo Object (MuData) Location
Expression/Count Matrices
mRNA Expression (RSEC)obj@assays[['RNA']]$countsmdata.mod['rna'].X
AbSeq Expression (RSEC)obj@assays[['ADT']]$countsmdata.mod['prot'].X
ATAC-Seq Peaksobj@assays[['peaks']]$countsmdata.mod['atac_peaks'].X
ATAC Gene Activity Scoresobj@assays[['gene_activity']]$countsmdata.mod['atac_gene_activity'].X
ATAC TF Motif Scores (ChromVAR)obj@assays[['chromvar']]$datamdata.mod['atac_motif'].X
Cell-Level Metadata
Sample Tag Assignments, VDJ Per-Cell Info, Immune Cell Type Predictions, Protein Aggregate Statusobj@meta.data
A data frame where columns correspond to metadata fields (e.g., obj$Sample_Name, obj$TCR_Paired_Chains).
mdata.mod[MODALITY].obs
Data linked to each modality. A data frame where columns correspond to metadata fields (e.g., mdata.mod['rna'].obs['Cell_Type_Experimental']).
Dimensionality Reduction (UMAP/tSNE)obj@reductions$umap or obj@reductions$tsnemdata.mod[MODALITY].obsm['X_umap'] or mdata.mod[MODALITY].obsm['X_tsne']
Data linked to each modality.
Feature-Level Metadata
ATAC Peak-to-Gene LinksNot presentmdata.mod['atac_peaks'].uns['atac']['peak_annotation']
ATAC TF Motif Matrix (Peak x Motif)Stored in a Motif object associated with the peaks assay: obj@assays[['peaks']]@motifsmdata.mod['atac_peaks'].varm['peaks_by_motif']
Motif names are stored in mdata.mod['atac'].uns['atac']['motif_names']
ATAC TF Motif Genomic PositionsStored in a Motif object associated with the peaks assay: obj@assays[['peaks']]@motifs@positionsmdata.mod['atac_motif'].uns['motif_positions']