View on GitHub

scSNViz

Introduction

The scSNViz script provides a comprehensive visualization of single cell-specific expressed SNVs (sceSNVs) by mapping them onto a dimensionally reduced representation of the data (e.g., UMAP, t-SNE, or PCA, depending on the selected technique). It generates a series of plots that represent the basic statistics and properties of the SNVs in the dataset. scSNViz integrates existing packages, including Seurat, Slingshot, scType, and CopyKat.

Input

The script requires two inputs:

Additionally, lists of SNVs with cell-barcode information not processed through SCReadCounts can be submitted in similar format using the (-t) option.

See sample files for reference.

Options

	-h, --help 
		Show this help message and exit

	-r RDS-FILE, --rds-file=RDS-FILE
		RDS file containing Seurat object.

	-m COUNTSMATRIX-FILE, --countsmatrix-file=COUNTSMATRIX-FILE
		folder containing STARsolo output folder name that contains
                     the following files:
                     barcodes.tsv.gz, features.tsv.gz, and matrix.mtx.gz

	-t SNV-FILE, --snv-file=SNV-FILE
		scReadCounts file

	-w DIMENSIONALITY-REDUCTION, --dimensionality-reduction=DIMENSIONALITY-REDUCTION
		options include tSNE, PCA, UMAP. Default=UMAP.

	-x TH-VARS, --th-vars=TH-VARS
		Threshold for number of sceSNVs. Default=0 (display cells with N_SNVs > 0).

	-y TH-READS, --th-reads=TH-READS
		Threshold for number of variant reads (N_VAR). Default=0 (consider as sceSNV positions covered with N_VAR > 0).

	-c, --disable-title
		Disable title for individual SNV plots. Default=F

	-d, --disable-ind-plots
		Disable individual SNV plots. Default=F.

  	-v, --hide-ind-plots
   		Hide individual scSNV plots in the combined HTML. Default=F.

	-e, --disable-3d-axis
		Disable axes in 3D plots. Default=F.

	-g, --disable-slingshot
		Disable slingshot curves in 3D plots. Default=F.

	-i, --enable-sctype
		Enable sctype to run. Default=F.

	-j TISSUE-TYPE, --tissue-type=TISSUE-TYPE
		tissue type for scType; options include:
                     Immunesystem, Pancreas, Liver, Eye, Kidney, Brain,
                     Lung, Adrenal, Heart, Intestine, Muscle, Placenta,
                     Spleen, Stomach, Thymus

	-k COLOR-SCALE, --color-scale=COLOR-SCALE
		if you would like to change the default color settings with
                     these options, you may use Blues, Reds, YlOrRd, YlGnBu, plasma, RdBu

	-b, --enable-cell-border
		Enable cell border. Default=F

	-q, --enable-dynamic-cell-size
		Enable cell size to depend on number of reads. Default=F

  	-u, --enable-copykat
   		Enable CopyKat for displaying CNVs. Default=F

	-s, --save-each-plot
 		Save plots as separate HTML files. Default=F

Output

Directory structure

scSNViz generates outputs for the set of the sceSNVs and for each individual sceSNV, as follows:

sample_SNVs_dimensionality_reduction_xry/
│
├── Exploratory_Combined_Plots.html
├── sample_SNVs-summary.txt
└── Figures_Individual_Plots_HTML/   (optional)
    ├── Cell_types_scType.html       (optional)
    ├── CNVs_CopyKat.html            (optional)
    ├── Median_VAF_RNA.html
    ├── Mean_VAF_RNA.html
    ├── N_REFreads.html
    ├── Total_VAF_RNA.html
    ├── N_sceSNVs.html
    ├── N_VARreads.html
    ├── Histogram_N_SNV.png
    ├── Histogram_N_VARreadsCounts.png
    ├── Histogram_MeanSNVsVAF.png
    ├── Histogram_TotalVAF.png
    └── Individual_sceSNVs/
        ├── VARreads/
        │   └── 3D N_VAR plot HTML files for each sceSNV
        ├── REFreads/
        │   └── 3D N_REF plot HTML files for each sceSNV
        └── VAF/
            └── 3D VAF plot HTML files for each sceSNV

Note: The Figures_Individual_Plots_HTML directory and its contents are optional and will only be generated if the user requests it using the -s option.
In sample_SNVs_dimensionality_reduction_xry, dimensionality_reduction indicates the choice of dimensionality reduction selected. x and y indicate the threshold for sceSNVs and the variant reads, respectively.

Description

For the set of the sceSNVs, the separately produced figures show the following:

    MeanSNVsVAF: Histogram of mean VAF per SNV per cell

    N_SNV: Histogram of the number of SNVs per cell

    N_VARreadsCounts: Histogram of the number of Variant Reads per cell

    TotalVAF: Histogram of the Total VAF per cell (VARreads/(VARreads + REFreads) per cell)

    Cell_types_scType: 3D UMAP/t-SNE/PCA representation of the cell types identified by scType

    CNVs_CopyKat.html: 3D UMAP/t-SNE/PCA representation of the copy number variations identified by CopyKat

    Mean_VAF_RNA: 3D UMAP/t-SNE/PCA representation of mean VAF for each cell

    Median_VAF_RNA: 3D UMAP/t-SNE/PCA representation of median VAF for each cell

    N_sceSNVs: 3D UMAP/t-SNE/PCA representation of number of number of SNVs for each cell

    N_VARreads: 3D UMAP/t-SNE/PCA representation of number of Variant Reads for each cell

    N_REFreads: 3D UMAP/t-SNE/PCA representation of number of Reference Reads for each cell

    Total_VAF_RNA: 3D UMAP/t-SNE/PCA representation of Total VAF per cell

sample_SNVs-summary: a text file of the summary statistics per cell

Individual_sceSNVs: contains 3D dmensionality reduction plots for individual sceSNV of the following:

Exploratory_Combined_Plots: displays all the separately generated plots above into one single html for modularity


Installation

Download the R file:

The following CRAN packages are required:

The following Bioconductor packages are required:

Other packages:

Note too that the matrixStats package (a dependancy of Seurat) needs to be downgraded to version 1.1.0:

Examples

% Rscript scSNViz.r -t sample_SNVs.txt -m SAMNXX_wasp_Solo.out/Gene/filtered/
% Rscript scSNViz.r -t sample_SNVs.txt -m SAMNXX_wasp_Solo.out/Gene/filtered/
                         --dimensionality-reduction=umap \
                         --th-vars=1 --th-reads=10 \
                         -i --tissue-type=Liver -c -d -u  
% Rscript scSNViz.r -t sample_SNVs.txt -r sample_Seurat_object.rds \
                         --dimensionality-reduction=tsne \
                         --th-vars=1 --th-reads=10 \
                         -i --tissue-type=Immunesystem \
                         -c -d -e --color-scale=YlGnBu