scCHyMErA-Seq

Code repository for the scCHyMErA-Seq project

scCHyMErA-Seq is a platform that enables efficient exon perturbations and gene knockouts, generating single-cell RNA-sequencing phenotypic readouts. To facilitate downstream analysis, this repository includes a ready-to-use pipeline built with scverse tools.

Input Files

To run the scCHyMErA-Seq pipeline, you’ll need:

A matrix file (*matrix.h5) produced by Cell Ranger
A metadata file containing cell barcodes and guide information

Generating Input Files

Use the Cell Ranger count pipeline for CRISPR Guide Capture analysis. Cell Ranger documentation

module load cellranger
cellranger count --id=s \
    --transcriptome=refdata-gex-GRCh38-2024-A \
    --libraries=library.csv \
    --feature-ref=feature_reference.csv \
    --create-bam=true

This will create matrix file and protospacer files along with many others

Example: library.csv

sample,fastqs,lanes,library_type
GEX,Sample_GEX,Any,Gene Expression
Cas9,Sample_Cas9,Any,CRISPR Guide Capture
Cas12a,Sample_Cas12a,Any,CRISPR Guide Capture

Loading Files and Downstream Analysis

Prerequisites

Install the following Python packages:

scanpy
anndata
pertpy — for Mixscape analysis
DecoupleR — for pseudobulk matrix calculation
PyDESeq2 — for differential expression analysys

Usage

Quality Control

python qc_cells.py filtered_feature_bc_matrix.h5

Matrix Preprocessing & Mixscape

python scanpy_analysis_split.py
python scanpy_analysis_combined.py

Outputs:

UMAPs of all processed cells
Cluster-specific LDA plots (highlighted cluster vs grey others)

UMAP + Leiden Clustering

Arguments for scanpy_analysis_split.py and scanpy_analysis_combined.py:

Argument	Description
`-o`, `--out`	Output directory for plots (default: current working directory)
`--analysis`	Type of analysis: `KO` or `Exon` (used in `scanpy_analysis_split.py` only)
`--resolution`	Leiden clustering resolution (0–1; higher = more clusters)
`-m`, `--matrix_input`	Path to input matrix file (`.h5`)
`-a`, `--anno_csv`	Path to annotation file (CSV) with cell barcode and guide pairing

These scripts also generate inputs for chymeraseq.md and gprofiler_analysis.md

Example SLURM Job

#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --open-mode=append
#SBATCH --time=1:00:00
#SBATCH --mem=300g
#SBATCH --job-name=schymeraseq

timestamp=$(date +%Y%m%d_%H%M)

export PYTHONHASHSEED=0
export NUMBA_CPU_NAME=generic

python scanpy_analysis_split.py -o ./ --analysis Exon --resolution 0.15 \
    -m filtered_feature_bc_matrix.h5 -a paired_hgRNA_calls_per_cell.csv \
    --timestamp $timestamp

python scanpy_analysis_split.py -o ./ --analysis KO --resolution 0.15 \
    -m filtered_feature_bc_matrix.h5 -a paired_hgRNA_calls_per_cell.csv \
    --timestamp $timestamp

python scanpy_analysis_combined.py -o ./ --resolution 0.15 \
    -m filtered_feature_bc_matrix.h5 -a paired_hgRNA_calls_per_cell.csv \
    --timestamp $timestamp

Bulk Differential Expression Analysis

To identify differentially expressed genes for each perturbation:

python pseudobulk_deg.py \
    -m filtered_feature_bc_matrix.h5 \
    -a paired_hgRNA_calls_per_cell.csv \
    -p exon_mxs_obs.csv \
    --timestamp $timestamp

Functional Enrichment Analysis

To run functional enrichment analysis using g:Profiler:

# --excel_file - Excel, csv or txt file generated by scanpy.get.rank_genes_groups_df()

python gprofiler_analysis.py \
--excel_file DEG_exons_mod.csv \
--out deg_exons_mod_0.5 \
--lfc_cutoff 0.5 \
--run_gprofiler

Name		Name	Last commit message	Last commit date
Latest commit History 116 Commits
chymeraseq_files/figure-gfm		chymeraseq_files/figure-gfm
gprofiler_analysis		gprofiler_analysis
raw		raw
scripts		scripts
.DS_Store		.DS_Store
README.md		README.md
chymeraseq.md		chymeraseq.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scCHyMErA-Seq

Input Files

Generating Input Files

Loading Files and Downstream Analysis

Prerequisites

Usage

Quality Control

Matrix Preprocessing & Mixscape

UMAP + Leiden Clustering

Example SLURM Job

Bulk Differential Expression Analysis

Functional Enrichment Analysis

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

NCI-RBL/scCHyMErA-Seq

Folders and files

Latest commit

History

Repository files navigation

scCHyMErA-Seq

Input Files

Generating Input Files

Loading Files and Downstream Analysis

Prerequisites

Usage

Quality Control

Matrix Preprocessing & Mixscape

UMAP + Leiden Clustering

Example SLURM Job

Bulk Differential Expression Analysis

Functional Enrichment Analysis

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages