πͺ camlhmp πͺ - Classification through yAML Heuristic Mapping Protocol
camlhmp is a tool for generating organism typing tools from YAML schemas. Through discussions
with Tim Read, we identified a need for a straightforward method to define and manage typing
schemas for organisms of interest. YAML was chosen for its simplicity and readability.
Full documentation for camlhmp can be found at https://rpetit3.github.io/camlhmp/.
The primary purpose of camlhmp is to provide a framework that enables researchers to
independently define typing schemas for their organisms of interest using YAML. This
approach facilitates the management and analysis biological data for researchers at any
level of experience.
camlhmp does not supply pre-defined typing schemas. Instead, it equips researchers
with the necessary tools to create and maintain their own schemas, ensuring these schemas
can easily remain up to date with the latest scientific developments.
Finally, the development of camlhmp was driven by a practical need to streamline
maintenance of multiple organism typing tools. Managing these tools separately is
time-consuming and challenging. camlhmp simplifies this by providing a single
framework for each tool.
To quickly get started with camlhmp, you can install it through Bioconda and run the
command-line interface:
# Install camlhmp through Bioconda
conda create -n camlhmp -c conda-forge -c bioconda camlhmp
conda activate camlhmp
camlhmp --help
# Example usage of camlhmp-blast-alleles
# Acquire test data
wget https://raw.githubusercontent.com/rpetit3/camlhmp/refs/heads/main/tests/data/blast/alleles/spn-pbptype.yaml
wget https://raw.githubusercontent.com/rpetit3/camlhmp/refs/heads/main/tests/data/blast/alleles/spn-pbptype.fasta
wget https://github.com/rpetit3/camlhmp/raw/refs/heads/main/tests/data/blast/alleles/SRR2912551.fna.gz
# Run camlhmp-blast-alleles
camlhmp-blast-alleles \
--yaml spn-pbptype.yaml \
--targets spn-pbptype.fasta \
--input SRR2912551.fna.gz
Running camlhmp-blast-alleless with following parameters:
--input SRR2912551.fna.gz
--yaml spn-pbptype.yaml
--targets spn-pbptype.fasta
--outdir ./
--prefix camlhmp
--min-pident 95
--min-coverage 95
Starting camlhmp for S. pneumoniae PBP typing...
Running tblastn...
Processing hits...
Final Results...
S. pneumoniae PBP typing
βββββ³ββββ³ββββ³ββββ³ββββ³ββββ³ββββ³ββββ³ββββ³βββββ³ββββ³βββββ³ββββ³βββββ³ββββ³βββββ³ββββ³βββββ³ββββ³βββββ
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β 1β¦ β β¦ β 2β¦ β β¦ β 2β¦ β β¦ β 2β¦ β β¦ β 2β¦ β β¦ β 2β¦ β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β 0 β 1β¦ β β¦ β 5β¦ β β 2 β β¦ β 1β¦ β β¦ β β
βββββ΄ββββ΄ββββ΄ββββ΄ββββ΄ββββ΄ββββ΄ββββ΄ββββ΄βββββ΄ββββ΄βββββ΄ββββ΄βββββ΄ββββ΄βββββ΄ββββ΄βββββ΄ββββ΄βββββ
Writing outputs...
Final predicted type written to ./camlhmp.tsv
tblastn results written to ./camlhmp.tblastn.tsvFor more example commands and outputs, see the documentation for each command:
camlhmp is available through PyPI and
Bioconda. While you can install it
through PyPi, it is recommended to install it through BioConda so that non-Python dependencies
are also installed.
camlhmp has been developed and tested on x86-64 Linux and macOS systems.
| OS | Architecture | Supported? |
|---|---|---|
| Linux | x86-64 | β |
| Linux | aarch64 | β (missing dependencies) |
| macOS | x86-64 | β |
| macOS | arm64 | β (missing dependencies) |
| Windows | x86-64 | β _(consider using WSL2) _ |
Tip
Docker containers are available from biocontainers/camlhmp
which can be used with the --platform flag to run on Apple Silicon and ARM-based Linux systems.
camlhmp relies on the following dependencies:
dependencies:
python:
- biopython >=1.83
- pyyaml >=6.0.1
- executor >=23.2
- rich >=13.7.1,<14
- rich-click >=1.6.0
non_python:
- blast >=2.15.0
- pigz
conda create -n camlhmp -c conda-forge -c bioconda camlhmp
conda activate camlhmp
camlhmp
πͺ camlhmp πͺ - Classification through YAML Heuristic Mapping Protocol
Available camlhmp commands
βββββββββββββββββββββββββ³βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β command β description β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β camlhmp-blast-alleles β Classify assemblies using BLAST against alleles of a set of genes β
β camlhmp-blast-regions β Classify assemblies using BLAST against larger genomic regions β
β camlhmp-blast-targets β Classify assemblies using BLAST against individual genes or proteins β
β camlhmp-extract β Extract typing targets from a set of reference sequences β
βββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββTo install camlhmp through PyPi, you can can use pip:
pip install camlhmp
camlhmp
πͺ camlhmp πͺ - Classification through YAML Heuristic Mapping Protocol
Available camlhmp commands
βββββββββββββββββββββββββ³βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β command β description β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β camlhmp-blast-alleles β Classify assemblies using BLAST against alleles of a set of genes β
β camlhmp-blast-regions β Classify assemblies using BLAST against larger genomic regions β
β camlhmp-blast-targets β Classify assemblies using BLAST against individual genes or proteins β
β camlhmp-extract β Extract typing targets from a set of reference sequences β
βββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββWarning
Installing through PyPi will not install non-Python dependencies. You will need to ensure these are installed manually.
If you make use of camlhmp in your analysis, please cite the following:
-
camlhmp
Petit III RA, Read TD camlhmp: Classification through yAML Heuristic Mapping Protocol (GitHub) -
BLAST+
Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009)
If I'm being honest, I really wanted to name a tool with "camel" in it because they are my wife's favorite animalπͺ and they also remind me of my friends in Oman!
Once it was decided YAML was going to be the format for defining schemas, I quickly stumbled on "Classification through YAML" and quickly found out I wasn't the only once who thought of "CAML". But, no matter, it was decided it would be something with "CAML", then Tim Read came with the save and suggested "Heuristic Mapping Protocol". So, here we are - camlhmp!
I'm not a lawyer and MIT has always been my go-to license. So, MIT it is!
As of v1.1.3, camlhmp has been developed with minimal assistance of Artificial
Intelligence (AI). GitHub Copilot was used for auto-completion, but otherwise all
code was written and reviewed by the author.
Support for this project came (in part) from the Wyoming Public Health Division, and the Center for Applied Pathogen Epidemiology and Outbreak Control (CAPE).

