Skip to content

mlorenz49/novaice

 
 

Repository files navigation

novaice

Check Build

Chemical perturbation modeling in 24hours.

[!Important] This model was developed during the Nucleate Hackathon 2025, Munich and does not represent a serious scientific project.

Getting started

Please refer to the documentation, in particular, the API documentation.

Installation

You need to have Python 3.11 or newer installed on your system. If you don't have Python installed, we recommend installing uv.

There are several alternative options to install novaice:

  1. Install the latest development version:
pip install git+https://github.com/lucas-diedrich/novaice.git

Usage

novaice is a simple model to predict gene expression across chemical perturbation conditions. It assumes that each observation is encoded by a (drug $d_i$, gene expression $X_i$) pair. The task is to predict gene expression from a vector representation of the drug. We implement a MLP model that predicts the parameters of a normal distribution ($\mu, \sigma$) that describe the distribution of the log1p normalized RNAseq data.

We implement various methods to embed chemical compounds from the smiles strings in the .pp module.

Evaluation is based on the featurewise $R^2$ value between maximum likelihood estimate of gene abundance and measured data. We also assess how well the model is calibrated with respect to the communicated uncertainty.

Run your model

from novaice.tl import ChemPertMLPModel, ChemPertVAEModel
from lightning.pytorch.loggers import TensorBoardLogger

# Setup and train model
ChemPertMLPModel.setup_anndata(adata, drug_embedding_key="drug_embedding")
model = ChemPertMLPModel(adata)

# Train
model.train(max_epochs=50)

# Predict gene expression
predictions = model.predict_gene_expression()

Use TensorBoard logging

To track training metrics with TensorBoard:

from lightning.pytorch.loggers import TensorBoardLogger

# Create TensorBoard logger
tb_logger = TensorBoardLogger("logs", name="my_experiment")

# Train with logging
model.train(max_epochs=50, logger=tb_logger, log_every_n_steps=5)

Then start TensorBoard to visualize the training:

tensorboard --logdir=logs

Open your browser to http://localhost:6006 to view the training metrics.

Release notes

See the changelog.

Contact

For questions and help requests, you can reach out in the scverse discourse. If you found a bug, please use the issue tracker.

About

Chemical perturbation modeling in 24hours

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%