Skip to content

Releases: datajoint/datajoint-python

Release 2.0.0

03 Feb 01:55
3a1c2db

Choose a tag to compare

DataJoint 2.0 - Computational Foundation for Agentic Data Pipelines

This is a major release representing a complete rewrite of the DataJoint Python library. It introduces a modernized architecture with an extensible type system, object-augmented schemas, semantic matching, and improved developer experience.

Related:

  • PR #1311 — Complete rewrite implementation
  • Discussion #1235 — DataJoint 2.0 design
  • Discussion #1354 — Object-Augmented Schemas (OAS)
  • Discussion #1256 — Extensible type system
  • Discussion #1243 — Semantic matching and lineage

💥 Breaking Changes

Platform Requirements

  • Python 3.10+ required - Dropped support for Python 3.9 and earlier
  • MySQL 8.0+ required - Dropped support for MySQL 5.x and pre-8.0 versions

Architecture Changes

  • New package structure - Source code moved to src/datajoint/
  • Extensible Type/Codec System - New <codec> syntax replaces hardcoded blob/attach handling. Custom codecs extend dj.Codec with encode()/decode() methods
  • Object-Augmented Schemas (OAS) - Schema-addressed storage (<object@>, <npy@>) creates browsable paths mirroring database structure
  • Semantic Matching with Lineage - ~lineage table tracks attribute origins. Joins/restrictions enforce homologous namesakes must share lineage
  • Table-Specific Jobs Tables - Each Computed/Imported table has its own ~~table_name jobs table (replaces shared jobs table)
  • New Configuration System - pydantic-settings based config with datajoint.json, .secrets/ directory, and DJ_* environment variables
  • New Test Infrastructure - Uses testcontainers for automatic MySQL/MinIO management (no manual docker-compose required)

Removed/Deprecated Features

  • dj.conn() interactive prompts - Use environment variables or config file
  • dj.kill() and dj.kill_quick() - Use database administration tools
  • otumat dependency - S3 credential management simplified
  • Positional tuple inserts deprecated - Use dict with explicit field names
  • ~log table deprecated - Schema-level logging table no longer used

🚀 Major Features

Core Type System

Scientist-friendly type names with portable semantics:

  • Numeric: float32, float64, int64, int32, int16, int8, bool
  • Special: uuid (binary(16)), json, bytes (longblob)
  • Temporal: date, datetime
  • String: char(n), varchar(n), enum(...)
  • Fixed-point: decimal(m,n)

Extensible Codec System

class GraphCodec(dj.Codec):
    name = "graph"
    def get_dtype(self, is_store): return "<blob>"
    def encode(self, value, *, key=None, store_name=None): ...
    def decode(self, stored, *, key=None): ...

# Use in definitions: data : <graph>

Built-in codecs: <blob>, <blob@>, <attach>, <attach@>, <hash@>, <object@>, <npy@>, <filepath@>

Object-Augmented Schemas (OAS)

  • Hash-addressed (<blob@>, <attach@>, <hash@>): Content-addressed with MD5 deduplication (base32-encoded, 26 chars). Paths: _hash/{hash[:2]}/{hash[2:4]}/{hash}
  • Schema-addressed (<object@>, <npy@>): Paths mirror schema structure: {schema}/{table}/{pk}/{attribute}
  • Filepath references (<filepath@>): Reference existing files in stores without copying
  • Lazy references: NpyRef and ObjectRef provide metadata access without I/O

Semantic Matching

  • Lineage tracking identifies attribute origins (schema.table.attribute)
  • Binary operations (join, restrict, union, aggr) enforce lineage compatibility
  • Use schema.rebuild_lineage() for legacy schema migration

Jobs 2.0

  • Per-table job queues with ~~table_name naming pattern
  • Composite index (status, priority, scheduled_time) for efficient job fetching
  • Improved error tracking and job status management

New Query Operator

  • extend(other) - Left-joins a functionally dependent table, preserving primary key and row count

Modernized Output Methods

  • keys() - Returns list of primary key dicts
  • to_arrays(*attrs) - Returns tuple of numpy arrays
  • to_dicts() - Returns list of dictionaries
  • to_pandas() - Returns pandas DataFrame
  • to_polars() - Returns Polars DataFrame
  • to_arrow() - Returns PyArrow Table
  • fetch() preserved with deprecation warning for backward compatibility

Configuration Enhancements

  • datajoint.json project config with parent directory search
  • .secrets/ directory for sensitive values (gitignore this)
  • database.database_prefix setting for automatic schema name prefixing
  • database.create_tables setting to control automatic table creation
  • dj.config.override() context manager for temporary config changes

📚 Documentation

Documentation has been moved to a dedicated repository and completely rewritten using the Diátaxis framework:

Structure:

⚖️ License Change

DataJoint 2.0 is released under Apache 2.0 license (previously LGPLv2.1).

0.14.7

02 Feb 19:25
832d92b

Choose a tag to compare

🐛 Bug Fixes

When using generator-based make (make_fetch, make_compute, make_insert), make_kwargs passed to populate() were not being forwarded to make_fetch. This caused TypeError when using make_kwargs with the tripartite pattern.

Fixes #1350


⚠️ End-of-Life Notice

This is the final maintenance release for the 0.14.x branch.

  • No further 0.14.x releases are planned
  • There will be no v0.15 — the next major version is v2.0
  • Security fixes only will be considered on a case-by-case basis

We encourage all users on 0.14.x to plan their migration to v2.0.


Full Changelog: v0.14.6...v0.14.7

Release 0.14.6

31 Jul 22:06
701f5ad

Choose a tag to compare

⚡️ Enhancements

📝 Documentation

Full Changelog: v0.14.5...v0.14.6

Release 0.14.5

25 Jul 13:37
4fc2cca

Choose a tag to compare

⚡️ Enhancements

🐛 Bug Fixes

  • fix: improve error handling when make_fetch referential integrity fails(#1245)@ttngu207

📝 Documentation

Full Changelog: v0.14.4...v0.14.5

Release 0.14.4

17 Apr 17:01
e0e337f

Choose a tag to compare

⚡️ Enhancements

  • Update pyproject.toml(#1229)@dimitri-yatsenko
  • fix: 📝 update home url(#1227)@yambottle
  • fix: 🔥 remove redundant contribution.md | check docs developer guide(#1226)@yambottle
  • datajoint-python developer guide(#1225)@yambottle
  • fix: 🐛 test/release status badge url typo(#1224)@yambottle
  • Dev 846 dj release(#1223)@yambottle
  • feat: ✨ dependabot for action version's auto update(#1222)@yambottle
  • Dev 846 release ci fix(#1216)@yambottle
  • Dev 861 pre commit(#1212)@yambottle
  • Dev 861 stale issues(#1208)@yambottle
  • DEV-861-auto-label(#1209)@yambottle

🐛 Bug Fixes

  • fix: 📝 update home url(#1227)@yambottle
  • fix: 🔥 remove redundant contribution.md | check docs developer guide(#1226)@yambottle
  • fix: 🐛 test/release status badge url typo(#1224)@yambottle
  • Fix #1218(#1219)@yambottle
  • fix: 🐛 fix rate limit and rename(#1214)@yambottle

📝 Documentation

  • fix: 📝 update home url(#1227)@yambottle
  • fix: 🔥 remove redundant contribution.md | check docs developer guide(#1226)@yambottle
  • datajoint-python developer guide(#1225)@yambottle
  • fix: 🐛 test/release status badge url typo(#1224)@yambottle
  • Dev 861 add readme badge(#1221)@yambottle

Full Changelog: v0.14.3...v0.14.4

Release 0.14.3

23 Sep 17:55
77b75e9

Choose a tag to compare

  • Added - dj.Top restriction - PR #1024) PR #1084
  • Fixed - Added encapsulating double quotes to comply with DOT language - PR #1177
  • Added - Datajoint python CLI (#940) - PR #1095
  • Added - Ability to set hidden attributes on a table - PR #1091
  • Added - Ability to specify a list of keys to populate - PR #989
  • Fixed - fixed topological sort #1057 - PR #1184
  • Fixed - .parts() not always returning parts #1103 - PR #1184
  • Changed - replace setup.py with pyproject.toml - PR #1183
  • Changed - disable add_hidden_timestamp configuration option by default - PR #1188

Release 0.14.2

22 Aug 19:11
c357772

Choose a tag to compare

  • Added - Migrate nosetests to pytest - PR #1142
  • Added - Codespell GitHub Actions workflow
  • Added - GitHub Actions workflow to manually release docs
  • Changed - Update datajoint/nginx to v0.2.6
  • Changed - Migrate docs from https://docs.datajoint.org/python to https://datajoint.com/docs/core/datajoint-python
  • Fixed - DevContainer configuration - PR #1115
  • Fixed - Updated set_password to work on MySQL 8 - PR #1106
  • Added - Missing tests for set_password - PR #1106
  • Changed - Returning success count after the .populate() call - PR #1050
  • Fixed - Autopopulate.populate excludes reserved jobs in addition to ignore and error jobs
  • Fixed - Issue #1159 (cascading delete) - PR #1160
  • Changed - Minimum Python version for Datajoint-Python is now 3.8 PR #1163
  • Fixed - docker compose commands in CI #1164
  • Changed - Default delete behavior now includes masters of part tables - PR #1158

Release 0.14.1

07 Jun 21:40
e2bfe7a

Choose a tag to compare

  • Fixed - Fix altering a part table that uses the "master" keyword - PR #991
  • Fixed - .ipynb output in tutorials is not visible in dark mode (#1078) PR #1080
  • Fixed - preview table font for darkmode PR #1089
  • Changed - Readme to update links and include example pipeline image
  • Changed - Docs to add landing page and update navigation
  • Changed - .data method to .stream in the get() method for S3 (external) objects PR #1085
  • Fixed - Docs to rename create_virtual_module to VirtualModule
  • Added - Skeleton from datajoint-company/datajoint-docs repository for docs migration
  • Added - Initial pytest for test_connection

Release 0.14.0

13 Feb 17:35
4360b17

Choose a tag to compare

  • Added - json data type (#245) PR #1051
  • Fixed - Activating a schema requires all tables to exist even if create_tables=False PR #1058
  • Changed - Populate call with reserve_jobs=True to exclude error and ignore keys - PR #1062
  • Added - Support for inserting data with CSV files - PR #1067
  • Changed - Switch testing image from pydev to djtest PR #1012
  • Added - DevContainer development environment compatible with GH Codespaces PR 1071
  • Fixed - Convert lingering prints by replacing with logs PR #1073
  • Changed - table.progress() defaults to no stdout PR #1073
  • Changed - table.describe() defaults to no stdout PR #1073
  • Deprecated - table._update() PR #1073
  • Deprecated - old-style foreign key syntax PR #1073
  • Deprecated - dj.migrate_dj011_external_blob_storage_to_dj012() PR #1073
  • Added - Method to set job keys to "ignore" status - PR #1068

Release 0.13.8

21 Sep 17:07
e7dc65a

Choose a tag to compare

  • Add - New documentation structure based on markdown PR #1052
  • Bugfix - Fix queries with backslashes (#999) PR #1052