Skip to content

Add AST-based dataflow diagram generation for documentation#566

Open
archit7-beep wants to merge 18 commits intomllam:mainfrom
archit7-beep:add-dataflow-generator
Open

Add AST-based dataflow diagram generation for documentation#566
archit7-beep wants to merge 18 commits intomllam:mainfrom
archit7-beep:add-dataflow-generator

Conversation

@archit7-beep
Copy link
Copy Markdown
Contributor

@archit7-beep archit7-beep commented Mar 31, 2026

Describe your changes

This PR introduces an initial prototype for generating dataflow diagrams from the codebase using an AST-based approach to improve documentation clarity.

The implementation extracts method-level relationships (e.g., process_step) and generates Mermaid-based diagrams representing how data flows through different components.

This work complements the ongoing structured documentation efforts by adding visual context to better understand model and pipeline behavior.

##Example of Digrams

🔁 Data Flow


🏗️ Structure Diagram


Motivation:
While current documentation improves structure and API visibility, understanding data flow still requires manual code tracing. This approach aims to make data pipelines easier to understand and improve onboarding for new contributors.

Dependencies:

  • Python AST (built-in)
  • Mermaid (for diagram rendering)

Issue Link

Related to ( #61 )

Type of change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📖 Documentation (Addition or improvements to documentation)

Checklist before requesting a review

  • My branch is up-to-date with the target branch - if not update your fork with the changes from the target branch (use pull with --rebase option if possible).
  • I have performed a self-review of my code
  • For any new/modified functions/classes I have added docstrings that clearly describe its purpose, expected inputs and returned values
  • I have placed in-line comments to clarify the intent of any hard-to-understand passages of my code
  • I have updated the README to cover introduced code changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have given the PR a name that clearly describes the change, written in imperative form (context).
  • I have requested a reviewer and an assignee (assignee is responsible for merging). This applies only if you have write access to the repo, otherwise feel free to tag a maintainer to add a reviewer and assignee.

Checklist for reviewers

Each PR comes with its own improvements and flaws. The reviewer should check the following:

  • the code is readable
  • the code is well tested
  • the code is documented (including return types and parameters)
  • the code is easy to maintain

Author checklist after completed review

  • I have added a line to the CHANGELOG describing this change, in a section
    reflecting type of change (add section where missing):
    • added: when you have added new functionality
    • changed: when default behaviour of the code has been changed
    • fixes: when your contribution fixes a bug
    • maintenance: when your contribution is relates to repo maintenance, e.g. CI/CD or documentation

Checklist for assignee

  • PR is up to date with the base branch
  • the tests pass
  • (if the PR is not just maintenance/bugfix) the PR is assigned to the next milestone. If it is not, propose it for a future milestone.
  • author has added an entry to the changelog (and designated the change as added, changed, fixed or maintenance)
  • Once the PR is ready to be merged, squash commits and merge the PR.

@archit7-beep archit7-beep marked this pull request as ready for review April 1, 2026 16:25
@joeloskarsson
Copy link
Copy Markdown
Collaborator

Hi, could you add pictures of the diagrams here in the PR, so it is easier to get an idea of what they contain?

@archit7-beep
Copy link
Copy Markdown
Contributor Author

Hi, could you add pictures of the diagrams here in the PR, so it is easier to get an idea of what they contain?

@joeloskarsson I have updated the diagrams in description of PR

@sadamov sadamov self-requested a review April 9, 2026 07:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants