Skip to content

C2m2 scripts#255

Draft
wilke0818 wants to merge 6 commits intomainfrom
c2m2_scripts
Draft

C2m2 scripts#255
wilke0818 wants to merge 6 commits intomainfrom
c2m2_scripts

Conversation

@wilke0818
Copy link
Copy Markdown
Contributor

  • Create scripts and documentation for generating c2m2 metadata
  • Creates metadata based off bundled data (PhysioNet feature data, not raw audio data)
  • Creates subject data separately so that metadata about sex and race can be allowed but no links to other data types in order to not allow for connecting diseases to subjects (per DAC/consent agreements)
  • Generate links between all other C2M2 types including biosamples (any voice recording), files, and diseases

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a suite of scripts to convert Bridge2AI voice bundle datasets into C2M2-compliant metadata, including tools for processing file metadata, subject demographics, and biosample associations. The reviewer identified a critical issue where only the first feature file was processed and recommended removing the redundant and buggy peds_bundle_to_c2m2.py script. Additional feedback suggests using more idiomatic pathlib methods for file traversal, implementing more specific exception handling, correcting documentation links and formatting, and completing missing ontology mappings for pediatric conditions and parquet file formats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant