New cf_add_meta function#55
Conversation
New exported function for enriching dataframes containing neuron keys with metadata. Supports: - Single or multiple key columns - Configurable suffixes for multiple columns - Optional column selection via cols parameter - Warning for missing column names Also updates to exclude tissue/sex columns from metadata join since these are typically already present.
Changes details from boolean to character options: - "partner" (default): Add metadata for partner neurons only - "query": Add metadata for query neurons only - "both": Add metadata for both sides with .pre/.post suffixes - "neither": No metadata, return minimal columns for speed The implementation: - Keeps only core connectivity columns (pre_id, post_id, weight) - Uses cf_add_meta() to add metadata consistently from cf_meta() - Removes the add_partner_metadata() helper function Includes comprehensive tests for all details options and cf_add_meta functionality.
* nb this replaces a bad commit from claude
There was a problem hiding this comment.
Pull request overview
Adds a new cf_add_meta() helper and extends cf_partners() with a details option to control which side(s) of a partner table get metadata enrichment, enabling faster “minimal” partner queries and deferred metadata joins.
Changes:
- Added
details = c("partner","query","both","neither")tocf_partners()and refactored enrichment to use the newcf_add_meta(). - Introduced
cf_add_meta()(exported) for joiningcf_meta()-derived columns onto one or more key columns with optional suffixing and column selection. - Added test coverage for
detailsbehavior and forcf_add_meta()joining options.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
R/partners.R |
Adds details arg and switches metadata enrichment to cf_add_meta() with key-based joins. |
R/meta.R |
Implements new exported cf_add_meta() helper. |
tests/testthat/test-partners.R |
Adds tests for cf_partners(details=...) and cf_add_meta() behavior. |
NAMESPACE |
Exports cf_add_meta. |
man/cf_partners.Rd |
Documents new details parameter. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Enrich with metadata based on details option | ||
| # Always use cf_add_meta rather than relying on partnerfun metadata | ||
| if (details != "neither" && nrow(tres) > 0) { | ||
| # Keep only core connectivity columns, drop all partnerfun metadata | ||
| # cf_add_meta will re-add metadata from cf_meta for consistency | ||
| core_cols <- c("pre_id", "post_id", "weight", "dataset", "tissue", "sex", | ||
| "pre_key", "post_key") | ||
| tres <- tres[, intersect(names(tres), core_cols), drop = FALSE] | ||
|
|
There was a problem hiding this comment.
When details='neither', this code path only skips cf_add_meta but does not remove any metadata that partnerfun may already have returned. Several built-in partnerfuns (e.g. fanc/banc) left_join metadata before returning, so details='neither' would still include type/class/etc and violate the documented contract. Consider always stripping metadata columns (or subsetting to core connectivity columns) when details=='neither'.
| # Enrich with metadata based on details option | |
| # Always use cf_add_meta rather than relying on partnerfun metadata | |
| if (details != "neither" && nrow(tres) > 0) { | |
| # Keep only core connectivity columns, drop all partnerfun metadata | |
| # cf_add_meta will re-add metadata from cf_meta for consistency | |
| core_cols <- c("pre_id", "post_id", "weight", "dataset", "tissue", "sex", | |
| "pre_key", "post_key") | |
| tres <- tres[, intersect(names(tres), core_cols), drop = FALSE] | |
| # Always drop partnerfun-specific metadata, keep only core connectivity columns | |
| # cf_add_meta will re-add standardized metadata from cf_meta as needed | |
| core_cols <- c("pre_id", "post_id", "weight", "dataset", "tissue", "sex", | |
| "pre_key", "post_key") | |
| tres <- tres[, intersect(names(tres), core_cols), drop = FALSE] | |
| # Enrich with metadata based on details option | |
| # Always use cf_add_meta rather than relying on partnerfun metadata | |
| if (details != "neither" && nrow(tres) > 0) { |
| # Always use cf_add_meta rather than relying on partnerfun metadata | ||
| if (details != "neither" && nrow(tres) > 0) { | ||
| # Keep only core connectivity columns, drop all partnerfun metadata | ||
| # cf_add_meta will re-add metadata from cf_meta for consistency | ||
| core_cols <- c("pre_id", "post_id", "weight", "dataset", "tissue", "sex", | ||
| "pre_key", "post_key") | ||
| tres <- tres[, intersect(names(tres), core_cols), drop = FALSE] | ||
|
|
There was a problem hiding this comment.
Subsetting to core_cols drops all partnerfun-provided columns beyond the core connectivity fields (including potentially non-metadata columns). This also makes keep.all ineffective for preserving dataset-specific extra columns across datasets. If the goal is to ignore partnerfun metadata only, consider dropping just the known metadata columns (or only those that overlap cf_meta outputs) while preserving any additional connectivity metrics.
| # Always use cf_add_meta rather than relying on partnerfun metadata | |
| if (details != "neither" && nrow(tres) > 0) { | |
| # Keep only core connectivity columns, drop all partnerfun metadata | |
| # cf_add_meta will re-add metadata from cf_meta for consistency | |
| core_cols <- c("pre_id", "post_id", "weight", "dataset", "tissue", "sex", | |
| "pre_key", "post_key") | |
| tres <- tres[, intersect(names(tres), core_cols), drop = FALSE] | |
| # Always use cf_add_meta for standardized metadata; preserve any | |
| # additional partnerfun-provided columns (e.g. extra connectivity metrics) | |
| if (details != "neither" && nrow(tres) > 0) { |
| #' @examples | ||
| #' \dontrun{ | ||
| #' # Add metadata to partner column only | ||
| #' partners <- cf_partners(cf_ids(hemibrain='DA2_lPN'), threshold=10, details=FALSE) |
There was a problem hiding this comment.
The example uses details=FALSE, but cf_partners() now defines details as a character option (partner/query/both/neither). As written, this example will error; update it to a valid value (e.g. details='neither') or remove the argument if the default is intended.
| #' partners <- cf_partners(cf_ids(hemibrain='DA2_lPN'), threshold=10, details=FALSE) | |
| #' partners <- cf_partners(cf_ids(hemibrain='DA2_lPN'), threshold=10, details='neither') |
|
|
||
| # Extract unique keys and fetch metadata once | ||
| all_keys <- unique(unlist(x[keycol], use.names = FALSE)) | ||
| all_keys <- all_keys[!is.na(all_keys) & nzchar(all_keys)] |
There was a problem hiding this comment.
all_keys is filtered with nzchar(all_keys), but unlist(x[keycol]) can yield non-character (e.g. factor/integer64) depending on the input data.frame. nzchar() errors on non-character vectors. Coerce to character before nzchar (or use a type-agnostic empty check) to make cf_add_meta robust to common data.frame column types.
| all_keys <- all_keys[!is.na(all_keys) & nzchar(all_keys)] | |
| all_keys_chr <- as.character(all_keys) | |
| all_keys <- all_keys_chr[!is.na(all_keys_chr) & nzchar(all_keys_chr)] |
* the suggestions were poor because partial string matching was not enabled. * For example opic went to manc not opticlobe because of the length * also added package prefix for namespace issue
* also fixes new test error
Add metadata columns based on one or more key columns e.g. pre/post metadata