Skip to content

Publisher filter#210

Open
FuhuXia wants to merge 3 commits intomainfrom
publisher-filter
Open

Publisher filter#210
FuhuXia wants to merge 3 commits intomainfrom
publisher-filter

Conversation

@FuhuXia
Copy link
Copy Markdown
Member

@FuhuXia FuhuXia commented Mar 16, 2026

For

About:
add publisher as a new filter, next to keywords, organizations...

Changes:

  • added new publisher subfields to the OpenSearch mapping
  • added get_top_publishers() to get top 100 publishers
  • added /api/publishers
  • added publisher filter on the UI

TODO
this need a new OpenSearch index.

@FuhuXia FuhuXia requested a review from a team March 16, 2026 22:10
Copy link
Copy Markdown
Member

@cmhedrickREI cmhedrickREI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

"""Return the top 100 publishers ordered by dataset count."""
publishers = self.opensearch.get_publisher_counts(size=100)

return sorted(
Copy link
Copy Markdown
Contributor

@rshewitt rshewitt Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems complicated. opensearch agg should return non-empty publishers with count > 0. if that's less than 100 so what. we don't need to sort by name only dataset count. also, each item should always contain the necessary keys so no need to check if one exists. why can't this be...

return sorted( publishers, key=lambda  item: item["count"], reverse=True)

or better yet why can the OS result be sorted already?

org_id=None,
org_types=None,
keywords: list[str] = None,
publisher: str | None = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is get_contextual_aggregations called?

publisher_size=100,
) -> dict:
"""
Get keyword and organization aggregations based on current search context.
Copy link
Copy Markdown
Contributor

@rshewitt rshewitt Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update docstring to include publishers?

publisher_size=100,
) -> dict:
"""
Get keyword and organization aggregations based on current search context.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update docstring for publisher

}

class PublisherAutocomplete {
constructor(options) {
Copy link
Copy Markdown
Contributor

@rshewitt rshewitt Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at this point we now have 3 classes sharing identical functionality. publisher, keyword, and organization. seems like a good opportunity to extract a base class so we can avoid having to review as much content.

_get_publisher_counts,
)

publishers = interface_with_dataset.get_top_publishers()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sorts by multiple fields (count then name) and checks for key presence. can you update the fixture data with more examples to reflect the expected sorting behavior?

assert geography_panel is not None


class TestPublisherSearch:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this work for organization? how about multiple publishers? do the chips look they way we expect them to? (only the selected ones are present and the counts are correct), do aggregates update after filtering by publisher?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants