adding second dataset and editing utils + tests#54
Merged
aileeny825 merged 2 commits intomainfrom Apr 24, 2026
Merged
Conversation
aileeny825
approved these changes
Apr 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hi team—
This PR extends the existing data pipeline to incorporate hourly wholesale electricity price data (LMP) alongside the current EIA demand datasets. It introduces a new ingestion script, load_gridstatus_hourly_prices_to_bigquery.py, which pulls a rolling 7-day window of hourly price data for three regions: CAISO (California), NYISO (New York), and ERCOT (Texas) using the Python gridstatus library, standardizes the schema across ISOs (including LMP and its components: energy, congestion, and loss), and replaces the destination BigQuery table (eia_data.hourly_pricing_main) on each run.
The BigQuery utilities were updated to include a new pricing_table_id configuration field and a read_pricing_data() helper function to support downstream analysis and visualization.
A new GitHub Actions workflow was also added to refresh the pricing data daily and create a 30-day snapshot backup, consistent with the existing EIA pipelines. Tests were updated to account for the new configuration field, and all tests are currently passing.
Notably, ERCOT required additional handling due to its MIS-based data access pattern, which currently is taking ~20 minutes to load. If it is causing issues for our daily ingestion, we can try a different region that is easier to use.
This work sets up the next step for the team of building a Streamlit analytics page to merge price and demand data and analyze price-demand relationships by ISO over the last seven days.
Thanks!
Aria