Once we're confident that NGED's S3 data pipeline is working well, then delete all the CKAN code, including:
- The entire
nged_data package.
- All the old CKAN Dagster assets.
- Anything in the data contracts that's only used in the CKAN data. e.g. different substation names from the location data versus the live substation data.
TODO
Related
Once we're confident that NGED's S3 data pipeline is working well, then delete all the CKAN code, including:
nged_datapackage.TODO
startTime? Can't we just storeendTime, and ensure the incoming data is half-hourly?valuetopowerin the XGBoostFeatures, and elsewhere?end_timetovalid_timeinPowerTimeSeriesto match the use ofvalid_timein the NWP data? Or maybe it's nice to be explicit thatend_timeis the end time of the half-hour period???packages/nged_json_datatonged_timeseries_data? (assuming we use a different package for the non-time-series data, like the adjacency matrix?) Or maybe tonged_data?flows_30mtopower_time_seriesh3_res_5column in thetime_series_metadata.parquet, and - crucially - make sure it's not over-complicated. I think there might be some silly code in the model training that fills in the H3 if it's missing. But it shouldn't be missing!csvorsubstation#105Related