-
Notifications
You must be signed in to change notification settings - Fork 2
Add scripts to upload sites to the catalog #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds scripts to enable bulk upload of CKAN site instance metadata to the POSE Ecosystem catalog. The implementation provides a CSV-based workflow for creating and updating site entries in the catalog.
- Adds a Python upload script with CSV parsing, API interaction, and validation logic
- Includes a template CSV file with example site data for different site types
- Provides comprehensive documentation covering field definitions, usage instructions, and troubleshooting
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| sites-upload/upload_sites.py | Main upload script handling CSV processing, CKAN API calls, field validation, and error handling |
| sites-upload/sites_metadata_template.csv | Template CSV with example site entries demonstrating various field configurations |
| sites-upload/config_sites.py | Configuration file for CKAN instance URL and API key |
| sites-upload/README.md | Documentation covering CSV format, field descriptions, usage instructions, and troubleshooting |
Comments suppressed due to low confidence (2)
sites-upload/upload_sites.py:114
- Variable response is not used.
response = action('organization_show', {'id': org_name})
sites-upload/upload_sites.py:24
- File is opened but is not closed.
open(os.path.abspath(file_dict.get('path')), 'rb'),
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| data_dict['upload'] = ( | ||
| file_dict.get('file_name'), | ||
| open(os.path.abspath(file_dict.get('path')), 'rb'), | ||
| 'application/octet-stream' | ||
| ) | ||
| fields = dict(data_dict) | ||
| m = MultipartEncoder(fields=fields) | ||
| r = requests.post( | ||
| ckanConfig.get('url') + '/api/action/' + action, | ||
| data=m, | ||
| headers={ | ||
| 'content-type': m.content_type, | ||
| 'Authorization': ckanConfig.get('api_key') | ||
| } | ||
| ) |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File handle is never closed, leading to a resource leak. The opened file should be closed after the request completes. Consider using a context manager or explicitly closing the file after the request.
| data_dict['upload'] = ( | |
| file_dict.get('file_name'), | |
| open(os.path.abspath(file_dict.get('path')), 'rb'), | |
| 'application/octet-stream' | |
| ) | |
| fields = dict(data_dict) | |
| m = MultipartEncoder(fields=fields) | |
| r = requests.post( | |
| ckanConfig.get('url') + '/api/action/' + action, | |
| data=m, | |
| headers={ | |
| 'content-type': m.content_type, | |
| 'Authorization': ckanConfig.get('api_key') | |
| } | |
| ) | |
| file_path = os.path.abspath(file_dict.get('path')) | |
| with open(file_path, 'rb') as f: | |
| data_dict['upload'] = ( | |
| file_dict.get('file_name'), | |
| f, | |
| 'application/octet-stream' | |
| ) | |
| fields = dict(data_dict) | |
| m = MultipartEncoder(fields=fields) | |
| r = requests.post( | |
| ckanConfig.get('url') + '/api/action/' + action, | |
| data=m, | |
| headers={ | |
| 'content-type': m.content_type, | |
| 'Authorization': ckanConfig.get('api_key') | |
| } | |
| ) |
|
|
||
| try: | ||
| response = action('organization_show', {'id': org_name}) | ||
| return True |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The response object is assigned but never checked. If the action() call raises no exception but returns a failed response (e.g., status code 404), this will incorrectly return True. Should check response status or success in the returned JSON.
| return True | |
| if response.status_code == 200: | |
| json_data = response.json() | |
| if json_data.get('success'): | |
| return True | |
| logger.warning(f"Organization {org_name} not found or API call failed: status={response.status_code}, response={response.text}") | |
| return False |
| print(r.json()) | ||
| print("\n") |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using print() statements for output is inconsistent with the logging framework used throughout the rest of the script. Should use logger.info() or logger.debug() instead for consistent logging.
| print(r.json()) | |
| print("\n") | |
| logger.info(r.json()) | |
| logger.info("\n") |
|
|
||
| Use decimal degrees format: | ||
| - Latitude: -90 to 90 (negative = South) | ||
| - longitude: -180 to 180 (negative = West) |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected capitalization of 'longitude' to match the consistent capitalization of 'Latitude' in line 159.
| - longitude: -180 to 180 (negative = West) | |
| - Longitude: -180 to 180 (negative = West) |
| ``` | ||
| Error: 403 Forbidden | ||
| ``` | ||
| **Solution:** Check API key in `config.py` |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The filename reference is incorrect. The actual configuration file is named 'config_sites.py', not 'config.py'. This will confuse users trying to troubleshoot authentication errors.
| **Solution:** Check API key in `config.py` | |
| **Solution:** Check API key in `config_sites.py` |
| """ | ||
| Perform CKAN API action with optional file upload | ||
| """ | ||
| fields = data_dict |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add new scripts to upload ckan sites to the catalog