This repository contains a data processing workflow to export geospatial data layers for the 30 x 30 planning project in the Democratic Republic of the Congo (DRC) using the Google Earth Engine (GEE) Python API. It computes grid-level summary statistics for various environmental and land cover datasets in the DRC, exporting the results to CSV.
All the data layers are listed in the gee_grid_analysis/datasets.py. Any future data layer can be added in this module.
The output of this repository can be used within the 30 x 30 optimization toolbox available here.
drc_30x30_data_prep/
├── Dockerfile
├── authenticate.ipynb
├── requirements.txt
├── README.md
└── gee_grid_analysis/
├── __init__.py
├── datasets.py
├── utils.py
└── scripts/
├── gee_export.py
├── merge_tables.py
└── data/
├── drc_1km_grid_reference_table.csv
After running the code an output folder will be created where the merged CSV file will be exported to.
This toolbox is developed in Python. To ensure reproducibility of results, the tool is wrapped in a Docker environment. To run the code locally, you need to install Docker for free and follow the instructions below.
If you are running this on Windows, it's recommended to run it inside Windows Subsystem for Linux (WSL). Follow the instructions here to enable WSL in your Windows.
git clone [email protected]:ClarkCGA/drc_30x30_data_prep.git
cd drc_30x30_data_prepdocker build -t drc-30x30-data-prep .docker run -it -p 8888:8888 -v $(pwd):/app/ drc-30x30-data-prepThis will print out the URL to the Jupyter Lab (including its token). Copy the URL, and paste into a browser to launch Jupyter Lab.
Open authenticate.ipynb and run the two cells to authenticate with your GEE account in the Docker container.
Open terminal in Jupyter Lab, and run the following to export all the data from GEE:
python scripts/gee_export.pyAll exports are sent to your Google Drive, in a folder named GEE_Downloads. You can monitor progress in the Earth Engine Tasks tab.
After the completion of GEE data export, move all the exported CSV files to data/gee, and run the following script to add spatial neighborhood information and merge all CSV files into a single file for further analysis:
python scripts/merge_tables.pyThis will generate a file named drc_1km_data_planning_units.csv in the output folder.
To change/add datasets or export parameters, modify:
gee_grid_analysis/datasets.pygee_grid_analysis/utils.py
This project is funded by the Wildlife Conservation Society (WCS) through a contract with Clark CGA.
If you run into any issues running this code, or if you have questions about it you can open a ticket on this GitHub repository or contact Clark CGA team at [email protected].