Skip to content

gcp-pdp/geo-earthengine

Repository files navigation

Google Earth Engine raster to BigQuery SQL convertion project

To extract big rasters like to dozens of gygabytes each from Google Earth Engine (GEE) that's required to have enough RAM+SWAP space for rasters fetching. Per-chunk downloading process doesn't require lots of memory while there are no network or server errors. Otherwise, memory intensive process of reorginizing downloaded chunks and re-downloading missed ones is using. Usually, we have 1-2 transfer error for every 10 GB of downloaded rasters. 32 GB of RAM+SWAP is the right value for the scripts below where raster cache size is equal to 30 000 MB (GDAL_CACHEMAX=30000). For 3 GB RAM plus 30 GB SWAP we are able to download 500 GB rasters in 12-24 hours.

For Google Earth Engine (GEE) access the service account key required (it's named /root/gee-export.json in the scripts), to create your own one follow the link Create and register a service account to use Earth Engine

GeoTIFF to BigQuery CSV convertion tool

Use provided script geotif-to-bqcsv.py to convert WGS 84 GeoTIFF files to BigQuery CSV data and table schema:

./geotif-to-bqcsv.py GeoTIFF_file [CSV_file]

or for batch conversion:

find . -name '*.tif' -print0 | parallel -0 geotif-to-bqcsv.py '{}' '{}'.csv

With just mandatory first argument the script returns corresponding BigQuery table schema only. With the optional second argument this script converts the entire GeoTIFF file to CSV output into the specified file and also prints the schema too.

See WorldPop.sh to extract data for 2020 year in WGS84 coordinates.

See AnnualNPP.sh to extract the entire dataset and convert it into WGS84 coordinates.

See GFS.sh to extract data for date "2021/04/13" and forecasting interval 384 hours in WGS84 coordinates.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published