Parameterize Geoprocessing Rasters #3406

rajadain · 2021-07-20T19:49:58Z

Overview

Previously, the raster layer names were hardcoded in specific JSON blocks of configuration. Now, since we may use a variant for different types of raster layers (land, soil, etc), we parameterize the configuration, replacing a token with the real layer name at run time.

There is a default set of layers that will be used if no overrides are provided. Only those layers that are being overridden need to be provided, the rest will come from the default.

When using the default layers, the existing cache keys will be used. When using overridden layers, those layer names will be added to the cache key. If the default layers are ever changed, we'll have to clear the geoprocessing cache. This PR only tokenizes the layers, and does not change the default ones.

This will mainly be used for MapShed modeling, but we also refactor Land analysis to use this, which cuts down on lines of code and demos the feature.

Connects #3405

Testing Instructions

Check out this branch
Delete all cached geoprocessing runs:
```
$ vagrant ssh services -c 'redis-cli -n 1 --raw KEYS ":1:geop_*" | xargs redis-cli -n 1 DEL'
```
This is necessary whenever we change the default layers. Not the case in this PR, but might as well.

Restart celery:

$ vagrant ssh worker -c 'sudo service celeryd restart'

Go to :8000 and find a shape that has variation between 2001 and 2019 land covers. Or just type in "Van Sciver Lake" into the search box in the top right, and select the "Van Sciver Lake-Delaware River, HUC-12 Subwatershed" shape (it may take a while to show up, wait for the spinner to end)
In the Land Analysis, ensure that the 2001 and 2019 have different results in the chart and table (compare the value of Deciduous Forest if using Van Sciver Lake), and compare with the same shape on https://staging.modelmywatershed.org/ to confirm

Previously the layer names used for geoprocessing were hard-coded. As of #3399 we now have a number of NLCD layers, which may be used for modeling purposes. To replace the NLCD layer in all operations we switch to this tokenized scheme. Now the operation definitions only carry the tokens, with a base configuration defined with the previously used values. Both `run` and `multi`, the two geoprocessing functions, now take an optional `layer_overrides` dictionary, which is a mapping from token to layer name. At run time, these overrides are combined with the base configuration, which is in turn used to replace all tokens with real layer names in the operation.

Since the cached value now depends on the layer overrides provided, we now add them to the cache key. Any existing caches will have to be deleted if ever we make changes to the default layers in base settings.

Previously we had a separate JSON block for every NLCD year for land analysis, which differed by only one line. Now, using the new token feature, we reduce those to only one block, and replace the tokens according to the requested NLCD year. The schema validator is also updated to now use direct values, since the nlcd_XYZ_ara configs are no longer available.

A hitherto unnoticed bug was revealed when the Analyze Land endpoint switched to using tokenized instances of the same JSON config: the token replacements would change the default config itself, since dict.copy() is a shallow copy. This is fixed by using copy.deepcopy() instead. It is unlikely that this would have occurred in the past, as 1. We used a separate JSON block for every geoprocessing run, rather than the same blocks as we are now with tokenization 2. The contents of the JSON block were not changed other than adding the input polygon, which would be overwritten on every call

rajadain · 2021-07-21T17:40:21Z

There's a failing test, taking a look at it now

not ok 118 Firefox 88.0 - [1008 ms] - Modeling Models ScenarioModel #fetchResults sets results to null on polling failure
    ---
        message: >
            uncaught exception: AssertionError: attemptSave should have been called twice (http://localhost:7357/333799856769/js/test.vendor.js:22067)
        browser log: |
            LOG: Failed to get modeling results.
            LOG: Completed polling for modeling results
    ...

rajadain · 2021-07-21T18:29:17Z

It was a seemingly transient failure. Am not seeing that test fail on my local, and it now it passes on CI as well.

jwalgran

I was able to run through the testing steps and see the same changes between 2001 and 2019 on both staging and this branch. I left a few comments not related to the functionality.

src/mmw/apps/modeling/geoprocessing.py

src/mmw/mmw/settings/base.py

rajadain · 2021-07-22T02:13:34Z

Thanks for all the great suggestions! They have all been recorded in future issues or addressed herein. Going to merge this now so it's ready for demo tomorrow, especially since the last commit is only documentation changes.

rajadain added 4 commits July 19, 2021 15:16

Use layer names in cache key

c3fa36d

Since the cached value now depends on the layer overrides provided, we now add them to the cache key. Any existing caches will have to be deleted if ever we make changes to the default layers in base settings.

rajadain added the PA DEP Funding Source: Pennsylvania Department of Environment Protection label Jul 20, 2021

rajadain requested a review from jwalgran July 20, 2021 19:49

rajadain assigned jwalgran Jul 20, 2021

jwalgran approved these changes Jul 21, 2021

View reviewed changes

src/mmw/apps/modeling/geoprocessing.py Show resolved Hide resolved

src/mmw/apps/modeling/geoprocessing.py Show resolved Hide resolved

src/mmw/mmw/settings/base.py Outdated Show resolved Hide resolved

src/mmw/mmw/settings/base.py Show resolved Hide resolved

jwalgran assigned rajadain and unassigned jwalgran Jul 21, 2021

Update README with cache clearing command

55be1c0

rajadain merged commit de273cf into develop Jul 22, 2021

rajadain deleted the tt/parameterize-geoprocessing-rasters branch July 22, 2021 02:13

This was referenced Jul 22, 2021

Add Multi Layer Support to Projects #3400

Closed

Allow Projects to Override Layers for Geoprocessing #3411

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parameterize Geoprocessing Rasters #3406

Parameterize Geoprocessing Rasters #3406

rajadain commented Jul 20, 2021

rajadain commented Jul 21, 2021

rajadain commented Jul 21, 2021

jwalgran left a comment

rajadain commented Jul 22, 2021

Parameterize Geoprocessing Rasters #3406

Parameterize Geoprocessing Rasters #3406

Conversation

rajadain commented Jul 20, 2021

Overview

Testing Instructions

rajadain commented Jul 21, 2021

rajadain commented Jul 21, 2021

jwalgran left a comment

Choose a reason for hiding this comment

rajadain commented Jul 22, 2021