-
Notifications
You must be signed in to change notification settings - Fork 91
InVEST Reports: Author's Guide
- quickly assess the validity of the results, making it obvious if something went wrong even if there was no error
- reduce some tedium for people who do lots of model iterations and need to view the results each time
- make it clear to beginners what the primary model outputs are and how to interpret them
- provide examples of how to visualize outputs, including figures people can re-use or take inspiration from
A model's MODEL_SPEC will have a reporter attribute.
The reporter module must have a callable report function. It will generate the data & plots to be rendered by the jinja template, and will call the template.render function. The report function should have a definition like this:
def report(file_registry, args_dict, model_spec, target_html_filepath):
"""Generate an HTML summary of model results.
Args:
file_registry (dict): The ``natcap.invest.FileRegistry.registry``
that was returned by the model's ``execute`` method.
args_dict (dict): The arguments that were passed to the model's
``execute`` method.
model_spec (natcap.invest.spec.ModelSpec): the model's ``MODEL_SPEC``.
target_html_filepath (str): path to an HTML file to be generated by
this function.
Returns:
``None``
"""All model templates should be rendered with these variables, in addition to other model-specific variables:
TODO: we can try to minimize the need to define all of these in every reporter
with open(target_html_filepath, 'w', encoding='utf-8') as target_file:
target_file.write(TEMPLATE.render(
report_script=__file__,
timestamp=time.strftime('%Y-%m-%d %H:%M'),
model_id=model_spec.model_id,
model_name=model_name,
model_description=model_description,
userguide_page=model_spec.userguide,
args_dict=args_dict,
model_spec_outputs=model_spec.outputs,
...
))For plotting rasters, use matplotlib wrapper functions defined in raster_utils.py
For plotting vectors, use altair.
The model's report template should extend from base.html and add items within the content block:
{% extends 'base.html' %}
<!-- Only override the styles block if additional styles are needed by this model -->
{% block styles %}
{{ super() }}
{% include 'datatable-styles.html' %}
{% endblock styles %}
{% block content %}
{{ super() }}
... more content rendered here ...
{% endblock content %}
<!-- Only override the scripts block if additional scripts are needed by this model -->
{% block scripts %}
{{ super() }}
{% include 'datatable-js.html' %}
{% endblock scripts %}- Figures should usually have captions. Captions should include references to the data sources displayed in the figure.
- Figure labels (legends, axes, titles, etc) should include the units of measure for numeric variables.
- Captions and units should be composed of text taken directly from a
model_specwhen possible.
To best serve the goals of the report, most reports should present the final outputs first, and then work backwards through intermediate outputs to inputs. This makes it clear to beginners what the primary outputs are and provides a narrative for describing how the model arrived at those final results. At the same time, this order gives experienced users a quick look at the high-level results and the opportunity to interrogate intermediate outputs as-needed. Here's an example page layout:
If a modeler can only look at one or two output datasets to understand how their model performed, which would they be?
- For models that summarize results by performing zonal stats, the zonal stats table is a good choice.
- The "ecosystem service" metric is also a good choice.
- Which datasets will help to explain the results/patterns in the final outputs?
- Which datasets have a distinctly large influence on the final outputs?
- Which datasets/parameters often require adjustments by the modeler before running the model again? Which outputs would a user review in order to decide whether or not make an adjustment?
A table summarizing the min/max/mean/nodata for raster outputs. This is useful for finding rasters with outlier values, or rasters with excessive amounts of nodata pixels.
It's often useful to look at input data when interpreting patterns in the results. But input data can be more challenging to visualize than outputs because its structure is less predictable. In many cases, it may make more sense to plot processed input data (like the output of an align_and_resize call) rather than the raw input data. https://github.com/natcap/invest-reports/issues/37
- Which inputs most heavily influence results?
- Which inputs are most likely to be the focus of the analysis? (e.g. an LULC in a project exploring landuse scenarios)
It's useful to see the same stats for input rasters as for outputs. For example, to see if the amount of nodata in outputs matches that of inputs.
The args_dict passed to the model execute
Model spec descriptions of all datasets generated by the model.
- Report authors should consult with scientific experts to review drafts of the report.
- Reports should be tested against multiple datastacks if possible. The sample data can be one.