Adding a batteries-included plot#
Each batteries-included plot should have its own file within the /src/arviz-plots/plots
folder.
Important
Batteries-included plots provide opinionated defaults for common EABM tasks. Consequently, setting those defaults should represent a significant part of both the code itself and the work done.
In fact, there are cases where getting the function to work is easier than managing to generate sensible defaults for the different parameter combinations.
Call signature#
All functions in Batteries-included plots should have a similar signature.
This includes things like taking DataTree
(in some cases also dictionaries of DataTrees
) as first input,
having inputs for defining subsets with var_names
or coords
,
for the dimensions to reduce…
Also have inputs that are passed downstream to customize objects created when plotting like plot_kwargs
or pc_kwargs
.
And last but not least, they should all return a PlotCollection
.
Function signature and docstring template
Here is a template of the function signature with common default values as well as their signature.
There are multiple parts within the template that require extra imput that is plot dependent.
These placeholders are indicated with [[description on what to fill here]]
def plot_xyz(
# initial base arguments
dt,
var_names=None,
filter_vars=None,
group="posterior",
coords=None,
sample_dims=None,
# plot specific arguments
[[...]],
# more base arguments
plot_collection=None,
backend=None,
labeller=None,
aes_map=None,
plot_kwargs=None,
stats_kwargs=None,
pc_kwargs=None,
):
"""Plot description in 1 line.
Extended description.
Parameters
----------
[[choose one]]
dt : DataTree
dt : DataTree or dict of {str : DataTree}
Input data. In case of dictionary input, the keys are taken to be model names.
In such cases, a dimension "model" is generated and can be used to map to aesthetics.
var_names : str or sequence of str, optional
One or more variables to be plotted.
Prefix the variables by ~ when you want to exclude them from the plot.
filter_vars : {None, “like”, “regex”}, default None
If None, interpret `var_names` as the real variables names.
If “like”, interpret `var_names` as substrings of the real variables names.
If “regex”, interpret `var_names` as regular expressions on the real variables names.
group : str, default "posterior"
Group to be plotted.
coords : dict, optional
sample_dims : str or sequence of hashable, optional
Dimensions to reduce unless mapped to an aesthetic.
Defaults to ``rcParams["data.sample_dims"]``
[[...]]
plot_collection : PlotCollection, optional
backend : {"matplotlib", "bokeh"}, optional
labeller : labeller, optional
aes_map : mapping of {str : sequence of str or False}, optional
Mapping of artists to aesthetics that should use their mapping in `plot_collection`
when plotted. Valid keys are the same as for `plot_kwargs`.
[[Description of default aesthetic mappings]]
plot_kwargs : mapping of {str : mapping or False}, optional
Valid keys are:
* [[first artist id]] -> [[function called when drawing the first artist]]
* [[repeat for all artists]]
stats_kwargs : mapping, optional
Valid keys are:
* [[stats/summary/diagnostic name]] -> [[function in arviz-stats used for computation]]
* [[repeat for all computations]]
pc_kwargs : mapping
[[choose one]]
Passed to :class:`arviz_plots.PlotCollection.wrap`
Passed to :class:`arviz_plots.PlotCollection.grid`
Returns
-------
PlotCollection
"""
Initial defaults#
The first thing that should generally happen in a plot_xyz
function is processing
the input arguments and setting initial defaults. This generally means creating
mutable objects for those input arguments the function expects to be mutable objects (e.g. dicts)
and getting default values from rcParams.
The general templates are therefore:
# rcParams
if parameter is None:
parameter = rcParams["xyz.parameter"]
# mutable inputs
if xyz_kwargs is None:
xyz_kwargs = {}
# if xyz is modified by the function, then also add
else:
xyz_kwargs = xyz_kwargs.copy()
There are also cases where this includes converting multiple input types into the type of object actually used within the function.
One example is sample_dims
which can be a string or a sequence, but the functions
expect to be a sequence. Thus, if it’s None
we use the above template to get the
default value, then if it is a string we create a list with the string as only element of the list:
if sample_dims is None:
sample_dims = rcParams["data.sample_dims"]
if isinstance(sample_dims, str):
sample_dims = [sample_dims]
Another example is processing dt
, var_names
, group
… where several arguments
can take multiple types and values. In this case, there is a helper function to take
care of that:
distribution = process_group_variables_coords(
dt, group=group, var_names=var_names, filter_vars=filter_vars, coords=coords
)
Finally, if needed we set default arguments for PlotCollection
and create and instance of it. In addition to having to copy pc_kwargs
when it is
not None
following the template above, it might also be necessary to set defaults
for dictionaries within pc_kwargs
such as pc_kwargs["aes"]
.
Use the following pattern for such cases:
pc_kwargs["aes"] = pc_kwargs.get("aes", {}).copy()
pc_kwargs["aes"].setdefault("color", ["model"])
PlotCollection dependent defaults#
Once we have made sure that plot_collection
is not None, we continue setting defaults.
There are arguments such as aes_map
that need information from the plot_collection
input to have their defaults set. A common pattern will therefore be:
if aes_map is None:
aes_map = {}
else:
aes_map = aes_map.copy()
aes_map.setdefault("artist", plot_collection.aes_set)
aes_map.setdefault("annotation", ["color"])
where we are setting the default that “artist” will use all available aesthetic mappings, and “annotation” will use only the mapping for color (if set, that is checked later on, so this default can be hardcoded).
We might also want to tweak some aesthetic values, in which case
get_aes_as_dataset
and update_aes_from_dataset
can be helpful. See plot_forest
source code for an example.
Adding artists to the plot#
Important
Before starting to add artists individually, check if part of the plot can be composed calling an existing function.
For example, plot_trace_dist
calls plot_trace
to fill the right column and plot_dist
to fill the left one.
Each artist should have its own id and ideally also its own call to .map
.
The id is what is used to get the artist specific kwargs from plot_kwargs
and aes_map
,
and what is used to store the artist in the viz
attribute.
The independent calls to .map
allow each artist in the plot to use different aesthetic mappings.
Consequently, there are multiple steps that should be followed for each artist:
Access the respective kwargs in
plot_kwargs
. Only proceed to step 2 if these are notFalse
.Use
filter_aes
to get the dimensions, active aesthetics and aesthetics to be ignored for this particular artist.(optional) If necessary and particular to this artist, call the stats/summary/diagnostic function. Details on this are in the Computation section.
Set default arguments.
Check we aren’t overriding active aesthetics by setting defaults.
Use only properties that are part of the common interface for defaults. If a setting a default for a property not on the list, open an issue to discuss it. This is key to ensure all plotting backends work seamlessly and behave as expected.
Call
map
to create the artist.
Here is a general template:
# step 1
artist_kwargs = copy(plot_kwargs.get("artist", {}))
if artist_kwargs is not False:
# step 2
artist_dims, artist_aes, artist_ignore = filter_aes(
plot_collection, aes_map, "artist", sample_dims
)
# step 3 (optional)
artist_data = stats(..., dims=artist_dims, **stats_kwargs.get("artist", {}))
# step 4
if "color" not in artist_aes:
artist_kwargs.setdefault("color", "gray")
# step 5
plot_collection.map(
visual,
"artist",
data=artist_data, # optional
ignore_aes=artist_ignore,
..., # if needed, add more arguments
**artist_kwargs
)
Note
There might be some cases in which multiple artists are given the same aesthetic mappings
and keyword arguments, but this should be done rarely and for artists that eventually call
the same function in the backend
module.
One example of this particular case are the var+coord labels in plot_trace_dist
.
The left column labels the x axis with the variable name and coordinate subset
whereas the right column labels the y axis.
Therefore, there are to .map
calls,
one to labelled_x
and labelled_y
but they can
be considered the same element, so they both get plot_kwargs
and aes_map
from
the label
kwarg.
Computation#
Warning
Computation relies on arviz-stats
library, which is still in earlier development stages
than arviz-plots. So at this point there aren’t many recommendations on the functions themselves
Functions should reduce the dimensions returned by filter_aes
(artist_dims
above).
Moreover, in order for the result to be valid data
argument when calling .map
it must
be a Dataset
with the same variables in var_names
(or all variables in input data if not given).