arviz_plots.PlotCollection.generate_aes_dt

arviz_plots.PlotCollection.generate_aes_dt#

PlotCollection.generate_aes_dt(aes=None, **kwargs)[source]#

Generate the aesthetic mappings.

Populate and store the DataTree attribute .aes of the PlotCollection.

Parameters:
aesmapping of {strlist of hashable or False}, optional

Dictionary with aesthetics as keys and as values a list of the dimensions it should be mapped to. It can also take False as value to indicate that no mapping should be considered for that aesthetic key.

**kwargsmapping, optional

Dictionary with aesthetics as keys and as values a list of the values that should be taken by that aesthetic.

Notes

Mappings are applied only when all variables defined in the mapping are found. Thus, a mapping for ["chain", "hierarchy"] would be applied if both dimensions are present in the variable, otherwise it is completely ignored.

It can be the case that a mapping is ignored for a specific variable because it has none of the dimensions that define the mapping or because it doesn’t have all of them. In such cases, out of the values in the property cycle, the first one is taken out and reserved as neutral_element. Then, the cycle excluding the first element is used when applying the mapping, and the neutral element is used when the mapping can’t be applied.

It is possible to force the inclusion of the neutral element from the property value cycle by providing the same value in both the first and second positions in the cycle, but this is generally not recommended.

Examples

Initialize a PlotCollection with the rugby dataset as data. Facetting and aesthetics mapping are independent. Thus, as we are limiting ourselves to the use of this method, we can provide an empty DataTree as viz_dt.

from datatree import DataTree
from arviz_base import load_arviz_data
from arviz_plots import PlotCollection
from arviz_base.datasets import REMOTE_DATASETS, RemoteFileMetadata
# TODO: remove this monkeypatching once the arviz_example_data repo has been updated
REMOTE_DATASETS.update({
    "rugby_field": RemoteFileMetadata(
        name="rugby_field",
        filename="rugby_field.nc",
        url="http://figshare.com/ndownloader/files/44667112",
        checksum="53a99da7ac40d82cd01bb0b089263b9633ee016f975700e941b4c6ea289a1fb0",
        description="Variant of the rugby model."
    )
})
idata = load_arviz_data("rugby_field")
pc = PlotCollection(idata.posterior, DataTree(), backend="matplotlib")
pc.generate_aes_dt(
    aes={
        "color": ["team"],
        "y": ["field", "team"],
        "marker": ["field"],
        "linestyle": ["chain"],
    },
    color=[f"C{i}" for i in range(6)],
    y=list(range(13)),
    linestyle=["-", ":", "--", "-."],
)
pc.aes
<xarray.DatasetView> Size: 0B
Dimensions:  ()
Data variables:
    *empty*

The generated aes_dt has one group per variable in the posterior group in the provided data. Each group in the aes_dt DataTree is a Dataset with the aesthetics that apply to that variable and the required shape for all values that aesthetic needs to take.

Thus, when we subset the data for plotting with ds[var_name].sel(**kwargs) we can get its aesthetics with aes_dt[var_name].sel(**kwargs).

Let’s inspect its contents for some variables. We’ll start with the intercept, which has dimensions chain, draw, field.

pc.aes["intercept"]
<xarray.DatasetView> Size: 120B
Dimensions:    (field: 2, chain: 4)
Coordinates:
  * field      (field) <U4 32B 'home' 'away'
  * chain      (chain) int64 32B 0 1 2 3
Data variables:
    color      <U2 8B 'C0'
    y          int64 8B 0
    marker     (field) <U1 8B '+' '^'
    linestyle  (chain) <U2 32B '-' ':' '--' '-.'

In this case, only the marker and linestyle mappings can be applied, so these two get arrays storing the values for the different coordinate values whereas the other two properties color and y get a scalar that corresponds to the neutral element.

We didn’t provide any defaults for the marker, but as we specified the backend, some default values were generated for us. We did provide 4 values for the linestyle and we get these for values in the mapped values storage.

Let’s move on to the sd_att variable, which in this case had dimensions chain, draw:

pc.aes["sd_att"]
<xarray.DatasetView> Size: 84B
Dimensions:    (chain: 4)
Coordinates:
  * chain      (chain) int64 32B 0 1 2 3
Data variables:
    color      <U2 8B 'C0'
    y          int64 8B 0
    marker     <U1 4B 'o'
    linestyle  (chain) <U2 32B '-' ':' '--' '-.'

Now only the linestyle mapping can be applied, so we get an array of values for them, scalar values for the others. It is worth noting that the value of the marker is different from the 2 we saw before for the intercept.

This is the neutral element. As we didn’t provide any values for the marker, 3 default values were set, one for each coordinate value of the team dimension and an extra one to act as neutral element, for those variables where the mapping does not apply. In fact, the values we have seen so far for color and y are also the ones corresponding to the neutral element.

The only mapping without neutral element is the linestyle one because the chain dimension is present in all variables, and so all variables will have an array with its 4 values.

Next let’s check atts_team variable, now with shape chain, draw, team:

pc.aes["atts_team"]
<xarray.DatasetView> Size: 316B
Dimensions:    (team: 6, chain: 4)
Coordinates:
  * team       (team) <U8 192B 'Wales' 'France' 'Ireland' ... 'Italy' 'England'
  * chain      (chain) int64 32B 0 1 2 3
Data variables:
    color      (team) <U2 48B 'C1' 'C2' 'C3' 'C4' 'C5' 'C1'
    y          int64 8B 0
    marker     <U1 4B 'o'
    linestyle  (chain) <U2 32B '-' ':' '--' '-.'

This case is similar to the intercept, changing linestyle and color. However, we manually provided values for the color cycle and we only gave 6 values. Thus, when the 1st one was taken as neutral element and excluded we end up having the same color (the 2nd in the cycle) for both the 1st and last teams (according the order defined in the team coordinate values).

Note however that if we had sliced the posterior to keep only variables with the team dimension there would be no need for the neutral element (like it currently happens with linestyle) and there wouldn’t be any repeated elements in the color cycle.

To finish, let’s check the atts variable where all mappings can be applied because it has chain, draw, field, team dimensions.

pc.aes["atts"]
<xarray.DatasetView> Size: 440B
Dimensions:    (team: 6, field: 2, chain: 4)
Coordinates:
  * team       (team) <U8 192B 'Wales' 'France' 'Ireland' ... 'Italy' 'England'
  * field      (field) <U4 32B 'home' 'away'
  * chain      (chain) int64 32B 0 1 2 3
Data variables:
    color      (team) <U2 48B 'C1' 'C2' 'C3' 'C4' 'C5' 'C1'
    y          (field, team) int64 96B 1 2 3 4 5 6 7 8 9 10 11 12
    marker     (field) <U1 8B '+' '^'
    linestyle  (chain) <U2 32B '-' ':' '--' '-.'

Consequently, you can see all aesthetics have arrays storing their values, and that all values differ from the neutral element in case there is one. Moreover, we gave 13 values for y which is one more than the unique combinations of field and team so there aren’t repeated values in the y cycle either even after excluding the neutral element.