dival.evaluation module

Tools for the evaluation of reconstruction methods.

class dival.evaluation.TaskTable(name='')[source]

Bases: object

Task table containing reconstruction tasks to evaluate.

Name of the task table.

Type:: str

tasks

Tasks that shall be run. The fields of each dict are set from the parameters to append() (or append_all_combinations()). Cf. documentation of append() for details.

Type:: list of dict

results

Results from the latest call to run().

Type:: ResultTable or None

__init__(name='')[source]

run(save_reconstructions=True, reuse_iterates=True, show_progress='text')[source]

Run all tasks and return the results.

The returned ResultTable object is also stored as results.

Parameters:

save_reconstructions (bool, optional) –
Whether the reconstructions should be saved in the results. The default is True.

If measures shall be applied after this method returns, it must be True.

If False, no iterates (intermediate reconstructions) will be saved, even if task['options']['save_iterates']==True.
reuse_iterates (bool, optional) –
Whether to reuse iterates from other sub-tasks if possible. The default is True.

If there are sub-tasks whose hyper parameter choices differ only in the number of iterations of an IterativeReconstructor, only the sub-task with the maximum number of iterations is run and the results for the other ones determined by storing iterates if this option is True.

Note 1: If enabled, the callbacks assigned to the reconstructor will be run only for the above specified sub-tasks with the maximum number of iterations.

Note 2: If the reconstructor is non-deterministic, this option can affect the results as the same realization is used for multiple sub-tasks.
show_progress (str, optional) –
Whether and how to show progress. Options are:

'text' (default)
print a line before running each task

'tqdm'
show a progress bar with tqdm

None
do not show progress

Returns:

results – The results.

Return type:

ResultTable

append(reconstructor, test_data, measures=None, dataset=None, hyper_param_choices=None, options=None)[source]

Append a task.

Parameters:

reconstructor (Reconstructor) – The reconstructor.
test_data (DataPairs) – The test data.
measures (sequence of (Measure or str), optional) – Measures that will be applied. Either Measure objects or their short names can be passed.
dataset (Dataset, optional) – The dataset that will be passed to reconstructor.train if it is a LearnedReconstructor.
hyper_param_choices (dict of list or list of dict, optional) –
Choices of hyper parameter combinations to try as sub-tasks.
- If a dict of lists is specified, all combinations of the list elements (cartesian product space) are tried.
- If a list of dicts is specified, each dict is taken as a parameter combination to try.
The current parameter values are read from Reconstructor.hyper_params in the beginning and used as default values for all parameters not specified in the passed dicts. Afterwards, the original values are restored.
options (dict) –
Options that will be used. Options are:

'skip_training'bool, optional
Whether to skip training. Can be used for manual training of reconstructors (or loading of a stored state). Default: False.

'save_best_reconstructor'dict, optional
If specified, save the best reconstructor from the sub-tasks (cf. hyper_param_choices) by calling Reconstructor.save_params(). For hyper_param_choices=None, the reconstructor from the single sub-task is saved. This option requires measures to be non-empty if there are multiple sub-tasks. The fields are:

'path'str
The path to save the best reconstructor at (argument to save_params()). Note that this path is used during execution of the task to store the best reconstructor params so far, so the file(s) are most likely updated multiple times.

'measure'Measure or str, optional
The measure used to define the “best” reconstructor (in terms of mean performance). Must be one of the measures. By default measures[0] is used. This field is ignored if there is only one sub-task.

'save_iterates'bool, optional
Whether to save the intermediate reconstructions of iterative reconstructors. Default: False. Will be ignored if save_reconstructions=False is passed to run. If reuse_iterates=True is passed to run and there are sub-tasks for which iterates are reused, these iterates are the same objects for all of those sub-tasks (i.e. no copies).

'save_iterates_measure_values'bool, optional
Whether to compute and save the measure values for each intermediate reconstruction of iterative reconstructors (the default is False).

'save_iterates_step'int, optional
Step size for 'save_iterates' and 'save_iterates_measure_values' (the default is 1).

append_all_combinations(reconstructors, test_data, measures=None, datasets=None, hyper_param_choices=None, options=None)[source]

Append tasks of all combinations of test data, reconstructors and optionally datasets. The order is taken from the lists, with test data changing slowest and reconstructor changing fastest.

Parameters:

reconstructors (list of Reconstructor) – Reconstructor list.
test_data (list of DataPairs) – Test data list.
measures (sequence of (Measure or str)) – Measures that will be applied. The same measures are used for all combinations of test data and reconstructors. Either Measure objects or their short names can be passed.
datasets (list of Dataset, optional) – Dataset list. Required if reconstructors contains at least one LearnedReconstructor.
hyper_param_choices (list of (dict of list or list of dict), optional) – Choices of hyper parameter combinations for each reconstructor, which are tried as sub-tasks. The i-th element of this list is used for the i-th reconstructor. See append for documentation of how the choices are passed.
options (dict) – Options that will be used. The same options are used for all combinations of test data and reconstructors. See append for documentation of the options.

class dival.evaluation.ResultTable(row_list)[source]

Bases: object

The results of a TaskTable.

Cf. TaskTable.results.

results

The results. The index is given by 'task_ind' and 'sub_task_ind', and the columns are 'reconstructions', 'reconstructor', 'test_data', 'measure_values' and 'misc'.

Type:: pandas.DataFrame

__init__(row_list)[source]

Usually, objects of this type are constructed by TaskTable.run(), which sets TaskTable.results, rather than by manually calling this constructor.

Parameters:: row_list (list of dict) – Result rows. Used to build results of type pandas.DataFrame.

apply_measures(measures, task_ind=None)[source]

Apply (additional) measures to reconstructions.

This is not possible if the reconstructions were not saved, in which case a ValueError is raised.

Parameters:

measures (list of Measure) – Measures to apply.
task_ind (int or sequence of ints, optional) – Indexes of tasks to which the measures shall be applied. If None, this is interpreted as “all results”.

Raises:

ValueError – If reconstructions are missing or task_ind is not valid.

plot_reconstruction(task_ind, sub_task_ind=0, test_ind=-1, plot_ground_truth=True, **kwargs)[source]

Plot the reconstruction at the specified index. Supports only 1d and 2d reconstructions.

Parameters:

task_ind (int) – Index of the task.
sub_task_ind (int, optional) – Index of the sub-task (default 0).
test_ind (sequence of int or int, optional) – Index in test data. If -1, plot all reconstructions (the default).
plot_ground_truth (bool, optional) – Whether to show the ground truth next to the reconstruction. The default is True.
kwargs (dict) – Keyword arguments that are passed to plot_image() if the reconstruction is 2d.

Returns:

ax_list – The axes in which the reconstructions and eventually the ground truth were plotted.

Return type:

list of np.ndarray of matplotlib.axes.Axes

plot_all_reconstructions(**kwargs)[source]

Plot all reconstructions.

Parameters:: kwargs (dict) – Keyword arguments that are forwarded to plot_reconstruction().
Returns:: ax – The axes the reconstructions were plotted in.
Return type:: np.ndarray of matplotlib.axes.Axes

plot_convergence(task_ind, sub_task_ind=0, measures=None, fig_size=None, gridspec_kw=None)[source]

Plot measure values for saved iterates.

This shows the convergence behavior with respect to the measures.

Parameters:

task_ind (int) – Index of the task.
sub_task_ind (int, optional) – Index of the sub-task (default 0).
measures ([list of ] Measure, optional) – Measures to apply. Each measure is plotted in a subplot. If None is passed, all measures in result['measure_values'] are used.

Returns:

ax – The axes the measure values were plotted in.

Return type:

np.ndarray of matplotlib.axes.Axes

plot_performance(measure, reconstructors=None, test_data=None, weighted_average=False, **kwargs)[source]

Plot average measure values for different reconstructors. The values have to be computed previously, e.g. by apply_measures().

The average is computed over all rows of results with the specified test_data that store the requested measure value.

Note that for tasks with multiple sub-tasks, all of them are used when computing the average (i.e., the measure values for all hyper parameter choices are averaged).

Parameters:

measure (Measure or str) – The measure to plot (or its short_name).
reconstructors (sequence of Reconstructor, optional) – The reconstructors to compare. If None (default), all reconstructors that are found in the results are compared.
test_data ([sequence of ] DataPairs, optional) – Test data to take into account for computing the mean value. By default, all test data is used.
weighted_average (bool, optional) – Whether to weight the rows according to the number of pairs in their test data. Default: False, i.e. all rows are weighted equally. If True, all test data pairs are weighted equally.

Returns:

ax – The axes the performance was plotted in.

Return type:

matplotlib.axes.Axes

to_string(max_colwidth=70, formatters=None, hide_columns=None, show_columns=None, **kwargs)[source]

Convert to string. Used by __str__().

Parameters:

max_colwidth (int, optional) – Maximum width of the columns, c.f. the option 'display.max_colwidth' of pandas.
formatters (dict of functions, optional) – Custom formatter functions for the columns, passed to results.to_string.
hide_columns (list of str, optional) – Columns to hide. Default: ['reconstructions', 'misc'].
show_columns (list of str, optional) – Columns to show. Overrides hide_columns.
kwargs (dict) – Keyword arguments passed to results.to_string.

Returns:

string – The string.

Return type:

str

print_summary()[source]: Prints a summary of the results.