hepdata_lib package

Module contents

hepdata_lib main.

class hepdata_lib.AdditionalResourceMixin

Bases: object

Functionality related to additional materials.

add_additional_resource(description, location, copy_file=False, file_type=None)

Add any kind of additional resource. If copy_file is set to False, the location and description will be added as-is. This is useful e.g. for the case of providing a URL to a web-based resource.

If copy_file is set to True, we will try to copy the file from the location you have given into the output directory. This only works if the location is a local file. If the location you gave does not exist or points to a file larger than 100 MB, a RuntimeError will be raised. While the file checks are performed immediately (i.e. the file must exist when this function is called), the actual copying only happens once create_files function of the submission object is called.

Parameters:
  • description (string) – Description of what the resource is.

  • location (string) – Can be either a URL pointing to a web-based resource or a local file path.

  • copy_file (bool) – If set to true, will attempt to copy a local file to the tar ball.

  • file_type (string) – Type of the resource file. Currently, only “HistFactory” has any effect.

copy_files(outdir)

Copy the files in the files_to_copy list to the output directory.

Parameters:

outdir (string) – Output directory path to copy to.

class hepdata_lib.Submission

Bases: AdditionalResourceMixin

Top-level object of a HEPData submission.

Holds all the lower-level objects and controls writing.

Append link to additional_resources list.

Parameters:
  • description (string.) – Description of what the link refers to.

  • location (string) – URL to link to.

add_record_id(r_id, r_type)

Append record_id to record_ids list.

Appends a record ID to the related_records list. :param r_id: The record’s ID :type r_id: integer

add_table(table)

Append table to tables list.

Parameters:

table (Table.) – The table to be added.

create_files(outdir='.', validate=True, remove_old=False)

Create the output files.

Implicitly triggers file creation for all tables that have been added to the submission, all variables associated to the tables and all uncertainties associated to the variables.

If validate is True, the hepdata-validator package will be used to validate the output tar ball.

If remove_old is True, the output directory will be deleted before recreation.

files_to_copy_nested()

List files-to-copy for this Submission and nested daughters

static get_license()

Return the default license.

read_abstract(filepath)

Read in the abstracts file.

Parameters:

filepath (string.) – Path to text file containing abstract.

class hepdata_lib.Table(name)

Bases: AdditionalResourceMixin

A table is a collection of variables.

It also holds meta-data such as a general description, the location within the paper, etc.

add_image(file_path, outdir=None)

Add an image file to the table.

This function only stores the path to the image. Any additional processing will be done later (see write_images function).

Parameters:
  • file_path (string) – Path to the image file.

  • outdir – Deprecated.

Appends a DOI string to the related_tables list.

Parameters:

doi (string) – The table DOI.

add_variable(variable)

Add a variable to the table

Parameters:

variable (Variable.) – Variable to add.

property name

Name getter.

write_images(outdir)

Write image files and thumbnails into the output directory.

Parameters:

outdir (string) – Path to output directory. Will be created if it doesn’t exist.

write_output(outdir)

Write the table files into the output directory.

Parameters:

outdir (string) – Path to output directory. Will be created if it doesn’t exist.

write_yaml(outdir='.')

Write the table (and all its variables) to a YAML file.

This function is intended to be called internally by the Submission object. Except for debugging purposes, no user should have to call this function.

class hepdata_lib.Uncertainty(label, is_symmetric=True)

Bases: object

Store information about an uncertainty on a variable

Uncertainties can be symmetric or asymmetric. The main information is stored as one (two) lists in the symmetric (asymmetric) case. The list entries are the uncertainty for each of the list entries in the corresponding Variable.

scale_values(factor)

Multiply each value by constant factor.

Parameters:

factor (float) – Value to multiply by.

set_values_from_intervals(intervals, nominal)

Set values relative to set of nominal values. Useful if you do not have the actual uncertainty available, but the upper and lower boundaries of an interval.

Parameters:
  • intervals (List of tuples of two floats) – Lower and upper interval boundaries

  • nominal (List of floats) – Interval centers

property values

Value getter.

Returns:

list – values, either as a direct list of values if uncertainty is symmetric, or list of tuples if it is asymmetric.

class hepdata_lib.Variable(name, is_independent=True, is_binned=True, units='', values=None, zero_uncertainties_warning=True)

Bases: object

A Variable is a wrapper for a list of values + some meta data.

add_qualifier(name, value, units='')

Add a qualifier.

add_uncertainty(uncertainty)

Add an uncertainty.

If the Variable object already has values assigned to it, it is required that the value list of the Uncertainty object has the same length as the list of Variable values.

If the list of values of the Variable is empty, no requirement is applied on the length of the list of Uncertainty values.

make_dict()

Return all data in this Variable as a dictionary.

The dictionary structure follows the hepdata conventions, so that dumping this dictionary to YAML will give a file that hepdata can read.

Uncertainties associated to this Variable are also written into the dictionary.

This function is intended to be called internally by the Submission object. Except for debugging purposes, no user should have to call this function.

scale_values(factor)

Multiply each value by constant factor. Also applies to uncertainties.

property values

Value getter.

hepdata_lib.dict_constructor(loader, node)

construct dict.

hepdata_lib.dict_representer(dumper, data)

represent dict.

.C file reader

class hepdata_lib.c_file_reader.CFileReader(cfile)

Bases: object

Reads ROOT Objects from .C files

property cfile

The .C file this reader reads from.

check_for_comments(line)

Check line for comment

create_tgraph(x_value, y_value)

Function to create pyroot TGraph object

create_tgraph_dict(graph_list, list_of_tgraphs)

Function to create pyroot TGraph dict

create_tgrapherrors(x_value, y_value, dx_value, dy_value)

Function to create pyroot TGraphErrors object

create_tgrapherrors_dict(graph_list)

Function to create pyroot TGraphErrors dict

find_graphs()

Find all TGraphs in .C file

get_graphs()

Parse the .C file trying to find TGraph objects

read_graph(graphname)

Function to read values of a graph

hepdata_lib helper functions.

hepdata_lib.helpers.any_uncertainties_nonzero(uncertainties, size)

Return a mask of bins where any of the uncertainties is nonzero.

hepdata_lib.helpers.check_file_existence(path_to_file)

Check that the given file path exists. If not, raise RuntimeError.

Parameters:

path_to_file (string) – File path to check.

hepdata_lib.helpers.check_file_size(path_to_file, upper_limit=None, lower_limit=None)

Check that the file size is between the upper and lower limits. If not, raise RuntimeError.

Parameters:
  • path_to_file (string) – File path to check.

  • upper_limit (float) – Upper size limit in MB.

  • lower_limit (float) – Lower size limit in MB.

hepdata_lib.helpers.convert_pdf_to_png(source, target)

Wrapper for the ImageMagick convert utility.

Parameters:
  • source (str) – Source file in PDF format.

  • target (str) – Output file in PNG format.

hepdata_lib.helpers.convert_png_to_thumbnail(source, target)

Wrapper for the ImageMagick convert utility in thumbnail mode.

Parameters:
  • source (str) – Source file in PNG format.

  • target (str) – Output thumbnailfile in PNG format.

hepdata_lib.helpers.execute_command(command)

Execute shell command using subprocess. If executable does not exist, return False. For other errors raise RuntimeError. Else return True on success.

Parameters:

command (string) – Command to execute.

hepdata_lib.helpers.file_is_outdated(file_path, reference_file_path)

Check if the given file is outdated compared to the reference file.

Also returns true if the reference file does not exist.

Parameters:
  • file_path (str) – Path to the file to check.

  • reference_file_path (str) – Path to the reference file.

hepdata_lib.helpers.find_all_matching(path, pattern)

Utility function that works like ‘find’ in bash.

hepdata_lib.helpers.get_number_precision(value)

Get precision of an input value. Exact integer powers of 10 are assigned same precision of smaller numbers For example get_number_precision(10.0) = 1 get_number_precision(10.001) = 2 get_number_precision(9.999) = 1

hepdata_lib.helpers.get_value_precision_wrt_reference(value, reference)

relative precision of first argument with respect to the second one value and reference are both float and/or int value can be float when reference is an int and viceversa

: param value: first value : type value: float, int

: param reference: reference value (usually the uncertainty on value) : type reference: float, int

hepdata_lib.helpers.relative_round(value, relative_digits)

Rounds to a given relative precision

hepdata_lib.helpers.round_value_and_uncertainty(cont, val_key='y', unc_key='dy', sig_digits_unc=2)

round values and uncertainty according to the precision of the uncertainty, and also round uncertainty to a given number of significant digits Typical usage:

reader = RootFileReader(“rootfile.root”) data = reader.read_hist_1d(“histogramName”) round_value_and_uncertainty(data,”y”,”dy”,2)

will round data[“y”] to match the precision of data[“dy”] for each element, after rounding each element of data[“dy”] to 2 significant digits e.g. 26.5345 +/- 1.3456 –> 26.5 +/- 1.3

: param cont : dictionary as returned e.g. by RootFileReader::read_hist_1d() : type cont : dictionary

: param sig_digits_unc: how many significant digits used to round the uncertainty : type sig_digits_unc: integer

hepdata_lib.helpers.round_value_and_uncertainty_to_decimals(cont, val_key='y', unc_key='dy', decimals=3)

round values and uncertainty to some decimals default round to 3 digits after period possible use case: correlations where typical values are within -1,1

: param cont : dictionary as returned e.g. by RootFileReader::read_hist_1d() : type cont : dictionary

: param decimals: how many decimals for the rounding : type decimals: integer

hepdata_lib.helpers.round_value_to_decimals(cont, key='y', decimals=3)

round all values in a dictionary to some decimals in one go default round to 3 digits after period possible use case: correlations where typical values are within -1,1

: param cont : dictionary as returned e.g. by RootFileReader::read_hist_1d() : type cont : dictionary

: param decimals: how many decimals for the rounding : type decimals: integer

hepdata_lib.helpers.sanitize_value(value)

Handle conversion of input types for internal storage.

Parameters:

value (string, int, or castable to float) – User-side input value to sanitize.

Strings and integers are left alone, everything else is converted to float.

hepdata_lib utilities to interact with ROOT data formats.

class hepdata_lib.root_utils.RootFileReader(tfile)

Bases: object

Easily extract information from ROOT histograms, graphs, etc

read_graph(path_to_graph)

Extract lists of X and Y values from a TGraph.

Parameters:

path_to_graph (str) – Absolute path in the current TFile.

Returns:

dict – For a description of the contents, check the documentation of the get_graph_points function.

read_hist_1d(path_to_hist, **kwargs)

Read in a TH1.

Parameters:
  • path_to_hist (str) – Absolute path in the current TFile.

  • **kwargs – See below

Keyword Arguments:
  • xlim (tuple) –

    limit x-axis range to consider (xmin, xmax)

  • force_symmetric_errors

    Force readout of symmetric errors instead of determining type automatically

Returns:

dict – For a description of the contents, check the documentation of the get_hist_1d_points function

read_hist_2d(path_to_hist, **kwargs)

Read in a TH2.

Parameters:
  • path_to_hist (str) – Absolute path in the current TFile.

  • **kwargs – See below

Keyword Arguments:
  • xlim (tuple) –

    limit x-axis range to consider (xmin, xmax)

  • ylim (tuple) –

    limit y-axis range to consider (ymin, ymax)

  • force_symmetric_errors

    Force readout of symmetric errors instead of determining type automatically

Returns:

dict – For a description of the contents, check the documentation of the get_hist_2d_points function

read_limit_tree(path_to_tree='limit', branchname_x='mh', branchname_y='limit')

Read in CMS combine limit tree.

Parameters:
  • path_to_tree (str) – Absolute path in the current TFile

  • branchname_x (str) – Name of the branch that identifies each of the toys/parameter points.

  • branchname_y (str) – Name of the branch that contains the limit values.

Returns:

list – Lists with 1+5 entries per toy/parameter point in the file. The entries correspond to the one number in the x branch and the five numbers in the y branch.

read_tree(path_to_tree, branch_name)

Extract a list of values from a tree branch.

Parameters:
  • path_to_tree (str) – Absolute path in the current TFile.

  • branch_name (str) – Name of branch to read.

Returns:

list – The values saved in the tree branch.

retrieve_object(path_to_object)

Generalized function to retrieve a TObject from a file.

There are three use cases: 1) The object is saved under the exact path given. In this case, the function behaves identically to TFile.Get. 2) The object is saved as a primitive in a TCanvas. In this case, the path has to be formatted as PATH_TO_CANVAS/NAME_OF_PRIMITIVE 3) The object is saved as a primitive in a TPad that is nested in a TCanvas. In this case, the path has to be formatted as CANVAS/PAD1/PAD2…/NAME_OF_PRIMITIVE

Parameters:

path_to_object (str.) – Absolute path in current TFile.

Returns:

TObject – The object corresponding to the given path.

property tfile

The TFile this reader reads from.

hepdata_lib.root_utils.get_graph_points(graph)

Extract lists of X and Y values from a TGraph.

Parameters:

graph (TGraph, TGraphErrors, TGraphAsymmErrors) – The graph to extract values from.

Returns:

dict – Lists of x, y values saved in dictionary (keys are “x” and “y”). If the input graph is a TGraphErrors (TGraphAsymmErrors), the dictionary also contains the errors (keys “dx” and “dy”). For symmetric errors, the errors are simply given as a list of values. For asymmetric errors, a list of tuples of (down,up) values is given.

hepdata_lib.root_utils.get_hist_1d_points(hist, **kwargs)

Get points from a TH1.

Parameters:
  • hist (TH1D) – Histogram to extract points from

  • **kwargs – See below

Keyword Arguments:
  • xlim (tuple) –

    limit x-axis range to consider (xmin, xmax)

  • force_symmetric_errors

    Force readout of symmetric errors instead of determining type automatically

Returns:

dict – Lists of x/y values saved in dictionary. Corresponding keys are “x” for the value of the bin center. The bin edges may be found under “x_edges” as a list of tuples (lower_edge, upper_edge). The bin contents are stored under the “y” key. Bin content errors are stored under the “dy” key as either a list of floats (symmetric case) or a list of down/up tuples (asymmetric). Symmetric errors are returned if the histogram error option TH1::GetBinErrorOption() returns TH1::kNormal.

hepdata_lib.root_utils.get_hist_2d_points(hist, **kwargs)

Get points from a TH2.

Parameters:
  • hist (TH2D) – Histogram to extract points from

  • **kwargs – See below

Keyword Arguments:
  • xlim (tuple) –

    limit x-axis range to consider (xmin, xmax)

  • ylim (tuple) –

    limit y-axis range to consider (ymin, ymax)

  • force_symmetric_errors

    Force readout of symmetric errors instead of determining type automatically

Returns:

dict – Lists of x/y/z values saved in dictionary. Corresponding keys are “x”/”y” for the values of the bin center on the respective axis. The bin edges may be found under “x_edges” and “y_edges” as a list of tuples (lower_edge, upper_edge). The bin contents and errors are stored under the “z” key. Bin content errors are stored under the “dz” key as either a list of floats (symmetric case) or a list of down/up tuples (asymmetric). Symmetric errors are returned if the histogram error option TH1::GetBinErrorOption() returns TH1::kNormal.