Histogram Factory

The number of arguments passed to hist() is large and usually a source of code repetation. The HistogramFactory is a way to define default argument that can be overridded when creating a histogram.

[1]:
import pandas as pd
import seaborn as sns

from freeforestml import Variable, Process, Cut, hist, HistogramFactory, McStack, DataStack
from freeforestml import toydata, example_style
example_style()
2023-08-02 16:39:42.723123: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-08-02 16:39:42.853325: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-08-02 16:39:42.853354: I tensorflow/compiler/xla/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2023-08-02 16:39:43.686927: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2023-08-02 16:39:43.687038: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2023-08-02 16:39:43.687050: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

Setup

Load or geneate toy dataset.

[2]:
df = toydata.get()

Define processes included in the histogram.

[3]:
p_ztt = Process(r"$Z\rightarrow\tau\tau$", range=(0, 0))
p_sig = Process(r"Signal", range=(1, 1))
p_asimov = Process(r"Asimov", selection=lambda d: d.fpid >= 0)

Define stacks. Data is it’s own stack and should not be stacked on top of the MC prediction.

[4]:
s_bkg = McStack(p_ztt, p_sig)
s_data = DataStack(p_asimov)

Examples

Create a default plotting method the has a default value for the dataframe, the stacks and the binning.

[5]:
hist_factory = HistogramFactory(df, stacks=[s_bkg, s_data], bins=20, range=(0, 200), selection=None,
     weight="weight")
None

Create a plot for the mass variable. Note that we pass a single argument to the plotting method.

[6]:
v_mmc = Variable(r"$m^H$", "higgs_m", "GeV")
hist_factory(v_mmc)
None
_images/HistogramFactory_14_0.png

Create a plot for different variables, also overriding the binning.

[7]:
v_tau_pT = Variable(r"$p_\mathrm{T}{\tau}$", "tau_pt", "GeV")
hist_factory(v_tau_pT, bins=12, range=(0, 120))
None
_images/HistogramFactory_16_0.png
[8]:
v_lep_pT = Variable(r"$p_\mathrm{T}{\ell}$", "lep_pt", "GeV")
hist_factory(v_lep_pT, bins=12, range=(0, 120))
None
_images/HistogramFactory_17_0.png