Confusion Matrix

This notebook illustrates how to create a confusion matrix plot directly from a dataset.

[1]:
import pandas as pd
import seaborn as sns

from freeforestml import Variable, Process, Cut, confusion_matrix, HistogramFactory
from freeforestml import toydata, example_style
example_style()
2023-08-02 16:29:58.251005: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-08-02 16:29:58.390282: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-08-02 16:29:58.390310: I tensorflow/compiler/xla/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2023-08-02 16:29:59.254900: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2023-08-02 16:29:59.255020: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2023-08-02 16:29:59.255032: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

Setup

[2]:
df = toydata.get()
[3]:
p_sig = Process(r"Signal", range=(1, 1))
p_ztt = Process(r"$Z\rightarrow\tau\tau$", range=(0, 0))
[4]:
c_low = Cut(lambda d: d.m_jj  < 350, label="Low $m^{jj}$")
c_mid = Cut(lambda d: (d.m_jj >= 350) & (d.m_jj  < 600), label="Mid $m^{jj}$")
c_high = Cut(lambda d: d.m_jj  > 600, label="High $m^{jj}$")

Normalized columns

[5]:
confusion_matrix(df, [p_sig, p_ztt], [c_low, c_mid, c_high],
                 y_label="Region", x_label="Truth Signal", annot=True, weight="weight")
None
_images/ConfusionMatrix_8_0.png

Normalized rows

[6]:
confusion_matrix(df, [p_sig, p_ztt], [c_low, c_mid, c_high], normalize_rows=True,
                 y_label="Region", x_label="Truth Signal", annot=True, weight="weight")
None
_images/ConfusionMatrix_10_0.png