{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Histogram Factory"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The number of arguments passed to `hist()` is large and usually a source of code repetation. The `HistogramFactory` is a way to define default argument that can be overridded when creating a histogram."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import seaborn as sns\n",
    "\n",
    "from freeforestml import Variable, Process, Cut, hist, HistogramFactory, McStack, DataStack\n",
    "from freeforestml import toydata, example_style\n",
    "example_style()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Load or geneate toy dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = toydata.get()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define processes included in the histogram."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "p_ztt = Process(r\"$Z\\rightarrow\\tau\\tau$\", range=(0, 0))\n",
    "p_sig = Process(r\"Signal\", range=(1, 1))\n",
    "p_asimov = Process(r\"Asimov\", selection=lambda d: d.fpid >= 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define stacks. Data is it's own stack and should not be stacked on top of the MC prediction."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "s_bkg = McStack(p_ztt, p_sig)\n",
    "s_data = DataStack(p_asimov)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Examples"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a default plotting method the has a default value for the dataframe, the stacks and the binning."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hist_factory = HistogramFactory(df, stacks=[s_bkg, s_data], bins=20, range=(0, 200), selection=None,\n",
    "     weight=\"weight\")\n",
    "None"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a plot for the mass variable. Note that we pass a single argument to the plotting method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "v_mmc = Variable(r\"$m^H$\", \"higgs_m\", \"GeV\")\n",
    "hist_factory(v_mmc)\n",
    "None"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a plot for different variables, also overriding the binning."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "v_tau_pT = Variable(r\"$p_\\mathrm{T}{\\tau}$\", \"tau_pt\", \"GeV\")\n",
    "hist_factory(v_tau_pT, bins=12, range=(0, 120))\n",
    "None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "v_lep_pT = Variable(r\"$p_\\mathrm{T}{\\ell}$\", \"lep_pt\", \"GeV\")\n",
    "hist_factory(v_lep_pT, bins=12, range=(0, 120))\n",
    "None"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}