Utilities for SBND CCnue cross-section analysis.
This package is designed for notebook and script workflows where CAF-derived dataframes
have already been produced (typically via cafpyana) and you want to run selection,
plotting, and uncertainty studies.
- Selection helpers for signal and sideband studies.
- Signal-category definitions for MC truth labeling.
- Histogram utilities with overflow handling.
- Plotting helpers for stacked MC, data overlays, and data/MC ratios.
- Systematic uncertainty tools (covariance, correlation, universe handling, detector-variation helpers).
- I/O helpers for split HDF5 dataframe files (output of cafpyana).
- Common constants and geometry utilities.
selection.py: Event selection and signal labeling.select,select_sidebanddefine_signal,define_generic
plotting.py: MC/data plotting utilities.plot_var,plot_var_pdgdata_plot_overlay,plot_mc_data
syst.py: Systematic uncertainty utilities.get_syst_hists,get_systcalc_matrices,get_detvar_systs,get_syst_dfmcstat
histogram.py: Histogram wrappers with overflow behavior.get_hist1d,get_hist2d
io.py: Split-HDF5 dataframe loading and exposure summaries.get_n_split,print_keys,load_dfs
geometry.py: Detector geometry checks.whereTPC
constants.py: Analysis dictionaries and default plotting metadata.utils.py: DataFrame helpers.ensure_lexsorted
The objects in constants.py are used by selection/plotting/systematics code to keep
category definitions and display metadata consistent. All signal colors, labels, and names can be user-specific. signal_dict will be called by the signal/background definition function in selection.py
-
signal_dict- Maps detailed analysis channels to integer IDs used in the
signalcolumn. - Convention: ID
0(nueCC) is the signal topology. - Includes backgrounds such as
numuCCpi0,NCpi0,nonFV,dirt,cosmic, andoffbeam.
- Maps detailed analysis channels to integer IDs used in the
-
signal_labels- Human-readable LaTeX labels corresponding to
signal_dictorder. - Used in stacked MC legends (for example by
plot_var).
- Human-readable LaTeX labels corresponding to
-
signal_colors- Default color palette corresponding to
signal_dict/signal_labelsorder. - Keeps category colors stable across plots.
- Default color palette corresponding to
-
generic_dict- Coarser category mapping (
CCnu,NCnu,nonFV,dirt,cosmic). - Used when plotting/labeling with the generic categorization mode.
- Coarser category mapping (
-
generic_labels,generic_colors- Display labels and default colors for the generic categories.
Important behavior:
- Because flux is read at import time in
constants.py, importingnueanarequires thatfluxfileexists and is readable in the current environment.
Most users should add cafpyana to their path as well. In a notebook:
import nueana
import sys
sys.path.append("<path/to>/cafpyana")
import numpy as np
import nueana as nue
file = "/path/to/cafpyana/df.df"
keys = ['mcnu','hdr','nuecc']
mc_dfs= = nue.load_dfs(file,keys)
# df_mc should be a CAF-style pandas DataFrame with expected columns.
df_final = nue.select(mc_dfs['nuecc'], savedict=False)
# Define signal labels for MC.
df_labeled = nue.define_signal(df_final)
# Make a basic stacked plot.
bins = np.linspace(0.0, 2.0, 21)
fig, ax_main, ax_sub, _ = nue.plot_mc_data(
mc_df=df_labeled,
data_df=df_data,
var=("primshw", "shw", "reco_energy", "", "", ""),
bins=bins,
xlabel="Reco shower energy [GeV]",
title="nue candidate selection",
)select(...) can return either all intermediate stages as a dictionary(savedict=True) or only the
final dataframe (savedict=False). Default stage names for the nueCC selection:
preselectionflash matchingshower energymuon rejectionconversion gapdEdxopening angleshower lengthshower density
Use stage="..." to stop at an intermediate point.
constants.pyreads the flux ROOT file at import time. If the file is unavailable, importingnueana.constants(or fullnueana) will fail.- Several utilities assume pandas MultiIndex columns with CAF-style naming.
- Many routines expect a
signalcolumn to already be defined (usedefine_signalordefine_genericwhere appropriate). - Overflow behavior in histograms is enabled by default (
overflow=True) and folds values into the last bin.