neural end-to-end-optimised summary statistics
arxiv.org/abs/2203.05570
Leverages the shoulders of giants (jax
and pyhf
) to differentiate through a high-energy physics analysis workflow, including the construction of the frequentist profile likelihood.
If you're more of a video person, see this talk given by Nathan on the broader topic of differentiable programming in high-energy physics, which also covers neos
.
Some things need to happen first. Click here for more info -- I wrote them up!
Do you want to chat about neos
? Join us in Mattermost:
Please cite our newly released paper:
@article{neos,
Author = {Nathan Simpson and Lukas Heinrich},
Title = {neos: End-to-End-Optimised Summary Statistics for High Energy Physics},
Year = {2022},
Eprint = {arXiv:2203.05570},
doi = {10.48550/arXiv.2203.05570},
url = {https://doi.org/10.48550/arXiv.2203.05570}
}
In a python 3 environment, run the following:
pip install --upgrade pip setuptools wheel
pip install neos
pip install git+http://github.com/scikit-hep/pyhf.git@make_difffable_model_ctor
With this, you should be able to run the demo notebook demo.ipynb on your pc :)
This workflow is as follows:
- From a set of normal distributions with different means, we'll generate four blobs of
(x,y)
points, corresponding to a signal process, a nominal background process, and two variations of the background from varying the background distribution's mean up and down. - We'll then feed these points into the previously defined neural network for each blob, and construct a histogram of the output using kernel density estimation. The difference between the two background variations is used as a systematic uncertainty on the nominal background.
- We can then leverage the magic of
pyhf
to construct an event-counting statistical model from the histogram yields. - Finally, we calculate the p-value of a test between the nominal signal and background-only hypotheses. This uses the familiar profile likelihood-based test statistic.
This counts as one forward pass of the workflow -- we then optimize the neural network by gradient descent, backpropagating through the whole analysis!
A big thanks to the teams behind jax
, fax
, jaxopt
and pyhf
for their software and support.