In [1]:
# classic stuff
%matplotlib inline
import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
import os,sys
import tables
import scipy.stats
import itertools

# some nice formating things
plt.rc('font', family='serif', size=20)
plt.rc('legend', fontsize = 20)
plt.rc('axes', labelsize = 20)
plt.rc('axes', titlesize = 20)

Model of the data

To model the data we are going to use the detector response tensor, $\mathcal{R}$, and the Honda+Gaisser flux model, $\mathcal{F}$. Then the number of events are \begin{equation} N{ei,ci} = \sum{p\in {\nu,\bar\nu}}\sum{et} \mathcal{R^p}{ei,ci,et} \mathcal{F^p}_{ci,et}, \end{equation} where $ei$ is the reconstructd energy proxy bin index, $ci$ the $\cos\theta$ index, and $et$ the true neutrino energy index. Note that we have added the neutrino and antineutrino contributions.

In [2]:
NominalResponse = tables.open_file("dataExA/nufsgen_nominal_response_array.h5", mode='r',)
In [3]:
HondaGaisser = tables.open_file("dataExA/HondaGaisser.h5", mode='r',)
In [4]:
neutrino_atmospheric_flux = (
    HondaGaisser.root.average_flux_nu_pion[:] +
    HondaGaisser.root.average_flux_nu_kaon[:]
)

antineutrino_atmospheric_flux = (
    HondaGaisser.root.average_flux_nubar_pion[:] +
    HondaGaisser.root.average_flux_nubar_kaon[:]
)
In [5]:
response_array = np.array([NominalResponse.root.neutrino_response_array,
                           NominalResponse.root.antineutrino_response_array])
flux_array = np.array([neutrino_atmospheric_flux,
                       antineutrino_atmospheric_flux])
ExpectedEventsPerBin = np.einsum('pijk,pjk->ij', response_array, flux_array)
In [6]:
energy_bin_centers = (HondaGaisser.root.e_true_bin_edges[:-1] + HondaGaisser.root.e_true_bin_edges[1:])/2.

Introducing uncertainties in the flux via nuisance parameters

We can try to parametrize the uncertainties in the flux by means of what we call nuisance parameters. In this exercise we will only consider uncertainties in the neutrino flux. We will parametrize the neutrino flux in the following way \begin{equation} \phi{\rm atm}(E\nu,\cos\theta) = N0 \bigg( \phi\pi + R_{\pi/K} \phiK \bigg) \left(\frac{E\nu}{E0}\right)^{- \Delta \gamma}, \end{equation} where $\phi\pi$ and $\phi_K$ represent either the neutrino/antineutrino components and we have introduced three nuisance parameters:

overall normalization of the atmospheric flux ($N_0$),

relative pion to kaon contribution ($R_{\pi/K}$),

and the change of cosmic ray flux ($\Delta \gamma$).

($E_0$ is just a pivot point, which is degenerate with $N_0$)

In [7]:
# Define a function that takes in model parameters and produces the expected events from the detector
# 2D array with an energy distribution for each zenith angle

def GetExpectedEvents(params):
    if(len(params) == 3):
        N0,PIKR,DeltaGamma = params
        E0 = 2020.0
    else:
        N0,PIKR,DeltaGamma,E0 = params
    tilt = lambda array: np.apply_along_axis(
                                            lambda slide: np.multiply(slide,
                                                                      np.power(energy_bin_centers/E0,DeltaGamma)),
                                            1,array)

    neutrino_atmospheric_flux = tilt(HondaGaisser.root.average_flux_nu_pion[:] +\
                                PIKR*HondaGaisser.root.average_flux_nu_kaon[:])
    antineutrino_atmospheric_flux = tilt(HondaGaisser.root.average_flux_nubar_pion[:] +\
                                PIKR*HondaGaisser.root.average_flux_nubar_kaon[:])
        
    flux_array = N0*np.array([neutrino_atmospheric_flux,
                              antineutrino_atmospheric_flux])
    return np.einsum('pijk,pjk->ij', response_array, flux_array)

Construct realization of from the model

We do not want to start doing our studies on the data. We will first use the model to study how well we can reconstruct the parameters when including fluctuations. We can construct ensemble of experiments by introducing poisson fluctuations.

In [8]:
# Applies poisson fluctuations to simulate a dataset

# This allowed us to create a fake dataset

def ConstructRealizationsFromHypothesis(expectation):
    shape=expectation.shape
    e_flat = expectation.flat
    realization = np.zeros(len(e_flat))
    realization[e_flat != 0] = scipy.stats.poisson.rvs(e_flat[e_flat != 0])
    return np.reshape(realization,shape)

def test(params):
    N0,PIKR,DeltaGamma = params
    exp = GetExpectedEvents(params)
    return (ConstructRealizationsFromHypothesis(exp), exp)

Constructing a Likelihood problem

We are going to construct a poisson likelihood which is just define as \begin{equation} \mathcal{L}(N0,R{\pi/K},\Delta \gamma) = \left(\prodi\mathcal{P}{poisson}\left(data_i | \mu = \mu_i(N0,R{\pi/K},\Delta \gamma)\right)\right)\left(\prod\eta \pi(\eta)\right), \end{equation} where the index $i=(ei,ci)$ goes over all bins and $\mathcal{P}{poisson}$ is the poisson probability mass function. Since the Likelihood can be a very small number it is better to computer $\log\mathcal{L}$. Where $\pi(\eta)$ are the priors in the parameters. Notice that this defininition of the likelihood includes the priors on the parameters, which is not correct in the Bayesian construction where they are separate; sorry for the bad notation.

$ \Large{\mathcal{L}(\mu)={P}(data\mid\mu)}$

$ \Large{\mathcal{L}(\mu)= }\prod_{bins}\Large{P(N_{e,cos}\mid\mu)}$

$ \Large{\mathcal{L}(\mu)= }\prod_{bins}\Large{(\frac{e^{-\mu}\cdot\mu^{N}}{N!})}$

In words, this says the likelihood is the probability of the data being true given some model parameters, $\mu$. This is the same as multiplying the likelihood for each bin.

The third line works because every bin is assumed as poisson

The famous LLH is the "log likelihood," meaning we take the log of the third expression above.

In [9]:
# Poisson probability distribution
# Priors are expected values from the theory or other experiments

# params is from hypothesis, exp is "expected" function of hypothesis
def LLH(data, exp, params=None, priors=None):
    stat_LLH = np.sum(scipy.stats.poisson.logpmf(data, exp))
    if priors == None or len(priors) == 0:
        return stat_LLH
    prior_LLH = sum([prior(param) for prior,param in itertools.izip(priors, params)])
    return stat_LLH + prior_LLH

# This produced a statistial LLH, which means we gave data and parameters

Reduced and saturated likelihood

It is sometimes useful to introduce the reduced likelihood which is defined as \begin{equation} \mathcal{L}_r(\theta) = \mathcal{L}(\theta) - \mathcal(L)_s \end{equation} where the saturated likelihood, in our poisson construction, is the likelihood evaluated when $expectation = observation$ in each bin. It can be shown that for a poisson likelihood with large numbers of events the reduced likelihood converges to a $\chi^2$ distribution with dof = # bins + # of parameters.

In [10]:
def chi2(data, exp, params, priors, gauss):
    param_LLH = LLH(data, exp, params, priors)
    # saturated LLH is the BEST POSSIBLE LLH because "exp" = "data"
    sat_LLH = LLH(data, data, [g[0] for g in gauss], priors)
    return -2*(param_LLH - sat_LLH)

Frequentist confidence interval

Constructing a profile likelihood on the parameter of interest

In the frequentist construction using Wilk's theorem, on the pion/kaon ratio. In order to do this we will use the profile likelihood technique (do not confuse with the Bayesian term of marginalization), where the likelihood on the parameter of interest is given by \begin{equation} \log\mathcal{L}(\theta{interest}) = \max{\theta{nuisance}} \log\mathcal{L}(\theta{interest},\theta_{nuisance}) \end{equation} which we can do numerically by minimizing $-\log\mathcal{L}$.

Finding the frequentist confindence interval

To construct the confidence interval we will use Wilk's theorem which states that \begin{equation} -2(\mathcal{L} - \max \mathcal{L}) \sim \chi^2(ddf) \end{equation} where $ddf$ is, in this case, the number of parameters we are fitting for.

One can define the one sigma interval by then doing \begin{equation} -2(\mathcal{L} - \max \mathcal{L}) = 1 \end{equation} or, equivalently, by finding the root of \begin{equation} \mathcal{L} - \max \mathcal{L} + 1/2 = 0 \end{equation}

In [11]:
# Minimize LLH w.r.t nuisance parameters
def profileLLH(r, data, priors):
    return scipy.optimize.minimize(
        lambda x: -LLH(data, GetExpectedEvents((x[0],r,x[1])), (x[0],r,x[1]), priors),
        np.array([1.0, 0.0]),
        method="L-BFGS-B",
        bounds=np.array([[.01, 2.], [-0.3, 0.3]]),
        jac=False
    )

# Get the LLH profile as a function of R
def get_profile(r_ar, data, priors):
    points = [(lambda x: (x['x'][0], r, x['x'][1], x['fun']))(profileLLH(r, data, priors)) for r in r_ar]
    return points

# Fit the LLH to a quadratic and return the 1 sigma range using Wilks' theorem
def get_1s_range(points):
    min_point = min(enumerate(points), key=lambda p: p[1][1])
    min_i = min_point[0]
    max_LLH = -min_point[1][-1]
    left_bound, right_bound = 0, len(points)-1
    for i in range(min_i + 1, len(points)):
        if points[i][1] < points[i-1][1]:
            right_bound = i
            break
    for i in range(min_i - 1, -1, -1):
        if points[i][1] < points[i+1][1]:
            left_bound = i
            break

    fit_points = points[int(np.floor((left_bound+min_i)/2)):int(np.ceil((right_bound+min_i)/2))+1]
    fit_x = [p[0] for p in fit_points]
    fit_y = [p[1] for p in fit_points]
    quad = lambda x,a,b,c: a*x**2+b*x+c
    fit = scipy.optimize.curve_fit(quad, fit_x, fit_y, (1,0,1))
    a,b,c = fit[0]
    x1 = (-b-np.sqrt(b**2-4*a*(c+max_LLH-0.5)))/(2*a)
    x2 = (-b+np.sqrt(b**2-4*a*(c+max_LLH-0.5)))/(2*a)
    return (min_point[1][0], x1, x2)

Brazilian construction for the parameter confidence interval

Construct an ensemple of realizations under some hypothesis, say N0=1,PIKR=1,DeltaGamma=1. For each realization calculate the confidence interval for PIKR following the procedure outline above. This will then give you an ensemble of upper and lower bound. The median of the lower (upper) bound will give the mean upper (lower) bound. One can then construct an interval that contains the mean such that $$\mathcal{I}(k) = \left\{\theta : pdf(\theta) \ge k \right\}$$ such that $$\int_\mathcal{I(k)} d\theta ~ pdf(\theta) = 1 - \alpha$$ where $pdf$ is construct out of the ensembles and $1-\alpha$ is the confidence level of the band.

Bayesian credibility intervals

Constructing a marginalized likelihood

In the bayesian statistics the likelihood is not profiled, but rather its marginalized, which is defined as $$\mathcal{L}(\theta_{interest}) = \int_{\theta_{nuisance}} d\theta_{nuisance} \mathcal{L}(\theta_{interest},\theta_{nuisance}) \pi(\theta_{nuisance})$$ where here the likelihood tacitly includes the priors on the parameters (sorry for the bad notation, but I do it so not to change notation of \mathcal{L} when switching to the bayesian approach).

Constructing bayesian credibility regions

To construct the bayesian credibility region on parameter $\theta$ we need to calculate the posterior which is just given by

$$Posterior(\theta) = pdf(\theta) = k \mathcal{L}(\theta) \pi(\theta)$$

where k is a normalization constant that makes the posterior a pdf, i.e. it integrates to one. Then the credibility interval is just interval that contains the mean such that $$\mathcal{I}(k) = \left\{\theta : pdf(\theta) \ge k \right\}$$ such that $$\int_\mathcal{I(k)} d\theta ~ pdf(\theta) = 1 - \alpha$$ where $pdf$ is construct out of the ensembles and $1-\alpha$ is the confidence level of the band.

Loading the data

For this exercise we will use one year of IceCube's public data. Let's load the data in the following way

In [12]:
data = np.genfromtxt("dataExA/observed_events.dat")
data[:,1] = np.cos(data[:,1])
In [13]:
data,_,_ = np.histogram2d(data[:,0],
                          data[:,1],
                          bins = [NominalResponse.root.proxy_energy_bin_edges,HondaGaisser.root.costh_reco_bin_edges])

First look at the data.

Make the energy distribution of the data and zenith distribution. Compare with your hypothesis.

In [14]:
# Construct guassian priors for model parameters
gauss = ((1, 0.3), (1, 0.1), (0, 0.05))
priors = (
    lambda x:scipy.stats.norm.logpdf(x, loc=gauss[0][0], scale=gauss[0][1]),    
    lambda x:scipy.stats.norm.logpdf(x, loc=gauss[1][0], scale=gauss[1][1]),
    lambda x:scipy.stats.norm.logpdf(x, loc=gauss[2][0], scale=gauss[2][1])
)

# Define search space for R
r_ar = np.arange(0.9, 1.2, 0.01)

# Get profile points
points = get_profile(r_ar, data, priors)
points_R = [p[1] for p in points]
points_nLLH = [p[3] for p in points]

# Plot the negative LLH profile for R
plt.plot(points_R, points_nLLH)
plt.xlabel('R')
plt.ylabel('-LLH')
plt.show()

# Find the 1-sigma range
result = (lambda x: (x[0], x[1]-x[0], x[2]-x[0]))(get_1s_range([(p[1], p[3]) for p in points]))
print "R = %.3f (%.3f,+%.3f)" % result
R = 1.110 (-0.073,+0.080)
In [ ]: