Diffuse Fit Project

Imports

We need numpy, scipy, scipy.stats, and json

In [ ]:
import 

json.load

Signature: json.load(fp, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)
Docstring:
Deserialize fp (a .read()-supporting file-like object containing a JSON document) to a Python object.

Remember to unzip the tarball toy_data.tar.gz
tar -xzvf toy_data.tar.gz

Simulation file hese_toy_simulation.json
Data file hese_toy_data.json

Load the simulation

In [ ]:
f = open(... , "r")
json_contents = json.load(...)
In [ ]:
simulation_mapping = json_contents["mapping"]
simulation_events = np.array(json_contents["events"])
del json_contents
In [ ]:
print type(simulation_mapping)

Load the data

In [ ]:
data_mapping = 
data_events = 

Define event weighting

In this case the generation probability includes information other than just the spectrum. That information is accounted for in the stored weights.
So to weight our simulation, we only need to concern ourselves with the flux.
Recall that the weight is the new flux divided by the injected flux (times other weighting factors).

$w_i=w\frac{\Phi_{test}(E_i,\theta_i)}{\Phi_{generated}(E_i,\theta_i)}$
In this case however, we have separate astophysical and atmospheric weights, so we will have to reweight them separately.
$w_{i,{atmo}}=w_{atmo}\frac{\Phi_{test, {atmo}}(E_i,\theta_i)}{\Phi_{generated, {atmo}}(E_i,\theta_i)}$
$w_{i,{astro}}=w_{atmo}\frac{\Phi_{test, {atmo}}(E_i,\theta_i)}{\Phi_{generated, {astro}}(E_i,\theta_i)}$
$w_{i}=w_{i,{atmo}}+w_{i,{astro}}$
For the atmospheric component our injected flux is $1.0$ times the expectation
So $w_{i,{atmo}}=N_{atmo} \cdot w_{atmo}$
For the astrophysical component we are using a powerlaw spectrum, and our injected flux is $1.0\cdot \frac{E}{100TeV}^{-2.5}$
So $w_{i,{astro}}=N_{astro} \cdot \frac{\frac{E}{100TeV}^{-\gamma}}{\frac{E}{100TeV}^{-2.5}} \cdot w_{astro}$
Define a function called weight_event below that takes as arguments (events, atmo_norm, astro_norm, astro_gamma) and returns the weights of those events.
Remember to use the energy of the primary neutrino to weight the spectrum.
We are using energy units of GeV so that 100TeV==1e5

In [ ]:
def weight_event(event, atmo_norm, astro_norm, astro_gamma):
    pass

Define the binning

We want to bin across all three observables we have (energy, zenith, topology)
In energy we generally use logarithmic bins, because they are appropriate for characterizing the sharply falling powerlaw spectrum. Our fit energy range is from 60TeV to 10PeV because it covers the deposited energy of our sample well, and below 60TeV we have poor understanding of the background. 20 bins are used as a compromise between fine binning for a better fit and coarse binning because of limited simulation statistics.
energy_bins = np.logspace(np.log10(60.0e3), np.log10(1.0e7), 20+1)
In zenith we use bins that are evenly spaced in cos(theta) because it scales with the overburden and observed flux. In a similar compromise we have found 10 bins to be appropriate for zenith.
zenith_bins = np.arccos(np.linspace(-1, 1, 10+1))[::-1]
In topology there are two categories, so it is equivalent to just define two bins.
topology_bins = np.linspace(0, 2, 2+1)

In [ ]:
energy_bins = 
zenith_bins = 
topology_bins = 

Function for binning

For our likelihood problem we need to bin both the data, and simulation.
However binning is expensive and we only want to do it once.
One easy way to manage this with numpy is to create masks for each bin.
So events_in_bin = all_events[bin_mask]
Define a function that produces a mask for each bin
def make_bin_masks(energies, zeniths, topologies, energy_bins=energy_bins, zenith_bins=zenith_bins, topology_bins=topology_bins):
This function should return an array of masks.

np.digitize

Docstring: digitize(x, bins, right=False)

Return the indices of the bins to which each value in input array belongs.

Each index i returned is such that bins[i-1] <= x < bins[i] if bins is monotonically increasing, or bins[i-1] > x >= bins[i] if bins is monotonically decreasing. If values in x are beyond the bounds of bins, 0 or len(bins) is returned as appropriate. If right is True, then the right bin is closed so that the index i is such that bins[i-1] < x <= bins[i] or bins[i-1] >= x > bins[i] if bins is monotonically increasing or decreasing, respectively.

In [ ]:
def make_bin_masks(energies, zeniths, topologies,
                   energy_bins=energy_bins, zenith_bins=zenith_bins, topology_bins=topology_bins):
    pass

Define bin masks

Using the function we just wrote, we can call it get our simulation bin masks, and the data bin masks

In [ ]:
bin_masks = 
data_masks = 

Define the likelihood

We want a function that takes the data, simulation, and our physics parameters; that returns the negative log likelihood.
One important piece of this is the expectation from the simulation.

Define expectation function

Let's first define the function that returns the expectation.

In [ ]:
def get_expectation(events, masks, weighting):
    pass

Define likelihood function

$\mathcal{L}(\vec{\theta}|\vec{d})=\prod_i\frac{(\lambda_i)^{k_i}e^{-\lambda_i}}{k_i!}$
$\mathcal{L}(\vec{\theta}|\vec{d})=\prod_i\frac{(\lambda_i)^{k_i}e^{-\lambda_i}}{\Gamma [k_i+1]}$
$\log\mathcal{L}(\vec{\theta}|\vec{d})=\sum_i k_i \cdot \log(\lambda_i) - \lambda_i - \Gamma [k_i+1]$

In [ ]:
def llh(data, simulation_events, atmo_norm, astro_norm, astro_gamma):
    pass
    l[np.logical_and(data == 0, expect == 0)] = 0
    l = np.sum(l).real
    return l
In [ ]:
# Test it
print llh(..., simulation_events, 1.0, 6.0, 2.9)

Get the binned data

In [ ]:
data = 

Minimize the negative log likelihood

sp.optimize

Signature: sp.optimize.minimize(fun, x0, args=(), method=None, jac=None, hess=None, hessp=None, bounds=None, constraints=(), tol=None, callback=None, options=None)
Docstring:
Minimization of scalar function of one or more variables.

In general, the optimization problems are of the form::

minimize f(x) subject to

g_i(x) >= 0,  i = 1,...,m
h_j(x)  = 0,  j = 1,...,p

where x is a vector of one or more variables. g_i(x) are the inequality constraints. h_j(x) are the equality constrains.

Optionally, the lower and upper bounds for each element in x can also be specified using the bounds argument.

In [1]:
result = 
Object `sp.optimize.minimize` not found.
In [ ]:
print result
In [ ]:
print 'Atmospheric Normalization = ', 
print 'Astrophysical Normalization = ', 
print 'Astrophysical Gamma = ',