The icetray framework provides file formats other than i3 files: hdf5 and root.
Here, we will cover hdf5 file booking and reading only.
In order to be able to handle hdf5 files in ipython or the notebook (e.g. for plotting your data), we need to have the python-tables package installed.
Entering ipython
in your shell will prompt you with the interactive ipython shell:
import tables
If the package is not available, the above line will return an error. In this case you have to get the python-tables
package.
If you are on the VM (or on any linux machine), exit ipython and enter: sudo apt-get install python-tables
Then try to import tables again in ipython.
The project that allows you to book flat tables like hdf5 tables is tableio
.
Additionally, we will need the particular project that handles the hdf format: hdfwriter
(if we wanted to write root files, we would need rootwriter
correspondingly).
A minimal script that translates user-specified i3 file frame objects into flat hdf5 tables is this:
#!/usr/bin/env python
import sys
from icecube import icetray, dataio
from icecube import tableio, hdfwriter
from I3Tray import *
outfile = sys.argv[1]
infiles = sys.argv[2:]
tray = I3Tray()
tray.Add("I3Reader",FilenameList=infiles)
hdftable = hdfwriter.I3HDFTableService(outfile)
tray.Add(tableio.I3TableWriter,'hdf1',
tableservice = hdftable,
SubEventStreams = ['in_ice'],
keys = ['I3EventHeader']
)
tray.Execute(200)
tray.Finish()
First, we have to create the TableService we want to use, which is needed by the I3TableWriter
module, provided by the tableio
project.
We also have to tell the writer which SubEventStreams
to use (we encountered in_ice and ice_top).
Lastly, the frame object we want to book have to be listed in keys
.
Instead of booking particular frame objects via the keys
parameter, we can also book all frame objects of a particular dataclasses type, using the type
parameter:
tray.Add(tableio.I3TableWriter,'hdf2',
tableservice = hdftable,
SubEventStreams = ['in_ice'],
types = [dataclasses.I3Particle]
)
This will book all frame objects of type I3Particle
. You can have both I3TableWriters in the same script as long as you give the modules different names (here, the ability to name a module comes in handy).
The hdfwriter
project has a python function implemented that allows us to perform the hdf writing in a single module call ( http://code.icecube.wisc.edu/projects/icecube/browser/IceCube/projects/hdfwriter/releases/V15-03-00/python/init.py):
tray.Add(hdfwriter.I3HDFWriter, 'hdf',
Output=outfile,
CompressionLevel=9,
SubEventStreams=['in_ice'],
Keys=['I3EventHeader']
)
In order to book a frame object, the writer needs to know how to translate it. This especially requires that the project, which contains this translator, has to be imported in your script. If this is not the case, icetray will throw errors at you in the shell, but the program will still execute and exit cleanly:
ERROR (icetray): frame caught exception "unregistered class" while loading class type "I3LogLikelihoodFitParams" at key "SPEFitSingleFitParams" (I3Frame.cxx:961 in I3FrameObjectConstPtr I3Frame::get_impl(__gnu_cxx::hash_map<I3Frame::hashed_str_t, boost::shared_ptr<I3Frame::value_t>, I3Frame::hashed_str_t_hash>::const_reference, bool) const)
ERROR (I3TableWriter): Frame object 'SPEFitSingleFitParams' could not be deserialized and will not be booked. (I3TableWriter.cxx:269 in void I3TableWriter::Convert(I3FramePtr))
Not all frame objects have converters to flat table objects implemented. These objects can not (yet) be booked in hdf5 files and you will be prompted an error:
FATAL (I3TableWriter): No converter found for 'I3MCTree' of type 'TreeBase::Tree<I3Particle, I3ParticleID, __gnu_cxx::hash<I3ParticleID> >' (I3TableWriter.cxx:96 in bool I3TableWriter::AddObject(std::string, std::string, I3ConverterPtr, I3FrameObjectConstPtr))
Make sure you have commas after every item in the list, otherwise they will not be booked and there is neither a warning nor an error.
hdf5 files can be conveniently read in ipython, the ipython notebook, or python scripts.
We need to import the python tables package:
import tables
Open the file you want to work with:
my_file = tables.openFile('/home/icecube/i3/scripts/test.h5')
Access objects of the file through starting at the "root". Tab completion comes in really handy here, as it shows you which objects (e.g. I3EventHeader) are available.
Objects might have several entries, each filling its own column.
Let's look at the layout of these columns:
my_file.root.I3EventHeader.coldescrs
The description shows us the data type of the column (e.g. UInt32), its shape (a tuple), its default value and the position of the column in the table.
Let's print the event IDs stored in the I3EventHeader:
print "Event Nr", my_file.root.I3EventHeader.cols.Event[:10]
This returns a list of event IDs, which we can work with if we assign it to a variable: event_id = my_file.root.I3EventHeader.cols.Event[:]
We see that each event ID only show sup once, as the EventHeader summarizes full events.
What happens if we print the content of OfflinePulses?
print "Charge", my_file.root.OfflinePulses.cols.charge[:100]
print "Event Nr", my_file.root.OfflinePulses.cols.Event[:100]
Each entry in OfflinePulses corresponds to a single DOM hit, i.e. there are many entries for the same event ID.
This is important to consider when combining different table entries.
Most likely we are more interested in the total charge of the event. It is in principle possible to sum the charges here, but it is much more convenient to do that already in the script that produced the hdf5 file:
print "Charge", my_file.root.TotalCharge.cols.value[:10]
print "Event Nr", my_file.root.TotalCharge.cols.Event[:10]
%pylab inline
import pylab
pylab.figure()
binwidth = (numpy.log10(6000.)-numpy.log10(1500.))/4. # 4 bins between 1500 and 6000 pe
numbins = 15
edges = 10.**numpy.linspace( numpy.log10(10), numpy.log10(10)+binwidth*numbins, numbins+1 )
charge = my_file.root.TotalCharge.cols.value[:]
event_id = my_file.root.TotalCharge.cols.Event[:]
cut = (charge[:]>100) & (event_id[:]>200)
#my_hist = pylab.hist(charge, bins = edges, log = True, histtype='stepfilled', edgecolor = "b", color = 'b', alpha = 0.5, label = "atm E-2")
my_hist = pylab.hist(charge[cut], bins = edges, log = True, histtype='stepfilled', edgecolor = "b", color = 'b', alpha = 0.5, label = "atm E-2")
pylab.loglog()
pylab.xlim([10,2000])
pylab.ylim([1,300])
print "Total number of events shown:", sum(my_hist[0])
pylab.ylabel("counts")
pylab.xlabel("total charge (PE)")