Analyzing Events Using ctapipe#

Initially presented @ LST Analysis Bootcamp in Padova, 26.11.2018 by Maximilian Nöthe (@maxnoe) & Kai A. Brügge (@mackaiver)

13 import tempfile
14 import timeit
15 from copy import deepcopy
16
17 import astropy.units as u
18 import matplotlib.pyplot as plt
19 import numpy as np
20 from astropy.coordinates import AltAz, angular_separation
21 from matplotlib.colors import ListedColormap
22 from scipy.sparse.csgraph import connected_components
23 from traitlets.config import Config
24
25 from ctapipe.calib import CameraCalibrator
26 from ctapipe.image import (
27     ImageProcessor,
28     camera_to_shower_coordinates,
29     concentration_parameters,
30     hillas_parameters,
31     leakage_parameters,
32     number_of_islands,
33     timing_parameters,
34     toymodel,
35 )
36 from ctapipe.image.cleaning import tailcuts_clean
37 from ctapipe.io import DataWriter, EventSource, TableLoader
38 from ctapipe.reco import ShowerProcessor
39 from ctapipe.utils.datasets import get_dataset_path
40 from ctapipe.visualization import ArrayDisplay, CameraDisplay
41
42 # %matplotlib inline
45 plt.rcParams["figure.figsize"] = (12, 8)
46 plt.rcParams["font.size"] = 14
47 plt.rcParams["figure.figsize"]
[12.0, 8.0]

General Information#

Design#

  • DL0 → DL3 analysis

  • Currently some R0 → DL2 code to be able to analyze simtel files

  • ctapipe is built upon the Scientific Python Stack, core dependencies are

    • numpy

    • scipy

    • astropy

    • numba

Development#

  • ctapipe is developed as Open Source Software (BSD 3-Clause License) at cta-observatory/ctapipe

  • We use the “Github-Workflow”:

    • Few people (e.g. @kosack, @maxnoe) have write access to the main repository

    • Contributors fork the main repository and work on branches

    • Pull Requests are merged after Code Review and automatic execution of the test suite

  • Early development stage ⇒ backwards-incompatible API changes might and will happen

What’s there?#

  • Reading simtel simulation files

  • Simple calibration, cleaning and feature extraction functions

  • Camera and Array plotting

  • Coordinate frames and transformations

  • Stereo-reconstruction using line intersections

What’s still missing?#

  • Good integration with machine learning techniques

  • IRF calculation

  • Documentation, e.g. formal definitions of coordinate frames

What can you do?#

  • Report issues

    • Hard to get started? Tell us where you are stuck

    • Tell user stories

    • Missing features

  • Start contributing

    • ctapipe needs more workpower

    • Implement new reconstruction features

A simple hillas analysis#

Reading in simtel files#

145 input_url = get_dataset_path("gamma_prod5.simtel.zst")
146
147 # EventSource() automatically detects what kind of file we are giving it,
148 # if already supported by ctapipe
149 source = EventSource(input_url, max_events=5)
150
151 print(type(source))
<class 'ctapipe.io.simteleventsource.SimTelEventSource'>
154 for event in source:
155     print(
156         "Id: {}, E = {:1.3f}, Telescopes: {}".format(
157             event.count, event.simulation.shower.energy, len(event.r0.tel)
158         )
159     )
Id: 0, E = 0.073 TeV, Telescopes: 3
Id: 1, E = 1.745 TeV, Telescopes: 14
Id: 2, E = 1.745 TeV, Telescopes: 4
Id: 3, E = 1.745 TeV, Telescopes: 20
Id: 4, E = 1.745 TeV, Telescopes: 31

Each event is a DataContainer holding several Fields of data, which can be containers or just numbers. Let’s look a one event:

167 event
ctapipe.containers.ArrayEventContainer:
                       index.*: event indexing information with default None
                          r0.*: Raw Data with default None
                          r1.*: R1 Calibrated Data with default None
                         dl0.*: DL0 Data Volume Reduced Data with default None
                         dl1.*: DL1 Calibrated image with default None
                         dl2.*: DL2 reconstruction info with default None
                  simulation.*: Simulated Event Information with default None
                                with type <class
                                'ctapipe.containers.SimulatedEventContainer'>
                     trigger.*: central trigger information with default None
                         count: number of events processed with default 0
                  monitoring.*: container for event-wise monitoring data (MON)
                                with default None
                        muon.*: Container for muon analysis results with default
                                None
(CameraDescription(name=CHEC, geometry=CHEC, readout=CHEC), CameraDescription(name=FlashCam, geometry=FlashCam, readout=FlashCam), CameraDescription(name=LSTCam, geometry=LSTCam, readout=LSTCam), CameraDescription(name=NectarCam, geometry=NectarCam, readout=NectarCam))
173 len(event.r0.tel), len(event.r1.tel)
(31, 31)

Data calibration#

The CameraCalibrator calibrates the event (obtaining the dl1 images).

Event displays#

Let’s use ctapipe’s plotting facilities to plot the telescope images

198 event.dl1.tel.keys()
dict_keys([3, 4, 8, 9, 14, 15, 23, 24, 25, 29, 32, 36, 42, 46, 52, 56, 103, 104, 109, 110, 118, 119, 120, 126, 127, 129, 130, 137, 139, 143, 161])
201 tel_id = 130
(CameraGeometry(name='NectarCam', pix_type=PixelShape.HEXAGON, npix=1855, cam_rot=0.000 deg, pix_rot=40.893 deg, frame=<CameraFrame Frame (focal_length=16.445049285888672 m, rotation=0.0 rad, telescope_pointing=None, obstime=None, location=None)>), ctapipe.containers.DL1CameraContainer:
                         image: Numpy array of camera image, after waveform
                                extraction.Shape: (n_pixel) if n_channels is 1
                                or data is gain selectedelse: (n_channels,
                                n_pixel) with default None
                     peak_time: Numpy array containing position of the peak of
                                the pulse as determined by the extractor.Shape:
                                (n_pixel) if n_channels is 1 or data is gain
                                selectedelse: (n_channels, n_pixel) with default
                                None
                    image_mask: Boolean numpy array where True means the pixel
                                has passed cleaning. Shape: (n_pixel, ) with
                                default None as a 1-D array with dtype bool
                      is_valid: True if image extraction succeeded, False if
                                failed or in the case of TwoPass methods, that
                                the first pass only was returned. with default
                                False
                    parameters: Image parameters with default None with type
                                <class
                                'ctapipe.containers.ImageParametersContainer'>)
array([-0.3856395 ,  1.2851689 ,  1.403967  , ...,  2.6847053 ,
        0.04941979, -1.0566036 ], shape=(1855,), dtype=float32)
214 display = CameraDisplay(geometry)
215
216 # right now, there might be one image per gain channel.
217 # This will change as soon as
218 display.image = dl1.image
219 display.add_colorbar()
NectarCam

Image Cleaning#

228 # unoptimized cleaning levels
229 cleaning_level = {
230     "CHEC": (2, 4, 2),
231     "LSTCam": (3.5, 7, 2),
232     "FlashCam": (3.5, 7, 2),
233     "NectarCam": (4, 8, 2),
234 }
237 boundary, picture, min_neighbors = cleaning_level[geometry.name]
238
239 clean = tailcuts_clean(
240     geometry,
241     dl1.image,
242     boundary_thresh=boundary,
243     picture_thresh=picture,
244     min_number_picture_neighbors=min_neighbors,
245 )
248 fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 5))
249
250 d1 = CameraDisplay(geometry, ax=ax1)
251 d2 = CameraDisplay(geometry, ax=ax2)
252
253 ax1.set_title("Image")
254 d1.image = dl1.image
255 d1.add_colorbar(ax=ax1)
256
257 ax2.set_title("Pulse Time")
258 d2.image = dl1.peak_time - np.average(dl1.peak_time, weights=dl1.image)
259 d2.cmap = "RdBu_r"
260 d2.add_colorbar(ax=ax2)
261 d2.set_limits_minmax(-20, 20)
262
263 d1.highlight_pixels(clean, color="red", linewidth=1)
Image, Pulse Time

Image Parameters#

{'intensity': np.float64(303.9444303512573),
 'kurtosis': np.float64(2.4649477808816265),
 'length': <Quantity 0.14667835 m>,
 'length_uncertainty': <Quantity 0.00509155 m>,
 'phi': <Angle -1.11344351 rad>,
 'psi': <Angle -1.12308339 rad>,
 'psi_uncertainty': <Angle 0.01927114 rad>,
 'r': <Quantity 0.71788425 m>,
 'skewness': np.float64(0.360906379879541),
 'transverse_cog_uncertainty': <Quantity 0.00261556 m>,
 'width': <Quantity 0.02584731 m>,
 'width_uncertainty': <Quantity 0.00111159 m>,
 'x': <Quantity 0.31699941 m>,
 'y': <Quantity -0.64410339 m>}
277 display = CameraDisplay(geometry)
278
279 # set "unclean" pixels to 0
280 cleaned = dl1.image.copy()
281 cleaned[~clean] = 0.0
282
283 display.image = cleaned
284 display.add_colorbar()
285
286 display.overlay_moments(hillas, color="xkcd:red")
NectarCam
{'deviation': 0.792151827050336,
 'intercept': np.float64(21.81257256241495),
 'slope': <Quantity 31.32320575 1 / m>}
ctapipe overview
[<matplotlib.lines.Line2D object at 0x7c2220192960>]
{'intensity_width_1': np.float32(0.0),
 'intensity_width_2': np.float32(0.0),
 'pixels_width_1': np.float64(0.0),
 'pixels_width_2': np.float64(0.0)}
NectarCam
2
{'cog': np.float64(0.2571728392782083),
 'core': np.float64(0.4014042974365615),
 'pixel': np.float64(0.09514053791448862)}

Putting it all together / Stereo reconstruction#

All these steps are now unified in several components configurable through the config system, mainly:

  • CameraCalibrator for DL0 → DL1 (Images)

  • ImageProcessor for DL1 (Images) → DL1 (Parameters)

  • ShowerProcessor for stereo reconstruction of the shower geometry

  • DataWriter for writing data into HDF5

A command line tool doing these steps and writing out data in HDF5 format is available as ctapipe-process

337 image_processor_config = Config(
338     {
339         "ImageProcessor": {
340             "image_cleaner_type": "TailcutsImageCleaner",
341             "TailcutsImageCleaner": {
342                 "picture_threshold_pe": [
343                     ("type", "LST_LST_LSTCam", 7.5),
344                     ("type", "MST_MST_FlashCam", 8),
345                     ("type", "MST_MST_NectarCam", 8),
346                     ("type", "SST_ASTRI_CHEC", 7),
347                 ],
348                 "boundary_threshold_pe": [
349                     ("type", "LST_LST_LSTCam", 5),
350                     ("type", "MST_MST_FlashCam", 4),
351                     ("type", "MST_MST_NectarCam", 4),
352                     ("type", "SST_ASTRI_CHEC", 4),
353                 ],
354             },
355         }
356     }
357 )
358
359 input_url = get_dataset_path("gamma_prod5.simtel.zst")
360 source = EventSource(input_url)
361
362 calibrator = CameraCalibrator(subarray=source.subarray)
363 image_processor = ImageProcessor(
364     subarray=source.subarray, config=image_processor_config
365 )
366 shower_processor = ShowerProcessor(subarray=source.subarray)
367 horizon_frame = AltAz()
368
369 f = tempfile.NamedTemporaryFile(suffix=".hdf5")
370
371 with DataWriter(source, output_path=f.name, overwrite=True, write_dl2=True) as writer:
372     for event in source:
373         energy = event.simulation.shower.energy
374         n_telescopes_r1 = len(event.r1.tel)
375         event_id = event.index.event_id
376         print(f"Id: {event_id}, E = {energy:1.3f}, Telescopes (R1): {n_telescopes_r1}")
377
378         calibrator(event)
379         image_processor(event)
380         shower_processor(event)
381
382         stereo = event.dl2.stereo.geometry["HillasReconstructor"]
383         if stereo.is_valid:
384             print("  Alt: {:.2f}°".format(stereo.alt.deg))
385             print("  Az: {:.2f}°".format(stereo.az.deg))
386             print("  Hmax: {:.0f}".format(stereo.h_max))
387             print("  CoreX: {:.1f}".format(stereo.core_x))
388             print("  CoreY: {:.1f}".format(stereo.core_y))
389             print("  Multiplicity: {:d}".format(len(stereo.telescopes)))
390
391         # save a nice event for plotting later
392         if event.count == 3:
393             plotting_event = deepcopy(event)
394
395         writer(event)
Overwriting /tmp/tmpxlmq768m.hdf5
Id: 4009, E = 0.073 TeV, Telescopes (R1): 3
  Alt: 70.18°
  Az: 0.41°
  Hmax: 14066 m
  CoreX: -113.2 m
  CoreY: 366.5 m
  Multiplicity: 3
Id: 5101, E = 1.745 TeV, Telescopes (R1): 14
  Alt: 70.00°
  Az: 359.95°
  Hmax: 10329 m
  CoreX: -739.5 m
  CoreY: -268.4 m
  Multiplicity: 10
Id: 5103, E = 1.745 TeV, Telescopes (R1): 4
  Alt: 70.21°
  Az: 0.50°
  Hmax: 11526 m
  CoreX: 832.1 m
  CoreY: 945.4 m
  Multiplicity: 2
Id: 5104, E = 1.745 TeV, Telescopes (R1): 20
  Alt: 70.16°
  Az: 0.07°
  Hmax: 10643 m
  CoreX: -518.3 m
  CoreY: -451.5 m
  Multiplicity: 15
Id: 5105, E = 1.745 TeV, Telescopes (R1): 31
  Alt: 70.02°
  Az: 0.13°
  Hmax: 10449 m
  CoreX: -327.7 m
  CoreY: 408.8 m
  Multiplicity: 25
Id: 5108, E = 1.745 TeV, Telescopes (R1): 3
  Alt: 69.95°
  Az: 359.79°
  Hmax: 10409 m
  CoreX: -1304.5 m
  CoreY: 205.6 m
  Multiplicity: 2
Id: 5109, E = 1.745 TeV, Telescopes (R1): 8
  Alt: 69.62°
  Az: 359.93°
  Hmax: 10886 m
  CoreX: -1067.6 m
  CoreY: 480.5 m
  Multiplicity: 3
404 theta = angular_separation(
405     events["HillasReconstructor_az"].quantity,
406     events["HillasReconstructor_alt"].quantity,
407     events["true_az"].quantity,
408     events["true_alt"].quantity,
409 )
410
411 plt.hist(theta.to_value(u.deg) ** 2, bins=25, range=[0, 0.3])
412 plt.xlabel(r"$\theta² / deg²$")
413 None
ctapipe overview

ArrayDisplay#

422 angle_offset = plotting_event.monitoring.pointing.array_azimuth
423
424 plotting_hillas = {
425     tel_id: dl1.parameters.hillas for tel_id, dl1 in plotting_event.dl1.tel.items()
426 }
427
428 plotting_core = {
429     tel_id: dl1.parameters.core.psi for tel_id, dl1 in plotting_event.dl1.tel.items()
430 }
431
432
433 disp = ArrayDisplay(source.subarray)
434
435 disp.set_line_hillas(plotting_hillas, plotting_core, 500)
436
437 plt.scatter(
438     plotting_event.simulation.shower.core_x,
439     plotting_event.simulation.shower.core_y,
440     s=200,
441     c="k",
442     marker="x",
443     label="True Impact",
444 )
445 plt.scatter(
446     plotting_event.dl2.stereo.geometry["HillasReconstructor"].core_x,
447     plotting_event.dl2.stereo.geometry["HillasReconstructor"].core_y,
448     s=200,
449     c="r",
450     marker="x",
451     label="Estimated Impact",
452 )
453
454 plt.legend()
455 # plt.xlim(-400, 400)
456 # plt.ylim(-400, 400)
MonteCarloArray
<matplotlib.legend.Legend object at 0x7c2221eb3020>

Reading the LST dl1 data#

464 loader = TableLoader(f.name)
465
466 dl1_table = loader.read_telescope_events(
467     ["LST_LST_LSTCam"],
468     dl2=False,
469     true_parameters=False,
470 )
473 plt.scatter(
474     np.log10(dl1_table["true_energy"].quantity / u.TeV),
475     np.log10(dl1_table["hillas_intensity"]),
476 )
477 plt.xlabel("log10(E / TeV)")
478 plt.ylabel("log10(intensity)")
479 None
ctapipe overview

Isn’t python slow?#

  • Many of you might have heard: “Python is slow”.

  • That’s trueish.

  • All python objects are classes living on the heap, even integers.

  • Looping over lots of “primitives” is quite slow compared to other languages.

⇒ Vectorize as much as possible using numpy
⇒ Use existing interfaces to fast C / C++ / Fortran code
⇒ Optimize using numba

But: “Premature Optimization is the root of all evil” — Donald Knuth

So profile to find exactly what is slow.

Why use python then?#

  • Python works very well as glue for libraries of all kinds of languages

  • Python has a rich ecosystem for data science, physics, algorithms, astronomy

Example: Number of Islands#

Find all groups of pixels, that survived the cleaning

Let’s create a toy images with several islands;

522 rng = np.random.default_rng(42)
523
524 image = np.zeros(geometry.n_pixels)
525
526
527 for i in range(9):
528     model = toymodel.Gaussian(
529         x=rng.uniform(-0.8, 0.8) * u.m,
530         y=rng.uniform(-0.8, 0.8) * u.m,
531         width=rng.uniform(0.05, 0.075) * u.m,
532         length=rng.uniform(0.1, 0.15) * u.m,
533         psi=rng.uniform(0, 2 * np.pi) * u.rad,
534     )
535
536     new_image, sig, bg = model.generate_image(
537         geometry, intensity=rng.uniform(1000, 3000), nsb_level_pe=5
538     )
539     image += new_image
542 clean = tailcuts_clean(
543     geometry,
544     image,
545     picture_thresh=10,
546     boundary_thresh=5,
547     min_number_picture_neighbors=2,
548 )
551 disp = CameraDisplay(geometry)
552 disp.image = image
553 disp.highlight_pixels(clean, color="xkcd:red", linewidth=1.5)
554 disp.add_colorbar()
LSTCam
558 def num_islands_python(camera, clean):
559     """A breadth first search to find connected islands of neighboring pixels in the cleaning set"""
560
561     # the camera geometry has a [n_pixel, n_pixel] boolean array
562     # that is True where two pixels are neighbors
563     neighbors = camera.neighbor_matrix
564
565     island_ids = np.zeros(camera.n_pixels)
566     current_island = 0
567
568     # a set to remember which pixels we already visited
569     visited = set()
570
571     # go only through the pixels, that survived cleaning
572     for pix_id in np.where(clean)[0]:
573         if pix_id not in visited:
574             # remember that we already checked this pixel
575             visited.add(pix_id)
576
577             # if we land in the outer loop again, we found a new island
578             current_island += 1
579             island_ids[pix_id] = current_island
580
581             # now check all neighbors of the current pixel recursively
582             to_check = set(np.where(neighbors[pix_id] & clean)[0])
583             while to_check:
584                 pix_id = to_check.pop()
585
586                 if pix_id not in visited:
587                     visited.add(pix_id)
588                     island_ids[pix_id] = current_island
589
590                     to_check.update(np.where(neighbors[pix_id] & clean)[0])
591
592     n_islands = current_island
593     return n_islands, island_ids
597 n_islands, island_ids = num_islands_python(geometry, clean)
LSTCam
612 timeit.timeit(lambda: num_islands_python(geometry, clean), number=1000) / 1000
0.001906123042001127
616 def num_islands_scipy(geometry, clean):
617     neighbors = geometry.neighbor_matrix_sparse
618
619     clean_neighbors = neighbors[clean][:, clean]
620     num_islands, labels = connected_components(clean_neighbors, directed=False)
621
622     island_ids = np.zeros(geometry.n_pixels)
623     island_ids[clean] = labels + 1
624
625     return num_islands, island_ids
629 n_islands_s, island_ids_s = num_islands_scipy(geometry, clean)
LSTCam
639 timeit.timeit(lambda: num_islands_scipy(geometry, clean), number=10000) / 10000
0.0003123096999999689

A lot less code, and a factor 3 speed improvement

Finally, current ctapipe implementation is using numba:

651 timeit.timeit(lambda: number_of_islands(geometry, clean), number=100000) / 100000
1.6972994480001945e-05

Total running time of the script: (0 minutes 32.399 seconds)

Gallery generated by Sphinx-Gallery