Note
Go to the end to download the full example code.
Analyzing Events Using ctapipe#
Initially presented @ LST Analysis Bootcamp in Padova, 26.11.2018 by Maximilian Nöthe (@maxnoe) & Kai A. Brügge (@mackaiver)
14 import tempfile
15 import timeit
16 from copy import deepcopy
17
18 import astropy.units as u
19 import matplotlib.pyplot as plt
20 import numpy as np
21 from astropy.coordinates import AltAz, angular_separation
22 from matplotlib.colors import ListedColormap
23 from scipy.sparse.csgraph import connected_components
24 from traitlets.config import Config
25
26 from ctapipe.calib import CameraCalibrator
27 from ctapipe.image import (
28 ImageProcessor,
29 camera_to_shower_coordinates,
30 concentration_parameters,
31 hillas_parameters,
32 leakage_parameters,
33 number_of_islands,
34 timing_parameters,
35 toymodel,
36 )
37 from ctapipe.image.cleaning import tailcuts_clean
38 from ctapipe.io import DataWriter, EventSource, TableLoader
39 from ctapipe.reco import ShowerProcessor
40 from ctapipe.utils.datasets import get_dataset_path
41 from ctapipe.visualization import ArrayDisplay, CameraDisplay
42
43 # %matplotlib inline
46 plt.rcParams["figure.figsize"] = (12, 8)
47 plt.rcParams["font.size"] = 14
48 plt.rcParams["figure.figsize"]
[12.0, 8.0]
General Information#
Design#
DL0 → DL3 analysis
Currently some R0 → DL2 code to be able to analyze simtel files
ctapipe is built upon the Scientific Python Stack, core dependencies are
numpy
scipy
astropy
numba
Development#
ctapipe is developed as Open Source Software (BSD 3-Clause License) at cta-observatory/ctapipe
We use the “Github-Workflow”:
Few people (e.g. @kosack, @maxnoe) have write access to the main repository
Contributors fork the main repository and work on branches
Pull Requests are merged after Code Review and automatic execution of the test suite
Early development stage ⇒ backwards-incompatible API changes might and will happen
What’s there?#
Reading simtel simulation files
Simple calibration, cleaning and feature extraction functions
Camera and Array plotting
Coordinate frames and transformations
Stereo-reconstruction using line intersections
What’s still missing?#
Good integration with machine learning techniques
IRF calculation
Documentation, e.g. formal definitions of coordinate frames
What can you do?#
Report issues
Hard to get started? Tell us where you are stuck
Tell user stories
Missing features
Start contributing
ctapipe needs more workpower
Implement new reconstruction features
A simple hillas analysis#
Reading in simtel files#
146 input_url = get_dataset_path("gamma_prod5.simtel.zst")
147
148 # EventSource() automatically detects what kind of file we are giving it,
149 # if already supported by ctapipe
150 source = EventSource(input_url, max_events=5)
151
152 print(type(source))
<class 'ctapipe.io.simteleventsource.SimTelEventSource'>
155 for event in source:
156 print(
157 "Id: {}, E = {:1.3f}, Telescopes: {}".format(
158 event.count, event.simulation.shower.energy, len(event.r0.tel)
159 )
160 )
Id: 0, E = 0.073 TeV, Telescopes: 3
Id: 1, E = 1.745 TeV, Telescopes: 14
Id: 2, E = 1.745 TeV, Telescopes: 4
Id: 3, E = 1.745 TeV, Telescopes: 20
Id: 4, E = 1.745 TeV, Telescopes: 31
Each event is a DataContainer
holding several Field
s of data,
which can be containers or just numbers. Let’s look a one event:
168 event
ctapipe.containers.ArrayEventContainer:
index.*: event indexing information with default None
r0.*: Raw Data with default None
r1.*: R1 Calibrated Data with default None
dl0.*: DL0 Data Volume Reduced Data with default None
dl1.*: DL1 Calibrated image with default None
dl2.*: DL2 reconstruction info with default None
simulation.*: Simulated Event Information with default None
with type <class
'ctapipe.containers.SimulatedEventContainer'>
trigger.*: central trigger information with default None
count: number of events processed with default 0
pointing.*: Array and telescope pointing positions with
default None
calibration.*: Container for calibration coefficients for the
current event with default None
mon.*: container for event-wise monitoring data (MON)
with default None
muon.*: Container for muon analysis results with default
None
171 source.subarray.camera_types
(CameraDescription(name=CHEC, geometry=CHEC, readout=CHEC), CameraDescription(name=FlashCam, geometry=FlashCam, readout=FlashCam), CameraDescription(name=LSTCam, geometry=LSTCam, readout=LSTCam), CameraDescription(name=NectarCam, geometry=NectarCam, readout=NectarCam))
174 len(event.r0.tel), len(event.r1.tel)
(31, 31)
Data calibration#
The CameraCalibrator
calibrates the event (obtaining the dl1
images).
186 calibrator = CameraCalibrator(subarray=source.subarray)
189 calibrator(event)
Event displays#
Let’s use ctapipe’s plotting facilities to plot the telescope images
199 event.dl1.tel.keys()
dict_keys([3, 4, 8, 9, 14, 15, 23, 24, 25, 29, 32, 36, 42, 46, 52, 56, 103, 104, 109, 110, 118, 119, 120, 126, 127, 129, 130, 137, 139, 143, 161])
202 tel_id = 130
(CameraGeometry(name='NectarCam', pix_type=PixelShape.HEXAGON, npix=1855, cam_rot=0.000 deg, pix_rot=40.893 deg, frame=<CameraFrame Frame (focal_length=16.445049285888672 m, rotation=0.0 rad, telescope_pointing=None, obstime=None, location=None)>), ctapipe.containers.DL1CameraContainer:
image: Numpy array of camera image, after waveform
extraction.Shape: (n_pixel) if n_channels is 1
or data is gain selectedelse: (n_channels,
n_pixel) with default None
peak_time: Numpy array containing position of the peak of
the pulse as determined by the extractor.Shape:
(n_pixel) if n_channels is 1 or data is gain
selectedelse: (n_channels, n_pixel) with default
None
image_mask: Boolean numpy array where True means the pixel
has passed cleaning. Shape: (n_pixel, ) with
default None as a 1-D array with dtype bool
is_valid: True if image extraction succeeded, False if
failed or in the case of TwoPass methods, that
the first pass only was returned. with default
False
parameters: Image parameters with default None with type
<class
'ctapipe.containers.ImageParametersContainer'>)
211 dl1.image
array([-0.3856395 , 1.2851689 , 1.403967 , ..., 2.6847053 ,
0.04941979, -1.0566036 ], dtype=float32)
Image Cleaning#
229 # unoptimized cleaning levels
230 cleaning_level = {
231 "CHEC": (2, 4, 2),
232 "LSTCam": (3.5, 7, 2),
233 "FlashCam": (3.5, 7, 2),
234 "NectarCam": (4, 8, 2),
235 }
238 boundary, picture, min_neighbors = cleaning_level[geometry.name]
239
240 clean = tailcuts_clean(
241 geometry,
242 dl1.image,
243 boundary_thresh=boundary,
244 picture_thresh=picture,
245 min_number_picture_neighbors=min_neighbors,
246 )
249 fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 5))
250
251 d1 = CameraDisplay(geometry, ax=ax1)
252 d2 = CameraDisplay(geometry, ax=ax2)
253
254 ax1.set_title("Image")
255 d1.image = dl1.image
256 d1.add_colorbar(ax=ax1)
257
258 ax2.set_title("Pulse Time")
259 d2.image = dl1.peak_time - np.average(dl1.peak_time, weights=dl1.image)
260 d2.cmap = "RdBu_r"
261 d2.add_colorbar(ax=ax2)
262 d2.set_limits_minmax(-20, 20)
263
264 d1.highlight_pixels(clean, color="red", linewidth=1)
Image Parameters#
{'intensity': 303.9444303512573,
'kurtosis': 2.4649477808816265,
'length': <Quantity 0.14667835 m>,
'length_uncertainty': <Quantity 0.00509155 m>,
'phi': <Angle -1.11344351 rad>,
'psi': <Angle -1.12308339 rad>,
'r': <Quantity 0.71788425 m>,
'skewness': 0.36090637987954094,
'width': <Quantity 0.02584731 m>,
'width_uncertainty': <Quantity 0.00111159 m>,
'x': <Quantity 0.31699941 m>,
'y': <Quantity -0.64410339 m>}
290 timing = timing_parameters(geometry, dl1.image, dl1.peak_time, hillas, clean)
291
292 print(timing)
{'deviation': 0.792151827050336,
'intercept': 21.81257256241495,
'slope': <Quantity 31.32320575 1 / m>}
295 long, trans = camera_to_shower_coordinates(
296 geometry.pix_x, geometry.pix_y, hillas.x, hillas.y, hillas.psi
297 )
298
299 plt.plot(long[clean], dl1.peak_time[clean], "o")
300 plt.plot(long[clean], timing.slope * long[clean] + timing.intercept)
[<matplotlib.lines.Line2D object at 0x7f7468e0af20>]
{'intensity_width_1': 0.0,
'intensity_width_2': 0.0,
'pixels_width_1': 0.0,
'pixels_width_2': 0.0}
2
317 conc = concentration_parameters(geometry, dl1.image, hillas)
318 print(conc)
{'cog': 0.2571728392782083,
'core': 0.4014042974365615,
'pixel': 0.09514053791448862}
Putting it all together / Stereo reconstruction#
All these steps are now unified in several components configurable through the config system, mainly:
CameraCalibrator for DL0 → DL1 (Images)
ImageProcessor for DL1 (Images) → DL1 (Parameters)
ShowerProcessor for stereo reconstruction of the shower geometry
DataWriter for writing data into HDF5
A command line tool doing these steps and writing out data in HDF5
format is available as ctapipe-process
338 image_processor_config = Config(
339 {
340 "ImageProcessor": {
341 "image_cleaner_type": "TailcutsImageCleaner",
342 "TailcutsImageCleaner": {
343 "picture_threshold_pe": [
344 ("type", "LST_LST_LSTCam", 7.5),
345 ("type", "MST_MST_FlashCam", 8),
346 ("type", "MST_MST_NectarCam", 8),
347 ("type", "SST_ASTRI_CHEC", 7),
348 ],
349 "boundary_threshold_pe": [
350 ("type", "LST_LST_LSTCam", 5),
351 ("type", "MST_MST_FlashCam", 4),
352 ("type", "MST_MST_NectarCam", 4),
353 ("type", "SST_ASTRI_CHEC", 4),
354 ],
355 },
356 }
357 }
358 )
359
360 input_url = get_dataset_path("gamma_prod5.simtel.zst")
361 source = EventSource(input_url)
362
363 calibrator = CameraCalibrator(subarray=source.subarray)
364 image_processor = ImageProcessor(
365 subarray=source.subarray, config=image_processor_config
366 )
367 shower_processor = ShowerProcessor(subarray=source.subarray)
368 horizon_frame = AltAz()
369
370 f = tempfile.NamedTemporaryFile(suffix=".hdf5")
371
372 with DataWriter(source, output_path=f.name, overwrite=True, write_dl2=True) as writer:
373 for event in source:
374 energy = event.simulation.shower.energy
375 n_telescopes_r1 = len(event.r1.tel)
376 event_id = event.index.event_id
377 print(f"Id: {event_id}, E = {energy:1.3f}, Telescopes (R1): {n_telescopes_r1}")
378
379 calibrator(event)
380 image_processor(event)
381 shower_processor(event)
382
383 stereo = event.dl2.stereo.geometry["HillasReconstructor"]
384 if stereo.is_valid:
385 print(" Alt: {:.2f}°".format(stereo.alt.deg))
386 print(" Az: {:.2f}°".format(stereo.az.deg))
387 print(" Hmax: {:.0f}".format(stereo.h_max))
388 print(" CoreX: {:.1f}".format(stereo.core_x))
389 print(" CoreY: {:.1f}".format(stereo.core_y))
390 print(" Multiplicity: {:d}".format(len(stereo.telescopes)))
391
392 # save a nice event for plotting later
393 if event.count == 3:
394 plotting_event = deepcopy(event)
395
396 writer(event)
Overwriting /tmp/tmpg5zly3qm.hdf5
Id: 4009, E = 0.073 TeV, Telescopes (R1): 3
Alt: 70.18°
Az: 0.41°
Hmax: 14066 m
CoreX: -113.2 m
CoreY: 366.5 m
Multiplicity: 3
Id: 5101, E = 1.745 TeV, Telescopes (R1): 14
Alt: 70.00°
Az: 359.95°
Hmax: 10329 m
CoreX: -739.5 m
CoreY: -268.4 m
Multiplicity: 10
Id: 5103, E = 1.745 TeV, Telescopes (R1): 4
Alt: 70.21°
Az: 0.50°
Hmax: 11526 m
CoreX: 832.1 m
CoreY: 945.4 m
Multiplicity: 2
Id: 5104, E = 1.745 TeV, Telescopes (R1): 20
Alt: 70.16°
Az: 0.07°
Hmax: 10643 m
CoreX: -518.3 m
CoreY: -451.5 m
Multiplicity: 15
Id: 5105, E = 1.745 TeV, Telescopes (R1): 31
Alt: 70.02°
Az: 0.13°
Hmax: 10449 m
CoreX: -327.7 m
CoreY: 408.8 m
Multiplicity: 25
Id: 5108, E = 1.745 TeV, Telescopes (R1): 3
Alt: 69.95°
Az: 359.79°
Hmax: 10409 m
CoreX: -1304.5 m
CoreY: 205.6 m
Multiplicity: 2
Id: 5109, E = 1.745 TeV, Telescopes (R1): 8
Alt: 69.62°
Az: 359.93°
Hmax: 10886 m
CoreX: -1067.6 m
CoreY: 480.5 m
Multiplicity: 3
400 loader = TableLoader(f.name)
401
402 events = loader.read_subarray_events()
405 theta = angular_separation(
406 events["HillasReconstructor_az"].quantity,
407 events["HillasReconstructor_alt"].quantity,
408 events["true_az"].quantity,
409 events["true_alt"].quantity,
410 )
411
412 plt.hist(theta.to_value(u.deg) ** 2, bins=25, range=[0, 0.3])
413 plt.xlabel(r"$\theta² / deg²$")
414 None
ArrayDisplay#
423 angle_offset = plotting_event.pointing.array_azimuth
424
425 plotting_hillas = {
426 tel_id: dl1.parameters.hillas for tel_id, dl1 in plotting_event.dl1.tel.items()
427 }
428
429 plotting_core = {
430 tel_id: dl1.parameters.core.psi for tel_id, dl1 in plotting_event.dl1.tel.items()
431 }
432
433
434 disp = ArrayDisplay(source.subarray)
435
436 disp.set_line_hillas(plotting_hillas, plotting_core, 500)
437
438 plt.scatter(
439 plotting_event.simulation.shower.core_x,
440 plotting_event.simulation.shower.core_y,
441 s=200,
442 c="k",
443 marker="x",
444 label="True Impact",
445 )
446 plt.scatter(
447 plotting_event.dl2.stereo.geometry["HillasReconstructor"].core_x,
448 plotting_event.dl2.stereo.geometry["HillasReconstructor"].core_y,
449 s=200,
450 c="r",
451 marker="x",
452 label="Estimated Impact",
453 )
454
455 plt.legend()
456 # plt.xlim(-400, 400)
457 # plt.ylim(-400, 400)
<matplotlib.legend.Legend object at 0x7f746a418df0>
Reading the LST dl1 data#
465 loader = TableLoader(f.name)
466
467 dl1_table = loader.read_telescope_events(
468 ["LST_LST_LSTCam"],
469 dl2=False,
470 true_parameters=False,
471 )
474 plt.scatter(
475 np.log10(dl1_table["true_energy"].quantity / u.TeV),
476 np.log10(dl1_table["hillas_intensity"]),
477 )
478 plt.xlabel("log10(E / TeV)")
479 plt.ylabel("log10(intensity)")
480 None
Isn’t python slow?#
Many of you might have heard: “Python is slow”.
That’s trueish.
All python objects are classes living on the heap, even integers.
Looping over lots of “primitives” is quite slow compared to other languages.
But: “Premature Optimization is the root of all evil” — Donald Knuth
So profile to find exactly what is slow.
Why use python then?#
Python works very well as glue for libraries of all kinds of languages
Python has a rich ecosystem for data science, physics, algorithms, astronomy
Example: Number of Islands#
Find all groups of pixels, that survived the cleaning
516 geometry = loader.subarray.tel[1].camera.geometry
Let’s create a toy images with several islands;
523 np.random.seed(42)
524
525 image = np.zeros(geometry.n_pixels)
526
527
528 for i in range(9):
529 model = toymodel.Gaussian(
530 x=np.random.uniform(-0.8, 0.8) * u.m,
531 y=np.random.uniform(-0.8, 0.8) * u.m,
532 width=np.random.uniform(0.05, 0.075) * u.m,
533 length=np.random.uniform(0.1, 0.15) * u.m,
534 psi=np.random.uniform(0, 2 * np.pi) * u.rad,
535 )
536
537 new_image, sig, bg = model.generate_image(
538 geometry, intensity=np.random.uniform(1000, 3000), nsb_level_pe=5
539 )
540 image += new_image
559 def num_islands_python(camera, clean):
560 """A breadth first search to find connected islands of neighboring pixels in the cleaning set"""
561
562 # the camera geometry has a [n_pixel, n_pixel] boolean array
563 # that is True where two pixels are neighbors
564 neighbors = camera.neighbor_matrix
565
566 island_ids = np.zeros(camera.n_pixels)
567 current_island = 0
568
569 # a set to remember which pixels we already visited
570 visited = set()
571
572 # go only through the pixels, that survived cleaning
573 for pix_id in np.where(clean)[0]:
574 if pix_id not in visited:
575 # remember that we already checked this pixel
576 visited.add(pix_id)
577
578 # if we land in the outer loop again, we found a new island
579 current_island += 1
580 island_ids[pix_id] = current_island
581
582 # now check all neighbors of the current pixel recursively
583 to_check = set(np.where(neighbors[pix_id] & clean)[0])
584 while to_check:
585 pix_id = to_check.pop()
586
587 if pix_id not in visited:
588 visited.add(pix_id)
589 island_ids[pix_id] = current_island
590
591 to_check.update(np.where(neighbors[pix_id] & clean)[0])
592
593 n_islands = current_island
594 return n_islands, island_ids
598 n_islands, island_ids = num_islands_python(geometry, clean)
602 cmap = plt.get_cmap("Paired")
603 cmap = ListedColormap(cmap.colors[:n_islands])
604 cmap.set_under("k")
605
606 disp = CameraDisplay(geometry)
607 disp.image = island_ids
608 disp.cmap = cmap
609 disp.set_limits_minmax(0.5, n_islands + 0.5)
610 disp.add_colorbar()
613 timeit.timeit(lambda: num_islands_python(geometry, clean), number=1000) / 1000
0.002224530461000086
617 def num_islands_scipy(geometry, clean):
618 neighbors = geometry.neighbor_matrix_sparse
619
620 clean_neighbors = neighbors[clean][:, clean]
621 num_islands, labels = connected_components(clean_neighbors, directed=False)
622
623 island_ids = np.zeros(geometry.n_pixels)
624 island_ids[clean] = labels + 1
625
626 return num_islands, island_ids
630 n_islands_s, island_ids_s = num_islands_scipy(geometry, clean)
633 disp = CameraDisplay(geometry)
634 disp.image = island_ids_s
635 disp.cmap = cmap
636 disp.set_limits_minmax(0.5, n_islands_s + 0.5)
637 disp.add_colorbar()
640 timeit.timeit(lambda: num_islands_scipy(geometry, clean), number=10000) / 10000
0.00045515157730001193
A lot less code, and a factor 3 speed improvement
Finally, current ctapipe implementation is using numba:
652 timeit.timeit(lambda: number_of_islands(geometry, clean), number=100000) / 100000
2.397498535000068e-05
Total running time of the script: (0 minutes 37.387 seconds)