Introduction

The goal of hadron physics is to understand how the strong interaction binds massless gluons and almost massless quarks into the massive hadrons that form our visible universe. The relevant interactions occur within the confinement domain, which corresponds to an energy range of \(\approx\)100 MeV up to a few tens of GeV. Particles produced in this energy region typically give rise to complicated detector signatures, including high-curvature tracks from low-momentum particles. Furthermore, the curvature of these tracks changes along the trajectory due to energy loss and other interactions with the material; hence, these trajectories cannot be described by simple geometries. Another challenge is the similarity between signatures of interest and background – a feature particularly prevalent at the low and intermediate energies characterising hadron physics. Constructing software algorithms that can handle these features is therefore challenging – but necessary to fully exploit the new and next generations of hadron physics facilities. A prominent example of the latter is the future PANDA experiment [1], where a beam of stored antiprotons and a large-acceptance, multi-purpose detector will open new avenues in exploring the strong interaction.

Hyperons are similar to the protons but also contain at least one heavy and unstable strange quark. Due to their weak, self-analysing decays, hyperons provide a precise diagnostic tool to test CP symmetry [2, 3] and electromagnetic structure [4, 5]. Hyperon spectroscopy provides crucial information about how composite systems emerge from strongly interacting quarks and gluons [6, 7], and interactions between hyperons, antihyperons and nuclei provide important pieces to the hyperon puzzle of neutron stars [8]. In PANDA, the antiproton-proton annihilations will enable the production of all known and predicted single-, double- and triple-strange hyperons (Y) in two-body reactions \(\bar{p}p \rightarrow \bar{Y}Y\). This provides a clean, particle-antiparticle symmetric final state that is straightforward to parametrise – a significant advantage for spectroscopic partial wave analyses [9], investigations of interaction dynamics [10, 11] and CP symmetry tests [5]. Simulation studies have demonstrated that PANDA will be a strangeness factory already in its first operation phase [12].

Identifying hyperons has its unique challenges. Ground-state hyperons decay weakly, on a time-scale of \(10^{-10}\) s. This implies that relativistic particles travel a distance of a few centimetres or even metres before decaying. Therefore, they will leave a track in the detector that starts a finite distance away from the beam-target interaction point (IP), i.e. a displaced vertex. As an example, the flight distance of the \(\Lambda\) hyperon is \(c\tau = 7.89 \text { cm}\). Heavier hyperons, such as the multi-strange \(\Xi ^0\), \(\Xi ^-\), \(\Xi ^*\), \(\Omega ^-\) and \(\Omega ^*\), decay by considerable fractions into states containing a \(\Lambda\) hyperon. Prominent decay channels of charm baryons, such as the \(\Lambda _c^+\), also contain the ground-state strange \(\Lambda\) or \(\Sigma\) hyperons. Hence, successful hyperon analyses rely on the ability to reconstruct tracks from displaced vertices and at the same time, to properly handle highly curved or even spiralling tracks from low-momentum particles [13]. However, most tracking algorithms are tailored to primary tracks, since the assumption that a track originates from the IP is a powerful constraint that reduces combinatorics and the background, as well as improves the momentum resolution. This is also the case for the standard PANDA track finder, used in most non-hyperon analyses up to now (see e.g. Refs. [14,15,16]). In recent years, several efforts have been made to develop algorithms that can handle secondary tracks from displaced vertices [17,18,19]. These are all based on classical approaches such as the Hough transformation, recursive annealing filters and Apollonius triplets. In particular, the algorithms explored by Alicke in Ref. [19] improve the reconstruction efficiency for secondary tracks up to 79%, to be compared with 45% for the standard primary track finder. Nevertheless, there is room for improvement and it is notable that Refs. [17,18,19] do not explicitly address low-momentum particles.

Machine-learning techniques are gaining importance in particle tracking, sparked by the Tracking Machine Learning Challenge (TrackML) within the high-energy community [20, 21]. In particular, Graph Neural Networks (GNN) have been found suitable for particle tracking in detectors with non-Euclidean geometries such as ATLAS [22, 23], WASA-FRS at FAIR [24], BESIII [25] and Belle II [26]. Since the central part of the PANDA detector has a non-Euclidean geometry, GNNs are natural candidates for facing the challenge of its track reconstruction. It is crucial to investigate how GNN-based solutions tackle the specific challenges of a low-to-intermediate energy experiment such as PANDA, particularly low-momentum particles with displaced vertices. In this paper, we present a detailed ML-based track reconstruction solution for a non-Euclidean PANDA detector system.

The paper is organised as follows. Sect. 2 briefly introduces the PANDA experiment at FAIR with an emphasis on the Straw Tube Tracker, Sect. 3 gives an overview of track reconstruction using machine learning, including an account of contemporary related work. Sect. 4 presents a detailed description of our methodology, from machine learning architecture designs to the performance evaluation metrics. Sect. 5 gives the final results for three different cases: (i) muon reconstruction using conventional deep learning, (ii) muon reconstruction using geometric deep learning, and (iii) hyperon reconstruction using geometric deep learning. Finally, in Sect. 6 we present the conclusions.

The PANDA Experiment at FAIR

The PANDA (anti-Proton ANnihilation at DArmstadt) experiment is currently under construction at FAIR (Facility for Anti-proton and Ion Research) [27, 28]. The antiproton beam from the High Energy Storage Ring (HESR) will impinge on a fixed, internal hydrogen cluster jet target, a hydrogen or deuterium pellet target (for \(\bar{p}p\) or \(\bar{p}n\) reactions) or foils (for \(\bar{p}A\) interactions). The interaction rate will be 1–20 MHz. A schematic layout of the planned detector is shown in Fig. 1.

Fig. 1
figure 1

Schematic of the PANDA Experiment

The detector will consist of two parts: a Target Spectrometer (TS) employing a solenoid magnet for the detection of particles emitted centrally, and a Forward Spectrometer (FS) with a dipole magnet for forward-going particles. Together, the two spectrometers cover almost the full 4\(\pi\) solid angle. In this work, we focus on the TS that comprises a Micro Vertex Detector (MVD) surrounding the IP, followed by a Straw Tube Tracker (STT) for tracking. A Barrel Detector of Internally Reflected Cherenkov light (DIRC) and a Barrel Time-of-Flight (ToF) provide particle identification and an electromagnetic calorimeter will measure the energies of charged and neutral particles. The solenoid magnet will surround the EMC and its iron yoke will act as an absorber. Most particles that traverse the full iron yoke will be muons, and they will be tagged in a dedicated muon system. The full PANDA detector is described in detail in Refs. [1, 12].

In this work, we focus on data collected by the STT, which will consist of 4224 single-channel straw tubes, distributed in 27 layers and six sectors in a hexagonal shape, as shown in Fig. 2 (left). The green-marked tubes (15 to 19 layers) will be arranged parallel to the beam axis to measure the hit position, whereas blue- and red-marked tubes (8 layers) are tilted or skewed with respect to the beam axis by a \(\pm 2.9^{\circ }\) polar angle. The skewed straws enable reconstruction of the \(z-\)component of the tracks [17]. The STT detector will cover polar angles from \(\theta = 22^{\circ }\) to \(\theta = 140^{\circ }\). For a solenoid field of 2 T, particles produced at the IP with a transverse momentum \(p_\text {T}\) of at least 50 MeV will reach the innermost layer, while a minimum \(p_\text {T}\) of 100 MeV is required to traverse the full STT.

Fig. 2
figure 2

Left: a cross-sectional view of the STT detector with distance to boundaries in mm. The green layers are parallel, whereas blue and red layers are skewed by \(\pm 2.9^{\circ }\) to the \(z-\)axis. Right: zoom-in view of straw tubes in black circles, the isochrone radius in blue circles, and a track in red

When a charged particle traverses the STT detector, it ionises the gas inside the tube, releasing electrons. The electrons then drift toward the centre of the tube, ionising more gas molecules on the way. All free electrons will be collected at an anode wire in the centre, resulting in a signal pulse referred to as a hit. The xy position of the hit is the position of the anode wire. The distance of the closest approach from the particle trajectory to the anode wire is the isochrone radius. Our analysis uses the xy position from the straight straws and the isochrone radii as input data.

PandaRoot Analysis Framework

We used the PandaRoot analysis framework [29] to produce the simulated data for the analysis. PandaRoot offers tools for event simulation, beginning with the production of Monte Carlo events and continuing with the propagation of particles through detector material, digitisation of signals, reconstruction and calibration, and physics analysis. PandaRoot, a detector-specific framework, is derived from the general-purpose FairRoot framework [30], which in turn is based on the ROOT framework [31]. FairRoot constitutes a base for other detector-specific frameworks within the FAIR software ecosystem and provides a wide range of basic classes that facilitate the customisation of each detector configuration. Furthermore, it provides an event display, database management, an input–output manager, a run manager, and the Virtual Monte Carlo (VMC) interface. The latter enables the selection of several simulation engines. In addition, it uses the task system of ROOT to combine and exchange different algorithms into a simulation chain.

Track Reconstruction with ML

Track reconstruction, a process that labels hits to reconstruct particle trajectories, is essential for all physics analyses performed in nuclear and particle physics. Various algorithms have been developed, most tailored to the experiment and the particles at focus. Factors such as particle multiplicity, point of origin, and momentum are crucial when choosing an algorithm. For instance, in high-energy physics (HEP), particles produced in a beam-target interaction have large transverse momenta \(p_\text {T}\), resulting in mostly straight particle trajectories. In contrast, hadron physics experiments often produce particles with \(p_\text {T}\) as low as 100 MeV/c, leading to highly curved particle trajectories that may intersect multiple other particle trajectories in the detector. This factor alone makes track reconstruction a demanding task. Another factor is that long-lived particles, such as hyperons, decay centimetres or even metres from the beam-target interaction point. On the other hand, the track multiplicity is generally lower in hadron physics compared to HEP; it is unusual that one \(\bar{p}p\) annihilation event within the energy region of PANDA gives rise to more than ten particles.

At the heart of every algorithm is the pattern recognition algorithm. Most classical algorithms [14,15,16] are combinatorial; they recursively try different hit combinations to find particles, which makes these algorithms computationally expensive. In this work, we explore an ML-based solution for track reconstruction to address not only the computational challenges but also the aforementioned challenges of track reconstruction in hadron physics experiments.

Related Work

Pattern recognition using neural networks has seen significant advancements in recent years. Within the HEP.TrkX project [32], novel deep learning techniques were developed for track reconstruction to address the challenges of High-Luminosity LHC (HL-LHC). These solutions considered image-based techniques, such as image segmentation and image captioning, recurrent neural networks (RNNs) and convolutional neural networks (CNNs), applied to pseudo-data from simulations of a planar detector geometry [33]. However, these methods do not scale with realistic detectors of irregular geometries and data sparsity. Using the space-point representation of tracking data from a generic barrel detector, RNNs and GNNs were used for track reconstruction with great success [34, 35]. Building upon these developments, the Exa.TrkX project [36] which is the successor of HEP.TrkX, demonstrated the potential of GNNs to broader particle track and shower reconstruction [37], track-seeding and labelling [38] including full-detector analyses [39]. Additional applications of GNNs in particle physics can be found in Refs. [40,41,42]. Almost all these applications use the TrackML data simulated with a generic detector geometry. The application to the more realistic detector geometries is reported in Refs. [23, 43]. Beyond HL-LHC, the GNNs have been used in several other realistic detectors such as GEM detectors [44], straw-tube detectors [45,46,47], drift chambers [48,49,50], and LArTPCs [51, 52]. To keep track of the applications of GNNs in nuclear and particle physics, we refer readers to the HEP ML Living Review [53].

In the PANDA experiment, GNN models have been utilised for edge classification in track reconstruction within the Straw Tube Tracker (STT), building upon the methodology outlined in Ref. [41], with preliminary results presented in Ref. [46]. This paper takes a step further by exploring the application of various neural network architectures to data involving muons and hyperons with detailed studies presented in Ref. [47].

Methodology

In this work, we aim to perform pattern recognition using concepts of nodes, edges, graphs, and deep neural networks. It is natural to consider particle trajectories in a detector as graphs, where the graph nodes represent the detector hits and graph edges represent the possibility of two detector hits coming from the same particle. An edge is labelled as true if the two linked hits are from the same particle and false otherwise. The core idea is to build a graph from the detector hits that includes all true edges and as few fake edges as possible. Then, the graph can be classified by FCNs or GNNs. One can perform classification on graphs in three different ways: (a) node classification where the hits are classified as either signal or noise, (b) edge classification where a link between two hits is classified as either true or false, and (c) graph classification where a full event is classified either signal or noise. For track reconstruction, an edge classification is a suitable option where all edges in a graph are labelled with edge scores. The labelled graph is then passed to a clustering algorithm to group hits as track candidates.

First, we consider two representations of the data from the PANDA STT detector: Euclidean and non-Euclidean. They differ in whether a classification model is itself geometrical or non-geometrical. Second, we use two different reactions to produce events: muon pairs and hadrons from a \(\bar{p}p \rightarrow \bar{\Lambda }\Lambda \rightarrow \bar{p}\pi ^{+}p\pi ^{-}\) reaction at a beam momentum of 1.642 GeV/c. The latter corresponds to an excess energy of \(\sim 73\) MeV with respect to the \(\bar{\Lambda } \Lambda\) production threshold and has been studied in detail before in the context of PANDA with ideal tracking [11]. Our strategy is to first compare both representations using hit data produced by the muons, and then use the best-performing approach to reconstruct hadrons from \(\bar{p}p \rightarrow \bar{\Lambda }\Lambda \rightarrow \bar{p}\pi ^{+}p\pi ^{-}\) reaction. Here, tracks from the final state particles originate from the \(\Lambda\) and \(\bar{\Lambda }\) decay vertices, typically located several centimetres away from the beam-target interaction point. Hence, this reaction provides a benchmark for evaluating the performance of machine learning algorithms in reconstructing tracks from displaced decay vertices.

The following sections detail the three major steps in our study: data generation and acquisition, the application of deep learning, and the evaluation of reconstructed trajectories. The relevant code used for this work is available at [54].

Data Generation

The muon data sample consists of five \(\mu ^\pm\) pairs in the momentum range from 100 MeV/c to 1.5 GeV/c, where the muons are produced with a particle gun, isotropically distributed in the STT acceptance. The \(\bar{p}p \rightarrow \bar{\Lambda }\Lambda \rightarrow \bar{p}\pi ^{+}p\pi ^{-}\) reaction is simulated with the generator EvtGen [55]. GEANT4 handles the particle transport through the detector material [56]. For both muons and hyperons, \(10^5\) events are generated for training, which is further distributed into \(90\%\) for training, \(5\%\) for validation and \(5\%\) for testing, respectively. For inference, a separate prediction sample containing \(2 \cdot 10^3\) events for muons and \(3 \cdot 10^3\) events for hyperons is used to avoid any bias.

Finally, we preprocess the events by creating a hit feature vector for each hit using their positions (\(r, \phi , z\)) and respective isochrone radii and defining true edges between hits with the Monte Carlo truth information.

Deep Learning Pipeline

The deep learning pipeline contains several stages: (1) Graph Construction, (2) Edge Classification, and (3) Graph Segmentation as shown in Fig. 3.

Fig. 3
figure 3

Sketch of a deep learning pipeline showing graph construction, edge classification and graph segmentation stages, image adapted from the Exa.TrkX [41]

In the graph construction stage, we use a layerwise heuristic method to build edges by connecting nodes in adjacent layers, starting from the innermost layer to the outermost layer of the STT. If a node is missing in one layer, the edge is created with the next available layer. We also restrict graph construction in the adjacent sectors of the STT. Since the detector occupancy from hyperons, i.e. the \(\bar{p}p \rightarrow \bar{\Lambda } \Lambda \rightarrow \bar{p}\pi ^{+}p\pi ^{-}\) reaction, is much smaller than the muons, we remove the adjacent sector constraint for hyperons. This stage produces an edge list where each edge is labelled as true or false depending on whether it belongs to a particle or not. Furthermore, all edges resulting from the noise, if any, are labelled as false.

In the edge classification stage, we train a deep learning model to classify edges as produced in the previous stage. This stage has two modes: Euclidean and non-Euclidean. In the Euclidean mode, each edge is fed separately to an FCN for classification. The relational information between hits beyond one-hop connections is inherently absent in this mode. In the non-Euclidean mode, the full event presented as a graph is fed to a GNN for classification. The idea is that the GNN can better capture the topological features of data.

Our FCN model has six fully connected layers with hidden dimensions of [128, 128, 1024, 1024, 128, 1]. We apply the relu() activation function in hidden layers and the sigmoid() activation function in the final layer for binary classification. In addition, layer norm is applied to the final layer. These hyperparameters are chosen after tuning the FCN model.

Similarly, the GNN model is the Interaction Graph Neural Network (IGNN) [57] formulated under a message-passing framework [58]. The IGNN consists of three modules: (i) encoder module, (ii) graph module, and (iii) decoder or output module. The encoder module consists of an edge network and a node network, and its task is to encode input node features to a vector of hidden features and to create edge features from neighbouring nodes. In the graph module, aggregated neighbouring edge features are passed to the node network, and the neighbouring node features are passed to the edge network. This is a message-passing step where information is exchanged between nodes and edges. This step is repeated eight times (\(N=8\)), where each step is a hidden layer of the graph module. In addition, residual connections \(\{H_i\}_{i=0}^{N}\) are formed by concatenating the input and the output of each graph module. The final output is then passed to the decoder module, which performs binary classification using the binary-cross-entropy loss function. As a result, each edge is assigned an edge score between 0 and 1. A schematic diagram of IGNN used in this work is shown in Fig. 4:

Fig. 4
figure 4

Schematic of an IGNN with an encoder module, graph modules illustrating message passing, and a decoder module. The output classifies edges marked as red (low score) and black (high score)

Networks inside IGNN modules are of FCN-type. Each network has a three-layer architecture with nodes of [128, 128, 128] with a ReLU activation function on all layers. The following hyperparameters are used during training:

  • Binary Cross-Entropy (BCE) loss

  • AmsGrad Optimizer: \(\alpha =0.001, \beta _{1}=0.9, \beta _{2}=0.999\), weight_decay=0.01

  • Learning rate: \(\alpha =0.001\)

  • Message-passing steps or layers: 8, Edge aggregation function: sum_max()

The hyperparameters of IGNN are chosen exactly as used in Ref. [41]. We set the batch size to 128 for the FCN and 1 for the IGNN. The networks are trained for 50 epochs as the generalisation gap between the training and the validation errors does not change significantly. Finally, each edge in the graph is labelled with a probability score, later called the edge score.

In the graph segmentation stage, we use the density-based spatial clustering of applications with noise (DBSCAN) algorithm [59] to find connected components of the labelled graph. We use the graph score as the distance metric between two nodes. The distance metric (\(\epsilon _{db}\)) defines the maximum distance between two nodes to be clustered together. The value of \(\epsilon _{db}\) is scanned to find an optimal value where the efficiency of graph segmentation is high.

Track Evaluation

Track evaluation ensures that the reconstructed tracks accurately represent true particle trajectories. One method to assess the quality of the track reconstruction algorithm is by calculating the overall tracking efficiency, referred to here as physics efficiency and track purity. Physics efficiency indicates how effectively the tracking algorithm can identify all particle tracks from the detector signals, and depends both on the performance of the algorithm and the efficiency of the detector. Track purity refers to the algorithm’s ability to distinguish true particle tracks from wrongly reconstructed tracks or other types of backgrounds. To evaluate the performance of the tracking algorithm itself, independently of the detector efficiency, a conditional tracking efficiency is defined, i.e. the technical efficiency. When this is evaluated, a minimum number of hits is required below which the algorithm cannot be expected to reconstruct a track. The tracking metrics will be defined using an evaluation scheme that closely aligns with the ATLAS community [60]:

  • \(N_{{\textbf {particles}}} ({\textbf {selected}})\) is the number of generated particles in the detector acceptance, which will be referred to as particles.

  • \(N_{{\textbf {particles}}} ({\textbf {selected}}, {\textbf {matched}})\) is the number of particles matched to at least one reconstructed track.

  • \(N_{{\textbf {particles}}} ({\textbf {selected}}, {\textbf {reconstructable}})\) is the number of generated particles that leave at least seven or more hits (\(N_t\)) in the detector, they will be referred to as the reconstructable particles.

  • \(N_{{\textbf {particles}}} ({\textbf {selected}}, {\textbf {reconstructable}}, {\textbf {matched}})\) is the number of reconstructable particles that are matched to at least one reconstructed track.

  • \(N_{{\textbf {tracks}}}({\textbf {selected}})\) is the number of reconstructed tracks containing at least five or more hits (\(N_r\)), which will be referred to as reconstructed tracks.

  • \(N_{{\textbf {tracks}}}({\textbf {selected}}, {\textbf {matched}})\) is the number of reconstructed tracks that are matched to a particle.

A particle is considered matched to a reconstructed track if more than (i) 50% of the hits in the reconstructed track belong to the same true particle, (ii) 50% of the hits in the matched true particle are found in the reconstructed tracks. This is known as two-way matching scheme. Furthermore, the reconstructable particles are the selected particles that also have at least seven hits in the detector before performing the track reconstruction.

The physics efficiency (\(\epsilon _{\text {phys}}\)) is the fraction of particles that match at least one reconstructed track:

$$\begin{aligned} \epsilon _{\text {phys}}&= \frac{N_{particles} (\text {selected, matched})}{N_{particles} (\text {selected})} \end{aligned}$$
(1)

The technical efficiency (\(\epsilon _{\text {tech}}\)) is the fraction of reconstructable particles that match at least one reconstructed track:

$$\begin{aligned} \epsilon _{\text {tech}}&= \frac{N_{particles} (\text {selected, reconstructable, matched})}{N_{particles} (\text {selected, reconstructable})} \end{aligned}$$
(2)

Finally, the track purity (\(\rho\)) is defined as the fraction of reconstructed tracks that match a selected particle:

$$\begin{aligned} \rho&= \frac{N_{\text {tracks}} (\text {selected, matched})}{N_{\text {tracks}} (\text {selected})} \end{aligned}$$
(3)

In addition, the fake rate (\(\equiv 1-\rho\)) or ghost rate is defined as the fraction of reconstructed tracks not matching any particle tracks. In contrast, the clone rate is the rate at which a particle is matched to more than one reconstructed track.

In addition to the requirement that the reconstructable track has at least 7 STT hits (i.e. \(N_t \ge 7\)), reconstructed tracks require at least 5 STT hits (i.e. \(N_r \ge 5\)) and a matching fraction (MF) greater than 50%.

Results

For a fine-grained understanding of the performance, we investigate how the track efficiencies depend on variables such as transverse momentum \(p_\text {T}\), and the radial distance \(d_0\) between the beam-target interaction point and the decay vertex:

$$\begin{aligned} \quad d_0 = \sqrt{v_x^{2} + v_y^2}. \end{aligned}$$
(4)

where \(v_x\) and \(v_y\) denote the positions of the decay vertex of particles along the x and y axes, respectively.

Muon Reconstruction with GDL

To reconstruct muons, two different approaches are adopted: the Euclidean, using FCN and the non-Euclidean, using IGNN. We investigate the performance of edge labelling and graph segmentation stages, leading to the evaluation of both approaches.

To examine the edge labelling, different evaluation metrics are used. The model output gives the classification probabilities, referred to as edge scores, for each edge in the graph. For edge predictions, an optimal threshold on the edge score is required. Fig. 5 shows the model outputs of FCN and IGNN where edge scores of true (blue) and false (orange) edges are shown without applying a threshold on the model outputs.

Fig. 5
figure 5

The output of a classification model: FCN (left) and IGNN (right)

IGNN gives better separation power between true and false edges compared to the FCN for a particular threshold value. To quantitatively evaluate model performance, we used the Receiver Operating Characteristic Curve (ROC) and measured Area Under the Curve (AUC). The ROC curve is constructed using the edge classification efficiency (\(\epsilon _E \equiv \text {TPR}\)) and edge classification purity (\(\rho _E \equiv 1 - \text {FPR}\)) for various thresholds, where TPR is the true positive rate and FPR is the false positive rate from the confusion matrix. The ROC curves along with the AUCs for both models are shown in Fig. 6.

Fig. 6
figure 6

ROC curve with AUC of 0.98748 for FCN (left) and 0.99970 for IGNN (right)

A high value of AUC represents high model performance and vice versa, thus prompting a reasonable model training period. Since the ROC curve is constructed at varying threshold values on the model output, one needs to find an optimal threshold value or edge score cut (s). For this purpose, the \(\epsilon _E\) and \(\rho _E\) are plotted as a function of edge score cut as shown in Fig. 7 for FCN (left) as well as IGNN (right).

Fig. 7
figure 7

The edge classification efficiency \(\epsilon _E\) and edge classification purity \(\rho _E\) as a function of s for FCN (left) and IGNN (right)

The higher values of s give high edge purity but low edge efficiency, and vice versa; hence, there is a trade-off in choosing a particular value of s. For example, choosing \(s = 0.5\) gives \(\epsilon _E \sim 96\%\) and \(\rho _E \sim 97\%\) for the FCN model, whereas \(\epsilon _E = 99.2\%\) and \(\rho _E = 99.0\%\) for the IGNN. Alternatively, we can examine the signal efficiency (\(\epsilon _{sig}\)) vs background rejection factor (BRF) at various values of edge score cut. We define signal efficiency as the TPR (\(\epsilon _{sig} \equiv \text {TPR}\)) or recall, and misidentification rate as the FPR from the ROC curve. The BRF is defined as the inverse of the misidentification rate (BRF \(\equiv\) 1/FPR). Figure 8 shows the signal efficiency as a function of the BRF for various values of edge score cut for FCN (left) and IGNN (right) models.

Fig. 8
figure 8

The \(\epsilon _{sig}\) as a function of BRF for various values of s for FCN (left) and IGNN (right) models, the dot represents \(s=0.5\)

The orange curve shows how the signal efficiency and the BRF depend on s, and the black dot represents the edge score cut value of 0.5. With this cut value, we get \(\epsilon _{sig} = 95.5\%\) and \(\text {BRF}=34.8\) for FCN, and the \(\epsilon _E\) is \(99.2\%\) and the BRF of 101.4 for IGNN. Increasing the cut value to 0.7 yields \(\epsilon _{sig} = 93.5\%\) and \(\text {BRF}=43.4\) for FCN and \(\epsilon _{sig} = 97.7\%\) and \(\text {BRF}=213.5\) for IGNN. Hence, the high BRF comes at a cost of a reduced value of \(\epsilon _{sig}\). Therefore, we chose \(s=0.5\) for further analysis.

After edge labelling, we look into the graph segmentation using the DBSCAN algorithm. We utilise a prediction dataset comprising \(2 \cdot 10^4\) events for this purpose. The DBSCAN extracts the connected components (Euclidean case) and weakly connected components (non-Euclidean case) of the graphs. This algorithm requires an optimal value of the distance metric (\(\epsilon _{\text {db}}\)) to cluster nodes together. For this purpose, we used different values of \(\epsilon _{\text {db}}\) to find connected components and then performed track evaluation. Figure 9 shows a scan of \(\epsilon _{\text {db}}\) against different tracking metrics for FCN (left) and IGNN (right), where the optimal values of \(\epsilon _{\text {db}}\) are shown as magenta lines with values of 0.20 and 0.25 for FCN and IGNN, respectively.

Fig. 9
figure 9

The scan of \(\epsilon _{db}\) for DBSCAN algorithm for FCN (left) and IGNN (right). The magenta lines show the selected values of \(\epsilon _{db}\)

Finally, we extract the track candidates using the optimal values of \(\epsilon _{\text {db}}\) and track evaluation criteria as discussed in Sect. 4.3. These tracking metrics are summarised in Table 1 for FCN and IGNN models as follows:

Table 1 Tracking efficiencies, ghost rate, and clone rate for muons

In the FCN case, we note that the efficiencies are fairly small from a physics perspective; for example, for events with four tracks, a tracking efficiency of \(\approx 77\) % will result in a three-fold reduction of the total efficiency. Hence, there is room for improvement. One major issue for FCN is to handle the huge class imbalance with a ratio of true to false edges of 1:4. IGNN can handle such imbalances by aggregating neighbourhood relations through message-passing, which results in an increase from 77.2% to 92.6% in the tracking efficiencies, an almost \(20\%\) increase in efficiencies. Furthermore, it reduces the ghost rate to be almost negligible. The clone rate is also reduced, but is still high.

To better understand the algorithm’s performance, we investigate how the tracking efficiency depends on the transverse momentum (\(p_\text {T}\)) of muons. Figure 10 shows the number of particles (selected, selected and matched, reconstructable and reconstructable and matched) for FCN (left panel) and IGNN (right panel) as a function of \(p_\text {T}\). In Fig. 11, we show the corresponding track efficiencies. We conclude that the main loss of tracks occurs at low \(p_\text {T}\), especially below \(p_\text {T} = 0.25\) GeV/c. Particles with such low \(p_\text {T}\) have trajectories with so large curvature that they make a turn before traversing the full STT detector. Hence, they are trapped inside the detector, with trajectories spiralling in the magnetic field, lose energy in interactions with the detector material and the gas, and potentially also intersect the trajectories of other particles. This is in contrast to high \(p_\text {T}\) particles, which have rather straight trajectories with fewer intersections, resulting in higher efficiencies for this class of particles. The improved performance of the IGNN compared to the FCN for low-\(p_\text {T}\) tracks is striking, as seen in Fig. 11.

Fig. 10
figure 10

The number of selected, selected and matched, reconstructable, reconstructable and matched particles as a function of \(p_\text {T}\) for FCN (left) and IGNN (right)

Fig. 11
figure 11

Tracking efficiencies as a function of \(p_\text {T}\) for FCN (left) and IGNN (right) with reference lines at \(p_\text {T} = 0.25\) GeV/c (vertical) and \(\epsilon = 90\%\) efficiency (horizontal)

Hyperon Reconstruction with GDL

We have performed simulations for training and testing with the reaction \(\bar{p}p \rightarrow \bar{\Lambda }\Lambda \rightarrow \bar{p}\pi ^{+}p\pi ^{-}\) at a beam momentum of \(\bar{p}_{beam} = 1.64\) GeV/c. Since \(\Lambda\) hyperons are neutral, tracking information is obtained from their charged daughters, i.e. protons and pions. The \(\bar{p}p \rightarrow \bar{\Lambda }\Lambda\) reaction has been rigorously studied by the PS185 experiment at LEAR, CERN [61, 62], in particular at this beam momentum [63]. It has been found that in the CMS system of the reaction, the \(\bar{\Lambda }\) antihyperon is emitted predominantly in the forward direction while the \(\Lambda\) hyperons go backwards. This means that in the lab system of PANDA, the fast \(\bar{\Lambda }\) antihyperon goes into the acceptance of the Forward Spectrometer while the \(\Lambda\) hyperons are slow and decay inside the Target Spectrometer. Hence, the daughters of the \(\Lambda\) give rise to hits in the STT and can be reconstructed with our algorithm. Of special interest is the daughter pions from \(\Lambda\) decays, since they often have very low momenta (see left panel of Fig. 12). In the decay, antiproton and protons (\(\bar{p}, p\)) take the larger share of the momentum, while only a small fraction goes to the pions (\(\pi ^+, \pi ^-\)). These pions are challenging to reconstruct due to the high curvature of their trajectories and their high probability of intersecting with the trajectories of other particles. Furthermore, due to the relatively long lifetime of the \(\Lambda\) hyperons, they are expected to decay far from the beam-target interaction point (see right panel of Fig. 12). This makes the \(\bar{p}p \rightarrow \bar{\Lambda }\Lambda \rightarrow \bar{p}\pi ^{+}p\pi ^{-}\) reaction an important benchmark for track reconstruction algorithms with PANDA.

Fig. 12
figure 12

Kinematic properties of \(\bar{p}p \rightarrow \bar{\Lambda }\Lambda \rightarrow \bar{p}\pi ^{+}p\pi ^{-}\) reaction: (left) MC |p| versus \(\theta\) distribution, (right) MC decay vertex distribution of \(\Lambda\) and \(\bar{\Lambda }\)

For the hyperon reconstruction, we applied the same GDL pipeline as in the muon case, except with one small difference: in the graph construction stage, the heuristic method for building nodes and edges was not restricted to adjacent sectors. Since the \(\bar{p}p \rightarrow \bar{\Lambda }\Lambda \rightarrow \bar{p}\pi ^{+}p\pi ^{-}\) reaction contains fewer particles per event compared to the \(5\mu ^+\mu ^-\) case, and since we expect many pions to be emitted at extremely low \(p_\text {T}\) [13], removing this condition increases the amount of data in each event. After edge labelling, we tested \(2 \cdot 10^3\) events during inference. In the graph segmentation stage, we used the DBSCAN method with \(\epsilon _{\text {db}} = 0.15\) after rescanning this parameter, similar to Fig. 9, and a minimum number of samples to be two to find the connected components from the test events. Using the same track evaluation criteria as in previous cases, the tracking efficiencies, ghost rate and clone rate are obtained as in Table 2.

Table 2 Tracking efficiencies, ghost rate, and clone rate for hyperons

The physics tracking efficiency \(\epsilon _{phys.}\) is about 90%. However, the technical efficiency \(\epsilon _{tech.}\) is significantly higher, and the ghost rate and clone rate are significantly lower compared to the muon case. This performance gain is understood as each event has fewer particles than the muon case and there are fewer track intersections.

Similar to the muon case, we analyse the tracking efficiencies as a function of \(p_\text {T}\) as shown in Fig. 13: the number of particles (left) and tracking efficiencies (right).

Fig. 13
figure 13

Number of particles (left), and tracking efficiencies (right) as a function of \(p_\text {T}\) with reference lines at \(p_\text {T} = 0.25\) GeV/c (vertical) and \(\epsilon = 90\%\) efficiency (horizontal)

In a large fraction of the events, there are particles with momenta as low as \(p_\text {T}\) \(< 0.25\) GeV/c, which are captured in the magnetic field of the PANDA solenoid and therefore remain inside the detector. These particles are primarily pions and form an enhancement at low \(p_\text {T}\) in the left panel of Fig. 13. Protons generally have much larger momenta, and the different kinematics of protons and pions manifest in different lab polar angles. As a result, the track lengths will differ and thus the reconstruction probability. In the right panel of Fig. 13, we see that the physical track efficiency \(\epsilon _{phys}\) has a structure where low-momentum pions and high-momentum protons are relatively well reconstructed. At the same time, there is a dip in the efficiency in the intermediate region. However, the technical efficiency has no such structure, which leads to the conclusion that the intermediate \(p_\text {T}\) region has a high content of non-reconstructable tracks.

Next, we investigate tracking efficiency as a function of the radial position of decay vertices (\(\text {d}_0\)), as shown in Fig. 14.

Fig. 14
figure 14

Number of particles (left), and tracking efficiencies (right) as a function of \(\text {d}_0\)

Most particles (protons, pions) are generated close to the interaction point, however, a considerable fraction is generated up to 14 cm from the beam-target interaction point. From Fig. 14, we conclude that our algorithm also performs well for these kinds of tracks: the physical and technical track efficiencies are above 90% for both pions and protons over the full \(\text {d}_0\) range. Moreover, the technical efficiency is about 97%. This is an important finding as most heavier hyperons (\(\Xi , \Omega\), etc.) decay at \(\text {d}_0 < 15\) cm [13].

Conclusion

In this work, we have successfully applied machine learning to reconstruct particle trajectories in a hadron physics experiment. Our work shows the first use of GNN-based track reconstruction in the straw tube detector with non-Euclidean geometry. It is found that GDL models give promising results, giving overall tracking efficiency \(\ge 90\%\). It can reconstruct pions with \(p_\text {T}\) as low as \(\sim 0.05\) GeV/c and protons with \(p_\text {T}\) as low as \(\sim 0.1\) GeV/c. Further studies show that this method also works well for reconstructing particles with secondary decay vertices up to at least \(\text {d}_0 = 14\) cm away from the IP in the radial direction. Beyond \(\text {d}_0 = 14\) cm, our simulated data contains no decaying \(\Lambda\) hyperons. This is an important result as heavier hyperons, such as \(\Xi ^-\) and \(\Omega ^-\), are expected to decay through intermediate \(\Lambda\) hyperons with the \(\Lambda\) decay vertices mostly occurring less than 15 cm from the IP. These results are promising for the hyperon reconstruction at PANDA and demonstrate the virtues of GNNs for the specific challenges of particle tracking in hadron physics experiments.