\addbibresource

main.bib

Spiking neurons as predictive controllers of linear systems

Paolo Agliati Department of Machine Learning and Neural Computing, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands André Urbano Cajal International Neuroscience Center, Spanish National Research Council, Madrid, Spain Pablo Lanillos Cajal International Neuroscience Center, Spanish National Research Council, Madrid, Spain Nasir Ahmad Department of Machine Learning and Neural Computing, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands Marcel van Gerven Department of Machine Learning and Neural Computing, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands Sander Keemink Department of Machine Learning and Neural Computing, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands

Abstract

Neurons communicate with downstream systems via sparse and incredibly brief electrical pulses, or spikes. Using these events, they control various targets such as neuromuscular units, neurosecretory systems, and other neurons in connected circuits. This gave rise to the idea of spiking neurons as controllers, in which spikes are the control signal. Using instantaneous events directly as the control inputs, also called ‘impulse control’, is challenging as it does not scale well to larger networks and has low analytical tractability. Therefore, current spiking control usually relies on filtering the spike signal to approximate analog control. This ultimately means spiking neural networks (SNNs) have to output a continuous control signal, necessitating continuous energy input into downstream systems. Here, we circumvent the need for rate-based representations, providing a scalable method for task-specific spiking control with sparse neural activity. In doing so, we take inspiration from both optimal control and neuroscience theory, and define a spiking rule where spikes are only emitted if they bring a dynamical system closer to a target. From this principle, we derive the required connectivity for an SNN, and show that it can successfully control linear systems. We show that for physically constrained systems, predictive control is required, and the control signal ends up exploiting the passive dynamics of the downstream system to reach a target. Finally, we show that the control method scales to both high-dimensional networks and systems. Importantly, in all cases, we maintain a closed-form mathematical derivation of the network connectivity, the network dynamics and the control objective. This work advances the understanding of SNNs as biologically-inspired controllers, providing insight into how real neurons could exert control, and enabling applications in neuromorphic hardware design.

1 Introduction

Refer to caption — Figure 1: (A) Scheme of a feedback controller and a downstream system. The current state of the system $\mathbf{x}\mathit{(t)}$ is given as input to the controller, along with a target value $\mathbf{z}\mathit{(t)}$ . The controller outputs a control signal $\mathbf{u}\mathit{(t)}$ , as a function of both inputs. $\mathbf{u}\mathit{(t)}$ is then mapped onto the dynamics of the downstream system via the $\mathbf{B}$ matrix. The system’s dynamics are governed by the $\mathbf{A}$ matrix. (B-D) Examples of three feedback control paradigms. (B) The continuous LQR uses a gain matrix $\mathbf{K}$ to directly compute the control signal for a given state and target pair. (C) In the filtered spiking case, the output spikes of the network are filtered to obtain $\mathbf{r}\mathit{(t)}$ , which is decoded by the matrix $\mathbf{D}$ to form the control signal $\mathbf{u}\mathit{(t)}$ (adapted from \citepslijkhuisClosedform2023). (D) In our spiking control paradigm, the output spikes $\mathbf{s}\mathit{(t)}$ represent the control input itself, and are therefore directly mapped onto the system dynamics through the $\mathbf{B}$ matrix.

In the vertebrate nervous system, action potentials are one of the universal principles of communication between neurons \citepkochBiophysics2004,gollischRapid2008, wolfeSparse2010, boerlinSpikebased2011, gutigspike2014, srivastavaMotor2017. Spiking behavior of cortical neurons grants efficient and robust computational properties, providing the basis for the realization of the cortex’s complex adaptive functions \citepmainenReliability1995, kochBiophysics2004, bazhenovRole2005, gollischRapid2008, wolfeSparse2010, boerlinSpikebased2011. At the level of single neurons, spike events are incredibly brief and sparse, and through synaptic communication can affect the dynamics of neurons in downstream areas. This property, where neurons are not only communicating information but modifying behavior, settled the basis for viewing neurons as controllers \citepcisekcomputer1999, ahissarPerception2016 and spikes as control signals. We can find clear examples of spike-driven control in movement control studies, where the spikes’ influence allows animals to orient their head in space \citepzhangRepresentation1996, as well as control the movement of their eyes \citepseungHow1996 or limbs \citepbernikernormative2019. This control perspective can offer important insights into neural computation, but is also increasingly relevant for neuromorphic applications where spiking networks are used to control downstream plants \citepstagstedneuromorphic2020, polykretisspiking2022, liuSpiking2023.

Although the theory and applications of control with spiking networks has recently seen some advances, it remains unclear how to generate each spike to successfully control a plant’s dynamics. Traditionally, control theory relies on optimal control methods, such as the linear quadratic regulator (LQR) \citepandersonOptimal2007, which continuously adjusts control inputs to drive a system toward a target state while minimizing a cost function (Fig. 1B). Although effective, these continuous control paradigms are difficult to reconcile with the discrete nature of SNNs. Attempts at bridging this gap include the use of a filtered spikes approach. These studies have applied classic optimal control principles to SNNs by approximating LQR signals through rate codes or filtered spike trains \citepboerlinPredictive2013, deneveEfficient2016, huangDynamical2018, guoNeural2021, slijkhuisClosedform2023, as shown in Fig. 1C. Importantly, in these filtered spikes approaches, spikes serve as the network’s representation of the system state but are not actively computing control signals, which are still continuous in nature. This makes it difficult to trace the effect each single spike has on the trajectory of the controlled system. Alternative approaches for spike-based control in the literature focus on individual neuron-based controllers \citepmooreneuron2024, or implement forms of instantaneous control \citepyangImpulsive1999. Such impulsive control methods have been widely studied, but they mostly focus on the control theory perspective, sometimes even bypassing the need of a network \citepyangStability2007, raffObservers2007, wangImpulsive2014, ouyangImpulsive2020. Only recently, some works started integrating this framework with theoretical neuroscience concepts, to explicitly interpret the control impulses as the neurons’ action potentials \citeppetriAnalysis2024. Still, network architectures are imposed rather than derived from control objectives. For this reason, the integration of bio-informed SNNs with (feedback) control remains an under-explored area. A new paradigm is needed to account for direct spiking control and its seamless integration with computational neuroscience theories.

Our study investigates how spike events can influence the dynamics of linear systems, using spikes directly as control signals. We consider this problem from the perspective of control theory, by formally deriving how neurons should spike to control a linear dynamical system (Fig. 1). This equates to generating control inputs that can successfully drive the dynamics of a downstream plant to minimize a certain loss associated with the state of the system. When deriving a solution for this problem (Sec. 4.2), we obtain, without needing to introduce further assumptions, a feedback control scenario like the one depicted in Fig. 1A. Our work embeds this instantaneous spiking control within theoretical neuroscience constraints.

Inspired by the neuroscience theory of spike coding networks (SCNs) \citepboerlinPredictive2013, gutigspike2014, deneveEfficient2016, guoNeural2021, we derive, from optimal control principles, a network of recurrent integrate-and-fire (IF) neurons with sparse activity and low-rank connectivity. By directly incorporating spikes as control signals (Fig. 1D), our model ensures that each neuron’s contribution to the system’s trajectory is explicit. Between each spike event $s(t)$ , the system evolves following its dynamics in an open loop, and when a control signal is produced, the system switches to a closed-loop control (with $s(t)$ being the control input). Our solution naturally exploits the intrinsic dynamics of the plant, instead of continuously inputting a signal over time. This approach could help understanding how neural circuits can execute control tasks efficiently while preserving key features of biological networks. In the following sections, we present our framework, analyze the model’s performance on a linear dynamical system, and discuss its implications for both neuroscience and control theory.

2 Results

2.1 Formulating the control problem

To investigate the role of spiking activity in controlling linear dynamical systems, we start by revisiting the fundamental components of a control task. We then introduce our approach to modeling spike-based control, and finally explore the consequences of applying this method to physically constrained systems.

2.1.1 General control setup

The control of a generic $K$ -dimensional linear dynamical system through an $N$ -dimensional control signal, as illustrated in Fig. 1A, can be expressed as:

\mathbf{\dot{x}}=\mathbf{Ax}+\mathbf{B}\mathbf{u}\,,

(1)

where $\mathbf{x}\in\mathbb{R}^{K}$ is the system state, $\mathbf{A}\in\mathbb{R}^{K\times K}$ is the system matrix, describing the system’s uncontrolled dynamics, $\mathbf{B}\in\mathbb{R}^{K\times N}$ is the input matrix which maps the control input onto the system, and $\mathbf{u}\in\mathbb{R}^{N}$ is the control input.

In classical control, $\mathbf{u}$ is typically a continuous signal, as in LQR control, where the optimal control input is computed to minimize a quadratic cost function (Fig. 1B). For control systems that make use of neural networks, the control signal can instead be derived from neural activity. In spiking networks, a filtered-spike signal can be employed, where the firing rates of neurons determine the value of the control input for each timestep (Fig. 1C). Here, the spikes are aiding the network’s internal representation of the systems, but the control input still approximates a continuous control signal \citepboerlinPredictive2013, deneveEfficient2016, huangOptimizing2017, huangDynamical2018, guoNeural2021, slijkhuisClosedform2023, mooreneuron2024. In our work, we instead consider a sparse and discrete control signal, where the input is made of brief pulses or ‘kicks’ (Fig. 1D) \citepyangImpulsive1999, yangStability2007, raffObservers2007, wangImpulsive2014, ouyangImpulsive2020, petriAnalysis2024. The system then becomes:

\mathbf{\dot{x}}=\mathbf{Ax}+\mathbf{B}\mathbf{s}\,,

(2)

where $\mathbf{s}\in\mathbb{R}^{N}$ describes the on/off state of each of $N$ neurons in the network. A spike signal is emitted every time the voltage $V_{i}$ of neuron $i$ reaches its threshold $T_{i}$ , resulting in the spike train $s_{i}(t)=\sum_{j}\delta(t-t_{j})$ with $t_{j}$ indexing the spike times. Importantly, here $\delta$ is the Dirac delta function, which instantaneously assumes the value one at spike times $t_{j}$ and is zero otherwise. This is different from other forms of impulsive control like bang-bang control or chattering control \citepfelgenhauerstability2003, where the instantaneous input is maintained for a certain period of time.

2.1.2 Spike-based control

To formalize our control approach, we define a loss function that expresses the discrepancy between the current state of the linear dynamical system and a desired target state, by taking the squared $L^{2}$ norm of their difference. Specifically, we consider the loss:

L=\|\mathbf{x}(t+f)-\mathbf{z}(t+f)\|^{2}+\mu||\mathbf{s}(t)||^{2}\,,

(3)

where $\mathbf{x}(t+f)$ is the system state some time $f$ in the future, $\mathbf{z}(t+f)\in\mathbb{R}^{K}$ the target state, and $\mu$ is a regularization parameter that determines the spiking cost (Sec. 2.5). We consider both reactive control ( $f=0$ ) and predictive control ( $f>0$ ). Classically, such a loss might be minimized by taking the gradient with respect to certain system parameters, or by finding the optimal control input that minimizes this loss for some future time window, as in standard LQR control. However, the use of a spiking control signal makes both of these approaches challenging to solve mathematically. Here, we instead take inspiration from the neuroscientific theory of spike coding networks. SCN theory assumes spikes should only happen when they improve on an underlying coding loss and derives the required spiking dynamics. From this principle, spiking neural networks that represent their inputs \citepboerlinPredictive2013, deneveEfficient2016 and perform a variety of computations \citepbrendelLearning2020, guoNeural2021, slijkhuisClosedform2023 can be defined.

Following this same principle, but applied to the problem of controlling dynamical systems, we require that a spike should only happen if it improves the control loss (Eq. (3)). According to these SCNs-inspired approaches, whenever a neuron in the network spikes, it (directly or indirectly) influences the state of the plant, resulting in a different loss value. Therefore, we can compute two different losses $L_{\text{s}}$ and $L_{\text{ns}}$ , representing the distance to the target in the presence or absence of a spike, respectively. In order to control the plant’s dynamics, the network compares the two losses to determine wether inputting a spike is advantageous to reach the control target. The inequality $L_{\text{s}}<L_{\text{ns}}$ is describing how our loss can be improved by the control input, decreasing the distance towards the target and helping our network solve the control task. These losses are describing a spiking condition, reminiscent of a voltage-threshold inequality $\mathbf{V}>\mathbf{T}$ (where $\mathbf{V}$ is the voltage of the neurons, and $\mathbf{T}$ is their firing threshold) that allows biological neural networks to produce action potentials. In Sec. 4.2, we show that one can indeed derive voltage and thresholds values from our loss comparison. As a consequence, in our framework, the values of both voltage and threshold are composed exclusively from quantities that describe the (current and future) states of the plant. In particular, the voltage of neuron $i$ is expressed by:

V_{i}=\mathbf{b}_{i}^{T}\mathbf{A}_{f}{{}^{T}}(\mathbf{\mathbf{z}}-\mathbf{A}_% {f}\mathbf{x}_{0})\,,

(4)

where $\mathbf{b}_{i}$ is the $i$ th column of the matrix $\mathbf{B}$ , $\mathbf{A}_{f}\equiv e^{\mathbf{A}f}\in\mathbb{R}^{K\times K}$ is the matrix exponential of the state matrix $\mathbf{A}$ times the future time window $f$ , and $\mathbf{x}_{0}=\mathbf{x}(t)$ is the current state of the system. The threshold of neuron $i$ is described as:

T_{i}=\frac{\mathbf{A}_{f}\mathbf{b}_{i}\mathbf{b}_{i}^{T}\mathbf{A}_{f}{{}^{T% }}}{2}+\mu\,.

(5)

A detailed derivation of the values of $\mathbf{V}$ and $\mathbf{T}$ can be found in Sec. 4.2.4. Although a cost $\mathbf{C}$ on each dimensions of the system can be introduced to further modulate spiking activity (Sec. 4.2), the values of $V_{i}$ and $T_{i}$ in Eq. (4) and Eq. (5) do not consider this weighted case (assuming $\mathbf{C}$ is an identity matrix), for simplicity. Further developing this simple spiking rule leads to a recurrent network of integrate-and-fire neurons, illustrated in Fig. 2, and described by:

\mathbf{\dot{V}}=-\mathbf{V}+\mathbf{G}(\dot{\mathbf{z}}+\mathbf{z})-\mathbf{F% }\mathbf{x}_{0}-\mathbf{\Omega s}\,,

(6)

where $\mathbf{V}\in\mathbb{R}^{N}$ represents the membrane potentials, $\mathbf{G}\in\mathbb{R}^{N\times K}$ are the forward connections that encode the target $\mathbf{z}$ , $\mathbf{F}\in\mathbb{R}^{N\times K}$ contains the forward connections that encode the current system’s state $\mathbf{x_{0}}$ , $\mathbf{\Omega}\in\mathbb{R}^{N\times N}$ represents recurrent connections, and $\mathbf{s}$ are the spikes. Interestingly, the obtained recurrent network has low-rank connectivity with respect to the dimensions of the downstream system (Sec. 4.2.4). This network structure is not assumed, but rather derived from each neuron as a greedy spiking condition given our loss comparison (Sec. 3.1.1). Consequently, although the network obtained is recurrent, each neuron computes the losses without access to network-level information (Sec. 3.1.1). For a full derivation of the network, see Sec. 4.2.

For this proof-of-concept, we enforce an asynchronous firing scheme: at each timestep, all neurons independently compute their local loss comparisons, but only one is allowed to spike, even if several exceed threshold. The spiking neuron can be chosen randomly from the suprathreshold set, or, as we do here, selected as the one with the highest voltage above threshold. While for clarity we keep this firing policy in all the shown examples, it is not strictly required for achieving control. This is because networks that operate on local spiking rules can distribute their activity very effectively among neurons \citepcalaimRobustness2021, which means no specific neuron is more relevant for the computation of the network. Such a simple asynchronous policy is only feasible in zero-delay systems, where the new state of the plant is inputted in the network without any time delay. In a system with delays, we could not impose this asynchronous firing. However, we can employ different methods that allow multiple neurons to spike while still being compatible with our local spiking rule, for instance, by scaling each neuron’s threshold according to its past firing activity.

2.2 Reactive vs. predictive spiking control

Our control problem is centered around the comparison of losses (distances of the state to a target) under the conditions in which a spike is generated or not (Sec. 2.1.2). By comparing the loss functions in the spiking and non-spiking cases, we derived a firing condition for each neuron. Namely, the neuron’s voltage exceeds its threshold and generates a spike only when the effect of this spike onto the system is to minimize the loss (reducing the distance between the system’s state and the target). Importantly, in our control paradigm, the spiking rule is fully local: each neuron computes its losses, with no access to shared network-level information, and fires only when the inequality $L_{\text{s}}<L_{\text{ns}}$ is satisfied, allowing the dynamics of the plant to unfold independently.

In Sec. 2.1.2 we showed how starting with a spiking condition based on a loss comparison (spike if $L_{\text{s}}<L_{\text{ns}}$ ), one can derive the required network structure and dynamics to control a downstream system. Such a spiking rule can be defined relative to the immediate result of a spike, or the predicted effect that such spike will have on the system. In the first case, spikes are generated based on minimizing the loss function at the current time step. In the latter, spikes are generated based on a forward prediction of the system state, incorporating future expected loss minimization over a short horizon of $f$ seconds. The idea of predictive spiking is also present in SCN-related works, usually considering the spike’s influence on a future state \citepschwemmerConstructing2015, or on the difference between past and current state \citepzeldenrustEfficient2021. Adapting this predictive spiking idea in our work, means that the system would evolve in open loop after the spike, only following its intrinsic dynamics, until another spike is able to further improve the predicted loss.

Reactive and predictive spiking control could lead to significantly different spiking behavior. To illustrate this, we consider both cases for an example linear system, with its dynamics defined by velocity and position (Fig. 3). Each spike produced by a neurons in the controlling network is multiplied by its corresponding column in the matrix $\mathbf{B}$ , which maps it onto the dynamics of the system. In Fig. 3 we chose a $\mathbf{B}$ matrix that maps the spikes from four different neurons as vectors in the four different cardinal directions. When the system receives a spike control signal in the current timestep, the system state is instantaneously kicked in the corresponding direction. In the reactive case, only the immediate effect of the spiking is considered. This means the loss comparison that produces a spike control signal does not consider the future state of the plant, but rather aims to improve the loss right after the spike is emitted. As shown by Fig. 3A, in this arbitrary linear system, adopting a reactive spiking rule results in a spike that kicks the system directly towards the target (green arrow), without accounting for the future dynamics of the plant (which could later move the trajectory further away from it). All other spiking directions (red arrows) do not result in an improved loss right after the spike, and so the neurons whose spikes are mapped in those directions do not fire. On the contrary, predictive spiking control favors the spike that minimizes the loss $f$ seconds into the future, considering how the system would evolve during this time window. In our predictive example (Fig. 3B), either spiking rightward (orange trajectory) or upward (green trajectory) would give a lower loss in $f$ seconds when compared to the non-spiking case (brown trajectory). If we employ an asynchronous firing policy (Sec. 2.1.2), only one of these two neurons is selected for firing. For all the examples in this paper, we employ asynchronous firing and we select the neuron with the highest voltage past its threshold (in this example, the one whose spike is mapped on the green arrow).

2.3 Application to physical systems: spring-mass-damper

To illustrate an application of our framework, we consider, as an example, the control of a spring-mass-damper system (SMD) \citepguRobust2005, for schematics see Fig. 1. This is a linear dynamical system composed of a mass attached to a spring (which causes oscillations whenever the mass is moved) and a dampener (which dampens the oscillations, giving the system a fixed stable state). The system can be entirely described within its two variables (the position and the velocity of the mass) which are both included in our system state, $\mathbf{x}$ . This makes the SMD a very simple system that grants observable dynamics and clear analytical solutions, while also modeling a physical system with realistic state variables. As such, it is a standard benchmark in control literature and spiking control theories \citepvincentPositioning1989, slijkhuisClosedform2023, ahmadvandNeuromorphic2023. Furthermore, the SMD as a linear system can be expanded to approximate more complex plants, as shown by its applications in muscle models for motor control and neuro-mechanical studies \citepbyadarhalymodular2012, pazzagliaBalancing2025. In our proof-of-concept, we make use of a simple target $\mathbf{z}$ , which evolves during the simulation, incrementing twice and then stabilizing. In all of our simulations, the target state is always relative to the position of the mass, while its velocity can range freely. Although this is the setup we will work with, our algorithm allows to constrain the values of either the position or the velocity of the mass, by changing the entries in the cost matrix $\mathbf{C}$ , as shown in Sec. 4.2. We will now illustrate the differences that emerge when controlling the system using a reactive versus a predictive paradigm, as described in Sec. 2.2 (Fig. 3). We consider both physically unconstrained control (where the control signal can affect both velocity and position), and physically constrained control (where the control signal can only affect the velocity).

2.3.1 Reactive control

We first consider a physically unconstrained system with reactive control (Fig. 4A). In this case, the system can be successfully made to track a target (Fig. 4A top right). This is achieved by continuously spiking directly towards the target, which is always the better choice since the neurons do not consider future states of the plant (as shown in Fig. 3A). In this example, the control is exerted by progressively kicking the mass in the same position as the evolving target, essentially ‘teleporting’ it without affecting its velocity. Of course, this is only possible because we are considering a physically unconstrained system, where the mass is allowed to be moved instantly in any direction of the state space. Also note that because of this property, the system ends up in a physically impossible state with a non-zero velocity (green curve in upper right plot), despite remaining in place according to the position variable. We therefore ask if a reactive control method could work on physically constrained systems. We employ reactive control to our physically constrained example, running the simulation of an SMD. In this context, applying a spiking control rule means directly influencing the velocity component of the state $\mathbf{x}$ , while the position component evolves as a consequence. This reflects a key constraint of physical systems: the position cannot be instantaneously changed, as the mass cannot move from one position to another in no time. Instead, we constrain spiking control to act on the simulated physical system by applying a force to only kick the velocity, guaranteeing realistic state evolution in accordance with Newtonian mechanics. As shown in Fig. 4B, a reactive approach fails to control a physically constrained system. This is because influencing the velocity of the mass has no instant effects on its position. To appreciate these effects, the network should be able to observe future states of the plant. In short, spiking up or down in state space does not immediately affect the loss, and no spike is produced.

2.3.2 Predictive control

We introduce our predictive spiking algorithm to a non-constrained system. In this scenario, the controller is allowed to spike any dimension of the system, but it is also able to predict the plant’s future states, and it produces spikes that improve the loss $f$ seconds from the current state. From Fig. 4C, we can see how when exerting the control, the position is sometimes directly moved towards the target (as in the reactive case), but optimizing the local losses often equates to kicking the system in its velocity, exploiting the intrinsic dynamics of the plant. Our predictive algorithm exploits the intrinsic dynamics of the system even in the absence of physical constraints, although in some instances, the mass is still instantaneously moved towards the target. Finally, we apply the same predictive algorithm to the SMD simulation. In this scenario, the spiking network accounts for future states of the system, and is constrained to only spike the velocity either up or down, adjusting to the changing target position. The trajectory approaches the target and the loss $f$ seconds after a spike is minimized. If the position then departs from the target, the losses of the neurons will change their value, eventually causing the neuron to reach its threshold. At this point, the velocity will be either spiked up or down, causing the network to trace the target correctly. As shown in Fig. 4D, our predictive spiking rule introduces an element of anticipation, making the control strategy more robust in physically constrained dynamical systems, and allowing for the control of the SMD where a reactive control would fail.

2.4 The controller as a recurrent spiking neural network

We now focus on the example activity of a predictive control of an SMD with a network of four neurons, as shown in Fig. 5. A visual representation of the structure of this network is shown in Fig. 2. In this example, we want to show the network’s performance, spikes, voltage traces, and errors. To bring all four neurons to spike, we choose a slightly more complex target $\mathbf{z}$ , which spans positive and negative values. The control task and the network are set up in the same way as the previous SMD example (Fig. 4D). In each timestep of the simulation, the neurons compute spiking and non-spiking losses, accounting for the future state of the system in $f$ seconds, and its distance towards the target. The distance to the target considered in these losses is represented as the predictive error. When its value is large enough, the neuron’s threshold is reached, and a spike is produced to minimize it. In turn, this minimizes the actual error, allowing the position of the mass to approach the target position. The sudden changes in value of the predictive error are caused by the voltage traces of the neurons reaching their threshold, and thus temporally coincide with the spike events. Each spike is a direct control input. It is scaled by the matrix $\mathbf{B}$ by a different amount for each neuron, and mapped as a positive or negative contribution to the velocity in $\mathbf{x}$ . Between spike moments, the system evolves as an open loop system, without any control signal in input, following its intrinsic dynamics. The control loop is closed each time a spike is produced and mapped onto the velocity. Since each spike is modeled as a Dirac delta function, the corresponding change in velocity is instantaneous and the control naturally exploits the intrinsic dynamics of the plant to minimize the predictive error.

2.5 Varying meta-parameters

The control network we employ operates by expressing its neural parameters with elements of the downstream plant, which serves as a proof-of-concept for a spiking network that can directly compute control in a linear dynamical system (Sec. 4.2). However, at this stage, other features of the network are still imposed as meta-parameters to adjust the control to our SMD example. For a detailed list of all the parameter setup used in this work, see Sec. 4.1.

Number of neurons ( $N$ )

The number of neurons in the network heavily influences the control of the plant. For instance, when operating on a system with very simple oscillatory dynamics as our SMD, employing more neurons can make the control more robust to different settings, but it raises the need to manage the spiking activity among neurons, especially since they do not have access to network-level information (Sec. 4.2). This can be achieved by imposing asynchronous firing or introducing an adaptive cost on spike, as mentioned in Sec. 2.1.2. While for our activity example in Fig. 5 we showed the results for four neurons, in all other demonstrations we kept the network as simple as possible and employed only two neurons: spiking the velocity of the plant either up or down by the same amount. Indeed, for systems with more complex dynamics and targets, exerting effective control requires larger networks, as in the coupled oscillator case of Fig. 8, where $N=500$ .

Kick strength ( $\mathbf{Bs}$ )

The spikes of our SNN are unitary Dirac delta functions that operate as control signals on the dynamics of the plant. These signals are mapped onto the downstream system via the matrix $\mathbf{B}$ , which can arbitrarily scale their value, resulting in smaller or larger kicks to the velocity of the system. This is the case for Fig. 5 and Fig. 8, where each neuron kicks the velocity differently. In all the other examples, since we refer to the same SMD system with the same dynamics, we decide to keep $\mathbf{B}$ fixed, only allowing the system’s velocity to be kicked up or down with the same strength. This makes our proof-of-concept as clear as possible. Of course, the value of $\mathbf{B}$ holds great relevance when varying the control objective, as it can let the same network control significantly different plants, as well as exert control on the same plant with different levels of accuracy. For this parameter exploration, since we only tackle the same SMD example, we keep a fixed $\mathbf{B}$ value that produces kicks of a compatible magnitude with the downstream plant.

Global threshold scaling ( $\mu$ )

When deriving the network, we can impose a scaling term $\mu$ on the loss, to introduce a cost for spiking (Eq. 3). This term influences the neuron’s threshold value (see Sec. 4.2). Scaling the neuron’s threshold gives interesting results when controlling the plant (Fig. 6), since it affects the number of spikes emitted as well as the accuracy in tracing the target position.

Predictive window ( $f$ )

When computing the spiking decision, each neuron compares spiking and non-spiking losses. As mentioned, these are predictive losses: they take into account the effect of a spike on the system $f$ seconds into the future (Sec. 2.2). The width of this future time window is crucial for the control results (Fig. 6). Looking further into the future allows neurons to better assess whether the trajectory will approach or depart from the target, often preventing unnecessary spikes. On the other hand, if the window $f$ is very large, the predicted state used in the loss comparison might approach (or depart from) the target way before the actual state of the plant does. This influences the accuracy with which the network represents the effect of its own control on the system’s dynamics, as we will explain below.

2.5.1 Control regimes

Different combinations of these meta-parameters can give different results in the control of the plant. Quantifying the optimality of the control strategy in comparison to other paradigms (standard LQR, filtered-spikes) is very difficult. As described in Sec. 3.1.1, each neuron in our network follows a greedy spiking rule given the control set, but like in other approaches, this does not necessarily produce the best overall controller. It is of interest to vary some control parameters, considering the cases where the changes in target value are traced by the system’s state (and therefore the SMD is successfully controlled). Within these cases, we identified different regimes of control for our network. In these explorations we decided to maintain fixed the number of neurons ( $N=2$ ) and the kick strength $\mathbf{B}$ . We therefore vary the future time window $f$ , influencing the predicted state based on which the network computes the spiking decision, and $\mu$ , the global scaling factor for threshold. As shown in Fig. 6A, networks with a lower $\mu$ tend to spike more, using more energy, as defined by the effected acceleration onto the system (for this system, this equates to the sum of the total spike count of the simulation, with each spike scaled by the corresponding value in the matrix $\mathbf{B}$ ). Given the same $f$ , these networks also tend to have a lower error compared to high $\mu$ networks (Fig. 6B), since a lower threshold allows to adjust more often to the target position. In general, it emerges from Fig. 6B how a relatively wide range of parameters allows for the control of the plant, giving comparable error measures. Only more extreme parameters combinations (a high threshold scaling, especially with small $f$ ) result in the network not tracing the target, yielding significantly higher cumulative error.

Efficient and inefficient control

Within the boundaries of these trade-offs, we identified four regimes (Fig. 6C-F). With an adequate future time window $f$ and relatively low $\mu$ , the network can operate in an efficient regime (Fig. 6C, where the target position is followed precisely, while keeping a relatively low spike count. If the spike count is lowered any further from this point, the control input becomes too sparse, and the SMD mass does not closely orbit the target position. On the contrary, decreasing the value of both these meta-parameters yields the inefficient regime, where the target is still traced with comparable precision, but the spike count, and as a consequence the energy input, is considerably higher (Fig. 6F).

Under- and over-control

We then have the case where the network is allowed to look far enough in future states of the system, but $\mu$ is also very high, making it difficult for the neurons to spike enough, and to trace the target accurately (Fig. 6D). In this regime of under-control, the position approximates the target but never manages to reach it. Finally, in the last case, the neurons are relatively free to spike as $\mu$ is low, and they are bound to look very far into future states of the system (Fig. 6E). In this regime of over-control, the neurons fire based on a predicted position that is far into the future (large $f$ ) and thus differs significantly from the actual one. For instance, while the current state of the system might have not reached the target yet, the predicted state is already well past it, which causes the neurons to fire earlier than needed. This over adjustment makes the position hover slightly past the actual target.

2.6 Comparing control approaches

Lastly, we want to showcase the differences in the resulting control for three approaches: the continuous LQR control, the filtered-spikes control, and our predictive spiking control. As for previous examples, we consider a physically constrained simulation of an SMD, with a progressively changing target. In Fig. 7 we directly compare the same measures for all three methods. As shown in the error trace (bottom rows), the filtered-spikes algorithm (Fig. 7B) approximates the control strategy of the continuous LQR (Fig. 7A). Spiking control (Fig. 7C) has similar increases and decreases of the error (contingent to the changes in the value of the target), although its profile is very different. This is mainly due to its control input not being continuous, which only allows the position to orbit the target with a certain accuracy without ever stationing on the exact target value. Analogous differences are highlighted in the activity over time of these networks (second row). In the continuous and filtered cases, the velocity is not spiked, and the position slowly adjusts to the changes in target, stabilizing once the target is reached. On the contrary, spiking control is progressively adjusting the velocity up or down, tracing the target as soon as it changes value. Once the target is reached, the controller orbits the target position by spiking regularly until the end of the simulation. Other fundamental differences are highlighted in the control input (third row), and in the consequent trajectory in state space (first row). As shown, the filtered spikes controller is able to smoothly control the state of the system by approximating the control input of the LQR. In this sense, the spikes are only representing the system’s state, but the control signal is still continuous, as it is inputted in each timestep of the simulation. This approximation requires the network to spike much more than our spiking control algorithm, which exclusively inputs a kick when it needs to adjust the position to the target.

2.7 Controlling coupled oscillators

We now scale up the same algorithm described in previous sections to control a higher dimensional linear system. This time, the plant we chose is a coupled oscillator system, where $M$ number of SMDs are connected in series, such that the position of one mass affects the acceleration of the ones connected. The network has to control multiple SMDs, while taking into account the coupled effects between them. For the example shown in Fig. 8, we consider the control of $M=10$ coupled masses using a network of $N=500$ neurons. As shown, the control algorithm scale naturally to this complex linear system: even though the masses are coupled with each other, our local spiking rule is able to coordinate the spikes as to allow all masses to reach their respective target (Fig. 8B). Neurons are assigned an element in the matrix $\mathbf{B}$ (now scaled up to the correct dimensionality based on the number of masses $M$ ) with a random normalized value. The spike of each neuron then maps either as a positive or negative kick on the velocity of a mass by the corresponding amount. This results in some neurons having a higher kick strength than others, but as before, only the neurons that allow the system to decrease the distance from the target in $f$ second will fire. During the simulation we observe how spiking activity is sparse and distributed among neurons (Fig. 8C). We then silence random neurons to test the robustness of the system. After each silencing event, the activity is distributed among the remaining active neurons and there is no effect on the control performance. This shows how our control algorithm scales to complex linear system and is robust to cell loss: no specific neuron is determining the outcome in the control performance.

3 Discussion

3.1 Proposed spiking control framework

As described in Sec. 1, biological neurons use spikes for signaling and control of downstream system. Control of dynamical systems has been widely studied and applied in different contexts, including artificial neural networks \citepboerlinPredictive2013, deneveEfficient2016, huangDynamical2018, slijkhuisClosedform2023, mooreneuron2024, but generally doesn’t consider spikes directly as control signals. In this work we remedied this by showcasing how control can be computed directly through the use of single spike events. This allowed us to obtain a model that uses spikes to actively kick the dynamics of the downstream plant, and exploits its intrinsic dynamics to reach a certain target. The emitted spikes are instantaneous events (modeled as Dirac delta functions, Sec. 2.1.1). The spike generation mechanisms in the network and the mappings to the system need to be adjusted to the various constraints the plant might have in its dynamics (Sec. 2.3). In order for this control paradigm to work on physically constrained systems, we developed a predictive algorithm, that takes into consideration the effect of our control inputs onto the system within a certain future window, $f$ . Our spiking rule is fully local and is computed for each neuron in every timestep of the simulation. Therefore, the target can be reached without a network-level objective, and without any explicit learning paradigm. Of course, introducing learning could still be a very useful addition to adjust all the meta-parameters of the network when adapting the control to different plants, or to tackle plants with unknown dynamics (Sec. 3.2). From this simple spiking rule, we obtain, without imposing it, a recurrent network of IF neurons that exerts control by only inputting a spike if that improves the defined loss. This rule does not guarantee the optimality of the control as a whole (as explained in Sec. 3.1.1), but allows for the control to work without the need for network-level information. Lastly, we scaled up this framework to control a coupled oscillator system, with various connected masses (each with its different target) and a larger network. We showcase that not only our control algorithm is easily scalable to complex linear systems and higher-dimensional networks, but it is also robust to neuron loss.

3.1.1 Assumptions and constraints

Greedy spiking rule

An important clarification about our proposed algorithm is that the spiking rule by which the network produces control signals does not guarantee any optimality on the controller as a whole. Each timestep in the simulation, the neurons operate the spike decision by simply trying to decrease the distance towards a target. That is why the neurons produce control signals (spikes) in a "greedy" way, as defined in the work of [boerlinPredictive2013]. If spiking improves the loss, the neuron always spikes, as that is the only information that it has available. No other network-level information is used in the process. In other words, computing the greedy spike decision for each neuron in the current timestep (albeit considering the effect of the spike in $f$ seconds), does not equate to finding the best set of spikes over a whole future window, let alone over the whole simulation. As a result, our approach does not guarantee that the control actuated by the network is necessarily the optimal control solution for reaching a certain target. Furthermore, as described in Sec. 2.5.1, many of the meta-parameters in the network are manually set rather than learned or optimized for the control task at hand. In most of our examples, we impose the meta-parameters that best fit the scale of the system to control, to showcase our proof of concept with clarity.

Simplicity of use case

Following a similar reasoning, we purposefully chose a simple dynamical system, which is linear (fully observable) and two-dimensional. We also chose a simple target that progressively changes position and then stabilizes. This simplicity allowed us to showcase our results without any additional complication, and in Sec. 2.7 we demonstrate how this framework easily scales to any linear dynamical system. However, the limits of our paradigm still need to be tested, by including adaptive (or learned) meta-parameters, and expanding our control algorithm to operate on non-linear systems (Sec. 3.2). Important to note that even in an optimal controller, these meta-parameters would need to be adjusted based on the control problem at hand. The differences in the control approach of our network is highlighted in Fig. 7 and explained in Sec. 2.6, where we control the same SMD system with a standard LQR (optimal controller), a filtered spiking algorithm, and our spiking controller. As shown, the filtered spiking controller (Fig. 7B) approximates the control signal of an LQR (Fig. 7A), making the resulting trajectories in state space equivalent, while our spiking algorithm (Fig. 7C) differs significantly from the optimal trajectory in state space. However, by adjusting the meta-parameter correctly, the network still traces the target with good accuracy (as shown in the error profile), while producing significantly less spikes than the filtered spiking controller.

3.2 Considerations on spiking control

Relation to prior spike-based control models

Previous works underlined the importance of each spike event in the computation of neural networks \citepboerlinPredictive2013, schwemmerConstructing2015, brendelLearning2020. Such spike-based computational models highlight the relevance of the temporal precision of spike events, where precise spike timing is determined by each neuron’s contribution to a desired output. These models exhibit features similar to cortical networks, including irregular spiking and E/I balance \citepschwemmerConstructing2015. Many of these models are developed from SCN theory, which gives them a strong overlap with our framework but also introduces important distinctions (Sec. 4.3). In SCNs, as in our approach, spike activity is used to generate a signal that controls a dynamical system. However, SCNs use this signal to control their own network read-out (thereby influencing the dynamics of their internal state). This read-out can be quite complex, and include synaptic kernels, requiring a form of predictive spiking \citepschwemmerConstructing2015, zeldenrustEfficient2021 similar to the one employed in our paradigm. In short, despite SCN-related works share common aspects with out approach, they do not explicitly hypothesize that the computational role of spikes could be controlling a different downstream system. In contrast, our work generalizes to the control any downstream linear system, rather than focusing solely on modulating the network’s own internal state. Moreover, when SCN principles are employed to solve realistic control tasks \citephuangDynamical2018, slijkhuisClosedform2023, mooreneuron2024, the emphasis on a temporal, single-spike perspective is lost, favoring a rate-based or filtered-spike approach, which glosses over the role of each spike in the network’s computation.

Future outlook

In neuroscience, single spike events for control could have a huge significance when modeling downstream actuator systems. Spike-based control seems to be a promising modeling angle when explaining the control of downstream muscle actuators \citepbernikernormative2019, that ultimately allow movement to occur. This paradigm could be expanded to include models of body orientation in space as well as the control of more detailed models of limbs. Further down the line, spiking control could be a useful approach to interpret how the spikes influence the dynamics of the network that produce it. This has been theorized by previous works in SCNs \citepboerlinPredictive2013, deneveEfficient2016, suggesting that the computational role of spikes in the network is to control its dynamics, and many physiological features of biological networks are a consequence of this necessity. In this case, the downstream system is the network itself (or a second downstream network), which means that our paradigm could serve as a tool to explain potential roles of the spike code in all bio-plausible networks. As mentioned above, various works emerged from this initial SCN perspective to investigate the biophysical and computational properties of these networks, and their potential values for interpreting detailed neural models \citepschwemmerConstructing2015, zeldenrustEfficient2021. Not only these works stem from the common root of SCN-based modeling, but they also incorporate elements that we employ in our algorithm, such as different forms of predictive spiking. Revisiting these topics with our control-focused perspective could give interesting insights, for instance, on the computational role of E/I balance \citeppodlaskiApproximating2024 or provide fresh perspectives on various properties of SNNs \citepmancooUnderstanding2020. Following a similar path, we could study the effect of bio-plausible additions in our framework, like noise or network perturbations, analyzing for instance the robustness of our system \citepcalaimRobustness2021.

As described in Sec. 3.1.1, expanding our proof-of-concept to more complex nonlinear systems could be relevant to investigate the role of spikes in controlling realistic downstream actuators. This could include detailed models of muscle fibers for the control of movement \citepbernikernormative2019, or models of complex downstream networks. Furthermore, it could have use cases for neuromorphic control applications where a better exploration of the full state space of the plant is needed. Tackling nonlinearity might require further refinements on the model by employing local linearization techniques \citepyesildirekFeedback1995 or a simultaneous prediction of multiple future states. In the effort to apply this algorithm to more realistic use cases, another required step would be to find a way to adjust neural parameters according to the control problem. As mentioned in Sec. 2.5, a lot of important parameters of the network are currently imposed to best fit the control case. Ideally, we would want the same network to be able to control different plants. This could be achieved by applying a discount factor on some parameters, which would allow us to adapt, for instance, the value of $f$ as the simulation progresses, finding the best time window to balance the number of spikes in input and the accuracy with which the target is traced. Finally, we can introduce learning on all the parameters of the network, to make our control as flexible as possible to a wide range of downstream systems. Not having to assume that we operate on completely observable plants would greatly expand the use cases of our algorithm, and would allow for testing of various neuroscience-derived hypotheses.

4 Methods

4.1 Simulation and network setup

Simulation details

All the simulations in this paper were implemented in Python using the following packages. NumPy(version 1.23.5): for matrix operations and numerical computations. SciPy(version 1.10.1): for solving differential equations and simulating trajectories using solve_ivp. Matplotlib(version 3.7.1) and Seaborn(version 0.13.2): for visualizing results, including raster plots, heatmaps, and system trajectories. Python control system library ’control’ (version 0.10.1): for the initializations of the continuous control example.

Time settings

In each single SMD simulation (Figs. 4 to 7), the total time $T$ is set to $50$ seconds, the timestep $\delta t=0.01s$ divides each run into $5000$ timesteps $nT$ . For the coupled SMD example (Sec. 2.7, Fig. 8) $T=100s$ , $\delta t=0.01s$ , and $nT=10000$ .

Target settings

For every simulation, the target $\mathbf{z}$ is only defined for the position component of the system’s state, and it is initialized at zero. For Figs. 4, 6, and 7, three target values are set: $z_{base}=5$ , from five second into the simulation, $z_{base}=10$ after 15 seconds, and $z_{base}=15$ after 30 seconds, and until the end of the simulation. For Fig. 8 we use similar $z_{base}$ steps, but we initialize them so each mass is attributed a different set of steps, which are progressively more positive or negative. In Fig. 5 we use bigger and irregular steps, to better show the activity of the four neurons: $z_{base}=-30$ from five second into the simulation, $z_{base}=0$ after 15 seconds, $z_{base}=-25$ after 20 seconds, $z_{base}=20$ after 30 seconds, and until the end of the simulation. Between each value of $z_{base}$ , the parameter $z_{leak}=0.5$ manages the exponential behavior of the target curve.

Neurons and spike mapping

For most runs, the number of neurons $N$ is fixed to two, except for Fig. 5 where $N=4$ . In the scaling up example of coupled oscillators (Fig. 8), $N=500$ . $\mathbf{B}$ is kept to spike the velocity component of a fixed value of $2$ or $-2$ for most simulations (the values relative to the positions are kept at zero). One exception is the physically unconstrained cases of Fig. 4, where $\mathbf{B}$ can also allow spikes to the position of the same amounts ( $2$ and $-2$ ). For Fig. 5, $\mathbf{B}$ values are sampled from a normal distribution, with the second column (velocity) normalized to have norm $1$ . The resulting value is then doubled. The same normalized $\mathbf{B}$ values are quadrupled in the coupled oscillator simulation. In this two cases, the dimensions of matrix $\mathbf{B}$ are modified to allow each one of the $N=4$ and $N=500$ neurons to have their spikes mapped onto the system.

Future time window and spike cost

The future window $f$ is kept at $0.3$ for the predictive simulations in Figs.4, 5 and 7. When using reactive control, $f$ is set to zero. For Fig. 6, $f$ values in the subsequent simulations are linearly spaced from $0.1$ to $1$ (included), with the fourth value rounded up at $0.25$ . $\mu$ is fixed at $0.3$ for the simulations in Figs.4 and 7. For Fig. 5, $\mu$ is set to $0.1$ . In Fig. 6, $\mu$ values in the subsequent simulations are linearly spaced from $0.02$ to $1$ (included), with the third value rounded up at $0.25$ .

Initial state and system matrix

The initial state of the simulation $\mathbf{x_{0}}$ (at $T=0$ ) is always kept at $[0,0]$ . The system matrix $\mathbf{A}$ is also kept the same for all the single-SMD simulations (since we control the same SMD). The first row is set at $[0,0.5]$ , and the second at $[-0.1,-0.1]$ . For Fig 8 the variable $M=10$ is introduced to indicate the number of masses to control. To test robustness, we silence 180 random neurons 30 seconds into the simulation, and other 180 after 70 seconds. In this example, $A$ is scaled up to the dimensionality of the coupled systems based on $M$ (and $N$ ), and coupling values $\gamma=-0.3$ are introduced to make the position and acceleration of the masses dependent on each other.

Continuous and filtered-spiking parameters

The examples of continuous and filtered-spiking control in Fig. 7A,B largely use the same settings as the other single-SMD simulations. In the continuous control case, we introduce an LQR gain matrix $\mathbf{K}$ to compute the control signal, initialized using the control.lqr function from the python control package. To initialize $\mathbf{K}$ , we use the matrix $\mathbf{B}$ that maps the control signal onto the velocity of system, set as $[0,0.25]$ . We will also need a control cost $R=0.001$ (analogous to $\mu$ in the spiking control case), and finally a $K\times K$ matrix $\mathbf{Q}$ to introduce a cost on the dimensions of the system (analogous to $\mathbf{C}$ in our spiking paradigm). The first row of $\mathbf{Q}$ is set at $[1,0]$ , while the second is $[0,1]$ . As described in Sec. 4.3, for our filtered spiking control we will need a matrix $\mathbf{D}=[1,-1]$ to map the filtered spike traces onto the network read-out. For our example of Fig. 7B, we use a spiking cost $\mu_{f}=0.1$ , and a decay factor $\lambda=1$ for the filtered spike traces.

4.2 Derivation

4.2.1 Network

In our work, we derive a spiking neural network (SNNs) of leaky integrate-and-fire (LIF) neurons from control principles. A network of $N$ such neurons is described by their voltage dynamics

\mathbf{\dot{V}}(t)=-\lambda\mathbf{V}(t)+\mathbf{Fx}(t)+\mathbf{\Omega s}(t),

(7)

where $\mathbf{V}(t)\in\mathbb{R}^{N}$ are the neurons’ voltages, $\lambda$ determines the membrane leak time-constant, $\mathbf{x}(t)\in\mathbb{R}^{K}$ are the inputs, $\mathbf{F}\in\mathbb{R}^{N\times K}$ are the forward weights, $\mathbf{\Omega}\in\mathbb{R}^{N\times N}$ are the recurrent weights, and $\mathbf{s}(t)\in\mathbb{R}^{N}$ are the neural spike trains. A spike is generated whenever a neuron’s voltage crosses its threshold, $T_{i}$ , and a spike train is described as a sum of Dirac delta functions, $s_{i}(t)=\sum_{t_{j}}\delta(t-t_{j})$ . Whenever a neuron spikes, its voltage is reset to a resting potential, $R_{i}$ . We now will generally be leaving out the time notation $(t)$ for notional clarity, unless it’s important for a specific derivation step.

4.2.2 Dynamical System

We want to control a linear dynamical system. We consider a $K$ -dimensional linear dynamical system, over which we want to directly exert control using spike events. The system is described by the following equation:

\mathbf{\dot{x}}=\mathbf{A}\mathbf{x}+\mathbf{B}\mathbf{s},

(8)

where $\mathbf{x}\in\mathbb{R}^{K}$ represents the state of the system (one value for each dimension of the system), $\mathbf{A}\in\mathbb{R}^{K\times K}$ is a matrix which defines the transformations in $\mathbf{x}$ that give rise to the dynamics (for instance, an oscillatory behavior), $\mathbf{B}\in\mathbb{R}^{K\times N}$ is a matrix that maps the spike events onto the system, and $\mathbf{s}\in\mathbb{R}^{N}$ are the spike trains, which represent our control signal.

Since it’s a linear system, we can solve it analytically (Sec. 4.1) to derive the state of the system in a future time step $f$ , starting from the current state $\mathbf{x}_{0}$ .

No spike condition

In the absence of any spiking control signal, we obtain:

\mathbf{x}_{f}=e^{\mathbf{A}f}\mathbf{x}_{0},

(9)

where $\mathbf{x}_{0}=\mathbf{x}(t)$ is the current state of the system, $\mathbf{x}_{f}=\mathbf{{x}}(t+f)$ is the predicted state after $f$ seconds, and $e^{\mathbf{A}f}$ is the matrix exponential of $\mathbf{A}$ times the future observation $f$ . We introduce the shorthand $e^{\mathbf{A}f}=\mathbf{A}_{f}$ for ease of writing, resulting in:

\mathbf{x}_{f}=\mathbf{A}_{f}\mathbf{x}_{0}.

(10)

Neuron $i$ spikes condition

We now consider the effect of a single neuron $i$ spiking on the current state of the system, and how that affects the state of the system $f$ seconds into the future. For the condition where neuron $i$ spikes, the spike vector $\mathbf{s}$ is zero for all elements and one for the element corresponding to the spiking neuron $i$ . The operation $\mathbf{B}\mathbf{s}$ in Eq.8 results in considering the $i$ th column of matrix $\mathbf{B}$ . Here we denote that with $\mathbf{b}_{i}$ . We can write the predicted state of the system after a spike as:

\mathbf{x}_{f}=\mathbf{A}_{f}(\mathbf{x}_{0}+\mathbf{b}_{i}),

(11)

where $\mathbf{b}_{i}$ is the $i$ th column of the matrix $\mathbf{B}$ .

4.2.3 Losses

To control the system, we set a target state $\mathbf{\mathbf{z}}$ which holds one value to be reached for each dimension of the system. To quantify the distance from this target, we define a loss as the $L^{2}$ norm of the difference between the target and the predicted state:

L=||\mathbf{\mathbf{z}}-\mathbf{x}_{f}||^{2}+\mu||\mathbf{s}||^{2},

(12)

where $\mathbf{z}\in\mathbb{R}^{K}$ is the target state, and $\mu$ is a parameter that scales the cost for each spike. In the most general case, we can scale the part of this loss that describes the distance to the target. We introduce the matrix $\mathbf{C}$ , which represents a cost on the dimensions of the system. This allows us to introduce a penalty on the values of the system’s dimensions, which will weigh on the input to the controller.

The loss then becomes:

L=(\mathbf{z}-\mathbf{x}_{f})^{T}\mathbf{C}(\mathbf{z}-\mathbf{x}_{f})+\mu||% \mathbf{s}||^{2},

(13)

where $\mathbf{C}\in\mathbb{R}^{K\times K}$ is a cost matrix on each dimensions of the system. We can express this loss in the presence and absence of spikes.

Loss with no spike

L_{\text{ns}}=(\mathbf{\mathbf{z}}-\mathbf{A}_{f}\mathbf{x}_{0})^{T}\mathbf{C}% (\mathbf{\mathbf{z}}-\mathbf{A}_{f}\mathbf{x}_{0}).

(14)

Loss with spike

L_{\text{s}}=(\mathbf{\mathbf{z}}-\mathbf{A}_{f}(\mathbf{x}_{0}+\mathbf{b}_{i}% ))^{T}\mathbf{C}(\mathbf{\mathbf{z}}-\mathbf{A}_{f}(\mathbf{x}_{0}+\mathbf{b}_% {i}))+\mu||\mathbf{s}||^{2}\\ .

(15)

If only neuron $i$ spikes in the current time step, $\mathbf{s}$ is a vector containing only zeros except for the $i$ th position, which is one. We name this vector $\mathbf{s}_{i}$ . Under this assumption, we can rewrite $||\mathbf{s}||^{2}$ as:

||\mathbf{s}||^{2}=(\mathbf{s}^{i}){{}^{T}}\mathbf{s}^{i}=\sum_{i}^{N}(\mathbf% {s}^{i}){{}^{2}}=1,

(16)

which would give us the following loss:

L_{\text{s}}=(\mathbf{\mathbf{z}}-\mathbf{A}_{f}(\mathbf{x}_{0}+\mathbf{b}_{i}% ))^{T}\mathbf{C}(\mathbf{\mathbf{z}}-\mathbf{A}_{f}(\mathbf{x}_{0}+\mathbf{b}_% {i}))+\mu.

(17)

4.2.4 Neural Variables

As mentioned in Sec. 3.2, previous works conceptualized spiking neurons as capable of directly encoding dynamical variables, and derived a "greedy" spiking rule where a neuron $i$ fires whenever this results in a decrease of a cost function, which usually measures the distance between the dynamical variable $x$ and its estimate $\hat{x}$ , or a target $z$ . This is a fully local rule: a neuron’s decision to fire depends solely on its current membrane potential and its threshold, which in turn depend on information locally accessible to the neuron at that moment. The neuron does not need global information about the activity of the entire network. This rule guarantees that each spike emitted by a neuron contributes to improving the representation (or the control) of the dynamical variable, and has therefore a clear interpretation within the computing objective of the network. We start deriving our network based on this simple firing condition:

L_{\text{s}}\leq L_{\text{ns}}

(18)

from which:

\begin{split}&(\mathbf{\mathbf{z}}-\mathbf{A}_{f}\mathbf{x}_{0}-\mathbf{A}_{f}% \mathbf{b}_{i})^{T}\mathbf{C}(\mathbf{\mathbf{z}}-\mathbf{A}_{f}\mathbf{x}_{0}% -\mathbf{A}_{f}\mathbf{b}_{i})+\mu\\ &-(\mathbf{\mathbf{z}}-\mathbf{A}_{f}\mathbf{x}_{0})^{T}\mathbf{C}(\mathbf{% \mathbf{z}}-\mathbf{A}_{f}\mathbf{x}_{0})\leq 0.\end{split}

(19)

Expanding the squares:

-2(\mathbf{A}_{f}\mathbf{b}_{i})^{T}\mathbf{C}(\mathbf{\mathbf{z}}-\mathbf{A}_% {f}\mathbf{x}_{0})+\mu+(\mathbf{A}_{f}\mathbf{b}_{i})^{T}\mathbf{C}(\mathbf{A}% _{f}\mathbf{b}_{i})\leq 0

(20)

and rearranging these terms, we obtain the inequality:

\frac{\mathbf{b}_{i}^{T}\mathbf{A}_{f}{{}^{T}}\mathbf{C}(\mathbf{A}_{f}\mathbf% {b}_{i})}{2}+\mu\leq\mathbf{b}_{i}^{T}\mathbf{A}_{f}{{}^{T}}\mathbf{C}(\mathbf% {\mathbf{z}}-\mathbf{A}_{f}\mathbf{x}_{0}).

(21)

On the left side of this inequality, all the factors are time independent. We can attribute these to the threshold of the neuron $i$ ,

T_{i}=\frac{\mathbf{b}_{i}^{T}\mathbf{A}_{f}{{}^{T}}\mathbf{C}(\mathbf{A}_{f}% \mathbf{b}_{i})}{2}+\mu

(22)

On the right side, all the time dependent factors can be attributed to the voltage of that neuron:

V_{i}=\mathbf{b}_{i}^{T}\mathbf{A}_{f}{{}^{T}}\mathbf{C}(\mathbf{\mathbf{z}}-% \mathbf{A}_{f}\mathbf{x}_{0}).

(23)

This gives us a spiking condition for neuron $i$ : ${V_{i}}\geq T_{i}.$

If we do not pose any constraint on the dimensions of the system, we would make the matrix $\mathbf{C}$ an identity matrix $\mathbf{I}$ , obtaining the voltage and threshold values described in Eq. (4) and Eq. (5) from Sec. 2.1.2.

Stepping back from the single neuron perspective, and considering instead the voltages of all the neurons in the network, we obtain:

\mathbf{V}=\mathbf{B}^{T}\mathbf{A}_{f}{{}^{T}}\mathbf{C}(\mathbf{\mathbf{z}}-% \mathbf{A}_{f}\mathbf{x}_{0}).

(24)

From there, we pose:

\mathbf{G}=\mathbf{B}^{T}\mathbf{A}_{f}{{}^{T}}\mathbf{C},

(25)

obtaining:

\mathbf{V}=\mathbf{G}(\mathbf{\mathbf{z}}-\mathbf{A}_{f}\mathbf{x}_{0}).

(26)

We now consider the rate of change of $\mathbf{V}$ , as well as the rate of change of all time-dependent variables:

\dot{\mathbf{V}}=\mathbf{G}(\dot{\mathbf{\mathbf{z}}}-\mathbf{A}_{f}\dot{% \mathbf{x}}_{0}),

(27)

the change of the current state of the system over time $\dot{\mathbf{x}}_{0}$ can be expressed by Eq. (8), obtaining:

\dot{\mathbf{x}}_{0}=\mathbf{A}\mathbf{x}_{0}+\mathbf{Bs}.

(28)

Substituting this into Eq. (27) yields:

$\displaystyle\dot{\mathbf{V}}$	$\displaystyle=\mathbf{G}(\dot{\mathbf{z}}-\mathbf{A}_{f}(\mathbf{A}\mathbf{x}_% {0}+\mathbf{Bs}))$	(29)
	$\displaystyle=\mathbf{G}(\dot{\mathbf{z}}-\mathbf{A}_{f}\mathbf{A}\mathbf{x}_{% 0}+\mathbf{A}_{f}\mathbf{Bs})$	(30)
	$\displaystyle=\mathbf{G}\dot{\mathbf{z}}-\mathbf{G}\mathbf{A}_{f}\mathbf{A}% \mathbf{x}_{0}-\mathbf{G}\mathbf{A}_{f}\mathbf{Bs}.$	(31)

We then add and subtract a $\mathbf{V}$ on the right side of the equation. We leave the $-\mathbf{V}$ as is, while we substitute the $+\mathbf{V}$ using the definition of $\mathbf{V}$ from Eq. (26), obtaining:

\dot{\mathbf{V}}=-\mathbf{V}+\mathbf{G}\dot{\mathbf{z}}-\mathbf{G}\mathbf{A}_{% f}\mathbf{A}\mathbf{x}_{0}+\mathbf{G}\mathbf{z}-\mathbf{G}\mathbf{A}_{f}% \mathbf{x}_{0}-\mathbf{G}\mathbf{A}_{f}\mathbf{Bs};

(32)

we can rewrite this as

\dot{\mathbf{V}}=-\mathbf{V}+\mathbf{G}(\dot{\mathbf{z}}+\mathbf{z})-\mathbf{G% }\mathbf{A}_{f}\mathbf{(A+I)}\mathbf{x}_{0}-\mathbf{G}\mathbf{A}_{f}\mathbf{Bs}.

(33)

We can trace this back to the form of a recurrent network of integrate and fire neurons:

\mathbf{\dot{V}}=-\mathbf{V}+\mathbf{G}(\dot{\mathbf{z}}+\mathbf{z})-\mathbf{F% }\mathbf{x}_{0}-\mathbf{\Omega s},\,

(34)

where $\mathbf{F}=\mathbf{G}\mathbf{A}_{f}\mathbf{(A+I)}$ , and $\mathbf{\Omega}=\mathbf{G}\mathbf{A}_{f}\mathbf{B}=\mathbf{B}^{T}\mathbf{A}_{f% }{{}^{T}}\mathbf{C}\mathbf{A}_{f}\mathbf{B}.$

This full network equation describes the changes in $\mathbf{V}\in\mathbb{R}^{N}$ , which represents the membrane potentials. Here, $\mathbf{G}\in\mathbb{R}^{N\times K}$ are the target input weights: forward connections that encode for the target $\mathbf{z}$ . Considering the changes in target value over time $\dot{\mathbf{z}}$ as well as the actual value of the target $\mathbf{z}$ allows for the network to trace rapid changes of the target. Only encoding for the current value of $\mathbf{z}$ would still allow to control the system, but with a slower tracing of the target. The matrix $\mathbf{F}\in\mathbb{R}^{N\times K}$ represents the state input weights, that contain the forward connections to encode the current system’s state $\mathbf{x_{0}}$ , $\mathbf{\Omega}\in\mathbb{R}^{N\times N}$ are the recurrent weights, and $\mathbf{s}$ are the spikes. Starting from a greedy spiking condition, we obtained a network of leaky integrate-and-fire (LIF) neurons in the form of Eq. (7). Such network has a voltage decay term, forward weights that map the target onto the network, forward weights that encode the system state in input, recurrent connectivity, and spikes, and is able to control our linear dynamical system. For a $K$ -dimensional system to control and a network of $N$ neurons, $\mathbf{B}$ is a $N\times K$ matrix, while $\mathbf{A}_{f}$ and $\mathbf{C}$ are $K\times K$ matrices. Thus, $\mathbf{\Omega}=\mathbf{B}^{T}\mathbf{A}_{f}{{}^{T}}\mathbf{C}\mathbf{A}_{f}% \mathbf{B}$ is of rank $K$ . Hence, our spiking control solution uses a network with low-rank connectivity (low-rank with respect to the system that is controlling).

4.3 Relation to spike coding networks

Works in SCNs use a very similar derivation as the one described in Sec. 4.2. In these works, a read-out is usually defined in the following form:

\mathbf{y}=\mathbf{D}\mathbf{r},

(35)

where $\mathbf{D}\in\mathbb{R}^{K\times N}$ contains the decoding or output weights of all neurons, and $\mathbf{y}\in\mathbb{R}^{K}$ is the $K$ -dimensional network read-out (also sometimes seen as the network state). $\mathbf{r}\in\mathbb{R}^{N}$ represent some form of filtered spike trains, ranging from simple exponential filtering (such that $\dot{\mathbf{r}}=-\lambda\mathbf{r}+\mathbf{s}$ ), or more complex post-synaptic response kernels. In both cases a loss is optimized such that the output of the network progressively tracks its input, $\mathbf{x}$ . A loss between these two quantities can be defined:

L=||\mathbf{x}-\mathbf{D}\mathbf{r}||^{2}.

(36)

Much like in our derivation, from this loss voltage, dynamics are derived by considering the rule that spikes should only happen if $L(\text{neuron spikes})<L(\text{no spikes})$ , which we took inspiration from for the control case. For works that make use of complex kernels either some prediction of the future state was also employed \citepschwemmerConstructing2015, or a delayed signal was used to explicitly model the difference between past and current state \citepzeldenrustEfficient2021.

These derivations are largely similar to ours, with the crucial differences that we consider downstream linear systems of a general form, and that previous SCN works have mainly considered a single read-out given by filtered spike trains. In this sense, our work can be seen as a generalization of standard SCN paradigms (which exert ‘control’ mainly over their own read-out), which expands the use of similar principles to explicitly control any downstream linear dynamical system.

4.4 Filtered spiking controller

For the filtered spiking controller in Fig. 7B, we used a more direct extension of SCNs. There, we consider a network that should produce a read-out which matches the optimal control expected from an LQR controller. This means our network spikes to constrain the loss between the network output, and the LQR control signal, such that:

L=||\mathbf{u}-\mathbf{D}\mathbf{r}||^{2},

(37)

where $\mathbf{u}=-\mathbf{K}(\mathbf{x}-\mathbf{z})$ is the control signal as it would be generated by an LQR controller. When defining the LQR signal to approximate, we consider a $K$ -dimensional system controlled by a $W$ -dimensional control signal. The matrix $\mathbf{K}\in\mathbb{R}^{W\times K}$ is then the optimal feedback gain matrix can be found by solving an algebraic Riccati equation, given assumptions on the cost of state deviations and control actuation \citepbruntonDatadriven2019. This solution ensures that the control law $\mathbf{u}=-\mathbf{K}(\mathbf{x}-\mathbf{z})$ optimally balances tracking performance and control cost when controlling the linear system. Spiking activity that constrains the loss described in Eq. (37) can be achieved by simply using a standard SCN network, while considering $-\mathbf{K}(\mathbf{x}-\mathbf{z})$ as an input instead of just $\mathbf{x}$ .

Acknowledgments

This publication is part of the project Dutch Brain Interface Initiative (DBI $2$ ) with project number 024.005.022 of the research programme Gravitation, which is financed by the Dutch Ministry of Education, Culture and Science (OCW) via the Dutch Research Council (NWO). The authors thank Luc Selen for advice on the model’s design and conceptualization.

References

\printbibliography

[heading=none]