这是indexloc提供的服务,不要输入任何密码
\addbibresource

main.bib

Spiking neurons as predictive controllers of linear systems

Paolo Agliati Department of Machine Learning and Neural Computing, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands André Urbano Cajal International Neuroscience Center, Spanish National Research Council, Madrid, Spain Pablo Lanillos Cajal International Neuroscience Center, Spanish National Research Council, Madrid, Spain Nasir Ahmad Department of Machine Learning and Neural Computing, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands Marcel van Gerven Department of Machine Learning and Neural Computing, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands Sander Keemink Department of Machine Learning and Neural Computing, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands
Abstract

Neurons communicate with downstream systems via sparse and incredibly brief electrical pulses, or spikes. Using these events, they control various targets such as neuromuscular units, neurosecretory systems, and other neurons in connected circuits. This gave rise to the idea of spiking neurons as controllers, in which spikes are the control signal. Using instantaneous events directly as the control inputs, also called ‘impulse control’, is challenging as it does not scale well to larger networks and has low analytical tractability. Therefore, current spiking control usually relies on filtering the spike signal to approximate analog control. This ultimately means spiking neural networks (SNNs) have to output a continuous control signal, necessitating continuous energy input into downstream systems. Here, we circumvent the need for rate-based representations, providing a scalable method for task-specific spiking control with sparse neural activity. In doing so, we take inspiration from both optimal control and neuroscience theory, and define a spiking rule where spikes are only emitted if they bring a dynamical system closer to a target. From this principle, we derive the required connectivity for an SNN, and show that it can successfully control linear systems. We show that for physically constrained systems, predictive control is required, and the control signal ends up exploiting the passive dynamics of the downstream system to reach a target. Finally, we show that the control method scales to both high-dimensional networks and systems. Importantly, in all cases, we maintain a closed-form mathematical derivation of the network connectivity, the network dynamics and the control objective. This work advances the understanding of SNNs as biologically-inspired controllers, providing insight into how real neurons could exert control, and enabling applications in neuromorphic hardware design.

1 Introduction

Refer to caption

Figure 1: (A) Scheme of a feedback controller and a downstream system. The current state of the system 𝐱(t)𝐱𝑡\mathbf{x}\mathit{(t)}bold_x ( italic_t ) is given as input to the controller, along with a target value 𝐳(t)𝐳𝑡\mathbf{z}\mathit{(t)}bold_z ( italic_t ). The controller outputs a control signal 𝐮(t)𝐮𝑡\mathbf{u}\mathit{(t)}bold_u ( italic_t ), as a function of both inputs. 𝐮(t)𝐮𝑡\mathbf{u}\mathit{(t)}bold_u ( italic_t ) is then mapped onto the dynamics of the downstream system via the 𝐁𝐁\mathbf{B}bold_B matrix. The system’s dynamics are governed by the 𝐀𝐀\mathbf{A}bold_A matrix. (B-D) Examples of three feedback control paradigms. (B) The continuous LQR uses a gain matrix 𝐊𝐊\mathbf{K}bold_K to directly compute the control signal for a given state and target pair. (C) In the filtered spiking case, the output spikes of the network are filtered to obtain 𝐫(t)𝐫𝑡\mathbf{r}\mathit{(t)}bold_r ( italic_t ), which is decoded by the matrix 𝐃𝐃\mathbf{D}bold_D to form the control signal 𝐮(t)𝐮𝑡\mathbf{u}\mathit{(t)}bold_u ( italic_t ) (adapted from \citepslijkhuisClosedform2023). (D) In our spiking control paradigm, the output spikes 𝐬(t)𝐬𝑡\mathbf{s}\mathit{(t)}bold_s ( italic_t ) represent the control input itself, and are therefore directly mapped onto the system dynamics through the 𝐁𝐁\mathbf{B}bold_B matrix.

In the vertebrate nervous system, action potentials are one of the universal principles of communication between neurons \citepkochBiophysics2004,gollischRapid2008, wolfeSparse2010, boerlinSpikebased2011, gutigspike2014, srivastavaMotor2017. Spiking behavior of cortical neurons grants efficient and robust computational properties, providing the basis for the realization of the cortex’s complex adaptive functions \citepmainenReliability1995, kochBiophysics2004, bazhenovRole2005, gollischRapid2008, wolfeSparse2010, boerlinSpikebased2011. At the level of single neurons, spike events are incredibly brief and sparse, and through synaptic communication can affect the dynamics of neurons in downstream areas. This property, where neurons are not only communicating information but modifying behavior, settled the basis for viewing neurons as controllers \citepcisekcomputer1999, ahissarPerception2016 and spikes as control signals. We can find clear examples of spike-driven control in movement control studies, where the spikes’ influence allows animals to orient their head in space \citepzhangRepresentation1996, as well as control the movement of their eyes \citepseungHow1996 or limbs \citepbernikernormative2019. This control perspective can offer important insights into neural computation, but is also increasingly relevant for neuromorphic applications where spiking networks are used to control downstream plants \citepstagstedneuromorphic2020, polykretisspiking2022, liuSpiking2023.

Although the theory and applications of control with spiking networks has recently seen some advances, it remains unclear how to generate each spike to successfully control a plant’s dynamics. Traditionally, control theory relies on optimal control methods, such as the linear quadratic regulator (LQR)  \citepandersonOptimal2007, which continuously adjusts control inputs to drive a system toward a target state while minimizing a cost function (Fig. 1B). Although effective, these continuous control paradigms are difficult to reconcile with the discrete nature of SNNs. Attempts at bridging this gap include the use of a filtered spikes approach. These studies have applied classic optimal control principles to SNNs by approximating LQR signals through rate codes or filtered spike trains \citepboerlinPredictive2013, deneveEfficient2016, huangDynamical2018, guoNeural2021, slijkhuisClosedform2023, as shown in Fig. 1C. Importantly, in these filtered spikes approaches, spikes serve as the network’s representation of the system state but are not actively computing control signals, which are still continuous in nature. This makes it difficult to trace the effect each single spike has on the trajectory of the controlled system. Alternative approaches for spike-based control in the literature focus on individual neuron-based controllers \citepmooreneuron2024, or implement forms of instantaneous control \citepyangImpulsive1999. Such impulsive control methods have been widely studied, but they mostly focus on the control theory perspective, sometimes even bypassing the need of a network \citepyangStability2007, raffObservers2007, wangImpulsive2014, ouyangImpulsive2020. Only recently, some works started integrating this framework with theoretical neuroscience concepts, to explicitly interpret the control impulses as the neurons’ action potentials \citeppetriAnalysis2024. Still, network architectures are imposed rather than derived from control objectives. For this reason, the integration of bio-informed SNNs with (feedback) control remains an under-explored area. A new paradigm is needed to account for direct spiking control and its seamless integration with computational neuroscience theories.

Our study investigates how spike events can influence the dynamics of linear systems, using spikes directly as control signals. We consider this problem from the perspective of control theory, by formally deriving how neurons should spike to control a linear dynamical system (Fig. 1). This equates to generating control inputs that can successfully drive the dynamics of a downstream plant to minimize a certain loss associated with the state of the system. When deriving a solution for this problem (Sec. 4.2), we obtain, without needing to introduce further assumptions, a feedback control scenario like the one depicted in Fig. 1A. Our work embeds this instantaneous spiking control within theoretical neuroscience constraints.

Inspired by the neuroscience theory of spike coding networks (SCNs) \citepboerlinPredictive2013, gutigspike2014, deneveEfficient2016, guoNeural2021, we derive, from optimal control principles, a network of recurrent integrate-and-fire (IF) neurons with sparse activity and low-rank connectivity. By directly incorporating spikes as control signals (Fig. 1D), our model ensures that each neuron’s contribution to the system’s trajectory is explicit. Between each spike event s(t)𝑠𝑡s(t)italic_s ( italic_t ), the system evolves following its dynamics in an open loop, and when a control signal is produced, the system switches to a closed-loop control (with s(t)𝑠𝑡s(t)italic_s ( italic_t ) being the control input). Our solution naturally exploits the intrinsic dynamics of the plant, instead of continuously inputting a signal over time. This approach could help understanding how neural circuits can execute control tasks efficiently while preserving key features of biological networks. In the following sections, we present our framework, analyze the model’s performance on a linear dynamical system, and discuss its implications for both neuroscience and control theory.

2 Results

2.1 Formulating the control problem

To investigate the role of spiking activity in controlling linear dynamical systems, we start by revisiting the fundamental components of a control task. We then introduce our approach to modeling spike-based control, and finally explore the consequences of applying this method to physically constrained systems.

2.1.1 General control setup

The control of a generic K𝐾Kitalic_K-dimensional linear dynamical system through an N𝑁Nitalic_N-dimensional control signal, as illustrated in Fig. 1A, can be expressed as:

𝐱˙=𝐀𝐱+𝐁𝐮,˙𝐱𝐀𝐱𝐁𝐮\mathbf{\dot{x}}=\mathbf{Ax}+\mathbf{B}\mathbf{u}\,,over˙ start_ARG bold_x end_ARG = bold_Ax + bold_Bu , (1)

where 𝐱K𝐱superscript𝐾\mathbf{x}\in\mathbb{R}^{K}bold_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT is the system state, 𝐀K×K𝐀superscript𝐾𝐾\mathbf{A}\in\mathbb{R}^{K\times K}bold_A ∈ blackboard_R start_POSTSUPERSCRIPT italic_K × italic_K end_POSTSUPERSCRIPT is the system matrix, describing the system’s uncontrolled dynamics, 𝐁K×N𝐁superscript𝐾𝑁\mathbf{B}\in\mathbb{R}^{K\times N}bold_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_K × italic_N end_POSTSUPERSCRIPT is the input matrix which maps the control input onto the system, and 𝐮N𝐮superscript𝑁\mathbf{u}\in\mathbb{R}^{N}bold_u ∈ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT is the control input.

In classical control, 𝐮𝐮\mathbf{u}bold_u is typically a continuous signal, as in LQR control, where the optimal control input is computed to minimize a quadratic cost function (Fig. 1B). For control systems that make use of neural networks, the control signal can instead be derived from neural activity. In spiking networks, a filtered-spike signal can be employed, where the firing rates of neurons determine the value of the control input for each timestep (Fig. 1C). Here, the spikes are aiding the network’s internal representation of the systems, but the control input still approximates a continuous control signal \citepboerlinPredictive2013, deneveEfficient2016, huangOptimizing2017, huangDynamical2018, guoNeural2021, slijkhuisClosedform2023, mooreneuron2024. In our work, we instead consider a sparse and discrete control signal, where the input is made of brief pulses or ‘kicks’ (Fig. 1D) \citepyangImpulsive1999, yangStability2007, raffObservers2007, wangImpulsive2014, ouyangImpulsive2020, petriAnalysis2024. The system then becomes:

𝐱˙=𝐀𝐱+𝐁𝐬,˙𝐱𝐀𝐱𝐁𝐬\mathbf{\dot{x}}=\mathbf{Ax}+\mathbf{B}\mathbf{s}\,,over˙ start_ARG bold_x end_ARG = bold_Ax + bold_Bs , (2)

where 𝐬N𝐬superscript𝑁\mathbf{s}\in\mathbb{R}^{N}bold_s ∈ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT describes the on/off state of each of N𝑁Nitalic_N neurons in the network. A spike signal is emitted every time the voltage Visubscript𝑉𝑖V_{i}italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT of neuron i𝑖iitalic_i reaches its threshold Tisubscript𝑇𝑖T_{i}italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, resulting in the spike train si(t)=jδ(ttj)subscript𝑠𝑖𝑡subscript𝑗𝛿𝑡subscript𝑡𝑗s_{i}(t)=\sum_{j}\delta(t-t_{j})italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) = ∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_δ ( italic_t - italic_t start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) with tjsubscript𝑡𝑗t_{j}italic_t start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT indexing the spike times. Importantly, here δ𝛿\deltaitalic_δ is the Dirac delta function, which instantaneously assumes the value one at spike times tjsubscript𝑡𝑗t_{j}italic_t start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT and is zero otherwise. This is different from other forms of impulsive control like bang-bang control or chattering control \citepfelgenhauerstability2003, where the instantaneous input is maintained for a certain period of time.

2.1.2 Spike-based control

To formalize our control approach, we define a loss function that expresses the discrepancy between the current state of the linear dynamical system and a desired target state, by taking the squared L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT norm of their difference. Specifically, we consider the loss:

L=𝐱(t+f)𝐳(t+f)2+μ𝐬(t)2,𝐿superscriptnorm𝐱𝑡𝑓𝐳𝑡𝑓2𝜇superscriptnorm𝐬𝑡2L=\|\mathbf{x}(t+f)-\mathbf{z}(t+f)\|^{2}+\mu||\mathbf{s}(t)||^{2}\,,italic_L = ∥ bold_x ( italic_t + italic_f ) - bold_z ( italic_t + italic_f ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_μ | | bold_s ( italic_t ) | | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (3)

where 𝐱(t+f)𝐱𝑡𝑓\mathbf{x}(t+f)bold_x ( italic_t + italic_f ) is the system state some time f𝑓fitalic_f in the future, 𝐳(t+f)K𝐳𝑡𝑓superscript𝐾\mathbf{z}(t+f)\in\mathbb{R}^{K}bold_z ( italic_t + italic_f ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT the target state, and μ𝜇\muitalic_μ is a regularization parameter that determines the spiking cost (Sec. 2.5). We consider both reactive control (f=0𝑓0f=0italic_f = 0) and predictive control (f>0𝑓0f>0italic_f > 0). Classically, such a loss might be minimized by taking the gradient with respect to certain system parameters, or by finding the optimal control input that minimizes this loss for some future time window, as in standard LQR control. However, the use of a spiking control signal makes both of these approaches challenging to solve mathematically. Here, we instead take inspiration from the neuroscientific theory of spike coding networks. SCN theory assumes spikes should only happen when they improve on an underlying coding loss and derives the required spiking dynamics. From this principle, spiking neural networks that represent their inputs \citepboerlinPredictive2013, deneveEfficient2016 and perform a variety of computations \citepbrendelLearning2020, guoNeural2021, slijkhuisClosedform2023 can be defined.

Following this same principle, but applied to the problem of controlling dynamical systems, we require that a spike should only happen if it improves the control loss (Eq. (3)). According to these SCNs-inspired approaches, whenever a neuron in the network spikes, it (directly or indirectly) influences the state of the plant, resulting in a different loss value. Therefore, we can compute two different losses Lssubscript𝐿sL_{\text{s}}italic_L start_POSTSUBSCRIPT s end_POSTSUBSCRIPT and Lnssubscript𝐿nsL_{\text{ns}}italic_L start_POSTSUBSCRIPT ns end_POSTSUBSCRIPT, representing the distance to the target in the presence or absence of a spike, respectively. In order to control the plant’s dynamics, the network compares the two losses to determine wether inputting a spike is advantageous to reach the control target. The inequality Ls<Lnssubscript𝐿ssubscript𝐿nsL_{\text{s}}<L_{\text{ns}}italic_L start_POSTSUBSCRIPT s end_POSTSUBSCRIPT < italic_L start_POSTSUBSCRIPT ns end_POSTSUBSCRIPT is describing how our loss can be improved by the control input, decreasing the distance towards the target and helping our network solve the control task. These losses are describing a spiking condition, reminiscent of a voltage-threshold inequality 𝐕>𝐓𝐕𝐓\mathbf{V}>\mathbf{T}bold_V > bold_T (where 𝐕𝐕\mathbf{V}bold_V is the voltage of the neurons, and 𝐓𝐓\mathbf{T}bold_T is their firing threshold) that allows biological neural networks to produce action potentials. In Sec. 4.2, we show that one can indeed derive voltage and thresholds values from our loss comparison. As a consequence, in our framework, the values of both voltage and threshold are composed exclusively from quantities that describe the (current and future) states of the plant. In particular, the voltage of neuron i𝑖iitalic_i is expressed by:

Vi=𝐛iT𝐀f(𝐳𝐀f𝐱0)T,V_{i}=\mathbf{b}_{i}^{T}\mathbf{A}_{f}{{}^{T}}(\mathbf{\mathbf{z}}-\mathbf{A}_% {f}\mathbf{x}_{0})\,,italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_FLOATSUPERSCRIPT italic_T end_FLOATSUPERSCRIPT ( bold_z - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , (4)

where 𝐛isubscript𝐛𝑖\mathbf{b}_{i}bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the i𝑖iitalic_ith column of the matrix 𝐁𝐁\mathbf{B}bold_B, 𝐀fe𝐀fK×Ksubscript𝐀𝑓superscript𝑒𝐀𝑓superscript𝐾𝐾\mathbf{A}_{f}\equiv e^{\mathbf{A}f}\in\mathbb{R}^{K\times K}bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ≡ italic_e start_POSTSUPERSCRIPT bold_A italic_f end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_K × italic_K end_POSTSUPERSCRIPT is the matrix exponential of the state matrix 𝐀𝐀\mathbf{A}bold_A times the future time window f𝑓fitalic_f, and 𝐱0=𝐱(t)subscript𝐱0𝐱𝑡\mathbf{x}_{0}=\mathbf{x}(t)bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = bold_x ( italic_t ) is the current state of the system. The threshold of neuron i𝑖iitalic_i is described as:

Ti=𝐀f𝐛i𝐛iT𝐀fT2+μ.T_{i}=\frac{\mathbf{A}_{f}\mathbf{b}_{i}\mathbf{b}_{i}^{T}\mathbf{A}_{f}{{}^{T% }}}{2}+\mu\,.italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = divide start_ARG bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_FLOATSUPERSCRIPT italic_T end_FLOATSUPERSCRIPT end_ARG start_ARG 2 end_ARG + italic_μ . (5)

A detailed derivation of the values of 𝐕𝐕\mathbf{V}bold_V and 𝐓𝐓\mathbf{T}bold_T can be found in Sec. 4.2.4. Although a cost 𝐂𝐂\mathbf{C}bold_C on each dimensions of the system can be introduced to further modulate spiking activity (Sec. 4.2), the values of Visubscript𝑉𝑖V_{i}italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and Tisubscript𝑇𝑖T_{i}italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in Eq. (4) and Eq. (5) do not consider this weighted case (assuming 𝐂𝐂\mathbf{C}bold_C is an identity matrix), for simplicity. Further developing this simple spiking rule leads to a recurrent network of integrate-and-fire neurons, illustrated in Fig. 2, and described by:

𝐕˙=𝐕+𝐆(𝐳˙+𝐳)𝐅𝐱0𝛀𝐬,˙𝐕𝐕𝐆˙𝐳𝐳subscript𝐅𝐱0𝛀𝐬\mathbf{\dot{V}}=-\mathbf{V}+\mathbf{G}(\dot{\mathbf{z}}+\mathbf{z})-\mathbf{F% }\mathbf{x}_{0}-\mathbf{\Omega s}\,,over˙ start_ARG bold_V end_ARG = - bold_V + bold_G ( over˙ start_ARG bold_z end_ARG + bold_z ) - bold_Fx start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_Ω bold_s , (6)

where 𝐕N𝐕superscript𝑁\mathbf{V}\in\mathbb{R}^{N}bold_V ∈ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT represents the membrane potentials, 𝐆N×K𝐆superscript𝑁𝐾\mathbf{G}\in\mathbb{R}^{N\times K}bold_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_K end_POSTSUPERSCRIPT are the forward connections that encode the target 𝐳𝐳\mathbf{z}bold_z, 𝐅N×K𝐅superscript𝑁𝐾\mathbf{F}\in\mathbb{R}^{N\times K}bold_F ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_K end_POSTSUPERSCRIPT contains the forward connections that encode the current system’s state 𝐱𝟎subscript𝐱0\mathbf{x_{0}}bold_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT, 𝛀N×N𝛀superscript𝑁𝑁\mathbf{\Omega}\in\mathbb{R}^{N\times N}bold_Ω ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_N end_POSTSUPERSCRIPT represents recurrent connections, and 𝐬𝐬\mathbf{s}bold_s are the spikes. Interestingly, the obtained recurrent network has low-rank connectivity with respect to the dimensions of the downstream system (Sec. 4.2.4). This network structure is not assumed, but rather derived from each neuron as a greedy spiking condition given our loss comparison (Sec. 3.1.1). Consequently, although the network obtained is recurrent, each neuron computes the losses without access to network-level information (Sec. 3.1.1). For a full derivation of the network, see Sec. 4.2.

Refer to caption

Figure 2: Scheme of a recurrent spiking network composed of four neurons controlling a downstream system. Inputs and target are mapped onto the network via forward connections. 𝐆𝐆\mathbf{G}bold_G maps the target 𝐳𝐳\mathbf{z}bold_z, while 𝐅𝐅\mathbf{F}bold_F maps the system’s current state, 𝐱𝐱\mathbf{x}bold_x. 𝛀𝛀\mathbf{\Omega}bold_Ω represents the recurrent weights. The control inputs are the spikes 𝐬𝐬\mathbf{s}bold_s, which are mapped onto the system via the matrix 𝐁𝐁\mathbf{B}bold_B.

For this proof-of-concept, we enforce an asynchronous firing scheme: at each timestep, all neurons independently compute their local loss comparisons, but only one is allowed to spike, even if several exceed threshold. The spiking neuron can be chosen randomly from the suprathreshold set, or, as we do here, selected as the one with the highest voltage above threshold. While for clarity we keep this firing policy in all the shown examples, it is not strictly required for achieving control. This is because networks that operate on local spiking rules can distribute their activity very effectively among neurons \citepcalaimRobustness2021, which means no specific neuron is more relevant for the computation of the network. Such a simple asynchronous policy is only feasible in zero-delay systems, where the new state of the plant is inputted in the network without any time delay. In a system with delays, we could not impose this asynchronous firing. However, we can employ different methods that allow multiple neurons to spike while still being compatible with our local spiking rule, for instance, by scaling each neuron’s threshold according to its past firing activity.

2.2 Reactive vs. predictive spiking control

Our control problem is centered around the comparison of losses (distances of the state to a target) under the conditions in which a spike is generated or not (Sec. 2.1.2). By comparing the loss functions in the spiking and non-spiking cases, we derived a firing condition for each neuron. Namely, the neuron’s voltage exceeds its threshold and generates a spike only when the effect of this spike onto the system is to minimize the loss (reducing the distance between the system’s state and the target). Importantly, in our control paradigm, the spiking rule is fully local: each neuron computes its losses, with no access to shared network-level information, and fires only when the inequality Ls<Lnssubscript𝐿ssubscript𝐿nsL_{\text{s}}<L_{\text{ns}}italic_L start_POSTSUBSCRIPT s end_POSTSUBSCRIPT < italic_L start_POSTSUBSCRIPT ns end_POSTSUBSCRIPT is satisfied, allowing the dynamics of the plant to unfold independently.

In Sec. 2.1.2 we showed how starting with a spiking condition based on a loss comparison (spike if Ls<Lnssubscript𝐿ssubscript𝐿nsL_{\text{s}}<L_{\text{ns}}italic_L start_POSTSUBSCRIPT s end_POSTSUBSCRIPT < italic_L start_POSTSUBSCRIPT ns end_POSTSUBSCRIPT), one can derive the required network structure and dynamics to control a downstream system. Such a spiking rule can be defined relative to the immediate result of a spike, or the predicted effect that such spike will have on the system. In the first case, spikes are generated based on minimizing the loss function at the current time step. In the latter, spikes are generated based on a forward prediction of the system state, incorporating future expected loss minimization over a short horizon of f𝑓fitalic_f seconds. The idea of predictive spiking is also present in SCN-related works, usually considering the spike’s influence on a future state  \citepschwemmerConstructing2015, or on the difference between past and current state  \citepzeldenrustEfficient2021. Adapting this predictive spiking idea in our work, means that the system would evolve in open loop after the spike, only following its intrinsic dynamics, until another spike is able to further improve the predicted loss.

Refer to caption
Figure 3: Examples of reactive and predictive spiking. The gray dot represents a target state. (A) Example of a linear dynamical system with its current state (brown dot), and four example states that can be reached with a spike. We employ a reactive spiking rule. The neuron for which Ls<Lnssubscript𝐿ssubscript𝐿nsL_{\text{s}}<L_{\text{ns}}italic_L start_POSTSUBSCRIPT s end_POSTSUBSCRIPT < italic_L start_POSTSUBSCRIPT ns end_POSTSUBSCRIPT spikes (green arrow), since that immediately shortens the distance between the system’s state and the target. The other ones do not (red arrows). (B) Example of a linear dynamical system, where the spike decision is taken based on the state of the system f𝑓fitalic_f seconds after the spike. In this predictive case, Ls<Lnssubscript𝐿ssubscript𝐿nsL_{\text{s}}<L_{\text{ns}}italic_L start_POSTSUBSCRIPT s end_POSTSUBSCRIPT < italic_L start_POSTSUBSCRIPT ns end_POSTSUBSCRIPT is still the spiking condition, but the preferred spiking direction can differ. The condition Ls<Lnssubscript𝐿ssubscript𝐿nsL_{\text{s}}<L_{\text{ns}}italic_L start_POSTSUBSCRIPT s end_POSTSUBSCRIPT < italic_L start_POSTSUBSCRIPT ns end_POSTSUBSCRIPT holds for both the upward (green) and rightward (orange) spikes, since the no-spike trajectory (brown) lies further from the target than either of the spiking ones. Here, an asynchronous firing rule (Sec. 2.1.2) would select only one of these neurons to fire; either at random, or by considering the neuron with the highest voltage (green).
Refer to caption
Figure 4: Spiking control simulations. The two rows denote reactive and predictive systems. The two columns represent a generic linear system, and the physical SMD. All the examples simulation are run with the simplest network possible of two neurons. (A) On the left side, state space representation of a reactive spiking control approach for a generic unconstrained system. The gray line corresponds to the changing target. Although the matrix 𝐁𝐁\mathbf{B}bold_B allows for spikes in all four directions, the network continuously spikes rightwards, ignoring the intrinsic dynamics of the plant. On the right side, the top figure shows the changes of velocity, position and target over time, while the bottom figure shows the spike control input (each spike scaled by corresponding value in the matrix 𝐁𝐁\mathbf{B}bold_B). (B) On the left side, the state space figure shows that without predictiveness, a physically constrained system never spikes. 𝐁𝐁\mathbf{B}bold_B is constrained to only allow spikes on the velocity component, which do not cause any immediate effects on the position. When controlling a physical system with a reactive control approach, the loss in the non-spiking case is always smaller than the loss in the spiking case, and the network never spikes. On the right side, no activity is present. (C) In the predictive, unconstrained example, the system can spike in all four directions but considers its future state. In this case, the network either spikes directly towards the target or spikes upwards, exploiting the systems dynamics. (D) Use of our predictive SNN approach for the control of an SMD. As shown in the state space representation, as the target position changes, the network spikes up or down, adjusting the velocity of the plant based on its predicted state in f𝑓fitalic_f seconds. On the right side, we highlight how the network’s activity adjusts to the target. It employs downwards spikes to adjust for an eventual overshooting of the position relative to the target, and produces consecutive upwards spikes when the target is changing more rapidly.

Reactive and predictive spiking control could lead to significantly different spiking behavior. To illustrate this, we consider both cases for an example linear system, with its dynamics defined by velocity and position (Fig. 3). Each spike produced by a neurons in the controlling network is multiplied by its corresponding column in the matrix 𝐁𝐁\mathbf{B}bold_B, which maps it onto the dynamics of the system. In Fig. 3 we chose a 𝐁𝐁\mathbf{B}bold_B matrix that maps the spikes from four different neurons as vectors in the four different cardinal directions. When the system receives a spike control signal in the current timestep, the system state is instantaneously kicked in the corresponding direction. In the reactive case, only the immediate effect of the spiking is considered. This means the loss comparison that produces a spike control signal does not consider the future state of the plant, but rather aims to improve the loss right after the spike is emitted. As shown by Fig. 3A, in this arbitrary linear system, adopting a reactive spiking rule results in a spike that kicks the system directly towards the target (green arrow), without accounting for the future dynamics of the plant (which could later move the trajectory further away from it). All other spiking directions (red arrows) do not result in an improved loss right after the spike, and so the neurons whose spikes are mapped in those directions do not fire. On the contrary, predictive spiking control favors the spike that minimizes the loss f𝑓fitalic_f seconds into the future, considering how the system would evolve during this time window. In our predictive example (Fig. 3B), either spiking rightward (orange trajectory) or upward (green trajectory) would give a lower loss in f𝑓fitalic_f seconds when compared to the non-spiking case (brown trajectory). If we employ an asynchronous firing policy (Sec. 2.1.2), only one of these two neurons is selected for firing. For all the examples in this paper, we employ asynchronous firing and we select the neuron with the highest voltage past its threshold (in this example, the one whose spike is mapped on the green arrow).

2.3 Application to physical systems: spring-mass-damper

To illustrate an application of our framework, we consider, as an example, the control of a spring-mass-damper system (SMD) \citepguRobust2005, for schematics see Fig. 1. This is a linear dynamical system composed of a mass attached to a spring (which causes oscillations whenever the mass is moved) and a dampener (which dampens the oscillations, giving the system a fixed stable state). The system can be entirely described within its two variables (the position and the velocity of the mass) which are both included in our system state, 𝐱𝐱\mathbf{x}bold_x. This makes the SMD a very simple system that grants observable dynamics and clear analytical solutions, while also modeling a physical system with realistic state variables. As such, it is a standard benchmark in control literature and spiking control theories \citepvincentPositioning1989, slijkhuisClosedform2023, ahmadvandNeuromorphic2023. Furthermore, the SMD as a linear system can be expanded to approximate more complex plants, as shown by its applications in muscle models for motor control and neuro-mechanical studies \citepbyadarhalymodular2012, pazzagliaBalancing2025. In our proof-of-concept, we make use of a simple target 𝐳𝐳\mathbf{z}bold_z, which evolves during the simulation, incrementing twice and then stabilizing. In all of our simulations, the target state is always relative to the position of the mass, while its velocity can range freely. Although this is the setup we will work with, our algorithm allows to constrain the values of either the position or the velocity of the mass, by changing the entries in the cost matrix 𝐂𝐂\mathbf{C}bold_C, as shown in Sec. 4.2. We will now illustrate the differences that emerge when controlling the system using a reactive versus a predictive paradigm, as described in Sec. 2.2 (Fig. 3). We consider both physically unconstrained control (where the control signal can affect both velocity and position), and physically constrained control (where the control signal can only affect the velocity).

2.3.1 Reactive control

We first consider a physically unconstrained system with reactive control (Fig. 4A). In this case, the system can be successfully made to track a target (Fig. 4A top right). This is achieved by continuously spiking directly towards the target, which is always the better choice since the neurons do not consider future states of the plant (as shown in Fig. 3A). In this example, the control is exerted by progressively kicking the mass in the same position as the evolving target, essentially ‘teleporting’ it without affecting its velocity. Of course, this is only possible because we are considering a physically unconstrained system, where the mass is allowed to be moved instantly in any direction of the state space. Also note that because of this property, the system ends up in a physically impossible state with a non-zero velocity (green curve in upper right plot), despite remaining in place according to the position variable. We therefore ask if a reactive control method could work on physically constrained systems. We employ reactive control to our physically constrained example, running the simulation of an SMD. In this context, applying a spiking control rule means directly influencing the velocity component of the state 𝐱𝐱\mathbf{x}bold_x, while the position component evolves as a consequence. This reflects a key constraint of physical systems: the position cannot be instantaneously changed, as the mass cannot move from one position to another in no time. Instead, we constrain spiking control to act on the simulated physical system by applying a force to only kick the velocity, guaranteeing realistic state evolution in accordance with Newtonian mechanics. As shown in Fig. 4B, a reactive approach fails to control a physically constrained system. This is because influencing the velocity of the mass has no instant effects on its position. To appreciate these effects, the network should be able to observe future states of the plant. In short, spiking up or down in state space does not immediately affect the loss, and no spike is produced.

2.3.2 Predictive control

We introduce our predictive spiking algorithm to a non-constrained system. In this scenario, the controller is allowed to spike any dimension of the system, but it is also able to predict the plant’s future states, and it produces spikes that improve the loss f𝑓fitalic_f seconds from the current state. From Fig. 4C, we can see how when exerting the control, the position is sometimes directly moved towards the target (as in the reactive case), but optimizing the local losses often equates to kicking the system in its velocity, exploiting the intrinsic dynamics of the plant. Our predictive algorithm exploits the intrinsic dynamics of the system even in the absence of physical constraints, although in some instances, the mass is still instantaneously moved towards the target. Finally, we apply the same predictive algorithm to the SMD simulation. In this scenario, the spiking network accounts for future states of the system, and is constrained to only spike the velocity either up or down, adjusting to the changing target position. The trajectory approaches the target and the loss f𝑓fitalic_f seconds after a spike is minimized. If the position then departs from the target, the losses of the neurons will change their value, eventually causing the neuron to reach its threshold. At this point, the velocity will be either spiked up or down, causing the network to trace the target correctly. As shown in Fig. 4D, our predictive spiking rule introduces an element of anticipation, making the control strategy more robust in physically constrained dynamical systems, and allowing for the control of the SMD where a reactive control would fail.

2.4 The controller as a recurrent spiking neural network

We now focus on the example activity of a predictive control of an SMD with a network of four neurons, as shown in Fig. 5. A visual representation of the structure of this network is shown in Fig. 2. In this example, we want to show the network’s performance, spikes, voltage traces, and errors. To bring all four neurons to spike, we choose a slightly more complex target 𝐳𝐳\mathbf{z}bold_z, which spans positive and negative values. The control task and the network are set up in the same way as the previous SMD example (Fig. 4D). In each timestep of the simulation, the neurons compute spiking and non-spiking losses, accounting for the future state of the system in f𝑓fitalic_f seconds, and its distance towards the target. The distance to the target considered in these losses is represented as the predictive error. When its value is large enough, the neuron’s threshold is reached, and a spike is produced to minimize it. In turn, this minimizes the actual error, allowing the position of the mass to approach the target position. The sudden changes in value of the predictive error are caused by the voltage traces of the neurons reaching their threshold, and thus temporally coincide with the spike events. Each spike is a direct control input. It is scaled by the matrix 𝐁𝐁\mathbf{B}bold_B by a different amount for each neuron, and mapped as a positive or negative contribution to the velocity in 𝐱𝐱\mathbf{x}bold_x. Between spike moments, the system evolves as an open loop system, without any control signal in input, following its intrinsic dynamics. The control loop is closed each time a spike is produced and mapped onto the velocity. Since each spike is modeled as a Dirac delta function, the corresponding change in velocity is instantaneous and the control naturally exploits the intrinsic dynamics of the plant to minimize the predictive error.

Refer to caption
Figure 5: Example activity of a network of four neurons. Each spike plotted as control signal is modeled as a Dirac delta function that gets mapped onto the plant’s dynamics through the matrix 𝐁𝐁\mathbf{B}bold_B, and instantaneously influences the velocity component of the system (Sec. 2.3). Spike moments and the relative voltage traces are shown, as well as the resulting errors. The error trace (deep blue line) describes the current distance between the position and a target state, while the predictive error (light blue line) measures the distance between the target and a predicted position, f𝑓fitalic_f seconds into the future.

2.5 Varying meta-parameters

Refer to caption
Figure 6: Changing two meta-parameter (μ𝜇\muitalic_μ and f𝑓fitalic_f) in simulating SMD with spiking control: (A) Energy Usage, calculated as effected acceleration onto the system in each simulation (𝐁𝐬𝐁𝐬\mathbf{Bs}bold_Bs), for various combinations of μ𝜇\muitalic_μ and f𝑓fitalic_f (linear scale). (B) Total Error, calculated as cumulative distance from the state 𝐱𝐱\mathbf{x}bold_x to the target 𝐳𝐳\mathbf{z}bold_z in each simulation, for different combinations of μ𝜇\muitalic_μ and f𝑓fitalic_f (logarithmic scale). (C-F) Different control regimes for different combinations of μ𝜇\muitalic_μ and f𝑓fitalic_f; (C): efficient control regime, (D): under-control regime, (E):over-control regime, (F): inefficient control regime.

The control network we employ operates by expressing its neural parameters with elements of the downstream plant, which serves as a proof-of-concept for a spiking network that can directly compute control in a linear dynamical system (Sec. 4.2). However, at this stage, other features of the network are still imposed as meta-parameters to adjust the control to our SMD example. For a detailed list of all the parameter setup used in this work, see Sec. 4.1.

Number of neurons (N𝑁Nitalic_N)

The number of neurons in the network heavily influences the control of the plant. For instance, when operating on a system with very simple oscillatory dynamics as our SMD, employing more neurons can make the control more robust to different settings, but it raises the need to manage the spiking activity among neurons, especially since they do not have access to network-level information (Sec. 4.2). This can be achieved by imposing asynchronous firing or introducing an adaptive cost on spike, as mentioned in Sec. 2.1.2. While for our activity example in Fig. 5 we showed the results for four neurons, in all other demonstrations we kept the network as simple as possible and employed only two neurons: spiking the velocity of the plant either up or down by the same amount. Indeed, for systems with more complex dynamics and targets, exerting effective control requires larger networks, as in the coupled oscillator case of Fig. 8, where N=500𝑁500N=500italic_N = 500.

Kick strength (𝐁𝐬𝐁𝐬\mathbf{Bs}bold_Bs)

The spikes of our SNN are unitary Dirac delta functions that operate as control signals on the dynamics of the plant. These signals are mapped onto the downstream system via the matrix 𝐁𝐁\mathbf{B}bold_B, which can arbitrarily scale their value, resulting in smaller or larger kicks to the velocity of the system. This is the case for Fig. 5 and Fig. 8, where each neuron kicks the velocity differently. In all the other examples, since we refer to the same SMD system with the same dynamics, we decide to keep 𝐁𝐁\mathbf{B}bold_B fixed, only allowing the system’s velocity to be kicked up or down with the same strength. This makes our proof-of-concept as clear as possible. Of course, the value of 𝐁𝐁\mathbf{B}bold_B holds great relevance when varying the control objective, as it can let the same network control significantly different plants, as well as exert control on the same plant with different levels of accuracy. For this parameter exploration, since we only tackle the same SMD example, we keep a fixed 𝐁𝐁\mathbf{B}bold_B value that produces kicks of a compatible magnitude with the downstream plant.

Global threshold scaling (μ𝜇\muitalic_μ)

When deriving the network, we can impose a scaling term μ𝜇\muitalic_μ on the loss, to introduce a cost for spiking (Eq. 3). This term influences the neuron’s threshold value (see Sec. 4.2). Scaling the neuron’s threshold gives interesting results when controlling the plant (Fig. 6), since it affects the number of spikes emitted as well as the accuracy in tracing the target position.

Predictive window (f𝑓fitalic_f)

When computing the spiking decision, each neuron compares spiking and non-spiking losses. As mentioned, these are predictive losses: they take into account the effect of a spike on the system f𝑓fitalic_f seconds into the future (Sec. 2.2). The width of this future time window is crucial for the control results (Fig. 6). Looking further into the future allows neurons to better assess whether the trajectory will approach or depart from the target, often preventing unnecessary spikes. On the other hand, if the window f𝑓fitalic_f is very large, the predicted state used in the loss comparison might approach (or depart from) the target way before the actual state of the plant does. This influences the accuracy with which the network represents the effect of its own control on the system’s dynamics, as we will explain below.

2.5.1 Control regimes

Different combinations of these meta-parameters can give different results in the control of the plant. Quantifying the optimality of the control strategy in comparison to other paradigms (standard LQR, filtered-spikes) is very difficult. As described in Sec. 3.1.1, each neuron in our network follows a greedy spiking rule given the control set, but like in other approaches, this does not necessarily produce the best overall controller. It is of interest to vary some control parameters, considering the cases where the changes in target value are traced by the system’s state (and therefore the SMD is successfully controlled). Within these cases, we identified different regimes of control for our network. In these explorations we decided to maintain fixed the number of neurons (N=2𝑁2N=2italic_N = 2) and the kick strength 𝐁𝐁\mathbf{B}bold_B. We therefore vary the future time window f𝑓fitalic_f, influencing the predicted state based on which the network computes the spiking decision, and μ𝜇\muitalic_μ, the global scaling factor for threshold. As shown in Fig. 6A, networks with a lower μ𝜇\muitalic_μ tend to spike more, using more energy, as defined by the effected acceleration onto the system (for this system, this equates to the sum of the total spike count of the simulation, with each spike scaled by the corresponding value in the matrix 𝐁𝐁\mathbf{B}bold_B). Given the same f𝑓fitalic_f, these networks also tend to have a lower error compared to high μ𝜇\muitalic_μ networks (Fig. 6B), since a lower threshold allows to adjust more often to the target position. In general, it emerges from Fig. 6B how a relatively wide range of parameters allows for the control of the plant, giving comparable error measures. Only more extreme parameters combinations (a high threshold scaling, especially with small f𝑓fitalic_f) result in the network not tracing the target, yielding significantly higher cumulative error.

Refer to caption
Figure 7: Simulation of the control of an SMD with continuous control (LQR), filtered-spikes control, and predictive spiking control: (A) LQR control example: the controller continuously adjusts the position to the target by modulating the velocity. This is shown in the phase plane (first row) and in the activity plot (second row). Each time the target value changes, the position slowly adjusts to it, stabilizing when the target is reached. The control signal (third row) is larger when the target changes value, and the velocity is modulated to adapt to it. During these moments, the mass is the furthest away from the target, which is reflected in the error profile (bottom row). (B) The filtered spikes control is approximating the LQR signal. Therefore, the trajectory in space, the activity over time and consequently the error (first, second, and last rows) have the same profile as in panel (A). The approximation is seen in the control input (third row). Since the system is represented through spikes, the network keeps spiking to trace the LQR signal. (C) In our predictive spiking algorithm, spikes directly influence the velocity of the system. This is shown in the upwards and downwards kicks in state space and in the activity plot (first and second row). On the one hand, this helps the controller adjust faster to changes in target values. On the other hand, when the target is reached, the controller can only orbit its position by spiking. This fundamental difference is also highlighted in the error profile (bottom row). In the control input plot (third row) we can see how the spike count is drastically lower in spike control than in the filtered-spikes one.
Efficient and inefficient control

Within the boundaries of these trade-offs, we identified four regimes (Fig. 6C-F). With an adequate future time window f𝑓fitalic_f and relatively low μ𝜇\muitalic_μ, the network can operate in an efficient regime (Fig. 6C, where the target position is followed precisely, while keeping a relatively low spike count. If the spike count is lowered any further from this point, the control input becomes too sparse, and the SMD mass does not closely orbit the target position. On the contrary, decreasing the value of both these meta-parameters yields the inefficient regime, where the target is still traced with comparable precision, but the spike count, and as a consequence the energy input, is considerably higher (Fig. 6F).

Under- and over-control

We then have the case where the network is allowed to look far enough in future states of the system, but μ𝜇\muitalic_μ is also very high, making it difficult for the neurons to spike enough, and to trace the target accurately (Fig. 6D). In this regime of under-control, the position approximates the target but never manages to reach it. Finally, in the last case, the neurons are relatively free to spike as μ𝜇\muitalic_μ is low, and they are bound to look very far into future states of the system (Fig. 6E). In this regime of over-control, the neurons fire based on a predicted position that is far into the future (large f𝑓fitalic_f) and thus differs significantly from the actual one. For instance, while the current state of the system might have not reached the target yet, the predicted state is already well past it, which causes the neurons to fire earlier than needed. This over adjustment makes the position hover slightly past the actual target.

Refer to caption
Figure 8: Control simulation of a coupled oscillator system. (A) Scheme of the feedback spiking control for a downstream coupled-oscillator system. The current state of the system 𝐱(t)𝐱𝑡\mathbf{x}\mathit{(t)}bold_x ( italic_t ) is given as input to the network, along with target state values 𝐳(t)𝐳𝑡\mathbf{z}\mathit{(t)}bold_z ( italic_t ). The SNN outputs spikes 𝐬(t)𝐬𝑡\mathbf{s}\mathit{(t)}bold_s ( italic_t ). These represent the control signal 𝐮(t)𝐮𝑡\mathbf{u}\mathit{(t)}bold_u ( italic_t ), computed as a function of both inputs. As for the other networks, 𝐮(t)𝐮𝑡\mathbf{u}\mathit{(t)}bold_u ( italic_t ) is then mapped onto the dynamics of the downstream system via the 𝐁𝐁\mathbf{B}bold_B matrix. This time, the downstream system is composed of M𝑀Mitalic_M masses connected to each other, such that the acceleration of each mass influences the position of the next one. The plant dynamics are still linear, but they are significantly up-scaled in dimensionality when compared to the single SMD examples. (B) Activity for the control of a system with M=10𝑀10M=10italic_M = 10 coupled masses. Each mass has a different target position over time 𝐳(t)𝐳𝑡\mathbf{z}\mathit{(t)}bold_z ( italic_t ). For this example, the SNN makes use of a randomly initialized 𝐁𝐁\mathbf{B}bold_B matrix to map the spikes onto the high-dimensional system and guide the position of each mass to its designated target. Note: for clarity, here we do not show the spiked velocity over time like in previous figures. In blue we show the cumulative error of all neurons over time. (C) Spiking activity of the controlling network. We demonstrate the robustness of the control by progressively silencing portions of the N=500𝑁500N=500italic_N = 500 neurons during the simulation. The vertical red lines indicate the silencing of neurons in certain timesteps. The network successfully redistributes the spiking activity among the non-silenced neurons and as a result, the control performance is not affected (as shown in panel B).

2.6 Comparing control approaches

Lastly, we want to showcase the differences in the resulting control for three approaches: the continuous LQR control, the filtered-spikes control, and our predictive spiking control. As for previous examples, we consider a physically constrained simulation of an SMD, with a progressively changing target. In Fig. 7 we directly compare the same measures for all three methods. As shown in the error trace (bottom rows), the filtered-spikes algorithm (Fig. 7B) approximates the control strategy of the continuous LQR (Fig. 7A). Spiking control (Fig. 7C) has similar increases and decreases of the error (contingent to the changes in the value of the target), although its profile is very different. This is mainly due to its control input not being continuous, which only allows the position to orbit the target with a certain accuracy without ever stationing on the exact target value. Analogous differences are highlighted in the activity over time of these networks (second row). In the continuous and filtered cases, the velocity is not spiked, and the position slowly adjusts to the changes in target, stabilizing once the target is reached. On the contrary, spiking control is progressively adjusting the velocity up or down, tracing the target as soon as it changes value. Once the target is reached, the controller orbits the target position by spiking regularly until the end of the simulation. Other fundamental differences are highlighted in the control input (third row), and in the consequent trajectory in state space (first row). As shown, the filtered spikes controller is able to smoothly control the state of the system by approximating the control input of the LQR. In this sense, the spikes are only representing the system’s state, but the control signal is still continuous, as it is inputted in each timestep of the simulation. This approximation requires the network to spike much more than our spiking control algorithm, which exclusively inputs a kick when it needs to adjust the position to the target.

2.7 Controlling coupled oscillators

We now scale up the same algorithm described in previous sections to control a higher dimensional linear system. This time, the plant we chose is a coupled oscillator system, where M𝑀Mitalic_M number of SMDs are connected in series, such that the position of one mass affects the acceleration of the ones connected. The network has to control multiple SMDs, while taking into account the coupled effects between them. For the example shown in Fig. 8, we consider the control of M=10𝑀10M=10italic_M = 10 coupled masses using a network of N=500𝑁500N=500italic_N = 500 neurons. As shown, the control algorithm scale naturally to this complex linear system: even though the masses are coupled with each other, our local spiking rule is able to coordinate the spikes as to allow all masses to reach their respective target (Fig. 8B). Neurons are assigned an element in the matrix 𝐁𝐁\mathbf{B}bold_B (now scaled up to the correct dimensionality based on the number of masses M𝑀Mitalic_M) with a random normalized value. The spike of each neuron then maps either as a positive or negative kick on the velocity of a mass by the corresponding amount. This results in some neurons having a higher kick strength than others, but as before, only the neurons that allow the system to decrease the distance from the target in f𝑓fitalic_f second will fire. During the simulation we observe how spiking activity is sparse and distributed among neurons (Fig. 8C). We then silence random neurons to test the robustness of the system. After each silencing event, the activity is distributed among the remaining active neurons and there is no effect on the control performance. This shows how our control algorithm scales to complex linear system and is robust to cell loss: no specific neuron is determining the outcome in the control performance.

3 Discussion

3.1 Proposed spiking control framework

As described in Sec. 1, biological neurons use spikes for signaling and control of downstream system. Control of dynamical systems has been widely studied and applied in different contexts, including artificial neural networks  \citepboerlinPredictive2013, deneveEfficient2016, huangDynamical2018, slijkhuisClosedform2023, mooreneuron2024, but generally doesn’t consider spikes directly as control signals. In this work we remedied this by showcasing how control can be computed directly through the use of single spike events. This allowed us to obtain a model that uses spikes to actively kick the dynamics of the downstream plant, and exploits its intrinsic dynamics to reach a certain target. The emitted spikes are instantaneous events (modeled as Dirac delta functions, Sec. 2.1.1). The spike generation mechanisms in the network and the mappings to the system need to be adjusted to the various constraints the plant might have in its dynamics (Sec. 2.3). In order for this control paradigm to work on physically constrained systems, we developed a predictive algorithm, that takes into consideration the effect of our control inputs onto the system within a certain future window, f𝑓fitalic_f. Our spiking rule is fully local and is computed for each neuron in every timestep of the simulation. Therefore, the target can be reached without a network-level objective, and without any explicit learning paradigm. Of course, introducing learning could still be a very useful addition to adjust all the meta-parameters of the network when adapting the control to different plants, or to tackle plants with unknown dynamics (Sec. 3.2). From this simple spiking rule, we obtain, without imposing it, a recurrent network of IF neurons that exerts control by only inputting a spike if that improves the defined loss. This rule does not guarantee the optimality of the control as a whole (as explained in Sec. 3.1.1), but allows for the control to work without the need for network-level information. Lastly, we scaled up this framework to control a coupled oscillator system, with various connected masses (each with its different target) and a larger network. We showcase that not only our control algorithm is easily scalable to complex linear systems and higher-dimensional networks, but it is also robust to neuron loss.

3.1.1 Assumptions and constraints

Greedy spiking rule

An important clarification about our proposed algorithm is that the spiking rule by which the network produces control signals does not guarantee any optimality on the controller as a whole. Each timestep in the simulation, the neurons operate the spike decision by simply trying to decrease the distance towards a target. That is why the neurons produce control signals (spikes) in a "greedy" way, as defined in the work of [boerlinPredictive2013]. If spiking improves the loss, the neuron always spikes, as that is the only information that it has available. No other network-level information is used in the process. In other words, computing the greedy spike decision for each neuron in the current timestep (albeit considering the effect of the spike in f𝑓fitalic_f seconds), does not equate to finding the best set of spikes over a whole future window, let alone over the whole simulation. As a result, our approach does not guarantee that the control actuated by the network is necessarily the optimal control solution for reaching a certain target. Furthermore, as described in Sec. 2.5.1, many of the meta-parameters in the network are manually set rather than learned or optimized for the control task at hand. In most of our examples, we impose the meta-parameters that best fit the scale of the system to control, to showcase our proof of concept with clarity.

Simplicity of use case

Following a similar reasoning, we purposefully chose a simple dynamical system, which is linear (fully observable) and two-dimensional. We also chose a simple target that progressively changes position and then stabilizes. This simplicity allowed us to showcase our results without any additional complication, and in Sec. 2.7 we demonstrate how this framework easily scales to any linear dynamical system. However, the limits of our paradigm still need to be tested, by including adaptive (or learned) meta-parameters, and expanding our control algorithm to operate on non-linear systems (Sec. 3.2). Important to note that even in an optimal controller, these meta-parameters would need to be adjusted based on the control problem at hand. The differences in the control approach of our network is highlighted in Fig. 7 and explained in Sec. 2.6, where we control the same SMD system with a standard LQR (optimal controller), a filtered spiking algorithm, and our spiking controller. As shown, the filtered spiking controller (Fig. 7B) approximates the control signal of an LQR (Fig. 7A), making the resulting trajectories in state space equivalent, while our spiking algorithm (Fig. 7C) differs significantly from the optimal trajectory in state space. However, by adjusting the meta-parameter correctly, the network still traces the target with good accuracy (as shown in the error profile), while producing significantly less spikes than the filtered spiking controller.

3.2 Considerations on spiking control

Relation to prior spike-based control models

Previous works underlined the importance of each spike event in the computation of neural networks \citepboerlinPredictive2013, schwemmerConstructing2015, brendelLearning2020. Such spike-based computational models highlight the relevance of the temporal precision of spike events, where precise spike timing is determined by each neuron’s contribution to a desired output. These models exhibit features similar to cortical networks, including irregular spiking and E/I balance \citepschwemmerConstructing2015. Many of these models are developed from SCN theory, which gives them a strong overlap with our framework but also introduces important distinctions (Sec. 4.3). In SCNs, as in our approach, spike activity is used to generate a signal that controls a dynamical system. However, SCNs use this signal to control their own network read-out (thereby influencing the dynamics of their internal state). This read-out can be quite complex, and include synaptic kernels, requiring a form of predictive spiking \citepschwemmerConstructing2015, zeldenrustEfficient2021 similar to the one employed in our paradigm. In short, despite SCN-related works share common aspects with out approach, they do not explicitly hypothesize that the computational role of spikes could be controlling a different downstream system. In contrast, our work generalizes to the control any downstream linear system, rather than focusing solely on modulating the network’s own internal state. Moreover, when SCN principles are employed to solve realistic control tasks \citephuangDynamical2018, slijkhuisClosedform2023, mooreneuron2024, the emphasis on a temporal, single-spike perspective is lost, favoring a rate-based or filtered-spike approach, which glosses over the role of each spike in the network’s computation.

Future outlook

In neuroscience, single spike events for control could have a huge significance when modeling downstream actuator systems. Spike-based control seems to be a promising modeling angle when explaining the control of downstream muscle actuators \citepbernikernormative2019, that ultimately allow movement to occur. This paradigm could be expanded to include models of body orientation in space as well as the control of more detailed models of limbs. Further down the line, spiking control could be a useful approach to interpret how the spikes influence the dynamics of the network that produce it. This has been theorized by previous works in SCNs \citepboerlinPredictive2013, deneveEfficient2016, suggesting that the computational role of spikes in the network is to control its dynamics, and many physiological features of biological networks are a consequence of this necessity. In this case, the downstream system is the network itself (or a second downstream network), which means that our paradigm could serve as a tool to explain potential roles of the spike code in all bio-plausible networks. As mentioned above, various works emerged from this initial SCN perspective to investigate the biophysical and computational properties of these networks, and their potential values for interpreting detailed neural models \citepschwemmerConstructing2015, zeldenrustEfficient2021. Not only these works stem from the common root of SCN-based modeling, but they also incorporate elements that we employ in our algorithm, such as different forms of predictive spiking. Revisiting these topics with our control-focused perspective could give interesting insights, for instance, on the computational role of E/I balance \citeppodlaskiApproximating2024 or provide fresh perspectives on various properties of SNNs \citepmancooUnderstanding2020. Following a similar path, we could study the effect of bio-plausible additions in our framework, like noise or network perturbations, analyzing for instance the robustness of our system \citepcalaimRobustness2021.

As described in Sec. 3.1.1, expanding our proof-of-concept to more complex nonlinear systems could be relevant to investigate the role of spikes in controlling realistic downstream actuators. This could include detailed models of muscle fibers for the control of movement \citepbernikernormative2019, or models of complex downstream networks. Furthermore, it could have use cases for neuromorphic control applications where a better exploration of the full state space of the plant is needed. Tackling nonlinearity might require further refinements on the model by employing local linearization techniques \citepyesildirekFeedback1995 or a simultaneous prediction of multiple future states. In the effort to apply this algorithm to more realistic use cases, another required step would be to find a way to adjust neural parameters according to the control problem. As mentioned in Sec. 2.5, a lot of important parameters of the network are currently imposed to best fit the control case. Ideally, we would want the same network to be able to control different plants. This could be achieved by applying a discount factor on some parameters, which would allow us to adapt, for instance, the value of f𝑓fitalic_f as the simulation progresses, finding the best time window to balance the number of spikes in input and the accuracy with which the target is traced. Finally, we can introduce learning on all the parameters of the network, to make our control as flexible as possible to a wide range of downstream systems. Not having to assume that we operate on completely observable plants would greatly expand the use cases of our algorithm, and would allow for testing of various neuroscience-derived hypotheses.

4 Methods

4.1 Simulation and network setup

Simulation details

All the simulations in this paper were implemented in Python using the following packages. NumPy(version 1.23.5): for matrix operations and numerical computations. SciPy(version 1.10.1): for solving differential equations and simulating trajectories using solve_ivp. Matplotlib(version 3.7.1) and Seaborn(version 0.13.2): for visualizing results, including raster plots, heatmaps, and system trajectories. Python control system library ’control’ (version 0.10.1): for the initializations of the continuous control example.

Time settings

In each single SMD simulation (Figs. 4 to  7), the total time T𝑇Titalic_T is set to 50505050 seconds, the timestep δt=0.01s𝛿𝑡0.01𝑠\delta t=0.01sitalic_δ italic_t = 0.01 italic_s divides each run into 5000500050005000 timesteps nT𝑛𝑇nTitalic_n italic_T. For the coupled SMD example (Sec. 2.7, Fig. 8) T=100s𝑇100𝑠T=100sitalic_T = 100 italic_s, δt=0.01s𝛿𝑡0.01𝑠\delta t=0.01sitalic_δ italic_t = 0.01 italic_s, and nT=10000𝑛𝑇10000nT=10000italic_n italic_T = 10000.

Target settings

For every simulation, the target 𝐳𝐳\mathbf{z}bold_z is only defined for the position component of the system’s state, and it is initialized at zero. For Figs. 4,  6, and  7, three target values are set: zbase=5subscript𝑧𝑏𝑎𝑠𝑒5z_{base}=5italic_z start_POSTSUBSCRIPT italic_b italic_a italic_s italic_e end_POSTSUBSCRIPT = 5, from five second into the simulation, zbase=10subscript𝑧𝑏𝑎𝑠𝑒10z_{base}=10italic_z start_POSTSUBSCRIPT italic_b italic_a italic_s italic_e end_POSTSUBSCRIPT = 10 after 15 seconds, and zbase=15subscript𝑧𝑏𝑎𝑠𝑒15z_{base}=15italic_z start_POSTSUBSCRIPT italic_b italic_a italic_s italic_e end_POSTSUBSCRIPT = 15 after 30 seconds, and until the end of the simulation. For Fig. 8 we use similar zbasesubscript𝑧𝑏𝑎𝑠𝑒z_{base}italic_z start_POSTSUBSCRIPT italic_b italic_a italic_s italic_e end_POSTSUBSCRIPT steps, but we initialize them so each mass is attributed a different set of steps, which are progressively more positive or negative. In Fig. 5 we use bigger and irregular steps, to better show the activity of the four neurons: zbase=30subscript𝑧𝑏𝑎𝑠𝑒30z_{base}=-30italic_z start_POSTSUBSCRIPT italic_b italic_a italic_s italic_e end_POSTSUBSCRIPT = - 30 from five second into the simulation, zbase=0subscript𝑧𝑏𝑎𝑠𝑒0z_{base}=0italic_z start_POSTSUBSCRIPT italic_b italic_a italic_s italic_e end_POSTSUBSCRIPT = 0 after 15 seconds, zbase=25subscript𝑧𝑏𝑎𝑠𝑒25z_{base}=-25italic_z start_POSTSUBSCRIPT italic_b italic_a italic_s italic_e end_POSTSUBSCRIPT = - 25 after 20 seconds, zbase=20subscript𝑧𝑏𝑎𝑠𝑒20z_{base}=20italic_z start_POSTSUBSCRIPT italic_b italic_a italic_s italic_e end_POSTSUBSCRIPT = 20 after 30 seconds, and until the end of the simulation. Between each value of zbasesubscript𝑧𝑏𝑎𝑠𝑒z_{base}italic_z start_POSTSUBSCRIPT italic_b italic_a italic_s italic_e end_POSTSUBSCRIPT, the parameter zleak=0.5subscript𝑧𝑙𝑒𝑎𝑘0.5z_{leak}=0.5italic_z start_POSTSUBSCRIPT italic_l italic_e italic_a italic_k end_POSTSUBSCRIPT = 0.5 manages the exponential behavior of the target curve.

Neurons and spike mapping

For most runs, the number of neurons N𝑁Nitalic_N is fixed to two, except for Fig. 5 where N=4𝑁4N=4italic_N = 4. In the scaling up example of coupled oscillators (Fig. 8), N=500𝑁500N=500italic_N = 500. 𝐁𝐁\mathbf{B}bold_B is kept to spike the velocity component of a fixed value of 2222 or 22-2- 2 for most simulations (the values relative to the positions are kept at zero). One exception is the physically unconstrained cases of Fig. 4, where 𝐁𝐁\mathbf{B}bold_B can also allow spikes to the position of the same amounts (2222 and 22-2- 2). For Fig. 5, 𝐁𝐁\mathbf{B}bold_B values are sampled from a normal distribution, with the second column (velocity) normalized to have norm 1111. The resulting value is then doubled. The same normalized 𝐁𝐁\mathbf{B}bold_B values are quadrupled in the coupled oscillator simulation. In this two cases, the dimensions of matrix 𝐁𝐁\mathbf{B}bold_B are modified to allow each one of the N=4𝑁4N=4italic_N = 4 and N=500𝑁500N=500italic_N = 500 neurons to have their spikes mapped onto the system.

Future time window and spike cost

The future window f𝑓fitalic_f is kept at 0.30.30.30.3 for the predictive simulations in Figs.4, 5 and 7. When using reactive control, f𝑓fitalic_f is set to zero. For Fig. 6, f𝑓fitalic_f values in the subsequent simulations are linearly spaced from 0.10.10.10.1 to 1111 (included), with the fourth value rounded up at 0.250.250.250.25. μ𝜇\muitalic_μ is fixed at 0.30.30.30.3 for the simulations in Figs.4 and 7. For Fig. 5, μ𝜇\muitalic_μ is set to 0.10.10.10.1. In Fig. 6, μ𝜇\muitalic_μ values in the subsequent simulations are linearly spaced from 0.020.020.020.02 to 1111 (included), with the third value rounded up at 0.250.250.250.25.

Initial state and system matrix

The initial state of the simulation 𝐱𝟎subscript𝐱0\mathbf{x_{0}}bold_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT (at T=0𝑇0T=0italic_T = 0) is always kept at [0,0]00[0,0][ 0 , 0 ]. The system matrix 𝐀𝐀\mathbf{A}bold_A is also kept the same for all the single-SMD simulations (since we control the same SMD). The first row is set at [0,0.5]00.5[0,0.5][ 0 , 0.5 ], and the second at [0.1,0.1]0.10.1[-0.1,-0.1][ - 0.1 , - 0.1 ]. For Fig 8 the variable M=10𝑀10M=10italic_M = 10 is introduced to indicate the number of masses to control. To test robustness, we silence 180 random neurons 30 seconds into the simulation, and other 180 after 70 seconds. In this example, A𝐴Aitalic_A is scaled up to the dimensionality of the coupled systems based on M𝑀Mitalic_M (and N𝑁Nitalic_N), and coupling values γ=0.3𝛾0.3\gamma=-0.3italic_γ = - 0.3 are introduced to make the position and acceleration of the masses dependent on each other.

Continuous and filtered-spiking parameters

The examples of continuous and filtered-spiking control in Fig. 7A,B largely use the same settings as the other single-SMD simulations. In the continuous control case, we introduce an LQR gain matrix 𝐊𝐊\mathbf{K}bold_K to compute the control signal, initialized using the control.lqr function from the python control package. To initialize 𝐊𝐊\mathbf{K}bold_K, we use the matrix 𝐁𝐁\mathbf{B}bold_B that maps the control signal onto the velocity of system, set as [0,0.25]00.25[0,0.25][ 0 , 0.25 ]. We will also need a control cost R=0.001𝑅0.001R=0.001italic_R = 0.001 (analogous to μ𝜇\muitalic_μ in the spiking control case), and finally a K×K𝐾𝐾K\times Kitalic_K × italic_K matrix 𝐐𝐐\mathbf{Q}bold_Q to introduce a cost on the dimensions of the system (analogous to 𝐂𝐂\mathbf{C}bold_C in our spiking paradigm). The first row of 𝐐𝐐\mathbf{Q}bold_Q is set at [1,0]10[1,0][ 1 , 0 ], while the second is [0,1]01[0,1][ 0 , 1 ]. As described in Sec. 4.3, for our filtered spiking control we will need a matrix 𝐃=[1,1]𝐃11\mathbf{D}=[1,-1]bold_D = [ 1 , - 1 ] to map the filtered spike traces onto the network read-out. For our example of Fig. 7B, we use a spiking cost μf=0.1subscript𝜇𝑓0.1\mu_{f}=0.1italic_μ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = 0.1, and a decay factor λ=1𝜆1\lambda=1italic_λ = 1 for the filtered spike traces.

4.2 Derivation

4.2.1 Network

In our work, we derive a spiking neural network (SNNs) of leaky integrate-and-fire (LIF) neurons from control principles. A network of N𝑁Nitalic_N such neurons is described by their voltage dynamics

𝐕˙(t)=λ𝐕(t)+𝐅𝐱(t)+𝛀𝐬(t),˙𝐕𝑡𝜆𝐕𝑡𝐅𝐱𝑡𝛀𝐬𝑡\mathbf{\dot{V}}(t)=-\lambda\mathbf{V}(t)+\mathbf{Fx}(t)+\mathbf{\Omega s}(t),over˙ start_ARG bold_V end_ARG ( italic_t ) = - italic_λ bold_V ( italic_t ) + bold_Fx ( italic_t ) + bold_Ω bold_s ( italic_t ) , (7)

where 𝐕(t)N𝐕𝑡superscript𝑁\mathbf{V}(t)\in\mathbb{R}^{N}bold_V ( italic_t ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPTare the neurons’ voltages, λ𝜆\lambdaitalic_λ determines the membrane leak time-constant, 𝐱(t)K𝐱𝑡superscript𝐾\mathbf{x}(t)\in\mathbb{R}^{K}bold_x ( italic_t ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT are the inputs, 𝐅N×K𝐅superscript𝑁𝐾\mathbf{F}\in\mathbb{R}^{N\times K}bold_F ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_K end_POSTSUPERSCRIPT are the forward weights, 𝛀N×N𝛀superscript𝑁𝑁\mathbf{\Omega}\in\mathbb{R}^{N\times N}bold_Ω ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_N end_POSTSUPERSCRIPT are the recurrent weights, and 𝐬(t)N𝐬𝑡superscript𝑁\mathbf{s}(t)\in\mathbb{R}^{N}bold_s ( italic_t ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT are the neural spike trains. A spike is generated whenever a neuron’s voltage crosses its threshold, Tisubscript𝑇𝑖T_{i}italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, and a spike train is described as a sum of Dirac delta functions, si(t)=tjδ(ttj)subscript𝑠𝑖𝑡subscriptsubscript𝑡𝑗𝛿𝑡subscript𝑡𝑗s_{i}(t)=\sum_{t_{j}}\delta(t-t_{j})italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) = ∑ start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_δ ( italic_t - italic_t start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ). Whenever a neuron spikes, its voltage is reset to a resting potential, Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. We now will generally be leaving out the time notation (t)𝑡(t)( italic_t ) for notional clarity, unless it’s important for a specific derivation step.

4.2.2 Dynamical System

We want to control a linear dynamical system. We consider a K𝐾Kitalic_K-dimensional linear dynamical system, over which we want to directly exert control using spike events. The system is described by the following equation:

𝐱˙=𝐀𝐱+𝐁𝐬,˙𝐱𝐀𝐱𝐁𝐬\mathbf{\dot{x}}=\mathbf{A}\mathbf{x}+\mathbf{B}\mathbf{s},over˙ start_ARG bold_x end_ARG = bold_Ax + bold_Bs , (8)

where 𝐱K𝐱superscript𝐾\mathbf{x}\in\mathbb{R}^{K}bold_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT represents the state of the system (one value for each dimension of the system), 𝐀K×K𝐀superscript𝐾𝐾\mathbf{A}\in\mathbb{R}^{K\times K}bold_A ∈ blackboard_R start_POSTSUPERSCRIPT italic_K × italic_K end_POSTSUPERSCRIPT is a matrix which defines the transformations in 𝐱𝐱\mathbf{x}bold_x that give rise to the dynamics (for instance, an oscillatory behavior), 𝐁K×N𝐁superscript𝐾𝑁\mathbf{B}\in\mathbb{R}^{K\times N}bold_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_K × italic_N end_POSTSUPERSCRIPT is a matrix that maps the spike events onto the system, and 𝐬N𝐬superscript𝑁\mathbf{s}\in\mathbb{R}^{N}bold_s ∈ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT are the spike trains, which represent our control signal.

Since it’s a linear system, we can solve it analytically (Sec. 4.1) to derive the state of the system in a future time step f𝑓fitalic_f, starting from the current state 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.

No spike condition

In the absence of any spiking control signal, we obtain:

𝐱f=e𝐀f𝐱0,subscript𝐱𝑓superscript𝑒𝐀𝑓subscript𝐱0\mathbf{x}_{f}=e^{\mathbf{A}f}\mathbf{x}_{0},bold_x start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = italic_e start_POSTSUPERSCRIPT bold_A italic_f end_POSTSUPERSCRIPT bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , (9)

where 𝐱0=𝐱(t)subscript𝐱0𝐱𝑡\mathbf{x}_{0}=\mathbf{x}(t)bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = bold_x ( italic_t ) is the current state of the system, 𝐱f=𝐱(t+f)subscript𝐱𝑓𝐱𝑡𝑓\mathbf{x}_{f}=\mathbf{{x}}(t+f)bold_x start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = bold_x ( italic_t + italic_f ) is the predicted state after f𝑓fitalic_f seconds, and e𝐀fsuperscript𝑒𝐀𝑓e^{\mathbf{A}f}italic_e start_POSTSUPERSCRIPT bold_A italic_f end_POSTSUPERSCRIPT is the matrix exponential of 𝐀𝐀\mathbf{A}bold_A times the future observation f𝑓fitalic_f. We introduce the shorthand e𝐀f=𝐀fsuperscript𝑒𝐀𝑓subscript𝐀𝑓e^{\mathbf{A}f}=\mathbf{A}_{f}italic_e start_POSTSUPERSCRIPT bold_A italic_f end_POSTSUPERSCRIPT = bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT for ease of writing, resulting in:

𝐱f=𝐀f𝐱0.subscript𝐱𝑓subscript𝐀𝑓subscript𝐱0\mathbf{x}_{f}=\mathbf{A}_{f}\mathbf{x}_{0}.bold_x start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT . (10)
Neuron i𝑖iitalic_i spikes condition

We now consider the effect of a single neuron i𝑖iitalic_i spiking on the current state of the system, and how that affects the state of the system f𝑓fitalic_f seconds into the future. For the condition where neuron i𝑖iitalic_i spikes, the spike vector 𝐬𝐬\mathbf{s}bold_s is zero for all elements and one for the element corresponding to the spiking neuron i𝑖iitalic_i. The operation 𝐁𝐬𝐁𝐬\mathbf{B}\mathbf{s}bold_Bs in Eq.8 results in considering the i𝑖iitalic_ith column of matrix 𝐁𝐁\mathbf{B}bold_B. Here we denote that with 𝐛isubscript𝐛𝑖\mathbf{b}_{i}bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. We can write the predicted state of the system after a spike as:

𝐱f=𝐀f(𝐱0+𝐛i),subscript𝐱𝑓subscript𝐀𝑓subscript𝐱0subscript𝐛𝑖\mathbf{x}_{f}=\mathbf{A}_{f}(\mathbf{x}_{0}+\mathbf{b}_{i}),bold_x start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , (11)

where 𝐛isubscript𝐛𝑖\mathbf{b}_{i}bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the i𝑖iitalic_ith column of the matrix 𝐁𝐁\mathbf{B}bold_B.

4.2.3 Losses

To control the system, we set a target state 𝐳𝐳\mathbf{\mathbf{z}}bold_z which holds one value to be reached for each dimension of the system. To quantify the distance from this target, we define a loss as the L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT norm of the difference between the target and the predicted state:

L=𝐳𝐱f2+μ𝐬2,𝐿superscriptnorm𝐳subscript𝐱𝑓2𝜇superscriptnorm𝐬2L=||\mathbf{\mathbf{z}}-\mathbf{x}_{f}||^{2}+\mu||\mathbf{s}||^{2},italic_L = | | bold_z - bold_x start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT | | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_μ | | bold_s | | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (12)

where 𝐳K𝐳superscript𝐾\mathbf{z}\in\mathbb{R}^{K}bold_z ∈ blackboard_R start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT is the target state, and μ𝜇\muitalic_μ is a parameter that scales the cost for each spike. In the most general case, we can scale the part of this loss that describes the distance to the target. We introduce the matrix 𝐂𝐂\mathbf{C}bold_C, which represents a cost on the dimensions of the system. This allows us to introduce a penalty on the values of the system’s dimensions, which will weigh on the input to the controller.

The loss then becomes:

L=(𝐳𝐱f)T𝐂(𝐳𝐱f)+μ𝐬2,𝐿superscript𝐳subscript𝐱𝑓𝑇𝐂𝐳subscript𝐱𝑓𝜇superscriptnorm𝐬2L=(\mathbf{z}-\mathbf{x}_{f})^{T}\mathbf{C}(\mathbf{z}-\mathbf{x}_{f})+\mu||% \mathbf{s}||^{2},italic_L = ( bold_z - bold_x start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_C ( bold_z - bold_x start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ) + italic_μ | | bold_s | | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (13)

where 𝐂K×K𝐂superscript𝐾𝐾\mathbf{C}\in\mathbb{R}^{K\times K}bold_C ∈ blackboard_R start_POSTSUPERSCRIPT italic_K × italic_K end_POSTSUPERSCRIPT is a cost matrix on each dimensions of the system. We can express this loss in the presence and absence of spikes.

Loss with no spike
Lns=(𝐳𝐀f𝐱0)T𝐂(𝐳𝐀f𝐱0).subscript𝐿nssuperscript𝐳subscript𝐀𝑓subscript𝐱0𝑇𝐂𝐳subscript𝐀𝑓subscript𝐱0L_{\text{ns}}=(\mathbf{\mathbf{z}}-\mathbf{A}_{f}\mathbf{x}_{0})^{T}\mathbf{C}% (\mathbf{\mathbf{z}}-\mathbf{A}_{f}\mathbf{x}_{0}).italic_L start_POSTSUBSCRIPT ns end_POSTSUBSCRIPT = ( bold_z - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_C ( bold_z - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) . (14)
Loss with spike
Ls=(𝐳𝐀f(𝐱0+𝐛i))T𝐂(𝐳𝐀f(𝐱0+𝐛i))+μ𝐬2.subscript𝐿ssuperscript𝐳subscript𝐀𝑓subscript𝐱0subscript𝐛𝑖𝑇𝐂𝐳subscript𝐀𝑓subscript𝐱0subscript𝐛𝑖𝜇superscriptnorm𝐬2L_{\text{s}}=(\mathbf{\mathbf{z}}-\mathbf{A}_{f}(\mathbf{x}_{0}+\mathbf{b}_{i}% ))^{T}\mathbf{C}(\mathbf{\mathbf{z}}-\mathbf{A}_{f}(\mathbf{x}_{0}+\mathbf{b}_% {i}))+\mu||\mathbf{s}||^{2}\\ .italic_L start_POSTSUBSCRIPT s end_POSTSUBSCRIPT = ( bold_z - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_C ( bold_z - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) + italic_μ | | bold_s | | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . (15)

If only neuron i𝑖iitalic_i spikes in the current time step, 𝐬𝐬\mathbf{s}bold_s is a vector containing only zeros except for the i𝑖iitalic_ith position, which is one. We name this vector 𝐬isubscript𝐬𝑖\mathbf{s}_{i}bold_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. Under this assumption, we can rewrite 𝐬2superscriptnorm𝐬2||\mathbf{s}||^{2}| | bold_s | | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT as:

||𝐬||2=(𝐬i)𝐬iT=iN(𝐬i)=21,||\mathbf{s}||^{2}=(\mathbf{s}^{i}){{}^{T}}\mathbf{s}^{i}=\sum_{i}^{N}(\mathbf% {s}^{i}){{}^{2}}=1,| | bold_s | | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( bold_s start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) start_FLOATSUPERSCRIPT italic_T end_FLOATSUPERSCRIPT bold_s start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ( bold_s start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT = 1 , (16)

which would give us the following loss:

Ls=(𝐳𝐀f(𝐱0+𝐛i))T𝐂(𝐳𝐀f(𝐱0+𝐛i))+μ.subscript𝐿ssuperscript𝐳subscript𝐀𝑓subscript𝐱0subscript𝐛𝑖𝑇𝐂𝐳subscript𝐀𝑓subscript𝐱0subscript𝐛𝑖𝜇L_{\text{s}}=(\mathbf{\mathbf{z}}-\mathbf{A}_{f}(\mathbf{x}_{0}+\mathbf{b}_{i}% ))^{T}\mathbf{C}(\mathbf{\mathbf{z}}-\mathbf{A}_{f}(\mathbf{x}_{0}+\mathbf{b}_% {i}))+\mu.italic_L start_POSTSUBSCRIPT s end_POSTSUBSCRIPT = ( bold_z - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_C ( bold_z - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) + italic_μ . (17)

4.2.4 Neural Variables

As mentioned in Sec. 3.2, previous works conceptualized spiking neurons as capable of directly encoding dynamical variables, and derived a "greedy" spiking rule where a neuron i𝑖iitalic_i fires whenever this results in a decrease of a cost function, which usually measures the distance between the dynamical variable x𝑥xitalic_x and its estimate x^^𝑥\hat{x}over^ start_ARG italic_x end_ARG, or a target z𝑧zitalic_z. This is a fully local rule: a neuron’s decision to fire depends solely on its current membrane potential and its threshold, which in turn depend on information locally accessible to the neuron at that moment. The neuron does not need global information about the activity of the entire network. This rule guarantees that each spike emitted by a neuron contributes to improving the representation (or the control) of the dynamical variable, and has therefore a clear interpretation within the computing objective of the network. We start deriving our network based on this simple firing condition:

LsLnssubscript𝐿ssubscript𝐿nsL_{\text{s}}\leq L_{\text{ns}}italic_L start_POSTSUBSCRIPT s end_POSTSUBSCRIPT ≤ italic_L start_POSTSUBSCRIPT ns end_POSTSUBSCRIPT (18)

from which:

(𝐳𝐀f𝐱0𝐀f𝐛i)T𝐂(𝐳𝐀f𝐱0𝐀f𝐛i)+μ(𝐳𝐀f𝐱0)T𝐂(𝐳𝐀f𝐱0)0.superscript𝐳subscript𝐀𝑓subscript𝐱0subscript𝐀𝑓subscript𝐛𝑖𝑇𝐂𝐳subscript𝐀𝑓subscript𝐱0subscript𝐀𝑓subscript𝐛𝑖𝜇superscript𝐳subscript𝐀𝑓subscript𝐱0𝑇𝐂𝐳subscript𝐀𝑓subscript𝐱00\begin{split}&(\mathbf{\mathbf{z}}-\mathbf{A}_{f}\mathbf{x}_{0}-\mathbf{A}_{f}% \mathbf{b}_{i})^{T}\mathbf{C}(\mathbf{\mathbf{z}}-\mathbf{A}_{f}\mathbf{x}_{0}% -\mathbf{A}_{f}\mathbf{b}_{i})+\mu\\ &-(\mathbf{\mathbf{z}}-\mathbf{A}_{f}\mathbf{x}_{0})^{T}\mathbf{C}(\mathbf{% \mathbf{z}}-\mathbf{A}_{f}\mathbf{x}_{0})\leq 0.\end{split}start_ROW start_CELL end_CELL start_CELL ( bold_z - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_C ( bold_z - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) + italic_μ end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL - ( bold_z - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_C ( bold_z - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ≤ 0 . end_CELL end_ROW (19)

Expanding the squares:

2(𝐀f𝐛i)T𝐂(𝐳𝐀f𝐱0)+μ+(𝐀f𝐛i)T𝐂(𝐀f𝐛i)02superscriptsubscript𝐀𝑓subscript𝐛𝑖𝑇𝐂𝐳subscript𝐀𝑓subscript𝐱0𝜇superscriptsubscript𝐀𝑓subscript𝐛𝑖𝑇𝐂subscript𝐀𝑓subscript𝐛𝑖0-2(\mathbf{A}_{f}\mathbf{b}_{i})^{T}\mathbf{C}(\mathbf{\mathbf{z}}-\mathbf{A}_% {f}\mathbf{x}_{0})+\mu+(\mathbf{A}_{f}\mathbf{b}_{i})^{T}\mathbf{C}(\mathbf{A}% _{f}\mathbf{b}_{i})\leq 0- 2 ( bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_C ( bold_z - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) + italic_μ + ( bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_C ( bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ≤ 0 (20)

and rearranging these terms, we obtain the inequality:

𝐛iT𝐀f𝐂T(𝐀f𝐛i)2+μ𝐛iT𝐀f𝐂T(𝐳𝐀f𝐱0).superscriptsubscript𝐛𝑖𝑇subscript𝐀𝑓superscript𝐂𝑇subscript𝐀𝑓subscript𝐛𝑖2𝜇superscriptsubscript𝐛𝑖𝑇subscript𝐀𝑓superscript𝐂𝑇𝐳subscript𝐀𝑓subscript𝐱0\frac{\mathbf{b}_{i}^{T}\mathbf{A}_{f}{{}^{T}}\mathbf{C}(\mathbf{A}_{f}\mathbf% {b}_{i})}{2}+\mu\leq\mathbf{b}_{i}^{T}\mathbf{A}_{f}{{}^{T}}\mathbf{C}(\mathbf% {\mathbf{z}}-\mathbf{A}_{f}\mathbf{x}_{0}).divide start_ARG bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_FLOATSUPERSCRIPT italic_T end_FLOATSUPERSCRIPT bold_C ( bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG 2 end_ARG + italic_μ ≤ bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_FLOATSUPERSCRIPT italic_T end_FLOATSUPERSCRIPT bold_C ( bold_z - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) . (21)

On the left side of this inequality, all the factors are time independent. We can attribute these to the threshold of the neuron i𝑖iitalic_i,

Ti=𝐛iT𝐀f𝐂T(𝐀f𝐛i)2+μsubscript𝑇𝑖superscriptsubscript𝐛𝑖𝑇subscript𝐀𝑓superscript𝐂𝑇subscript𝐀𝑓subscript𝐛𝑖2𝜇T_{i}=\frac{\mathbf{b}_{i}^{T}\mathbf{A}_{f}{{}^{T}}\mathbf{C}(\mathbf{A}_{f}% \mathbf{b}_{i})}{2}+\muitalic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = divide start_ARG bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_FLOATSUPERSCRIPT italic_T end_FLOATSUPERSCRIPT bold_C ( bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG 2 end_ARG + italic_μ (22)

On the right side, all the time dependent factors can be attributed to the voltage of that neuron:

Vi=𝐛iT𝐀f𝐂T(𝐳𝐀f𝐱0).subscript𝑉𝑖superscriptsubscript𝐛𝑖𝑇subscript𝐀𝑓superscript𝐂𝑇𝐳subscript𝐀𝑓subscript𝐱0V_{i}=\mathbf{b}_{i}^{T}\mathbf{A}_{f}{{}^{T}}\mathbf{C}(\mathbf{\mathbf{z}}-% \mathbf{A}_{f}\mathbf{x}_{0}).italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_FLOATSUPERSCRIPT italic_T end_FLOATSUPERSCRIPT bold_C ( bold_z - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) . (23)

This gives us a spiking condition for neuron i𝑖iitalic_i: ViTi.subscript𝑉𝑖subscript𝑇𝑖{V_{i}}\geq T_{i}.italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≥ italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT .

If we do not pose any constraint on the dimensions of the system, we would make the matrix 𝐂𝐂\mathbf{C}bold_C an identity matrix 𝐈𝐈\mathbf{I}bold_I, obtaining the voltage and threshold values described in Eq. (4) and Eq. (5) from Sec. 2.1.2.

Stepping back from the single neuron perspective, and considering instead the voltages of all the neurons in the network, we obtain:

𝐕=𝐁T𝐀f𝐂T(𝐳𝐀f𝐱0).𝐕superscript𝐁𝑇subscript𝐀𝑓superscript𝐂𝑇𝐳subscript𝐀𝑓subscript𝐱0\mathbf{V}=\mathbf{B}^{T}\mathbf{A}_{f}{{}^{T}}\mathbf{C}(\mathbf{\mathbf{z}}-% \mathbf{A}_{f}\mathbf{x}_{0}).bold_V = bold_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_FLOATSUPERSCRIPT italic_T end_FLOATSUPERSCRIPT bold_C ( bold_z - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) . (24)

From there, we pose:

𝐆=𝐁T𝐀f𝐂T,𝐆superscript𝐁𝑇subscript𝐀𝑓superscript𝐂𝑇\mathbf{G}=\mathbf{B}^{T}\mathbf{A}_{f}{{}^{T}}\mathbf{C},bold_G = bold_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_FLOATSUPERSCRIPT italic_T end_FLOATSUPERSCRIPT bold_C , (25)

obtaining:

𝐕=𝐆(𝐳𝐀f𝐱0).𝐕𝐆𝐳subscript𝐀𝑓subscript𝐱0\mathbf{V}=\mathbf{G}(\mathbf{\mathbf{z}}-\mathbf{A}_{f}\mathbf{x}_{0}).bold_V = bold_G ( bold_z - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) . (26)

We now consider the rate of change of 𝐕𝐕\mathbf{V}bold_V, as well as the rate of change of all time-dependent variables:

𝐕˙=𝐆(𝐳˙𝐀f𝐱˙0),˙𝐕𝐆˙𝐳subscript𝐀𝑓subscript˙𝐱0\dot{\mathbf{V}}=\mathbf{G}(\dot{\mathbf{\mathbf{z}}}-\mathbf{A}_{f}\dot{% \mathbf{x}}_{0}),over˙ start_ARG bold_V end_ARG = bold_G ( over˙ start_ARG bold_z end_ARG - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT over˙ start_ARG bold_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , (27)

the change of the current state of the system over time 𝐱˙0subscript˙𝐱0\dot{\mathbf{x}}_{0}over˙ start_ARG bold_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT can be expressed by Eq. (8), obtaining:

𝐱˙0=𝐀𝐱0+𝐁𝐬.subscript˙𝐱0subscript𝐀𝐱0𝐁𝐬\dot{\mathbf{x}}_{0}=\mathbf{A}\mathbf{x}_{0}+\mathbf{Bs}.over˙ start_ARG bold_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = bold_Ax start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + bold_Bs . (28)

Substituting this into Eq. (27) yields:

𝐕˙˙𝐕\displaystyle\dot{\mathbf{V}}over˙ start_ARG bold_V end_ARG =𝐆(𝐳˙𝐀f(𝐀𝐱0+𝐁𝐬))absent𝐆˙𝐳subscript𝐀𝑓subscript𝐀𝐱0𝐁𝐬\displaystyle=\mathbf{G}(\dot{\mathbf{z}}-\mathbf{A}_{f}(\mathbf{A}\mathbf{x}_% {0}+\mathbf{Bs}))= bold_G ( over˙ start_ARG bold_z end_ARG - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( bold_Ax start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + bold_Bs ) ) (29)
=𝐆(𝐳˙𝐀f𝐀𝐱0+𝐀f𝐁𝐬)absent𝐆˙𝐳subscript𝐀𝑓subscript𝐀𝐱0subscript𝐀𝑓𝐁𝐬\displaystyle=\mathbf{G}(\dot{\mathbf{z}}-\mathbf{A}_{f}\mathbf{A}\mathbf{x}_{% 0}+\mathbf{A}_{f}\mathbf{Bs})= bold_G ( over˙ start_ARG bold_z end_ARG - bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_Ax start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_Bs ) (30)
=𝐆𝐳˙𝐆𝐀f𝐀𝐱0𝐆𝐀f𝐁𝐬.absent𝐆˙𝐳subscript𝐆𝐀𝑓subscript𝐀𝐱0subscript𝐆𝐀𝑓𝐁𝐬\displaystyle=\mathbf{G}\dot{\mathbf{z}}-\mathbf{G}\mathbf{A}_{f}\mathbf{A}% \mathbf{x}_{0}-\mathbf{G}\mathbf{A}_{f}\mathbf{Bs}.= bold_G over˙ start_ARG bold_z end_ARG - bold_GA start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_Ax start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_GA start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_Bs . (31)

We then add and subtract a 𝐕𝐕\mathbf{V}bold_V on the right side of the equation. We leave the 𝐕𝐕-\mathbf{V}- bold_V as is, while we substitute the +𝐕𝐕+\mathbf{V}+ bold_V using the definition of 𝐕𝐕\mathbf{V}bold_V from Eq. (26), obtaining:

𝐕˙=𝐕+𝐆𝐳˙𝐆𝐀f𝐀𝐱0+𝐆𝐳𝐆𝐀f𝐱0𝐆𝐀f𝐁𝐬;˙𝐕𝐕𝐆˙𝐳subscript𝐆𝐀𝑓subscript𝐀𝐱0𝐆𝐳subscript𝐆𝐀𝑓subscript𝐱0subscript𝐆𝐀𝑓𝐁𝐬\dot{\mathbf{V}}=-\mathbf{V}+\mathbf{G}\dot{\mathbf{z}}-\mathbf{G}\mathbf{A}_{% f}\mathbf{A}\mathbf{x}_{0}+\mathbf{G}\mathbf{z}-\mathbf{G}\mathbf{A}_{f}% \mathbf{x}_{0}-\mathbf{G}\mathbf{A}_{f}\mathbf{Bs};over˙ start_ARG bold_V end_ARG = - bold_V + bold_G over˙ start_ARG bold_z end_ARG - bold_GA start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_Ax start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + bold_Gz - bold_GA start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_GA start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_Bs ; (32)

we can rewrite this as

𝐕˙=𝐕+𝐆(𝐳˙+𝐳)𝐆𝐀f(𝐀+𝐈)𝐱0𝐆𝐀f𝐁𝐬.˙𝐕𝐕𝐆˙𝐳𝐳subscript𝐆𝐀𝑓𝐀𝐈subscript𝐱0subscript𝐆𝐀𝑓𝐁𝐬\dot{\mathbf{V}}=-\mathbf{V}+\mathbf{G}(\dot{\mathbf{z}}+\mathbf{z})-\mathbf{G% }\mathbf{A}_{f}\mathbf{(A+I)}\mathbf{x}_{0}-\mathbf{G}\mathbf{A}_{f}\mathbf{Bs}.over˙ start_ARG bold_V end_ARG = - bold_V + bold_G ( over˙ start_ARG bold_z end_ARG + bold_z ) - bold_GA start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( bold_A + bold_I ) bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_GA start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_Bs . (33)

We can trace this back to the form of a recurrent network of integrate and fire neurons:

𝐕˙=𝐕+𝐆(𝐳˙+𝐳)𝐅𝐱0𝛀𝐬,˙𝐕𝐕𝐆˙𝐳𝐳subscript𝐅𝐱0𝛀𝐬\mathbf{\dot{V}}=-\mathbf{V}+\mathbf{G}(\dot{\mathbf{z}}+\mathbf{z})-\mathbf{F% }\mathbf{x}_{0}-\mathbf{\Omega s},\,over˙ start_ARG bold_V end_ARG = - bold_V + bold_G ( over˙ start_ARG bold_z end_ARG + bold_z ) - bold_Fx start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_Ω bold_s , (34)

where 𝐅=𝐆𝐀f(𝐀+𝐈)𝐅subscript𝐆𝐀𝑓𝐀𝐈\mathbf{F}=\mathbf{G}\mathbf{A}_{f}\mathbf{(A+I)}bold_F = bold_GA start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( bold_A + bold_I ), and 𝛀=𝐆𝐀f𝐁=𝐁T𝐀f𝐂𝐀fT𝐁.𝛀subscript𝐆𝐀𝑓𝐁superscript𝐁𝑇subscript𝐀𝑓superscriptsubscript𝐂𝐀𝑓𝑇𝐁\mathbf{\Omega}=\mathbf{G}\mathbf{A}_{f}\mathbf{B}=\mathbf{B}^{T}\mathbf{A}_{f% }{{}^{T}}\mathbf{C}\mathbf{A}_{f}\mathbf{B}.bold_Ω = bold_GA start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_B = bold_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_FLOATSUPERSCRIPT italic_T end_FLOATSUPERSCRIPT bold_CA start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_B .

This full network equation describes the changes in 𝐕N𝐕superscript𝑁\mathbf{V}\in\mathbb{R}^{N}bold_V ∈ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT , which represents the membrane potentials. Here, 𝐆N×K𝐆superscript𝑁𝐾\mathbf{G}\in\mathbb{R}^{N\times K}bold_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_K end_POSTSUPERSCRIPT are the target input weights: forward connections that encode for the target 𝐳𝐳\mathbf{z}bold_z. Considering the changes in target value over time 𝐳˙˙𝐳\dot{\mathbf{z}}over˙ start_ARG bold_z end_ARG as well as the actual value of the target 𝐳𝐳\mathbf{z}bold_z allows for the network to trace rapid changes of the target. Only encoding for the current value of 𝐳𝐳\mathbf{z}bold_z would still allow to control the system, but with a slower tracing of the target. The matrix 𝐅N×K𝐅superscript𝑁𝐾\mathbf{F}\in\mathbb{R}^{N\times K}bold_F ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_K end_POSTSUPERSCRIPT represents the state input weights, that contain the forward connections to encode the current system’s state 𝐱𝟎subscript𝐱0\mathbf{x_{0}}bold_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT, 𝛀N×N𝛀superscript𝑁𝑁\mathbf{\Omega}\in\mathbb{R}^{N\times N}bold_Ω ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_N end_POSTSUPERSCRIPT are the recurrent weights, and 𝐬𝐬\mathbf{s}bold_s are the spikes. Starting from a greedy spiking condition, we obtained a network of leaky integrate-and-fire (LIF) neurons in the form of Eq.  (7). Such network has a voltage decay term, forward weights that map the target onto the network, forward weights that encode the system state in input, recurrent connectivity, and spikes, and is able to control our linear dynamical system. For a K𝐾Kitalic_K-dimensional system to control and a network of N𝑁Nitalic_N neurons, 𝐁𝐁\mathbf{B}bold_B is a N×K𝑁𝐾N\times Kitalic_N × italic_K matrix, while 𝐀fsubscript𝐀𝑓\mathbf{A}_{f}bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT and 𝐂𝐂\mathbf{C}bold_C are K×K𝐾𝐾K\times Kitalic_K × italic_K matrices. Thus, 𝛀=𝐁T𝐀f𝐂𝐀fT𝐁𝛀superscript𝐁𝑇subscript𝐀𝑓superscriptsubscript𝐂𝐀𝑓𝑇𝐁\mathbf{\Omega}=\mathbf{B}^{T}\mathbf{A}_{f}{{}^{T}}\mathbf{C}\mathbf{A}_{f}% \mathbf{B}bold_Ω = bold_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_A start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_FLOATSUPERSCRIPT italic_T end_FLOATSUPERSCRIPT bold_CA start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT bold_B is of rank K𝐾Kitalic_K. Hence, our spiking control solution uses a network with low-rank connectivity (low-rank with respect to the system that is controlling).

4.3 Relation to spike coding networks

Works in SCNs use a very similar derivation as the one described in Sec. 4.2. In these works, a read-out is usually defined in the following form:

𝐲=𝐃𝐫,𝐲𝐃𝐫\mathbf{y}=\mathbf{D}\mathbf{r},bold_y = bold_Dr , (35)

where 𝐃K×N𝐃superscript𝐾𝑁\mathbf{D}\in\mathbb{R}^{K\times N}bold_D ∈ blackboard_R start_POSTSUPERSCRIPT italic_K × italic_N end_POSTSUPERSCRIPT contains the decoding or output weights of all neurons, and 𝐲K𝐲superscript𝐾\mathbf{y}\in\mathbb{R}^{K}bold_y ∈ blackboard_R start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT is the K𝐾Kitalic_K-dimensional network read-out (also sometimes seen as the network state). 𝐫N𝐫superscript𝑁\mathbf{r}\in\mathbb{R}^{N}bold_r ∈ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT represent some form of filtered spike trains, ranging from simple exponential filtering (such that 𝐫˙=λ𝐫+𝐬˙𝐫𝜆𝐫𝐬\dot{\mathbf{r}}=-\lambda\mathbf{r}+\mathbf{s}over˙ start_ARG bold_r end_ARG = - italic_λ bold_r + bold_s), or more complex post-synaptic response kernels. In both cases a loss is optimized such that the output of the network progressively tracks its input, 𝐱𝐱\mathbf{x}bold_x. A loss between these two quantities can be defined:

L=𝐱𝐃𝐫2.𝐿superscriptnorm𝐱𝐃𝐫2L=||\mathbf{x}-\mathbf{D}\mathbf{r}||^{2}.italic_L = | | bold_x - bold_Dr | | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . (36)

Much like in our derivation, from this loss voltage, dynamics are derived by considering the rule that spikes should only happen if L(neuron spikes)<L(no spikes)𝐿neuron spikes𝐿no spikesL(\text{neuron spikes})<L(\text{no spikes})italic_L ( neuron spikes ) < italic_L ( no spikes ), which we took inspiration from for the control case. For works that make use of complex kernels either some prediction of the future state was also employed  \citepschwemmerConstructing2015, or a delayed signal was used to explicitly model the difference between past and current state  \citepzeldenrustEfficient2021.

These derivations are largely similar to ours, with the crucial differences that we consider downstream linear systems of a general form, and that previous SCN works have mainly considered a single read-out given by filtered spike trains. In this sense, our work can be seen as a generalization of standard SCN paradigms (which exert ‘control’ mainly over their own read-out), which expands the use of similar principles to explicitly control any downstream linear dynamical system.

4.4 Filtered spiking controller

For the filtered spiking controller in Fig. 7B, we used a more direct extension of SCNs. There, we consider a network that should produce a read-out which matches the optimal control expected from an LQR controller. This means our network spikes to constrain the loss between the network output, and the LQR control signal, such that:

L=𝐮𝐃𝐫2,𝐿superscriptnorm𝐮𝐃𝐫2L=||\mathbf{u}-\mathbf{D}\mathbf{r}||^{2},italic_L = | | bold_u - bold_Dr | | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (37)

where 𝐮=𝐊(𝐱𝐳)𝐮𝐊𝐱𝐳\mathbf{u}=-\mathbf{K}(\mathbf{x}-\mathbf{z})bold_u = - bold_K ( bold_x - bold_z ) is the control signal as it would be generated by an LQR controller. When defining the LQR signal to approximate, we consider a K𝐾Kitalic_K-dimensional system controlled by a W𝑊Witalic_W-dimensional control signal. The matrix 𝐊W×K𝐊superscript𝑊𝐾\mathbf{K}\in\mathbb{R}^{W\times K}bold_K ∈ blackboard_R start_POSTSUPERSCRIPT italic_W × italic_K end_POSTSUPERSCRIPT is then the optimal feedback gain matrix can be found by solving an algebraic Riccati equation, given assumptions on the cost of state deviations and control actuation \citepbruntonDatadriven2019. This solution ensures that the control law 𝐮=𝐊(𝐱𝐳)𝐮𝐊𝐱𝐳\mathbf{u}=-\mathbf{K}(\mathbf{x}-\mathbf{z})bold_u = - bold_K ( bold_x - bold_z ) optimally balances tracking performance and control cost when controlling the linear system. Spiking activity that constrains the loss described in Eq. (37) can be achieved by simply using a standard SCN network, while considering 𝐊(𝐱𝐳)𝐊𝐱𝐳-\mathbf{K}(\mathbf{x}-\mathbf{z})- bold_K ( bold_x - bold_z ) as an input instead of just 𝐱𝐱\mathbf{x}bold_x.

Acknowledgments

This publication is part of the project Dutch Brain Interface Initiative (DBI2222) with project number 024.005.022 of the research programme Gravitation, which is financed by the Dutch Ministry of Education, Culture and Science (OCW) via the Dutch Research Council (NWO). The authors thank Luc Selen for advice on the model’s design and conceptualization.

References

\printbibliography

[heading=none]