这是indexloc提供的服务,不要输入任何密码
Skip to main content
Log in

Compressed Event Sensing (CES) Volumes for Event Cameras

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Deep learning has made significant progress in event-driven applications. But to match standard vision networks, most approaches rely on aggregating events into grid-like representations, which obscure crucial temporal information and limit overall performance. To address this issue, we propose a novel event representation called compressed event sensing (CES) volumes. CES volumes preserve the high temporal resolution of event streams by leveraging the sparsity property of events and the principles of compressed sensing theory. They effectively capture the frequency characteristics of events in low-dimensional representations, which can be accurately decoded to raw high-dimensional event signals. In addition, our theoretical analysis show that, when integrated with a neural network, CES volumes demonstrates greater expressive power under the neural tangent kernel approximation. Through synthetic phantom validation on dense frame regression and two downstream applications involving intensity-image reconstruction and object recognition tasks, we demonstrate the superior performance of CES volumes compared to state-of-the-art event representations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Data Availibility Statement

This work does not propose a new dataset. All the datasets we used are publicly available.

References

  • Alonso, I., & Murillo, A. C. (2019). Ev-segnet Semantic segmentation for event-based cameras. In: CVPRW. https://doi.org/10.1109/cvprw.2019.00205

  • Arora, S., Du, S. S., Hu, W., Li, Z., Salakhutdinov, R. R., & Wang, R. (2019). On exact computation with an infinitely wide neural net. Advances in neural information processing systems,32.

  • Bajwa, W. U., Haupt, J., Sayeed, A. M., & Nowak, R. (2010). Compressed channel sensing: A new approach to estimating sparse multipath channels. Proceedings of the IEEE, 98(6), 1058–1076. https://doi.org/10.1109/jproc.2010.2042415

    Article  MATH  Google Scholar 

  • Baldwin, R., Liu, R., Almatrafi, M. M., Asari, V. K., & Hirakawa, K. (2022). Time-ordered recent event (tore) volumes for event cameras. TPAMI.https://doi.org/10.1109/tpami.2022.3172212

  • Basarab, A., Liebgott, H., Bernard, O., Friboulet, D., & Kouamé, D. (2013). Medical ultrasound image reconstruction using distributed compressive sampling. In: International symposium on biomedical imaging, pp. 628–631. IEEE. https://doi.org/10.1109/isbi.2013.6556553.

  • Bi, Y., Chadha, A., Abbas, A., Bourtsoulatze, E., & Andreopoulos, Y. (2019). Graph-based object classification for neuromorphic vision sensing. In: ICCV, pp. 491–501. https://doi.org/10.1109/iccv.2019.00058

  • Bietti, A., & Mairal, J. (2019). On the inductive bias of neural tangent kernels. Advances in Neural Information Processing Systems, 32

  • Candès, E. J., Romberg, J., & Tao, T. (2006). Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. Transactions on Information Theory, 52(2), 489–509. https://doi.org/10.1109/tit.2005.862083

    Article  MathSciNet  MATH  Google Scholar 

  • Candes, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing sparsity by reweighted l 1 minimization. Journal of Fourier Analysis and Applications, 14(5), 877–905. https://doi.org/10.1007/s00041-008-9045-x

    Article  MathSciNet  MATH  Google Scholar 

  • Carvalho, L., Costa, J. L., Mourão, J., & Oliveira, G. (2024).The positivity of the neural tangent kernel. arXiv preprint arXiv:2404.12928.

  • Chen, L., & Xu, S. (2020). Deep neural tangent kernel and laplace kernel have the same rkhs. arXiv preprint arXiv:2009.10683.

  • Chen, Z., Cao, Y., Quanquan, G., & Zhang, T. (2020). A generalized neural tangent kernel analysis for two-layer neural networks. Advances in Neural Information Processing Systems, 33, 13363–13373.

    MATH  Google Scholar 

  • Donoho, D. L. (2006). Compressed sensing. IEEE Transactions on Information Theory, 52(4), 1289–1306. https://doi.org/10.1109/TIT.2006.871582

    Article  MathSciNet  MATH  Google Scholar 

  • Eldar, Y. C., & Kutyniok, G. (2012). Compressed sensing: theory and applications. Cambridge University Press.

  • Fei-Fei, L., Fergus, R., & Perona, P. (2006). One-shot learning of object categories. TPAMI, 28(4), 594–611. https://doi.org/10.1109/tpami.2006.79

    Article  MATH  Google Scholar 

  • Foucart, S., & Rauhut, H. (2013) Restricted isometry property, pp. 133–174. Springer New York, New York, NY.

  • Gallego, G., Delbrück, T., Orchard, G., Bartolozzi, C., Taba, B., Censi, A., Leutenegger, S., Davison, A. J., Jörg Conradt, & Daniilidis, K., et al. (2020). Event-based vision: A survey. TPAMI, 44(1), 154–180. https://doi.org/10.1109/TPAMI.2020.3008413

  • Gehrig, D., Gehrig, M., Hidalgo-Carrió, J., & Scaramuzza, D. (2020). Video to events: Recycling video datasets for event cameras. In: CVPR, pp. 3586–3595. https://doi.org/10.1109/cvpr42600.2020.00364

  • Gehrig, D., Loquercio, A., Derpanis, K. G., & Scaramuzza, D. (2019). End-to-end learning of representations for asynchronous event-based data. In: ICCV, pp. 5633–5643. https://doi.org/10.1109/iccv.2019.00573

  • Gehrig, M., Shrestha, S. B., Mouritzen, D., & Scaramuzza, D. (2020). Event-based angular velocity regression with spiking networks. In: ICRA, pp. 4195–4202. IEEE, https://doi.org/10.1109/icra40945.2020.9197133.

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. CVPR, 45, 770–778. https://doi.org/10.1109/cvpr.2016.90

    Article  MATH  Google Scholar 

  • Hendrycks, D., & Gimpel, K. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, (2016). https://doi.org/10.48550/arXiv.1606.08415.

  • Huh, D., Sejnowski, T. J. (2018). Gradient descent for spiking neural networks. In: NeurIPS, 31. https://doi.org/10.48550/arXiv.1706.04698.

  • Jacot, A., Gabriel, F., & Hongler, C. (2018). Neural tangent kernel: Convergence and generalization in neural networks. NeurIPS, 31.

  • Jiang, Z., Zhang, Y., Zou, D., Ren, J., Lv, J., & Liu, Y. (2020) Learning event-based motion deblurring. In: CVPR, pp. 3320–3329. https://doi.org/10.1109/cvpr42600.2020.00338.

  • Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. ICLR. https://doi.org/10.48550/arXiv.1412.6980.

  • Lagorce, X., Orchard, G., Galluppi, F., Shi, B. E., & Benosman, R. B. (2016). Hots: a hierarchy of event-based time-surfaces for pattern recognition. TPAMI, 39(7), 1346–1359. https://doi.org/10.1109/tpami.2016.2574707

    Article  MATH  Google Scholar 

  • Lee, J. H., Delbruck, T., & Pfeiffer, M. (2016). Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience, 10, 508. https://doi.org/10.3389/fnins.2016.00508.

  • Lin, S., Zhang, J., Pan, J., Jiang, Z., Zou, D., Wang, Y., Chen, J., & Ren, J. (2020). Learning event-driven video deblurring and interpolation. In: ECCV, pp. 695–710. Springer. https://doi.org/10.1007/978-3-030-58598-3_41.

  • Liu, C., Hui, L. (2023). Relu soothes the ntk condition number and accelerates optimization for wide neural networks. arXiv preprint arXiv:2305.08813.

  • Maqueda, A. I., Loquercio, A., Gallego, G., García, N., Scaramuzza, D. (2018). Event-based vision meets deep learning on steering prediction for self-driving cars. In: CVPR, pp. 5419–5427. https://doi.org/10.1109/cvpr.2018.00568.

  • Mitrokhin, A., Fermüller, C., Parameshwara, C., & Aloimonos, Y. (2018). Event-based moving object detection and tracking. In: IROS, pp. 1–9. IEEE. https://doi.org/10.1109/iros.2018.8593805.

  • Mitrokhin, A., Ye, C., Fermüller, C., Aloimonos, Y., & Delbruck, T. (2019). Ev-imo: Motion segmentation dataset and learning pipeline for event cameras. In: IROS, pp. 6105–6112. IEEE. https://doi.org/10.1109/iros40897.2019.8968520.

  • Mohtashemi, M., Smith, H., Walburger, D., Sutton, F., & Diggans, J., Sparse sensing dna microarray-based biosensor: Is it feasible? In: 2010 IEEE sensors applications symposium, pp. 127–130. IEEE (2010). https://doi.org/10.1109/sas.2010.5439412.

  • Mueggler, E., Rebecq, H., Gallego, G., Delbruck, T., & Scaramuzza, D. (2017). The event-camera dataset and simulator: Event-based data for pose estimation, visual odometry, and slam. The International Journal of Robotics Research, 36(2), 142–149. https://doi.org/10.1177/0278364917691115

    Article  Google Scholar 

  • Neil, D., Pfeiffer, M., Liu, S.-C. (2016). Phased lstm: Accelerating recurrent network training for long or event-based sequences. NeurIPS, 29. https://doi.org/10.48550/arXiv.1610.09513.

  • Nguyen, T. L. N., & Shin, Y. (2013) Deterministic sensing matrices in compressive sensing: A survey. The Scientific World Journal, 2013. https://doi.org/10.1155/2013/192795.

  • Nyquist, H. (1928). Certain topics in telegraph transmission theory. Transactions of the American Institute of Electrical Engineers, 47(2), 617–644. https://doi.org/10.1109/5.989875

    Article  MATH  Google Scholar 

  • Orchard, G., Jayawant, A., Cohen, G. K., & Thakor, N. (2015). Converting static image datasets to spiking neuromorphic datasets using saccades. Frontiers in Neuroscience, 9, 437. https://doi.org/10.3389/fnins.2015.00437.

  • Orchard, G., Meyer, C., Etienne-Cummings, R., Posch, C., Thakor, N., & Benosman, R. (2015). Hfirst: A temporal approach to object recognition. TPAMI, 37(10), 2028–2040. https://doi.org/10.1109/tpami.2015.2392947

    Article  Google Scholar 

  • Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., & DeVito, Z. , Lin, Z., Desmaison, A., Antiga, L., & Lerer, A. (2017). Automatic differentiation in pytorch.

  • Posch, C., Matolin, D., & Wohlgenannt, R. (2010). A qvga 143 db dynamic range frame-free pwm image sensor with lossless pixel-level video compression and time-domain cds. IEEE Journal of Solid-State Circuits, 46(1), 259–275. https://doi.org/10.1109/jssc.2010.2085952

    Article  Google Scholar 

  • Rebecq, H., Gehrig, D., Scaramuzza, D. (2018). ESIM: an open event camera simulator. In: CoRL.

  • Rebecq, H., Horstschaefer, T., & Scaramuzza, D. (2017). Real-time visual-inertial odometry for event cameras using keyframe-based nonlinear optimization.https://doi.org/10.5244/c.31.16

  • Rebecq, H., Ranftl, R., Koltun, V., & Scaramuzza, D. (2019). High speed and high dynamic range video with an event camera. TPAMI, 43(6), 1964–1980. https://doi.org/10.1109/tpami.2019.2963386

    Article  Google Scholar 

  • Saitoh, S., Sawano, Y., et al. (2016). Theory of reproducing kernels and applications. Springer.

  • Schaefer, S., Gehrig, D., & Scaramuzza, D. (2022). Aegnn: Asynchronous event-based graph neural networks. In: CVPR, pp. 12371–12381. https://doi.org/10.1109/cvpr52688.2022.01205.

  • Scheerlinck, C., Barnes, N., & Mahony, R. (2018). Continuous-time intensity estimation using event cameras. In: ACCV, pp. 308–324. Springer. https://doi.org/10.1007/978-3-030-20873-8_20.

  • Schölkopf, B., Smola, A. J. (2002). Learning with kernels: Support vector machines, regularization, optimization, and beyond. MIT press.

  • Seeger, M. (2004). Gaussian processes for machine learning. International Journal of Neural Systems, 14(02), 69–106.

    Article  MATH  Google Scholar 

  • Sekikawa, Y., Hara, K., & Saito, H. (2019). Eventnet: Asynchronous recursive event processing. In: CVPR, pp. 3887–3896. https://doi.org/10.1109/cvpr.2019.00401.

  • Sironi, A., Brambilla, M., Bourdis, N., Lagorce, X., & Benosman, R. (2018). Hats: Histograms of averaged time surfaces for robust event-based object classification. In: CVPR, pp. 1731–1740. https://doi.org/10.1109/cvpr.2018.00186.

  • Tancik, M., Srinivasan, P., Mildenhall, B., Fridovich-Keil, S., Raghavan, N., Singhal, U., Ramamoorthi, R., Barron, J., & Ng, R. (2020). Fourier features let networks learn high frequency functions in low dimensional domains. NeurIPS, 33, 7537–7547. https://doi.org/10.1109/mmul.2021.3053698

    Article  Google Scholar 

  • Wainwright, M. J. (2019). High-dimensional statistics: A non-asymptotic viewpoint, vol. 48. Cambridge university press.

  • Wang, L., Ho, Y.-S., & Yoon, K.-J. et al. (2019). Event-based high dynamic range image and very high frame rate video generation using conditional generative adversarial networks. In: CVPR, pp. 10081–10090. https://doi.org/10.1109/cvpr.2019.01032.

  • Wang, Z., Bovik, A. C. Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. TIP, 13(4), 600–612. https://doi.org/10.1109/tip.2003.819861.

  • Yang, J., Zhang, Q., Ni, B., Li, L., Liu, J., Zhou, M., & Tian, Q. (2019). Modeling point clouds with self-attention and gumbel subset sampling. In: CVPR, pp. 3323–3332. https://doi.org/10.1109/cvpr.2019.00344.

  • Zhang, H., Chen, X.-H., & Xin-Min, W. (2013). Seismic data reconstruction based on cs and fourier theory. Applied Geophysics, 10(2), 170–180. https://doi.org/10.1007/s11770-013-0375-3

    Article  MATH  Google Scholar 

  • Zhang, R., Isola, P., Efros, A. A., Shechtman, E., & Wang, O. (2018). The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR, pp. 586–595. https://doi.org/10.1109/cvpr.2018.00068.

  • Zhang, S., Zhang, Y., Jiang, Z., Zou, D., Ren, J., & Zhou, B. (2020). Learning to see in the dark with events. In: ECCV, pp. 666–682. Springer. https://doi.org/10.1007/978-3-030-58523-5_39.

  • Zhao, B., Ding, R., Chen, S., Linares-Barranco, B., & Tang, H. (2014). Feedforward categorization on aer motion events using cortex-like features in a spiking neural network. TNNLS, 26(9), 1963–1978. https://doi.org/10.1109/tnnls.2014.2362542

    Article  MathSciNet  Google Scholar 

  • Zhu, A. Z., & Yuan, L. (2018). Ev-flownet: Self-supervised optical flow estimation for event-based cameras. In: Robotics: Science and Systems. https://doi.org/10.15607/rss.2018.xiv.062.

  • Zhu, A. Z., Yuan, L., Chaney, K., & Daniilidis, K. (2019). Unsupervised event-based learning of optical flow, depth, and egomotion. In: CVPR, pp. 989–997. https://doi.org/10.1109/cvpr.2019.00108

Download references

Funding

This work was supported in part by the Ministry of Education, Republic of Singapore, through its Start-Up Grant and Academic Research Fund Tier 1 (RG61/22).

Author information

Authors and Affiliations

Contributions

Songnan Lin Conceptualization, Methodology, Software, Writing - original draft preparation; Ye Ma Methodology, Software, Writing - review and editing; Jing Chen Writing - review and editing; Bihan Wen Conceptualization, Writing - review and editing, Supervision;

Corresponding author

Correspondence to Bihan Wen.

Ethics declarations

Conflict of interest

The authors have no Conflict of interest to declare that are relevant to the content of this article.

Code availability

The code of this work will be released after acceptance.

Additional information

Communicated by Yasuyuki Matsushita.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Proof of Theorem 1

Proof of Theorem 1

Theorem 1

Given a non-zero distinct s-sparse dataset \({\mathfrak {X}}=\{\vec {x_i}\}_{i=1}^I\), let \({\mathfrak {H}}_a\) and \({\mathfrak {H}}_b\) be the RKHS associated with the NTK of same-architecture fully-connected network with \(\Psi _a^T\vec {x}\), and \(\Psi _b^T\vec {x}\) as input, where \(\Psi _a\) holds the non-degenerate property while \(\Psi _b\) does not, the following subset inclusion relation hold:

$$\begin{aligned} {\mathfrak {H}}_b \subsetneq {\mathfrak {H}}_a. \end{aligned}$$
(16)

We first introduce two key ingredients of the proof:

Lemma 1

(Theorem 2.17 in Saitoh et al. (2016)) Let \(K_a,K_b\, E \times E \rightarrow {\mathbb {C}}\) be two positive semi-definite kernels. Then the following two statements are equivalent:

  1. 1.

    The Hilbert space \({\mathfrak {H}}_{b}\) is a subset of \({\mathfrak {H}}_{a}\)

  2. 2.

    There exist \(\gamma > 0\), such that

    $$\begin{aligned} K_b \preceq \gamma ^2 K_a. \end{aligned}$$
    (17)

Lemma 2

(Proposition 2 in Jacot et al. (2018), Theorem 6 in Luís et al. (2024)) For a fully-connected network adopting a non-polynomial Lipschitz nonlinearity activation function \(\sigma \), for any input dimension \(n_0\), the limiting NTK is strictly positive definite if the number of layer \(L \ge 2\).

Proof of Theorem 1

According to Lemma 1, to obtain \({\mathfrak {H}}_b \subsetneq {\mathfrak {H}}_a\), we require proof that \(\gamma ^2 K_a - K_b\) is a positive semidefinite kernel for some \(\gamma > 0\), whereas \(\gamma ^2 K_b - K_a\) is not a positive semidefinite kernel for all \(\gamma > 0\).

Consider arbitrary non-empty subset of \({\mathfrak {X}}\), \(\{\vec {x_i}\}_{i=1}^r \subset {\mathfrak {X}}\), for \(1 \le r \le I\), the NTK matrix \({\textbf{K}}\) with size \(r \times r\) could be constructed for kernel \(K_a\) and \(K_b\), whose entries are

$$\begin{aligned} {\textbf{K}}_{a}^{i,j} = K_a (\vec {x_i}, \vec {x_j});~~ {\textbf{K}}_{b}^{i,j} = K_b (\vec {x_i}, \vec {x_j}). \end{aligned}$$
(18)

As introduced in the proposed NTK model, deep learning methods first represent events using sensing matrix \(\Psi \), and then feed the representation into a neural network g. We refer to the NTK of the network g as \(K_g(\cdot , \cdot )\). Therefore, for two different sensing matrices \(\Psi _a\) and \(\Psi _b\), the NTK of the whole networks can be represented as

$$\begin{aligned} \begin{aligned} K_{a}(\vec {x_i}, \vec {x_j})&= K_g(\Psi _a^T \vec {x_i}, \Psi _a^T \vec {x_j})\\ K_{b}(\vec {x_i}, \vec {x_j})&= K_g(\Psi _b^T \vec {x_i}, \Psi _b^T \vec {x_j}). \\ \end{aligned} \end{aligned}$$
(19)

According to Lemma 2, when we adopt the same network settings to Jacot et al. (2018), we obtain that the NTK of the network \(K_g\) is a strictly positive definite for distinct network inputs.

Since \(\Psi _a\) holds the non-degenerate property as described in Equation (11), the compressed representations \(\Psi _a^T \vec {x_i}\) are distinct. Therefore, the NTK matrix \({\textbf{K}}_a\) is positive definite with eigenvalues \(\lambda _a^i >0\). Whereas, \(\Psi _b\) does not hold this property, i.e., there might exist “degenerate" vector pairs \(\vec {x_i}\) and \(\vec {x_j}\) such that \(\Psi _b^T \vec {x_i} = \Psi _b^T \vec {x_j}\), leading to identical values in i th and j th rows of the NTK matrix \({\textbf{K}}_b\). And thus, the eigenvalues \(\lambda _b^i \ge 0\).

Therefore, for each non-empty subset \(\{\vec {x_i}\}_{i=1}^r\) of \({\mathfrak {X}}\), \(\gamma ^2 {\textbf{K}}_a - {\textbf{K}}_b\) is positive semidefinite when

$$\begin{aligned} \gamma = \sqrt{ \frac{\max _i \lambda _b^i}{\min _i \lambda _a^i}}. \end{aligned}$$
(20)

Here, let the \(\gamma _{max}\) be the maximum of \(\gamma \) respective to all the non-empty subsets of \({\mathfrak {X}}\). Based on the definition of positive semidefinite kernels (see Definition 12.6 in Martin (2019)), \(\gamma _{max}^2 K_a - K_b\) is a positive semidefinite kernel on \({\mathfrak {X}}\). This enables us to apply Lemma 1 to obtain that

$$\begin{aligned} {\mathfrak {H}}_b \subseteq {\mathfrak {H}}_a, \end{aligned}$$
(21)

Conversely, since the eigenvalues of \({\textbf{K}}_b\) contains zero in the degenerate case, for all \(\gamma > 0\), \(\gamma ^2 {\textbf{K}}_b - {\textbf{K}}_a\) is not positive semidefinite. Thus, for all \(\gamma > 0\), the kernel function \(\gamma ^2 K_b - K_a\) is not positive semidefinite. According to Lemma 1, \({\mathfrak {H}}_a\) is not a subset of \({\mathfrak {H}}_b\). Combining \({\mathfrak {H}}_b \subseteq {\mathfrak {H}}_a\), we can come to \({\mathfrak {H}}_a \ne {\mathfrak {H}}_b\), thereby concluding the proof. \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, S., Ma, Y., Chen, J. et al. Compressed Event Sensing (CES) Volumes for Event Cameras. Int J Comput Vis 133, 435–455 (2025). https://doi.org/10.1007/s11263-024-02197-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1007/s11263-024-02197-2

Keywords