Abstract
Despite their superior accuracy to simpler linear models, kernelized models can be prohibitively expensive in applications where the training set changes frequently, since training them is computationally intensive. We provide bounds for the changes in a kernelized model when its training set has changed, as well as bounds on the prediction of the new hypothetical model on new data. Our bounds support any kernelized model with \(L_2\) regularization and convex, differentiable loss. The bounds can be computed incrementally as the data changes, much faster than re-computing the model. We apply our bounds to three applications: active learning, leave-one-out cross-validation, and online learning in the presence of concept drifts. We demonstrate empirically that the bounds are tight, and that the proposed algorithms can reduce costly model re-computations by up to 10 times, without harming accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Given an objective function of the form \(a \sum _{i \in \mathcal {D}} \ell _i(\beta ) + b \Vert \beta \Vert ^2 \), choosing \(C = \frac{a}{2 b}\) will bring it to the standard form (1).
- 2.
Our previous work [28] includes similar bounds for linear models, but only provides a sketc.h of the proof. Here we provide bounds and the full proof for both kernelized and linear models.
References
Bottou, L., Lin, C.J.: Support vector machine solvers. In: Large Scale Kernel Machines, pp. 301–320 (2007)
Boyd, S., Vandenberghe, L.: Convex Optimization (2004)
Cauwenberghs, G., Poggio, T.: Incremental and decremental support vector machine learning. In: NIPS 2000, pp. 388–394 (2000)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011)
Diamond, S., Boyd, S.: CVXPY: a python-embedded modeling language for convex optimization. JMLR 17(83), 1–5 (2016)
Dua, D., Graff, C.: UCI machine learning repository (2017)
Duarte, M.F., Hu, Y.H.: Vehicle classification in distributed sensor networks. J. Parallel Distrib. Comput. 64(7), 826–838 (2004)
Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22, 1517–1531 (2011)
Fine, S., Scheinberg, K.: Incremental learning and selective sampling via parametric optimization framework for SVM. In: NIPS 2001, pp. 705–711 (2001)
Gabel, M., Keren, D., Schuster, A.: Monitoring least squares models of distributed streams. In: KDD 2015, pp. 319–328 (2015)
Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28645-5_29
Gama, J., Sebastião, R., Rodrigues, P.P.: On evaluating stream learning algorithms. Mach. Learn. 90(3), 317–346 (2013). https://doi.org/10.1007/s10994-012-5320-9
Hanada, H., Shibagaki, A., Sakuma, J., Takeuchi, I.: Efficiently monitoring small data modification effect for large-scale learning in changing environment. In: AAAI (2018)
Hofmann, T., Schölkopf, B., Smola, A.: Kernel methods in machine learning. Ann. Stat. 36, 1171–1220 (2007)
Jaakkola, T.S., Haussler, D.: Probabilistic kernel regression models. In: Proceedings of the 1999 Conference on AI and Statistics (1999)
Joachims, T.: Estimating the generalization performance of an SVM efficiently. In: ICML 2000, pp. 431–438 (2000)
Kamp, M., Bothe, S., Boley, M., Mock, M.: Communication-efficient distributed online learning with Kernels. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016, Part II. LNCS (LNAI), vol. 9852, pp. 805–819. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46227-1_50
Karasuyama, M., Takeuchi, I.: Multiple incremental decremental learning of support vector machines. In: NIPS 2009, pp. 907–915 (2009)
Kivinen, J., Smola, A.J., Williamson, R.C.: Online learning with kernels. IEEE Trans. Signal Process. 52(8), 2165–2176 (2004)
Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1994, pp. 3–12 (1994)
Liu, Y., Jiang, S., Liao, S.: Efficient approximation of cross-validation for Kernel methods using bouligand influence function. In: ICML 2014, pp. I-324–I-332 (2014)
Maalouf, M., Trafalis, T.B., Adrianto, I.: Kernel logistic regression using truncated Newton method. Comput. Manage. Sci. 8(4), 415–428 (2011)
Molinaro, A.M., Simon, R., Pfeiffer, R.M.: Prediction error estimation: a comparison of resampling methods. Bioinformatics 21(15), 3301–3307 (2005)
Okumura, S., Suzuki, Y., Takeuchi, I.: Quick sensitivity analysis for incremental data modification and its application to leave-one-out CV in linear classification problems. In: KDD 2015, pp. 885–894 (2015)
Orabona, F., Keshet, J., Caputo, B.: Bounded Kernel-based online learning. J. Mach. Learn. Res. 10, 2643–2666 (2009)
Pan, S.J., Yang, Q.: A survey on transfer learning. TKDE 22(10), 1345–1359 (2010)
Schölkopf, B.: The Kernel trick for distances. In: NIPS 2000, pp. 283–289 (2000)
Sivan, H., Gabel, M., Schuster, A.: Online linear models for edge computing. In: ECML PKDD 2019 (2019)
Santosh, K.C., Hegadi, R.S. (eds.): RTIP2R 2018, Part II. CCIS, vol. 1036. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-9184-2
Tsai, C.H., Lin, C.Y., Lin, C.J.: Incremental and decremental training for linear classification. In: KDD 2014, pp. 343–352 (2014)
Tsang, I.W., Kwok, J.T., Cheung, P.M.: Core vector machines: Fast SVM training on very large data sets. J. Mach. Learn. Res. 6, 363–392 (2005)
Vanschoren, J., van Rijn, J.N., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. SIGKDD Explor. 15(2), 49–60 (2013)
Vapnik, V., Chapelle, O.: Bounds on error expectation for support vector machines. Neural Comput. 12(9), 2013–2036 (2000)
Wang, Z., Vucetic, S.: Online passive-aggressive algorithms on a budget. In: Proceedings of Machine Learning Research, vol. 9, pp. 908–915, 13–15 May 2010
Zhang, T.: Leave-one-out bounds for kernel methods. Neural Comput. 15(6), 1397–1437 (2003)
Zhang, Y., Yang, Y.: Cross-validation for selecting a model selection procedure. J. Econom. 187(1), 95–112 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Sivan, H., Gabel, M., Schuster, A. (2021). Incremental Sensitivity Analysis for Kernelized Models. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12458. Springer, Cham. https://doi.org/10.1007/978-3-030-67661-2_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-67661-2_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67660-5
Online ISBN: 978-3-030-67661-2
eBook Packages: Computer ScienceComputer Science (R0)