+
Skip to main content
Log in

relf: robust regression extended with ensemble loss function

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Ensemble techniques are powerful approaches that combine several weak learners to build a stronger one. As a meta-learning framework, ensemble techniques can easily be applied to many machine learning methods. Inspired by ensemble techniques, in this paper we propose an ensemble loss functions applied to a simple regressor. We then propose a half-quadratic learning algorithm in order to find the parameter of the regressor and the optimal weights associated with each loss function. Moreover, we show that our proposed loss function is robust in noisy environments. For a particular class of loss functions, we show that our proposed ensemble loss function is Bayes consistent and robust. Experimental evaluations on several data sets demonstrate that the our proposed ensemble loss function significantly improves the performance of a simple regressor in comparison with state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/

References

  1. Bai Q, Lam H, Sclaroff S (2014) A bayesian framework for online classifier ensemble. In: International conference on machine learning, pp 1584–1592

  2. Bartlett PL, Jordan MI, McAuliffe JD (2006) Convexity, classification, and risk bounds. J Am Stat Assoc 101(473):138–156

    Article  MathSciNet  MATH  Google Scholar 

  3. Bartlett PL, Wegkamp MH (2008) Classification with a reject option using a hinge loss. J Mach Learn Res 9:1823–1840

    MathSciNet  MATH  Google Scholar 

  4. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MATH  MathSciNet  Google Scholar 

  5. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  6. Buja A, Stuetzle W, Shen Y (2005) Loss functions for binary class probability estimation and classification: structure and applications. Manuscript, available at http://www-stat.wharton.upenn.edu/~buja

  7. Chen B, Xing L, Liang J, Zheng N, Principe JC (2014) Steady-state mean-square error analysis for adaptive filtering under the maximum correntropy criterion. IEEE Signal Process Lett 21(7):880–884

    Article  Google Scholar 

  8. Chen B, Xing L, Xu B, Zhao H, Zheng N, Principe JC (2017) Kernel risk-sensitive loss: definition, properties and application to robust adaptive filtering. IEEE Trans Signal Process 65(11):2888–2901

    Article  MathSciNet  MATH  Google Scholar 

  9. Chen B, Xing L, Zhao H, Zheng N, Prı JC, et al. (2016) Generalized correntropy for robust adaptive filtering. IEEE Trans Signal Process 64(13):3376–3387

    Article  MathSciNet  MATH  Google Scholar 

  10. Cruz RM, Sabourin R, Cavalcanti GD, Ren TI (2015) Meta-des: a dynamic ensemble selection framework using meta-learning. Pattern Recogn 48(5):1925–1935

    Article  Google Scholar 

  11. Dudek G (2016) Pattern-based local linear regression models for short-term load forecasting. Electr Power Syst Res 130:139– 147

    Article  Google Scholar 

  12. Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) Liblinear: a library for large linear classification. J Mach Learn Res 9(Aug):1871–1874

    MATH  Google Scholar 

  13. Feng Y, Yang Y, Suykens JA (2016) Robust gradient learning with applications. IEEE Transactions on Neural Networks and Learning Systems 27(4):822–835

    Article  MathSciNet  Google Scholar 

  14. Friedman J, Hastie T, Tibshirani R, et al. (2000) Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann Stat 28(2):337–407

    Article  MATH  Google Scholar 

  15. Geman D, Reynolds G (1992) Constrained restoration and the recovery of discontinuities. IEEE Trans Pattern Anal Mach Intell 14(3):367–383

    Article  Google Scholar 

  16. Geman D, Yang C (1995) Nonlinear image recovery with half-quadratic regularization. IEEE Trans Image Process 4(7):932–946

    Article  Google Scholar 

  17. Genton MG (1998) Highly robust variogram estimation. Math Geol 30(2):213–221

    Article  MathSciNet  MATH  Google Scholar 

  18. Hajiabadi H, Molla-Aliod D, Monsefi R (2017) On extending neural networks with loss ensembles for text classification. In: Proceedings of the Australasian language technology association workshop 2017

  19. He R, Zheng WS, Tan T, Sun Z (2014) Half-quadratic-based iterative minimization for robust sparse representation. IEEE Trans Pattern Anal Mach Intell 36(2):261–275

    Article  Google Scholar 

  20. Holland MJ, Ikeda K (2016) Minimum proper loss estimators for parametric models. IEEE Trans Signal Process 64(3):704–713

    Article  MathSciNet  MATH  Google Scholar 

  21. Huber PJ, et al. (1964) Robust estimation of a location parameter. The annals of mathematical statistics 35(1):73–101

    Article  MathSciNet  MATH  Google Scholar 

  22. Islam M, Rojas E, Bergey D, Johnson A, Yodh A (2003) High weight fraction surfactant solubilization of single-wall carbon nanotubes in water. Nano letters 3(2):269–273

    Article  Google Scholar 

  23. Kang S, Kang P (2018) Locally linear ensemble for regression. Inf Sci 432:199–209

    Article  MathSciNet  MATH  Google Scholar 

  24. Khan I, Roth PM, Bais A, Bischof H (2013) Semi-supervised image classification with huberized laplacian support vector machines. In: 2013 IEEE 9th international conference on Emerging technologies (ICET), IEEE, pp 1–6.

  25. Ko AH, Sabourin R, Britto AS Jr (2008) From dynamic classifier selection to dynamic ensemble selection. Pattern Recogn 41(5):1718–1731

    Article  MATH  Google Scholar 

  26. Liu W, Pokharel PP, Principe JC (2006) Correntropy: a localized similarity measure. In: 2006. IJCNN’06. International joint conference on Neural networks, IEEE, pp 4919–4924

  27. Liu W, Pokharel PP, Príncipe JC (2007) Correntropy: properties and applications in non-gaussian signal processing. IEEE Trans Signal Process 55(11):5286–5298

    Article  MathSciNet  MATH  Google Scholar 

  28. Liu Y, Yao X, Higuchi T (2000) Evolutionary ensembles with negative correlation learning. IEEE Trans Evol Comput 4(4):380–387

    Article  Google Scholar 

  29. López J, Maldonado S (2018) Robust twin support vector regression via second-order cone programming. Knowl-Based Syst 152:83–93

    Article  Google Scholar 

  30. Mannor S, Meir R (2001) Weak learners and improved rates of convergence in boosting. In: Advances in neural information processing systems, pp 280–286

  31. Masnadi-Shirazi H, Mahadevan V, Vasconcelos N (2010) On the design of robust classifiers for computer vision. In: 2010 IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 779–786

  32. Masnadi-Shirazi H, Vasconcelos N (2009) On the design of loss functions for classification: theory, robustness to outliers, and savageboost. In: Advances in neural information processing systems, pp 1049–1056

  33. Mendes-Moreira J, Soares C, Jorge AM, Sousa JFD (2012) Ensemble approaches for regression: a survey. ACM Comput Surv (CSUR) 45(1):10

    Article  MATH  Google Scholar 

  34. Meyer CD (2000) Matrix analysis and applied linear algebra, vol 71 Siam

  35. Miao Q, Cao Y, Xia G, Gong M, Liu J, Song J (2016) Rboost: label noise-robust boosting algorithm based on a nonconvex loss function and the numerically stable base learners. IEEE Transactions on Neural Networks and Learning Systems 27(11):2216–2228

    Article  MathSciNet  Google Scholar 

  36. Nápoles G, Falcon R, Papageorgiou E, Bello R, Vanhoof K (2017) Rough cognitive ensembles. Int J Approx Reason 85:79–96

    Article  MathSciNet  MATH  Google Scholar 

  37. Painsky A, Rosset S (2016) Isotonic modeling with non-differentiable loss functions with application to lasso regularization. IEEE Trans Pattern Anal Mach Intell 38(2):308–321

    Article  Google Scholar 

  38. Peng J, Guo L, Hu Y, Rao K, Xie Q (2017) Maximum correntropy criterion based regression for multivariate calibration. Chemometr Intell Lab Syst 161:27–33

    Article  Google Scholar 

  39. Rodriguez-Galiano V, Sanchez-Castillo M, Chica-Olmo M, Chica-Rivas M (2015) Machine learning predictive models for mineral prospectivity: an evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geol Rev 71:804–818

    Article  Google Scholar 

  40. Sangari A, Sethares W (2016) Convergence analysis of two loss functions in soft-max regression. IEEE Trans Signal Process 64(5):1280–1288

    Article  MathSciNet  MATH  Google Scholar 

  41. Schaal S, Atkeson CG, Vijayakumar S (2002) Scalable techniques from nonparametric statistics for real time robot learning. Appl Intell 17(1):49–60

    Article  MATH  Google Scholar 

  42. Steinwart I, Christmann A (2008) Support vector machines. Springer Science & Business Media

  43. Tang L, Tian Y, Yang C, Pardalos PM (2018) Ramp-loss nonparallel support vector regression: robust, sparse and scalable approximation Knowledge-Based Systems

  44. Uhlich S, Yang B (2012) Bayesian estimation for nonstandard loss functions using a parametric family of estimators. IEEE Trans Signal Process 60(3):1022–1031

    Article  MathSciNet  MATH  Google Scholar 

  45. Vapnik V (1998) Statistical learning theory, vol 1998. Wiley, New York

    MATH  Google Scholar 

  46. Vapnik V (2013) The nature of statistical learning theory. Springer science & business media

  47. Wang K, Zhong P (2014) Robust non-convex least squares loss function for regression with outliers. Knowl-Based Syst 71:290–302

    Article  Google Scholar 

  48. Wang Z, Simoncelli EP, Bovik AC (2003) Multiscale structural similarity for image quality assessment. In: 2004 Conference record of the thirty-seventh asilomar conference on Signals, systems and computers, vol 2. IEEE, pp 1398–1402

  49. Xiao Y, Wang H, Xu W (2017) Ramp loss based robust one-class svm. Pattern Recogn Lett 85:15–20

    Article  Google Scholar 

  50. Xie L, Yin M, Wang L, Tan F, Yin G (2018) Matrix regression preserving projections for robust feature extraction. Knowledge-Based Systems

  51. Zhang J, Chung C, Han Y (2016) Online damping ratio prediction using locally weighted linear regression. IEEE Trans Power Syst 31(3):1954–1962

    Article  Google Scholar 

  52. Zhang P, Zhuo T, Zhang Y, Huang H, Chen K (2016) Bayesian tracking fusion framework with online classifier ensemble for immersive visual applications. Multimedia Tools and Applications 75(9):5075–5092

    Article  Google Scholar 

  53. Zhang T (2004) Statistical behavior and consistency of classification methods based on convex risk minimization. Ann Stat 32(1):56–85. http://www.jstor.org/stable/3448494

    Article  MathSciNet  MATH  Google Scholar 

  54. Zhang T, Oles FJ (2001) Text categorization based on regularized linear classification methods. Inf Retr 4(1):5–31

    Article  MATH  Google Scholar 

  55. Zhao L, Mammadov M, Yearwood J (2010) From convex to nonconvex: a loss function analysis for binary classification. In: 2010 IEEE international conference on Data mining workshops (ICDMW), IEEE, pp 1281–1288

  56. Zhao S, Chen B, Principe JC (2012) An adaptive kernel width update for correntropy. In: The 2012 international joint conference on Neural networks (IJCNN), IEEE, pp 1–5

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Reza Monsefi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hajiabadi, H., Monsefi, R. & Yazdi, H.S. relf: robust regression extended with ensemble loss function. Appl Intell 49, 1437–1450 (2019). https://doi.org/10.1007/s10489-018-1341-9

Download citation

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1007/s10489-018-1341-9

Keywords

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载