Likelihood robust optimization for data-driven problems

Wang, Zizhuo; Glynn, Peter W.; Ye, Yinyu

doi:10.1007/s10287-015-0240-3

Likelihood robust optimization for data-driven problems

Original Paper
Published: 19 September 2015

Volume 13, pages 241–261, (2016)
Cite this article

Computational Management Science Aims and scope Submit manuscript

3830 Accesses
167 Citations
Explore all metrics

Abstract

We consider optimal decision-making problems in an uncertain environment. In particular, we consider the case in which the distribution of the input is unknown, yet there is some historical data drawn from the distribution. In this paper, we propose a new type of distributionally robust optimization model called the likelihood robust optimization (LRO) model for this class of problems. In contrast to previous work on distributionally robust optimization that focuses on certain parameters (e.g., mean, variance, etc.) of the input distribution, we exploit the historical data and define the accessible distribution set to contain only those distributions that make the observed data achieve a certain level of likelihood. Then we formulate the targeting problem as one of optimizing the expected value of the objective function under the worst-case distribution in that set. Our model avoids the over-conservativeness of some prior robust approaches by ruling out unrealistic distributions while maintaining robustness of the solution for any statistically likely outcomes. We present statistical analyses of our model using Bayesian statistics and empirical likelihood theory. Specifically, we prove the asymptotic behavior of our distribution set and establish the relationship between our model and other distributionally robust models. To test the performance of our model, we apply it to the newsvendor problem and the portfolio selection problem. The test results show that the solutions of our model indeed have desirable performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Distributionally robust multiobjective optimization with application to risk measure theory

Article 23 December 2024

Statistical robustness in utility preference robust optimization models

Article 31 August 2020

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Article Open access 07 July 2017

Notes

We note that there is also a vast literature on robust optimization where the worst-case parameter is chosen for each decision made. However, the robust optimization is based on a slightly different philosophy than the distributionally robust optimization and is usually more conservative. It can also be viewed as a special case of the distributionally robust optimization where the distribution set only contains singleton distributions. In view of this, we choose not to include a detailed discussion of this literature in the main text but refer the readers to Ben-Tal et al. (2009) and Bertsimas et al. (2011) for comprehensive reviews.

References

Ben-Tal A, den Hertog D, De Waegenaere A, Melenberg B, Rennen G (2013) Robust solutions of optimization problems affected by uncertain probabilities. Manage Sci 59(2):341–357
Article Google Scholar
Ben-Tal A, El Ghaoui L, Nemirovski A (2009) Robust optimization. Princeton Series in Applied Mathematics. Princeton University Press, Princeton
Ben-Tal A, Nemirovski A (1999) Robust solutions of uncertain linear programs. Oper Res Lett 25(1):1–13
Article Google Scholar
Bertsimas D, Thiele A (2006) Robust and data-driven optimization: modern decision-making under uncertainty. In: Tutorial on Operations Research, INFORMS
Bertsimas D, Brown D, Caramanis C (2011) Theory and applications of robust optimization. SIAM Rev 53(3):464–501
Article Google Scholar
Bertsimas D, Gupta V, Kallus N (2013) Data-driven robust optimization. arXiv:1401.0212
Bertsimas D, Sim M (2004) The price of robustness. Oper Res 52(1):35–53
Article Google Scholar
Bertsimas D, Thiele A (2006) A robust optimization approach to inventory theory. Oper Res 54(1):150–168
Article Google Scholar
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge
Book Google Scholar
Calafiore G, El Ghaoui L (2006) On distributionally robust chance-constrained linear programs. J Optim Theory Appl 130(1):1–22
Article Google Scholar
Chen X, Sim M, Sun P (2007) A robust optimization perspective on stochastic programming. Oper Res 55(6):1058–1071
Article Google Scholar
Delage E (2009) Distributionally robust optimization in context of data-driven problems. PhD thesis
Delage E, Ye Y (2008) Distributionally robust optimization under moment uncertainty with application to data-driven problems. Oper Res 58(3):595–612
Article Google Scholar
Dupaĉová J (1987) The minimax approach to stochastic programming and an illustrative application. Stochastics 20(1):73–88
Article Google Scholar
Gabrel V, Murat C, Thiele A (2014) Recent advances in robust optimization: An overview. Eur J Oper Res 235(3):471–483
Article Google Scholar
Gallego G, Moon I (1993) The distribution free newsboy problem: Review and extension. J Oper Res Soc 44(8):825–834
Article Google Scholar
Gelman A, Carlin J, Stern HS, Rubin DB (1995) Bayesian data analysis. Chapman and Hall Press, New York
Google Scholar
Iyengar G (2005) Robust dynamic programming. Math Oper Res 30(2):257–280
Article Google Scholar
Johnson N, Kotz S, Balakrishnan N (1994) Continuous univariate distributions, vol. 1. Wiley Series in Probability and Statistics
Khouja M (1999) The single period newsvendor problem: literature review and suggestions for future research. Omega 27:537–553
Article Google Scholar
Luenberger D (1997) Investment science. Oxford University Press, Oxford
Google Scholar
Mason D, Schuenemeyer J (1983) A modified $\text{ K }$olmogorov-$\text{ S }$mirnov test sensitive to tail alternatives. Ann Stat 11(3):933–946
Article Google Scholar
Nilim A, El Ghaoui L (2005) Robust control of Markov decision processes with uncertain transition matrices. Oper Res 53(5):780–798
Article Google Scholar
Owen A (2001) Empirical likelihood. Chapman and Hall Press, London
Book Google Scholar
Pardo L (2005) Statistical inference based on divergence measures. Chapman and Hall Press, London
Book Google Scholar
Perakis G, Roels G (2008) Regret in the newsvendor model with partial information. Oper Res 56(1):188–203
Article Google Scholar
Popescu I (2007) Robust mean-covariance solutions for stochastic optimization. Oper Res 55(1):98–112
Article Google Scholar
Scarf H (1958) A min-max solution of an inventory problem. In: Arrow K, Karlin S, Scarf H (eds) Studies in the mathematical theory of inventory and production. Stanford University Press, California, pp 201–209
Google Scholar
Scarf H (1959) Bayes solutions of the statistical inventory problem. Ann Math Stat 30(2):490–508
Article Google Scholar
Shapiro A, Dentcheva D, Ruszczynski A (2009) Lectures on stochastic programming: modeling and theory. MPS-SIAM Series on Optimization
Shapiro A, Kleywegt AJ (2002) Minimax analysis of stochastic programs. Optim Methods Softw 17:523–542
Article Google Scholar
So AMS, Zhang J, Ye Y (2009) Stochastic combinatorial optimization with controllable risk aversion level. Math Oper Res 34(3):522–537
Article Google Scholar
Wassaman L (2009) All of statistics: a concise course in statistical inference. Springer Texts in Statistics, New York
Google Scholar
Yue J, Chen B, Wang M (2006) Expected value of distribution information for the newsvendor problem. Oper Res 54(6):1128–1136
Article Google Scholar
Ẑáĉková J (1966) On minimax solutions of stochastic linear programming problems. Casopis pro Pêstování Matematiky 91:423–430
Google Scholar
Zhu Z, Zhang J, Ye Y (2013) Newsvendor optimization with limited distribution information. Optim Methods Softw 28(3):640–667
Article Google Scholar

Download references

Acknowledgments

The authors thank Dongdong Ge and Zhisu Zhu for valuable insights and discussions. The authors also thank Erick Delage for insightful discussions and for sharing useful codes for the numerical experiments. The research of the first author is supported by the National Science Foundation (NSF) under research Grant CMMI-1434541.

Author information

Authors and Affiliations

Department of Industrial and Systems Engineering, University of Minnesota, Minneapolis, MN, 55455, USA
Zizhuo Wang
Department of Management Science and Engineering, Stanford University, Stanford, CA, 94305, USA
Peter W. Glynn & Yinyu Ye

Authors

Zizhuo Wang
View author publications
Search author on:PubMed Google Scholar
Peter W. Glynn
View author publications
Search author on:PubMed Google Scholar
Yinyu Ye
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Zizhuo Wang.

Appendix

Proof of Proposition 1

First, we show that problem (4) is a convex optimization problem. To see this, we first write (4) in a slightly different but equivalent way:

$$\begin{aligned} \text{ maximize }_{\varvec{x},\lambda ,\mu , \varvec{y}}&\mu +\lambda (\gamma +N-\sum _{i=1}^nN_i\log {N_i})-N\lambda \log {\lambda }+\lambda \sum _{i=1}^nN_i\log {y_i}\nonumber \\ \text{ s.t. }&\lambda \ge 0, \quad h(\varvec{x},\varvec{\xi }_i)-\mu \ge y_i, \forall i, \quad \varvec{x}\in D. \end{aligned}$$

(15)

Note that because the objective is to maximize, (15) is an equivalent representation of (4). It is easy to see that when $h(\varvec{x}, \varvec{\xi })$ is concave in $\varvec{x}$, the constraints are all convex constraints. Now we only need to verify that the objective function is jointly concave. Note that the first two terms of the objective function are linear, therefore, we only need to show that the function $-N\lambda \log {\lambda } + \lambda \sum _{i=1}^n N_i \log {y_i}$ is jointly concave in $\lambda $ and $\varvec{y}$. We consider the Hessian matrix of the objective function with respect to $\lambda $ and $\varvec{y}$:

$$\begin{aligned} H = \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} -\frac{N}{\lambda } &{} \frac{N_1}{y_1} &{} \frac{N_2}{y_2} &{} \cdots &{} \frac{N_n}{y_n} \\ \frac{N_1}{y_1} &{} -\frac{\lambda N_1}{y_1^2} &{} 0 &{} \cdots &{} 0 \\ \frac{N_2}{y_2}&{} 0 &{} -\frac{\lambda N_2}{y_2^2} &{} \cdots &{} 0\\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ \frac{N_n}{y_n} &{} 0 &{} 0 &{} \cdots &{} -\frac{\lambda N_n}{y_n^2} \end{array} \right) . \end{aligned}$$

Now consider a diagonal matrix A with diagonal elements $\lambda , y_1,\ldots ,y_n$, then we have

$$\begin{aligned} A^THA = \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} -N\lambda &{} \lambda N_1 &{} \lambda N_2 &{} \cdots &{} \lambda N_n \\ \lambda N_1 &{} -\lambda N_1 &{} 0 &{} \cdots &{} 0 \\ \lambda N_2&{} 0 &{} -\lambda N_2 &{} \cdots &{} 0\\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ \lambda N_n &{} 0 &{} 0 &{} \cdots &{} -\lambda N_n \end{array} \right) \end{aligned}$$

which is a diagonal dominant matrix with all negative diagonal elements. Thus $A^THA$ is negative semidefinite, which further implies that H is negative semidefinite. Thus the objective function is a concave function and the optimization problem is a convex optimization problem.

Finally, we prove that the worst-case distribution satisfies:

$$\begin{aligned} p_i = \frac{\lambda ^* N_i}{h(\varvec{x}^*, {\varvec{\xi }}_i) - \mu ^*}, \quad \forall i. \end{aligned}$$

We consider the derivative of the Lagrangian function $L(\varvec{p},\lambda , \mu )$ with respect to $p_i$. By the KKT conditions, we must have $p_i\nabla _{p_i} L(\varvec{p}, \lambda , \mu ) = 0$, which implies that $p_i(h(\varvec{x}^*, \varvec{\xi }_i) -\mu ^*) = \lambda ^* N_i$. Also, we note that in (4), since there is a term of $\log {(h(\varvec{x}, \varvec{\xi }_i) -\mu )}$, therefore, $h(\varvec{x}^*, \varvec{\xi }_i) -\mu ^*\ne 0$. Thus Proposition 1 is proved. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Z., Glynn, P.W. & Ye, Y. Likelihood robust optimization for data-driven problems. Comput Manag Sci 13, 241–261 (2016). https://doi.org/10.1007/s10287-015-0240-3

Download citation

Received: 09 August 2013
Accepted: 09 September 2015
Published: 19 September 2015
Issue date: April 2016
DOI: https://doi.org/10.1007/s10287-015-0240-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Likelihood robust optimization for data-driven problems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Distributionally robust multiobjective optimization with application to risk measure theory

Statistical robustness in utility preference robust optimization models

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Explore related subjects

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Proof of Proposition 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now