THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng | |
---|---|
Số trang | 240 |
Dung lượng | 2,68 MB |
Nội dung
Ngày đăng: 30/08/2020, 07:23
Nguồn tham khảo
Tài liệu tham khảo | Loại | Chi tiết | ||
---|---|---|---|---|
1. Agrawal, R.: Sample mean based index policies with O(log n) regret for the multi-armed bandit problem. Adv. Appl. Probab. 27, 1054–1078 (1995) | Sách, tạp chí |
|
||
66. Fu, M.C., Healy, K.J.: Techniques for simulation optimization: an experimental study on an (s, S) inventory system. IIE Trans. 29, 191–199 (1997) | Sách, tạp chí |
|
||
112. Koole, G.: The deviation matrix of the M/M/1/ ∞ and M/M/1/N queue, with applications to controlled queueing models. In: Proceedings of the 37th IEEE Conference on Decision and Control, pp. 56–59 (1998) | Sách, tạp chí |
|
||
2. Altman, E., Koole, G.: On submodular value functions and complex dynamic programming.Stoch. Models 14, 1051–1072 (1998) | Khác | |||
3. Arapostathis, A., Borkar, V.S., Fernández-Gaucherand, E., Ghosh, M.K., Marcus, S.I.:Discrete-time controlled Markov processes with average cost criterion: a survey. SIAM J.Control Optim. 31(2), 282–344 (1993) | Khác | |||
4. Auer, P., Cesa-Bianchi, N., Fisher, P.: Finite-time analysis of the multiarmed bandit problem.Mach. Learn. 47, 235–256 (2002) | Khác | |||
5. Baglietto, M., Parisini, T., Zoppoli, R.: Neural approximators and team theory for dynamic routing: a receding horizon approach. In: Proceedings of the 38th IEEE Conference on De- cision and Control, pp. 3283–3288 (1999) | Khác | |||
6. Balakrishnan, V., Tits, A.L.: Numerical optimization-based design. In: Levine, W.S. (ed.) The Control Handbook, pp. 749–758. CRC Press, Boca Raton (1996) | Khác | |||
7. Banks, J. (ed.): Handbook of Simulation: Principles, Methodology, Advances, Applications, and Practice. Wiley, New York (1998) | Khác | |||
8. Barash, D.: A genetic search in policy space for solving Markov decision processes. In:AAAI Spring Symposium on Search Techniques for Problem Solving Under Uncertainty and Incomplete Information. Stanford University, Stanford (1999) | Khác | |||
9. Barto, A., Sutton, R., Anderson, C.: Neuron-like elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. 13, 835–846 (1983) | Khác | |||
10. Benaim, M.: A dynamical system approach to stochastic approximations. SIAM J. Control Optim. 34, 437–472 (1996) | Khác | |||
11. Bertsekas, D.P.: Differential training of rollout policies. In: Proceedings of the 35th Allerton Conference on Communication, Control, and Computing (1997) | Khác | |||
12. Bertsekas, D.P.: Dynamic Programming and Optimal Control, vol. 1. Athena Scientific, Bel- mont (2005), vol. 2 (2012) | Khác | |||
13. Bertsekas, D.P.: Dynamic programming and suboptimal control: a survey from ASP to MPC.Eur. J. Control 11, 310–334 (2005) | Khác | |||
14. Bertsekas, D.P., Castanon, D.A.: Adaptive aggregation methods for infinite horizon dynamic programming. IEEE Trans. Autom. Control 34(6), 589–598 (1989) | Khác | |||
15. Bertsekas, D.P., Castanon, D.A.: Rollout algorithms for stochastic scheduling problems.J. Heuristics 5, 89–108 (1999) | Khác | |||
16. Bertsekas, D.P., Shreve, S.E.: Stochastic Control: The Discrete Time Case. Academic Press, New York (1978) | Khác | |||
17. Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996) | Khác | |||
18. Bes, C., Lasserre, J.B.: An on-line procedure in discounted infinite-horizon stochastic opti- mal control. J. Optim. Theory Appl. 50, 61–67 (1986) | Khác |
TỪ KHÓA LIÊN QUAN
TÀI LIỆU CÙNG NGƯỜI DÙNG
TÀI LIỆU LIÊN QUAN