Computational methods for Markov decision problems

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Computational methods for Markov decision problems Shin, Moon Chirl

Abstract

In this thesis we study computational methods for finite discounted Markov decision problems and finite discounted parametric Markov decision problems over an infinite horizon. For the former problem our emphasis is on finding methods to significantly reduce the effort required to determine an optimal policy. We discuss the implementation of Porteus' scalar extrapolation methods in the modified policy iteration algorithm and show that the results using only a final scalar extrapolation will be the same as those obtained by applying scalar extrapolation at each iteration and then using a final scalar extrapolation. Action elimination procedures for policy iteration and modified policy iteration algorithms are presented. The purpose of these techniques is to reduce the size of the action space to be searched in the improvement phase of the algorithm. A method for eliminating non-optimal actions for all subsequent iterations using upper and lower bounds on the optimal expected total discounted return is presented along with procedures for eliminating actions that cannot be part of the policy chosen in the improvement phase of the next iteration. A numerical comparison of these procedures on Howard's automobile replacement problem and on a large randomly generated problem suggests that using modified policy iteration together with one of the single iteration elimination procedures will lead to large savings in the computational time for problems with large state spaces. Modifications of the algorithm to reduce storage space are also discussed. For the finite discounted Markov decision problems in which the reward vector is parameterized by a scalar we present an algorithm to determine the optimal policy for each value of the parameter within an interval. The algorithm is based on using approximations of values to resolve difficulties caused by roundoff error. Also, several action elimination procedures are presented for this problem. Bi-criterion Markov decision problems and Markov decision problems with a single constraint are formulated as parametric Markov decision problems. A numerical comparison of algorithms with and without action elimination procedures is carried out on a two criterion version of Howard's automobile replacement problem. The results suggest that the algorithm with one of the action elimination procedures will lead to efficient solution of this problem.

Item Metadata

Title	Computational methods for Markov decision problems
Creator	Shin, Moon Chirl
Publisher	University of British Columbia
Date Issued	1980
Description	In this thesis we study computational methods for finite discounted Markov decision problems and finite discounted parametric Markov decision problems over an infinite horizon. For the former problem our emphasis is on finding methods to significantly reduce the effort required to determine an optimal policy. We discuss the implementation of Porteus' scalar extrapolation methods in the modified policy iteration algorithm and show that the results using only a final scalar extrapolation will be the same as those obtained by applying scalar extrapolation at each iteration and then using a final scalar extrapolation. Action elimination procedures for policy iteration and modified policy iteration algorithms are presented. The purpose of these techniques is to reduce the size of the action space to be searched in the improvement phase of the algorithm. A method for eliminating non-optimal actions for all subsequent iterations using upper and lower bounds on the optimal expected total discounted return is presented along with procedures for eliminating actions that cannot be part of the policy chosen in the improvement phase of the next iteration. A numerical comparison of these procedures on Howard's automobile replacement problem and on a large randomly generated problem suggests that using modified policy iteration together with one of the single iteration elimination procedures will lead to large savings in the computational time for problems with large state spaces. Modifications of the algorithm to reduce storage space are also discussed. For the finite discounted Markov decision problems in which the reward vector is parameterized by a scalar we present an algorithm to determine the optimal policy for each value of the parameter within an interval. The algorithm is based on using approximations of values to resolve difficulties caused by roundoff error. Also, several action elimination procedures are presented for this problem. Bi-criterion Markov decision problems and Markov decision problems with a single constraint are formulated as parametric Markov decision problems. A numerical comparison of algorithms with and without action elimination procedures is carried out on a two criterion version of Howard's automobile replacement problem. The results suggest that the algorithm with one of the action elimination procedures will lead to efficient solution of this problem.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2010-03-23
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0095141
URI	http://hdl.handle.net/2429/22353
Degree	Doctor of Philosophy - PhD
Program	Business Administration - Management Science
Affiliation	Business, Sauder School of; Operations and Logistics (OPLOG), Division of
Degree Grantor	University of British Columbia
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

UBC_1980_A1 S46.pdf -- 7.96MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

Computational methods for Markov decision problems Shin, Moon Chirl

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights