KYBERNETIKA

Kybernetika 41(6):757-772, 2005.

Estimates for Perturbations of Average Markov Decision Processes with a Minimal State and Upper Bounded by Stochastically Ordered Markov Chains

Raúl Montes-de-Oca and Francisco Salem-Silva

Abstract:

This paper deals with Markov decision processes (MDPs) with real state space for which its minimum is attained, and that are upper bounded by (uncontrolled) stochastically ordered (SO) Markov chains. We consider MDPs with (possibly) unbounded costs, and to evaluate the quality of each policy, we use the objective function known as the {\it average cost}. For this objective function we consider two Markov control models ${\mathbb{P}}$ and ${\mathbb{P}}_{1}$. $\mathbb{P}$ and ${\mathbb{P}}_{1}$ have the same components except for the transition laws. The transition $q$ of $\mathbb{P}$ is taken as unknown, and the transition $q_{1}$ of ${\mathbb{P}}_{1}$, as a known approximation of $q$. Under certain irreducibility, recurrence and ergodic conditions imposed on the bounding SO Markov chain (these conditions give the rate of convergence of the transition probability in $t$-steps, $t=1,2,\ldots$ to the invariant measure), the difference between the optimal cost to drive $\mathbb{P}$ and the cost obtained to drive $\mathbb{P}$ using the optimal policy of ${\mathbb{P}}_{1}$ is estimated. That difference is defined as {\it the index of perturbations}, and in this work upper bounds of it are provided. An example to illustrate the theory developed here is added.

Keywords: stochastically ordered Markov chains; Lyapunov condition; invariant probability; average Markov decision processes;

AMS: 90C40; 93E20;

download full-text.pdf

download abstract.pdf

BIB TeX

@article{kyb:2005:6:757-772,

author = {Montes-de-Oca, Ra\'{u}l and Salem-Silva, Francisco },

title = {Estimates for Perturbations of Average Markov Decision Processes with a Minimal State and Upper Bounded by Stochastically Ordered Markov Chains},

journal = {Kybernetika},

volume = {41},

year = {2005},

number = {6},

pages = {757-772}

publisher = {{\'U}TIA, AV {\v C}R, Prague },

}

BACK to VOLUME 41 NO.6

Kybernetika

International journal published by

Institute of Information Theory and Automation