Mathematics
Mathematics, 21.02.2020 17:58, deonceee4671

In a coin game, you repeatedly toss a biased coin (0.4 for head, 0.6 for tail). Each head represents 3 points and tail represents 1 point. You can either Toss or Stop if the total number of points you have tossed is no more than 7. Otherwise, you must Stop. When you Stop, your utility is equal to your total points (up to 7), or 0 if you get a total of 8 points or higher. When you Toss, you receive no utility. There is no discounting (= 1).

(a) What are the states and the actions for this MDP? Which states are terminal?
(b) What is the transition function and the reward function for this MDP? Hint: The problem may be simpler to formulate using the general version of rewards: R(s, a, s')
(c) Run value iteration to find the optimal value function V* for the MDP. Show each Vk step (starting from Vo(s) = 0 for all states s). For a reasonable MDP formulation, this should converge in fewer than 10 steps. If you find it too tedious to do by hand, you may write a program to do this for you; however, there may be some benefit in seeing the calculation unfolding in front of you.
(d) Using the V* you found, determine the optimal policy for this MDP.

answer
Answers: 3

Other questions on the subject: Mathematics

image
Mathematics, 21.06.2019 15:30, robclark128
Which statement about the graph of y = 8(0.25)^2
Answers: 2
image
Mathematics, 21.06.2019 17:30, muhammadcorley123456
Miranda is braiding her hair. then she will attach beads to the braid. she wants 1_3 of the beads to be red. if the greatest number of beads that will fit on the braid is 12,what other fractions could represent the part of the beads that are red?
Answers: 3
image
Mathematics, 21.06.2019 18:00, mihirkantighosh
Arecipe calls for 32 fluid ounces of heavy cream. how many 1 pint containers of heavy cream are needed to make the recipe?
Answers: 2
image
Mathematics, 21.06.2019 19:30, estherstlouis7812
1. find the area of the unshaded square 2. find the area of the large square 3. what is the area of the frame represented by the shaded region show
Answers: 1
Do you know the correct answer?
In a coin game, you repeatedly toss a biased coin (0.4 for head, 0.6 for tail). Each head represents...

Questions in other subjects:

Konu
Mathematics, 12.12.2020 16:50