代做Gambler’s Ruin、代写Java，Python程序语言、代写C/C++设计-代写Python编程

联系方式

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-23:00
微信：codinghelp

您当前位置：首页 >> Python编程Python编程

代做Gambler’s Ruin、代写Java，Python程序语言、代写C/C++设计

日期：2019-12-20 09:08

(Exercise 9) For this week’s exercise I will provide some code below to help you reach

Part II. Recall the Gambler’s Ruin problem on the integers {0, 1, 2, . . . , 10} with bets

at each stage of an amount a ∈ [1, 2, . . . , x], and a fixed (independent) probability

p of winning each bet. A very useful way to formulate Gambler’s Ruin has already

been seen in the notes:

• to have no rewards at all, because instead we . . .

• place a boundary condition on the value function, essentially enforcing that

Vn(x) = 1 for any x ≥ 10. (This is like receiving a reward of +1 after reaching

an x ≥ 10.)

• Note the reward is obtained after moving, not during the action, so steprewards

are 0 and also independent of the action taken.

What we therefore do is as follows: let VT (x) represent the maximized reward

over T steps following an optimal betting strategy starting with wealth x. Then if

during our T steps we reach a state ≥ 10 we will have collected reward of +1, if

we haven’t reached such a state (including all cases where we have gone bankrupt

already) we have a reward of 0. As discussed in lectures this approach will maximize

the probability of reaching such a state ≥ 10.

In part (h) of Exercise 9 you should use the following ‘shell’ for a function:

findValueFunction <− function ( T , p ) {

steps <− 0

# In s e r t your code here c rea ting V

# In s e r t your code here c rea ting W, pu t ting . . .

# in boundary values

while ( steps < T ) {

# In s e r t code here f o r pu t ting in . . .

# boundary values o f W

# In s e r t loop code here f o r pu t ting . . .

# values i n t o W using Op timali ty . . .

# Equation , V , and also s t o rin g the . . .

# best choice a∗ f o r each s ta te x

V <− W steps <− steps + 1}

return ( l i s t ( value=W, act ions=<your . best . action . vector > )}

findValueFunction(100,0.4), for example, will now work out V100(x) when p = 0.4.

Can you see why? It even tells you which action to optimally take initially.

(The idea here is that the code increments t from 1, with V always containing

Vt−1 and being used (at each x) to work out Vt which is then stored in W. Then t

is incremented, W stored into V and the process repeated. The value of t although

not explicit, is always steps+1.)

3-30

Exercise 9 (Al-Khwarizmi). Part I: Solving Gambler’s Ruin.

Our value function for VT (x) satisfies this optimality equation:

VT (x) = max

1≤a≤x

pVT −1(x + a) + (1 − p)VT −1(x − a),

which holds for all T ≥ 1, and x ∈ {1, 2, 3, . . . , 9}. We also have boundary conditions:

• For all T ≥ 0: VT (0) = 0 and VT (x) = 1 for any x ≥ 10;

• V0(x) = 0 for x ≤ 9;

You will use this Optimality Equation (for each x ∈ {1, 2, . . . , 9}) to calculate Vn

from Vn−1 for each n ≥ 1 until you have found V1000(x). (I suggest you fix p = 0.4

until Part II). Parts (a)-(b)-(c) are intended to be very quick, the exercise really starts

at (d).

(a) Why is X = {0, 1, 2, . . . , 18} the full set of reachable state while following the

game rules? For which values of x ∈ X do you already explicilty know the values

of Vn(x), for every n ≥ 0? (i.e. boundary conditions)

(b) Create a vector, V of length 19, to hold the values of V0(·). Note that in R, the

first element of a vector is called element 1, so V[i ] will need to contain V0(i−1).

values from the boundary conditions. Values not known from the boundary

conditions should be set equal to NA.

(d) Define a function of x, a, p and VT −1 which calculates the right-hand side (RHS)

bracket of the optimality equation for the given values.

(e) Identify the valid choices a ∈ Ax. Construct a loop over all valid (x, a) combinations

(we only need 1 ≤ x ≤ 9, and I suggest an outer loop over x, and inner

loop over a). Test it by printing all (x, a) pairs.

(f) Inside the loop, add a step to call the function defined in (d) for each a ∈ Ax in

order to find the best choice a at each x. Then for each x ∈ {1, 2, . . . , 9} store

the best choice of a in some vector you create.

(g) In your loop above, use the the vector W to store the values of VT (x) calculated

as the best value of the RHS of the optimality equation when given x and VT −1.

(You can use V to store VT −1, and W for VT .)

(h) Copy your code above inside the function shell described in the accompanying

purple block from the notes (see previous page) and ...

Part II: Use your value function function to answer questions... most importantly

comment on the answers!

• Find V1(x), V2(x) and V3(x) for x ∈ {1, 2, . . . , 9}, when p = 0.6. Also finding

the optimal first initial actions.

• How does V1000(5) vary for p ∈ { 110 ,13,12,34}?

• What is the initial optimal betting action for each state when p = 0.4 and

T = 20?

• How do V1000(x) and V999(x) compare?

• Feel free to answer other questions you pose yourself too!

3-31

【返回顶部】【打印本稿】【关闭本页】

【上一篇】：STAT6210代做、代写Data Analysis、代写Python程序语言、Java，c++代写代做

【下一篇】：STAT6210代做、代写Data Analysis、代写Python程序语言、Java，c++代写代做

联系方式

最新辅导

热门辅导

您当前位置：首页 >> Python编程Python编程

代做Gambler’s Ruin、代写Java，Python程序语言、代写C/C++设计

日期：2019-12-20 09:08

相关文章