# solution

Mr. Smith owns a shop and wants to find the optimal inventory management policy using Markov
decision process. He sells 0, 1 or 2 products per week with probability 20%, 40%, and 40%, respectively.
Due to the space restriction, he can have at most two products at a time. At the end of each week, he
checks the inventory level and places an order to replenish the inventory if necessary. The ordered item is
delivered immediately. The purchasing cost is \$100 per product. The selling price of the product is \$200
per product. The inventory cost is \$50 per product, and it is charged for both the existing inventory and
newly order items.
Let ?? denote the number of products in the inventory at the end of week before placing the order.
The ?? represents the state of the inventory management system. Mr. Smithâ€™s decision is how many
products to order at the end of each week.

a) Provide all the possible policies with the descriptions of states and decisions

b) Suppose that Mr. Smith orders only 1 product when the inventory is empty; Otherwise, he does
not place any order. What is the transition probability matrix of the states with this policy?

