- 1 Introduction
- 2 Average Utility
- 3 Single-Marriage Case

- 4 Multi-Marriage Case

- 5 Optimal Ordering of Supplies

- 6 Software Examples

18

Sequential Statistical Decisions Model, "Bride Problem"

1 Introduction

The Bride problem is a good example of sequential statistical decisions [Wald, 1947,Wald, 1950]. The dynamic programming is a conventional technique to optimize sequential decisions [Bellman, 1957]. Applying the dynamic programming to specific problems, one develops specific algorithms, as usual. The algorithms, similar to these of bride's decision making, can be applied choosing the optimal time to buy durable goods, such as cars, houses, computers, to start new business, e.t.c..

Typical industrial applications of sequential decisions are the optimal stopping rules. For example, one can consider stopping of some nuclear plant as a marriage. One can evaluate stopping cost directly. Risks involved in running the plant, when one observes some deviations from the normal routine, can be considered as risks of waiting for the next proposal. That means that one hopes for something better while risking a bad outcome.

2 Average Utility

Suppose that impressions about the goodness of grooms are independent and identically distributed. Then the probability density of an impression , given the goodness , is

Assume the Gaussian prior probability density of goodness

Here is a prior variance and is a prior mean of goodness. For example, shows an optimistic bride. shows that a bride is pessimistic. Suppose, that a prior probability density of bride impressions is

Here is a variance of impressions around the true goodness .

Assume that both the groom goodness and the bride impression are random variables depending on many independent factors. This explains the Gaussian distributions (18.3) and (18.4). One defines a posterior probability density of goodness , given the impression , by the Bayesian formula [Bayes, 1783]

Here

3 Single-Marriage Case

Denote by the bride's decision about the groom

Suppose that

The last condition means that brides marry, and marry only once. Formally condition (18.8) defines the set of feasible decisions when the -th groom proposes

Here is the marriage index

The marriage index is zero, if the bride is marred. This prevents repeated marriages.

1 Bellman's Equations

Denote by the expected utility function, if the impression of the last groom is

Comparing (18.11) and (18.12) one observes that

This follows from independence assumptions (18.1) and (18.2). Denote by the expected utility, if the impression of the th groom is and a bride is making the optimal decision

Here

Following the same pattern, we define the expected utility, if the impression of the -th groom is and the bride is making the optimal decision

Here

Note, that the utility of accepting a proposal in expressions (18.17) and (18.18) is a function only of impression . It does not depend on the proposal number . That follows from independence assumptions (18.1) and (18.2). Solving these recurrent equations one defines the sequence of optimal decision functions and the expected utilities . This should be done for all possible impressions and for all numbers . One cannot do that in continuous case. Therefore, one uses a discrete approximation

2 Discrete Approximation

From expression (18.16)

From expression (18.19)

Here and . That is a discrete approximation of the recurrent equations. All possible impressions and all numbers are considered. One sets the number of iterations by the accuracy needed.

The results are sequences of optimal decision functions and the expected utilities . These sequences are stored in a set of arrays which define how the optimal decisions and the expected utilities depend on the possible impressions . Doing this, one avoids the repeated calculations. Large arrays is a disadvantage of this procedure.

In a similar way one defines the expected utility if the impression of the -th groom is and the bride is making the optimal decision

The other expressions remains the same.

4 Multi-Marriage Case

1 " Buy-a-PC" Example

Assume that a prior probability density of PC utilities is Gaussian

Here is a prior variance and is a prior mean of PC utilities. Suppose that and that . This means that expected PC utility is increasing linearly. The expected diversity remains the same.

The PC price in $ is denoted by . Suppose that the price of PC depends linearly on the weighted sum

Here parameters are defined in $ per unit and reflect the market prices of CPU, RAM, and HD. Expressing the price as a function of utility :

Here

Assume that one observes the utility exactly. This means that the impression and that impression errors in expression (18.2). That simplifies the problem.

2 Bellman's Equations

The expected utility function is . Here is the utility of a new PC. is the utility of the old PC, to be replaced by the new one. Consider a "horizon" of years. During a year one can change PC only once, if one wishes.

Denote by the maximal expected utility in the year

There are two possible decisions . The decision means to buy a new PC. The utility of the new PC is . The utility of the old one is .

The utility of the decision , to keep the old PC in the last year , is . Here is the price of refusing to buy a new PC. This includes the waiting losses minus the price of new PC in the year defined by (18.30). It is assumed that we abandon the old PC as soon as we obtain the new one. Therefore, one "wins" the price of the new PC by using the old PC. The optimal decision in the last year

Denote by the maximal expected utility in the year .

Here is the maximal expected utility in the year , if the utility of the old PC is .

is a prior probability density of defined by expression (18.28) at a time .

Here is obtained from this equation

The optimal decision in the year

Denote by the maximal expected utility in the year . Then

Here is the maximal expected utility in the year , if the utility of the old PC is .

is a prior probability density of defined by expression (18.28) at a time .

is obtained from the equation

The optimal decision in the year

Here the decision functions depend on two variables defining the corresponding utilities of the new and the old computer. That is difficult for both the calculations and grafical representations.

One applies the discrete approximation such as in Section 3.2. The optimal decision functions depend on two variables and . Therefore, one needs times more calculations to obtain the same accuracy as in the single marriage case (see Section 3.2).

To solve real life problems the "directly unobservable" factors should be included into the "Buy-a-PC" model. For example , one can estimate the expected life-span of PC, given the impression , by the Bayesian formula

Here is the life-span of PC and is the user's impression about its reliability. The impression depends on the warranty, the reputation of the manufacturer and on some subjective factors, too.

To simplify the calculations, the "One-Step-Ahead" approximation is suggested. Here one assumes that the next step is the last one and introduces the parameter .

The one-step-optimal decision

Following the same pattern, one defines the maximal expected utility in -th year

where

Here

The one-step-optimal decision

The one-step-optimal decision is a function of difference that depends on two parameters and . Therefore, one defines it as a function of critical pairs depending on time .

Solving these recurrent equations one defines the sequence of one-step-optimal decision functions and the expected utilities . This should be done for all possible values of and .

The One-Step-Ahead approximation, assuming that , reduces the computing time to a level of single marriage case. However, one needs to estimate the approximation error.

The important application a of sequential statistical decisions is optimal ordering of supplies. Consider, for example, same computer shop that orders units of RAM (Random Access Memory) at fixed times dividing the time into stages of unit length each.

1 Uncertain Demand

Here and , is the number of unsold units.

is the loss due to short supply including fines for breaking supply contracts and/or damages to reputation as an reliable retailer.

All these parameters correspond to the stage , meaning that

where is the demand at the stage ,

is the supply belonging to a feasible set at this stage, and

is the number of unsold RAM units, left from the previous stage .

Here and later we omit the lower indices to make expressions shorter.

Denote by
the probability
of demand
at the stage .

Usually the demand
during the time is described by the Poisson distribution [Tijms, 1994]

Considering stages of unit time and

where is the expected demand at the stage . Then the expected utility at the stage

The maximal expected utility at the last stage

where is a number of unsold RAM units available at the beginning of the -th stage

is a number of RAM units ordered at the beginning of the -th stage,

is the demand at the stage , is a set of feasible supplies at the stage .

Denote by the optimal decision function maximizing the expected utility (18.55) at any fixed .

Denote by the maximal expected utility at the -th stage

Note, that the meaning of parameter depends on the index of the function , for example, , if the index is , and , if this index is , etc.

The maximum of (18.56) is obtained by the optimal decision function

where .

Extending (18.56) to one obtains

Solving recurrent equations (18.55)-(18.57) one defines the sequence of optimal decision functions

and the optimal expected utility functions .

This is done for all possible remaining RAM units and for all stages .

In the software terms that means calculating recurrently the sequence of arrays that shows how the optimal supply and the maximal profit depend on the unsold RAM units , left from the previous stages. This way one reduces computing time considerably, because array access operations need much less time as compared to multi-dimensional optimization.

Denote by a probability that the wholesale price at the stage is . Denote by the expected wholesale price at the stage . Assuming the Gaussian distribution of the wholesale price logarithm, one obtains the lognormal density function [Tijms, 1994]

where means , is the shape parameter, and is the scale parameter. The expected value and the standard of the lognormal distribution

The lognormal density is unimodal with maximum at .

In the case of random wholesale prices , the profit at the -th stage

Here , is the number of unsold units.

Then the expected utility at the stage

The maximal expected utility at the last stage

where is a number of unsold RAM units left at the beginning of -th stage,

is the wholesale price at the stage .

Denote by the optimal decision function maximizing the expected utility (18.63) at any fixed .

Denote by the maximal expected utility at the -th stage

That is achieved by the optimal decision function

where .

Extending to one obtains

Solving recurrent equations (18.63)-(18.65) one defines the sequence of optimal decision functions

and the optimal expected utilities .

This is done for all possible numbers of RAM units available at the beginning of stages .

We call by Market Elasticity the relation of expected demand to the retail price , where is a set of feasible retail prices. A simple approximation is the exponential function

where is the saturation demand, is the elasticity parameter, is the retail price, and is a set of feasible retail prices, all at the stage .

In this case the profit function at the -th stage

Here denotes the difference of retail and wholesale prices at the stage .

The expected utility at the stage

where .

The maximal expected utility at the last stage

Denote by , where denotes , the two-dimensional optimal decision function maximizing the expected utility (18.69) at any fixed .

Denote by the maximal expected utility at the -th stage

That is achieved by the two-dimensional optimal decision function

where means .

Extending (18.70) to one obtains

Solving recurrent equations (18.69)-(18.71) one defines the sequence of decision functions

and the optimal expected profits .

This is done for all possible numbers of RAM units available at the beginning of stages .

In the case of market elasticity, to represent maximal profits and two-dimensional optimal decisions as functions of remaining RAM units one needs to calculate a sequence of arrays defining the sequence of two-dimensional decision functions. This is difficult, therefore, one starts from the simplest case of uncertain demand (see section 5.1), as usual.

Note that in the actual calculations of expressions (18.63)-(18.71) the integrals are replaced by sums and the densities by probabilities using expressions similar to those in the section 3.2.

In the demonstration version the "horizon" means the estimated end of business. In the expert version one uses the "moving horizon" , repeating the optimization after each stage. This way one simplifies calculations and updates data.

The recurrent equations (18.55)-(18.57) include Bride problems as special cases with only two feasible decisions and different utility functions (18.51).

Several interactive versions of the Bride problem are implemented
as Java applets. These, and the corresponding software are
on web-sites (see section 4).

Figure 18.2 shows simulated results of the "optimal marriage".

Figure 18.3 shows the optimal decision function defining how the critical "goodness" of a groom depends on his number.

Figure 18.5 shows the input window of the "Buy-a-Car" problem.

Figure 18.6 shows the output window of the "Buy-a-Car" problem.

Figure 18.7 shows the optimal decision function of the "Buy-a-Car" problem.

The "Buy-a-Car" algorithm is just a first approximation. It should be improved. For example, improving the By-a-PC model one should

- remove the "single marriage" condition (18.8),
- add "Change-a-Car" cost,
- express desirable car properties (see Figure 18.5), using conditions of Pareto optimality [Mockus, 1989b],
- extend the property list by including properties that are not directly observable, such as the reliability of a car,
- keep the "No-Return" rule.

First one selects the problem "Bride-Multi" in the section "Discrete, Dynamic and Linear Programming ". The program [Damasevicius, 2000] starts by clicking the button at the end of the page. Data is entered using the operation page (see Figure 18.8). Optimization starts by clicking the "Calculate" button.

The decision function ( see Figure 18.9) is defined by Bellman equations (18.33)(18.50). The "State-of Nature" generated by Monte Carlo simulation of probability distributions (18.1) (18.2) is also shown in Figure 18.9. The jumps of the decision function correspond to two marriages. In addition, Figure 18.10 shows the best "State-of-Nature" .jonas mockus 2004-03-20