- 1 Introduction
- 2 Convex Version
- 3 Mixed Strategies
- 4 Search for Equilibrium

- 5 One-dimensional Example
- 6 Economic Duel. Nash Model

- 7 Software Example

12 "Duel" Problem,

Differential Game Model

1 Introduction

Consider, as a simple illustration, two objects in space trying to destroy each other. Assume, as a first approximation, that the control parameters are: initial points , "climbing" rates and firing times . These parameters are set before the start for both objects. Denote the control parameters by vectors and , correspondingly. Denote by the lower bounds of these parameters. Denote by the upper bounds. In the illustrative example , and , where . Movements of the objects are described in two-dimensional "hight-time" space by the first-order differential equations

The solution of these equations defines trajectories of the objects

As usual, models of moving objects use homogenuos second-order differential equations, at least

The solution of these equations defines more realistic trajectories of the objects because this includes acceleration, too. These trajectories depend on the roots of the charasteristic equation

Here

The trajectory of the first object

Here .

The constants are defined by initial or boundary conditions. Suppose that the initial higth are fixed by operator of the first flying object. Then

Considering the speed one can write

From here assuming that initial speed

This way we obtained three pairs of linear equations corresponding to three different pairs of roots defining two unknown constants .

The trajectory of the second object is defined in the same way.

Later we shall apply first-order linear differential equations (12.1) and (12.2) as a first approximation to test algorithms of optimization and modeling. Denote by a distance between objects at the moment

Denote by the hitting probability

Here and . The expected utility function of the first object

The expected utility function of the second object

Expressions (12.18) and (12.19) of expected utilities follows from expression (12.17) of hitting probabilities. It is assumed that utilities to hit the hostile object are plus and . Utilities to be hit are minus and . Here denote utilities of the first object and define utilities of the second one.

The expected utility function of the second object

Expected utilities (12.18) and (12.19) are convex functions of probabilities and at the equilibrium point

The convexity of expected utilities (12.18) and (12.19) as functions of probabilities and does not provide their convexity as functions of firing times and and other parameters. The convexity is a part of sufficient equilibrium conditions. Therefore, the equilibrium may not exist.

Expected utilities reach the maximal values at the ends of
intervals of parameters , as usual. For the first
object there are four pure strategies corresponding to various
arrangements of these ends

.

The pure strategies for
the second object:

.

We may search for the pure
equilibrium of bimatrix games (12.20)(12.21)
corresponding to each fixed pair of firing times
using simple discrete algorithm (13.14). However, the
probability to obtain the pure equilibrium is high only for
large bimatrix games [Forgo et al., 1999]. Therefore, we consider
mixed strategies, first.

3 Mixed Strategies

. Probabilities of these events are defined by first four components

of a vector

The fifth component defines the firing time. A vector

has a similar meaning for the second object.

Applying mixed strategies
and

, the utility functions of the first and the second
objects are

Here , where denotes the four events . The matrix

is the winning of the first object, if the pure strategy denoted by the quadruple is used. The matrix

is the winning of the second object, if the pure strategy is used. Expanding expressions (12.23)

Here winnings are defined by expression (12.25). Expression (12.24) illustrates how the function is defined. The function is defined similarly.

One defines bimatrix games (12.23) and (12.24) for each pair of firing times . Therefore, for fixed pair , the same methods of search for the equilibrium points may be used as in the inspector model (13.14). Here winnings and depend on firing times . This is the important difference from the inspector game. One updates bimatrix games while changing firing times.

Therefore, there are only two parameters and to be found by the fraud minimization (see section 4.1). One can do that by applying methods, such as the , to search for the pair that minimizes the fraud.

1 Bimatrix Game Algorithm

There are four linear inequalities, five linear equations and five variables: , because the set has four elements.

A mixed strategy of the second object is fraud irrelevant if

These expressions include inequalities what is usual in the linear programming (LP). Applying LP, the variables and should be expesed as differences of two non-negative variables and . Then from the expressions (12.28), (12.29) one obtains the LP problem:

Here .

Similar linear fraud irrelevant equations were considered in the inspection problem (see expressions (13.42) and (13.43)). Here the fraud irrelevant equations (12.31) and (12.32) depend on two unknown parameters and . Therefore, linear problems (12.31) and (12.32) are just a part of general non-linear problem. We use longer notations, such as and , to show that depend on .

If there is no solution of fraud irrelevant equations, we search for the pure equilibrium strategies using condition (13.14). The Strategy Elimination Algorithm (SE) is a good alternative to obtain equilibrium. A general but difficult way to find the equilibrium is to solve the bilinear problem (13.18) defining the necessary and sufficient equilibrium conditions.

Obtaining equilibrium in mixed strategies or in pure strategies , one reduces the six-dimensional optimization problem to the two-dimensional one.

Here , where are the "contract" firing times. The operator is defined by the fraud condition

Here is the expected winning of the first object. It is the solution of the corresponding bimatrix game at fixed firing times and . is the expected winning of the second object defined as the solution of the bimatrix game at given firing times and . For example, winnings and can be defined by LP problems (12.31)(12.32), or by the Direct Search Algorithm (DSA) (see Section 2.1), or by Strategy Elimination Algorithm (SE) (see Section 2.4).

The equilibrium is achieved, if the minimum (12.36) is zero. The minimization of (12.36) is the task for global optimization algorithms, including the Bayesian ones.

2 Matrix Game Algorithm

and

Here mini-max condition [Owen, 1968]

holds for all firing times . Therefore, a search for the equilibrium is reduced to the condition

The zero-sum equilibrium is reached when the hitting probabilities are equal to 0.5 for both players and their winnings are zero (see the one-dimensional example in the next section).

and the hitting probability is

Here . The control parameters are the firing times and . Assuming that and applying expression (12.20), the expected utility function of the first object is

The expected utility function of the second object follows from expression (12.21)

The equilibrium is reached at the firing moment . The one-dimensional example helps to understand and to test the results of the two-dimensional case.

There are servers providing the same service

where is the profit, is the service price, is the rate of customers, is the running cost, and is the server index

Suppose that arriving customers estimate the average waiting time as the relation of the number of waiting customers to the server capacity that is assumed to be equal to the running cost

A customer goes to the server , if

A customer goes away, if

where is the critical cost. The rate of incoming consumers is fixed

To illustrate general ideas of the dynamic Nash model, we simplify the static one, first.

Assume the customer balance condition.

Here is the customer balance factor. The balance condition follows from (10.4),(10.5), (10.6), if the number of customers is large. The condition is based on the observation that arriving customers prefer servers with lesser sum . From expression (10.7)

From (12.48)

and

Consider a monopolistic situation when only the th server is there. In this case

and

Othervise, customers abandon services . The maximal monopolistic profit

Replacing inequality (12.55) by equality, we eliminate (12.55) and maximize the expression

Now one defines the optimal service price

the optimal service capacity,

and the maximal monopolistic profit

2 Dynamic Nash Model

Denote by a profit accumulated by the th server at a time . Assume that

where defines a profit at a moment

Here is a rate of customers of the th server at a moment . This rate is formally defined as a limit of the fraction

where is a total number of customers arrived during an interval . The zero denotes the starting time of the system. is the end of the operation period, the "horizon". In real life is defined at fixed moments, for example, minutes, hours, days e.t.c.. We consider it as a continuous parameter because we consider a continuous model.

Denote the bankruptcy threshold of an th server by . Denote a bankruptcy moment (if it happens) of the th server by . Then

Otherwise, customer rates are defined by the customer balance condition (12.50)

From (12.59) the profit

Otherwise, the profit is defined by (12.61).

Assume that both servers are planning their dynamic competition in advance. Each server defines trajectories of the service price and the service capacity . These trajectories show how the service prices and capacities depend on the time . Objectives are to maximize profits

Here defines the accumulated profit (12.60) of th server at the end of period .

We illustrate the idea by a simple example. Assume that servers control initial values and rates of change of parameters . Then the trajectories are defined by differential equations

and

The corresponding solutions are

and

The search is similar to that in the Nash model
(see Section 10). We fix the the initial values, the
"Contract-Vector"

. The transformed values, the "Fraud-Vector"

, is
obtained by maximizing the profits of each server . The
maximization is performed assuming that all partners
honors the contract

Here the symbol denotes the accumulated profit (12.65). Formally, condition (12.70) transforms the vector

into the vector . To make expressions shorter denote this transformation by

One may obtain the equilibrium at the fixed point , where

The fixed point exists, if the feasible set is convex and all the profit functions (10.1) are convex [Michael, 1976]. We obtain the equilibrium directly by iterations (12.71) , if the transformation is contracting. If not, then we minimize the square deviation

The equilibrium is achieved, if the minimum (12.73) is zero.

One makes the model more flexible by introducing second derivatives of parameters and . The second derivatives represent accelerations. Assume that servers control the starting values , the rates and accelerations of parameters . Then trajectories are defined by differential equations

and

Here the vector has twelve components.

That means high dimension of the fraud minimization problem (12.73). In addition, one needs to test the convexity of accumulated profits defined by expressions (12.65) as functions of parameters

. If they are convex then one obtains equilibrium by condition (12.73). Otherwise, some specific algorithms should be designed. An example is the inspector problem.

The task window, in Figure 12.1, shows the domain of firing times .

One selects the optimization method from the menu (see Figure 9.7). The method is selected (see Figure 12.2).

The total number of iterations is set to forty. The number of initial iterations is set to five.
The optimization results are in Figure 12.3.

The mode shows how the deviation from the equilibrium depends on the iteration number (see Figure 12.4).

The duel starts by selecting mode in the menu (see Figure 9.8).

To run duel press the button (see Figure 12.5).

- Person vs Person,
- Person vs Computer,
- Computer vs Computer.

field shows parameters generated at random using probabilities defined by equilibrium conditions. The firing times are decided by these conditions, directly.

In the game the person fix parameters of the first object. A window to fix them is in the center of the top Figure 12.7.

Probabilities of parameters of the second object and its firing time are defined by equilibrium conditions (12.33),(12.32), and (12.36). Parameters of the second object, and the actual hitting, are defined by probabilities. Therefore, the outcome is random.Figure 12.8 shows the input of the game.

In this game parameters of both objects are fixed by persons using the input window. The outcome of this game is shown in a form similar to the game (see the bottom Figure 12.7).One sets the speed of moving objects in the window (see Figureduelout1).

Figures 12.10 show windows.

The main future task is to implement flexible economic models. Another task is the implementation of second-order economic duel models. These tasks need additional theoretical investigations and software developments. Creation of entertaining graphical interfaces is important, too.

jonas mockus 2004-03-20