12 "Duel" Problem,
Differential Game Model
In all three competition models, the Nash, the Walras, and the
Inspector, the steady-state conditions were considered, for
simplicity. In real life competition processes, dynamics are
essential. Dynamical competition models may be represented as
differential games. They are difficult for numerical and visual analysis, as
usual [Friedman, 1971,Galperin and Zheng, 1991,Hajek, 1975,Isaacs, 1965,Krasovski, 1970,von Neumann and Morgenstern, 1953,Petrosyan, 1985,Petrosyan, 1993,Rodin, 1987]. Therefore, one tries to make the dynamic
competition models as simple as possible.
Consider, as a simple illustration, two objects in space trying
to destroy each other. Assume, as a first approximation, that
the control parameters are: initial points
, "climbing" rates and firing times .
These parameters are set before the
start for both objects. Denote the control parameters by vectors
, correspondingly. Denote by the lower
bounds of these parameters. Denote by the upper bounds. In the illustrative example
Movements of the objects are described in two-dimensional
"hight-time" space by the first-order differential equations
The solution of these equations defines
trajectories of the objects
As usual, models of moving objects use homogenuos
second-order differential equations, at least
The solution of these equations defines
more realistic trajectories of the objects
because this includes acceleration, too.
These trajectories depend on the roots of the charasteristic
The trajectory of the first object
The constants are defined by initial or boundary conditions.
Suppose that the initial higth
are fixed by operator of the first flying object.
Considering the speed one can write
From here assuming that initial speed
This way we obtained three pairs of linear equations corresponding
to three different pairs of roots defining
two unknown constants .
The trajectory of the second object is defined in the same way.
Later we shall apply first-order linear differential
equations (12.1) and (12.2) as a first approximation to
test algorithms of optimization and modeling.
Denote by a distance between objects at the moment
Denote by the hitting probability
The expected utility function of the first object
The expected utility function of the second object
Expressions (12.18) and (12.19) of expected utilities
follows from expression (12.17) of hitting probabilities. It is
assumed that utilities to hit the hostile object are plus
and . Utilities to be hit are minus and .
Here denote utilities of the first object and
define utilities of the second one.
Expected utilities (12.18) and (12.19) are not convex
functions of variables and . The convex
version may be obtained assuming that the object destroys the enemy, if it is not
hit itself. Suppose that
. Then it follows from expressions
(12.18) and (12.19) that the expected utility function of the first object
The expected utility function of the second object
Expected utilities (12.18) and
(12.19) are convex functions of probabilities and
at the equilibrium point
The convexity of expected utilities
(12.18) and (12.19) as functions of probabilities
and does not provide their convexity as
functions of firing times and and other parameters.
The convexity is a part of sufficient equilibrium conditions. Therefore, the equilibrium may not exist.
Expected utilities reach the maximal values at the ends of
intervals of parameters , as usual. For the first
object there are four pure strategies corresponding to various
arrangements of these ends
The pure strategies for
the second object:
We may search for the pure
equilibrium of bimatrix games (12.20)(12.21)
corresponding to each fixed pair of firing times
using simple discrete algorithm (13.14). However, the
probability to obtain the pure equilibrium is high only for
large bimatrix games [Forgo et al., 1999]. Therefore, we consider
mixed strategies, first.
If expected utilities reach the maximal values at the ends of
intervals of parameters then mixed strategies
could be helpful. The mixed strategies mean that each object chooses
at random a combination of lower and upper bounds. Define those
bounds as zero-ones for simplicity.
There are four events:
Probabilities of these events are defined by first four components
of a vector
The fifth component defines the firing
time. A vector
has a similar meaning for
the second object.
Applying mixed strategies
, the utility functions of the first and the second
, where denotes
the four events
. The matrix
is the winning of the first object, if the pure strategy
denoted by the quadruple is used. The matrix
is the winning of the second object, if the pure strategy
is used. Expanding expressions (12.23)
Here winnings are defined by expression (12.25).
(12.24) illustrates how the function is
defined. The function is defined similarly.
One defines bimatrix games (12.23) and
(12.24) for each pair of firing times .
Therefore, for fixed pair , the same methods of search for the equilibrium points may be used
as in the inspector model (13.14). Here winnings
and depend on firing times .
This is the important difference from the inspector game.
One updates bimatrix games while changing
Therefore, there are only two parameters and to
be found by the fraud minimization (see section
4.1). One can do that by applying methods, such as
the , to search for the pair
that minimizes the fraud.
Consider the bimatrix game (12.23) and (12.24)
defined by fixing a pair of firing times. We
search for the equilibrium of this game using conditions
(13.14) and (13.19)- (13.23). Here the matrix is
small thus the proportion of pure strategy equilibrium points
will not be considerable [Forgo et al., 1999]. Therefore we
start by looking for such a mixed strategy that makes all the
partner strategies equally profitable for him. That makes the
partner fraud irrelevant. At fixed pair a mixed
strategy of the first object is "fraud irrelevant" if
1 Bimatrix Game Algorithm
There are four linear inequalities, five linear equations and
, because the set has
A mixed strategy of the second object is fraud irrelevant
These expressions include inequalities what is usual in the
linear programming (LP). Applying LP, the variables and
should be expesed as differences of two non-negative variables
and . Then from the expressions
(12.28), (12.29) one obtains the LP problem:
Similar linear fraud irrelevant equations were considered in the
inspection problem (see expressions (13.42) and
(13.43)). Here the fraud irrelevant
equations (12.31) and (12.32) depend on two unknown
parameters and . Therefore, linear problems
(12.31) and (12.32) are just a part of general
non-linear problem. We use longer notations, such as
and , to show that depend on
If there is no solution of fraud irrelevant equations, we search for the pure equilibrium strategies using
condition (13.14). The Strategy Elimination Algorithm
(SE) is a good alternative to obtain equilibrium.
A general but difficult way to find the equilibrium is to solve the bilinear problem (13.18) defining the necessary and sufficient equilibrium conditions.
Obtaining equilibrium in mixed
or in pure
strategies , one reduces the six-dimensional optimization
problem to the two-dimensional one.
, where are the "contract"
firing times. The operator is defined by the fraud
Here is the expected winning of the first object. It is the solution of
the corresponding bimatrix game at
fixed firing times and . is the expected winning of the
object defined as the solution of the bimatrix game at given firing times
For example, winnings and can be defined by
LP problems (12.31)(12.32),
or by the Direct Search Algorithm (DSA) (see Section 2.1),
or by Strategy Elimination
Algorithm (SE) (see Section 2.4).
The equilibrium is achieved, if the minimum (12.36)
is zero. The minimization of (12.36) is the task
for global optimization algorithms, including the Bayesian ones.
The bimatrix game algorithm can be applied to both the symmetric
and the non-symmetric cases. In the symmetric case the utilities
satisfy the zero-sum game condition . Here one
obtains the equilibrium by solving the following linear
2 Matrix Game Algorithm
Here mini-max condition [Owen, 1968]
holds for all firing times . Therefore, a search for the
equilibrium is reduced to the condition
The zero-sum equilibrium is reached when the hitting
probabilities are equal to 0.5 for both players and their
winnings are zero (see the one-dimensional example in the next
(12.1)-(12.19) in one-dimensional space the trajectories are
and the hitting
control parameters are the firing times and . Assuming that
and applying expression (12.20),
the expected utility function of the first object is
The expected utility function of the second object follows from
The equilibrium is reached at the firing moment . The
one-dimensional example helps to understand and to test the results of the
By the "Economic Duel" we mean a dynamic case of the Nash model.
In Chapter 10, the static case was described.
There are servers providing the same service
where is the profit, is the service price, is
the rate of customers, is the running cost, and is the
Suppose that arriving customers estimate the
average waiting time as the relation of the
number of waiting
customers to the server capacity that is assumed to be equal
to the running cost
A customer goes to the server , if
A customer goes away, if
where is the critical cost. The rate of incoming
consumers is fixed
general ideas of the dynamic Nash model, we simplify the
static one, first.
We express the customer estimate of the expected waiting time as
Assume the customer balance condition.
Here is the customer balance factor.
condition follows from (10.4),(10.5), (10.6),
if the number of customers is large.
The condition is based on the observation that arriving customers prefer servers
with lesser sum .
Consider a monopolistic situation when only the th server is
there. In this case
Othervise, customers abandon services . The maximal monopolistic
Replacing inequality (12.55) by equality, we eliminate
(12.55) and maximize the expression
Now one defines the optimal service price
the optimal service capacity,
and the maximal monopolistic profit
In the previous section, considering the static model, it was tacitly assumed that a
server can operate steadily while making loss instead of profit.
Now we consider a case of insolvency, too. Suppose that a server gets
broken, if the accumulated losses exceed some credit threshold.
Here the bankrupt server is eliminated and the
remaining one assumes a monopolistic position.
2 Dynamic Nash Model
Denote by a profit accumulated by the th server at a time . Assume that
where defines a profit at a moment
Here is a rate of customers of the th
server at a moment . This rate is formally defined as a limit
of the fraction
where is a total number of customers
arrived during an interval
. The zero denotes the
starting time of the system. is the end of the operation
period, the "horizon". In real life is
defined at fixed moments, for example, minutes, hours, days
e.t.c.. We consider it as a continuous parameter because we consider
a continuous model.
Denote the bankruptcy threshold of an th server by .
Denote a bankruptcy moment (if it happens) of the th server by
Otherwise, customer rates are defined by the customer
balance condition (12.50) .
From (12.59) the
Otherwise, the profit is defined by (12.61).
Assume that both servers are planning their dynamic competition
in advance. Each server defines trajectories of the service
price and the service capacity . These
trajectories show how the service prices and capacities depend on
the time . Objectives are to maximize profits
Here defines the accumulated profit (12.60)
of th server at the end of period .
We illustrate the idea by a simple example. Assume that servers
control initial values
and rates of
. Then the trajectories are defined by differential
The search is similar to that in the Nash model
(see Section 10). We fix the the initial values, the
. The transformed values, the "Fraud-Vector"
obtained by maximizing the profits of each server . The
maximization is performed assuming that all partners
honors the contract
Here the symbol denotes the accumulated profit (12.65).
(12.70) transforms the vector
into the vector . To make
expressions shorter denote this transformation by
One may obtain the equilibrium at the fixed point , where
The fixed point exists, if the feasible set is convex
and all the profit functions (10.1) are convex
[Michael, 1976]. We obtain the equilibrium directly by
iterations (12.71) , if the transformation is contracting.
If not, then we minimize the square deviation
The equilibrium is achieved, if the minimum (12.73) is zero.
One makes the model more flexible by introducing
second derivatives of parameters and . The second derivatives represent accelerations.
Assume that servers control
the starting values
, the rates
trajectories are defined by differential equations
Here the vector has twelve
That means high dimension of the fraud minimization problem (12.73).
In addition, one needs to test the convexity of accumulated profits defined by expressions (12.65) as functions of parameters
If they are convex then one obtains equilibrium by
condition (12.73). Otherwise, some specific algorithms should be designed. An example is the inspector problem.
An interactive duel model is implemented using Java JDK1.2 [Matulevicius, 1999].
To run the Java applet go to web-sites (see section
4), start and select the task (see Figure 9.6).
The task window, in Figure 12.1, shows the domain of firing times
The duel task window.
One selects the optimization method from the menu (see Figure 9.7).
The method is selected (see Figure 12.2).
The total number of iterations is set to forty. The number of initial iterations is set to five.
The Bayes method selection.
The optimization results are in Figure 12.3.
To start the analysis of results one touches the button
(see Figure 9.9).
shows how the deviation from the equilibrium depends on the iteration number (see Figure 12.4).
The convergence window.
The duel starts by selecting mode in the menu (see Figure 9.8).
To run duel press the button (see Figure 12.5).
One selects the operation mode in the
operation window (see Figure 12.6).
The starting window.
There are three operation modes:
The operation window in the
Figure 12.6 shows the
- Person vs Person,
- Person vs Computer,
parameters generated at random using probabilities
defined by equilibrium conditions. The firing times are
decided by these conditions, directly.
game the person fix parameters
of the first object. A window to fix them is in the center of the top
Probabilities of parameters of the second object and its
firing time are defined by equilibrium conditions
(12.33),(12.32), and (12.36).
Parameters of the second object, and the actual hitting, are defined by probabilities. Therefore, the outcome is random.
The input (top) and the outcome (bottom) of the
Figure 12.8 shows the input of the
In this game parameters of both objects are fixed by persons
using the input window. The outcome of this game is shown in a
form similar to the
game (see the bottom
The input of the
One sets the speed of moving objects in the window (see Figureduelout1).
Figures 12.10 show windows.
The speed control window.
In the example, one searches for equilibrium in mixed strategies
using the Irrelevant Fraud Algorithm (IF) which is described in
section 2.3. This algorithm solves linear
programming problems (12.31) and (12.32). If no
solution is found, one searches for pure equilibrium
strategies using the Direct Search Algorithm (DS) (13.14).
The Strategy Elimination Algorithm (SE) (see section
2.4) helps, if both IF and DS fail.
The main future task is to implement flexible economic models.
Another task is the implementation of second-order economic duel models.
These tasks need additional theoretical investigations
and software developments. Creation of
entertaining graphical interfaces is important, too.