## Description

**Computer made/mounted on velcro 4.0 inch-100mm**

**JOINT OPTIMIZING INFORMATIONAL STRIKE TOOL (JOIST)**

JOIST is a prototype computer program that deals with the optimal tasking of both strike and information assets in a protracted war. Phenomena modeled include damage assessment, surveillance, and target tracking. The optimization is constrained by limits on weapon stocks, attrition, and the capacity of platforms to undertake assigned tasks. The model carefully follows the laws of Probability, using Bayes’ Theorem to process information by revising the marksman’s perception of target state. Realistically scaled problems can be solved in minutes or hours, depending on the degree of precision required.

The technological possibilities that are emerging in the Information Age make many things possible militarily that were not possible just a few years ago, but many of the new information systems are expensive or in short supply. Therefore, these observational assets need to be carefully tasked, just as firepower assets need to be carefully tasked. Passive sensors must be told where and when to look and on what frequencies. Active sensors must be similarly guided, and must be employed especially carefully because of the risk of giving up more information than is gained. Furthermore, although this has not always been the case historically, strike and information assets should be tasked jointly. Information has no value in itself, but only as a guide to subsequent action. Using a shoot-look-shoot strategy with an expensive munition such as a cruise missile does not make sense if the “look” capability is ineffective or urgently employed elsewhere. JOIST attempts to task both kinds of asset jointly; in fact, the “Joint” part of JOIST’s name refers as much to jointness between strike and information asset tasking, as it does to inter-service jointness.

JOIST is a tasking method dedicated to the idea that strike and information assets should be tasked optimally as well as jointly. The word “optimal” is used here in its mathematical sense; that is, an objective function whose meaning is “expected net value gained” is maximized over a specified time period, subject to constraints on the availability of assets such as aircraft sorties, aircraft losses, weapons and sensors. The reader should not be overly impressed by this optimality, since computational necessity forces JOIST to deal only with a simplified version of reality. The number of possible target states and sensor reports is very small, for example. Nonetheless, optimal systems have several important advantages, among them not being at the mercy of arbitrary rules about when information should be sought or how it should be used. Surprisingly complex joint optimal strategies emerge even in JOIST’s simplified setting. JOIST has been tested with realistic data, but not with real data. Realistically scaled problems can be solved in several minutes on a 450 MIHz Wintel computer. An example is given below.

Since JOIST is optimizing and considers both information and firepower resources, it is potentially useful for making tradeoffs between those resources. Early indications are that the Information Age is still one in which it pays to know how to shoot-results are generally more sensitive to strike resources than to information resources in the (unclassified) scenarios considered so far. JOIST might also possibly be used operationally as part of an automatic Air Tasking Order construction package, since it deals with optimal asset allocation in a time-extended scenario. The main impediment to accomplishing either of these goals will be the acquisition of accurate data about sensors.

Background. The optimal use of information is difficult to model for two fundamental reasons. First, any policy for using information must take the mathematical form of a function, since the essence of the problem is to determine the best action as a function of the information. In most practical cases there are an extremely large number of functions, so the problem of finding the best one tends to be difficult except in special circumstances. Second, the effect of actions on the system to be controlled must also be modeled, since the meaning of the word “optimal” must involve the effect of decision making on the controlled system. For these reasons, models of optimal decision making with information tend to be crude. In practice, many modelers adopt reasonable but non-optimal decision rules to specify which action is to be taken; in other words, they adopt a particular one of the above mentioned functions, rather than searching for the best one.

One of the special circumstances where optimal decision policies can be found as a practical matter is the Partially Observable Markov Decision Process (POMDP). As long as the state of the process is not too complicated and evolves in a Markov manner over time, optimal policies can be found without having to list all the possibilities. POMDPs are still difficult to solve, but, with continual advances in computer power and the topical importance that comes with the Information Age, applications have become worth considering. One natural application area is the problem of Bomb Damage Assessment (BDA), since the Markov process representing the state of the target has only a small number of states (two, if one thinks of the target as being either live or dead), and also since the weapons employed to change the state of the target can be very expensive. Castenon (1997) describes one implementation of this idea. Yost (1998) develops a similar model, except that the action costs required in solving a POMDP are replaced by constraints on resources enforced by a Linear Program (LP) in an iterative scheme. Yost and Washburn (2000) describe this as the LP/POMDP marriage. These efforts have been criticized for omitting the possibilities of lost track (targets sometimes move or submerge before they can be struck) and surveillance (military sensors are used to find new targets, as well as deal with old ones). An additional criticism is that even targets that are either “live” or “dead” can be in multiple states as far as observation is concerned. Dead targets sometimes appear to be still alive, for example, thereby inviting further unnecessary strikes unless the truth is exposed by observation. JOIST is intended to be responsive to these concerns.

JOIST models a situation where a collection of targets is to be destroyed or at least reduced in effectiveness by a “marksman.” Targets are of multiple types, each of which is in either state 1, state 2, or state 3 at all times. The states are arbitrary as far as the method is concerned, but have been given specific meanings in what will be described as the “standard scenario.” In the standard scenario state 1 is a virgin live target, state 2 is a target that is dead but not obviously so, and state 3 is a target that is more or less clearly dead, and a strike can in no case decrease a target’s state. In addition, some targets are “tracked,” and some are not. Actions can be taken only against tracked targets, with each action potentially having multiple effects. An action can change the state of a target, and may also provide information (“live,” “dead,” or “no report”) about the new state. An action may also have the effect of causing a tracked target to become untracked, in which case no further actions against it may be taken in the future, or the surveillance effect of discovering previously untracked targets that are then added to the list of targets against which actions may be taken in the future. In principle, a single action could have all of these effects.

Time proceeds in discrete periods, with the marksman choosing some action (possibly null) against each tracked target in each period. For each tracked target, at the end of each period, any information provided by the chosen action is incorporated into the Marksman’s estimate of its state, possibly including the knowledge that the target has become untracked. There is a final period after which no further actions can be taken against any target, some limit on the total number of periods being a computational necessity. In practice, we would expect JOIST to be used with a “rolling horizon” where the last period is always a fixed distance from the time of the next action.

The object of JOIST is to provide optimal actions against every target at every opportunity. An objective function is therefore necessary. The JOIST objective is to maximize the expected net gain from the whole campaign. Each terminal state of each target has a reward associated with it, and the campaign reward is the sum of the individual rewards, one for each target that is tracked at the end. In the standard scenario the reward is the target’s value if the terminal state is either 2 or 3, otherwise 0. Each action has a cost associated with it, and the campaign cost is the sum of the costs of all actions taken. The campaign net gain is just the difference between the campaign reward and the campaign cost.

JOIST is a constrained optimizer; that is, the choice of actions is limited by constraints on the availability of resources such as aircraft, weapons and sensors. These constraints are joint in the sense that they simultaneously influence the feasible action choices for the entire collection of tracked targets, possibly including targets to be discovered in the future. These constraints are a significant computational complication, since otherwise an independent solution could be made at each target. The basic LP/POMDP solution method is to construct an independent POMDP solution for each target using prices for the resources obtained from the LP, incorporate the solution as a new column of the LP, and then solve the LP again in an iterative loop (see Yost and Washburn (2000)). JOIST uses this method.

Although JOIST is not a simulation, it does incorporate a post-optimization Monte Carlo simulator for purposes of conducting sanity checks on the policies determined to be “optimal.” A sample output for a target worth 30 points is given in Table 1. The marksman is at each stage required to choose an action that is either null or a strike or a look. The three numbers in parentheses are the state probabilities just before the action is taken, as updated by the rules of Probability (Bayes’ Theorem). The true state is also given for reference in Table 1, so both truth and the marksman’s perception are available for the analyst. The marksman, of course, is not allowed to know the true state.

Look actions do not transform the target state, but do report on it and have the side effect of possibly discovering new targets. Thus, the first two actions in Table 1 (looks by Sensor 1) are not foolish, even though the state is known for certain to be 1 initially, but rather represent failed gambles to discover new targets. The third action (Mission 157 out of a long list of legal strikes) sends the target to state 3, but the marksman does not know that so he employs Sensor 9. Sensor 9 eventually provides a Dead report, but not before discovering a new target of the same type (the indented text), etc. In this replication two targets are killed, but the same policy could result in a different outcome (possibly much more complicated than the one shown) in a different replication. JOIST averages over all of these possible outcomes in the process of determining the optimal strategy.