**6th U.S. Army Conference
on Applied Statistics**

A view on the campus of Rice University

The Sixth U.S. Army Conference on Applied Statistics was hosted by Rice University, 18-20 October, 2000. The conference was co-sponsored by the U.S. Army Research Laboratory (ARL), the U.S. Army Research Office (ARO), the United States Military Academy (USMA), the Training and Doctrine Command (TRADOC) Analysis Center-White Sands Missile Range, the Walter Reed Army Institute of Research (WRAIR), and the National Institute for Standards and Technology (NIST). Cooperating organizations included RAND, Los Alamos National Laboratory, George Mason University, Office of Naval Research, and the Institute for Defense Analysis. Approximately 100 people attended, making this conference the best attended since a 1982 Design of Experiments Conference in Monterey, CA. A bulletized accounting of the conference program is given and abstracts follow.

**Short Course:**Data, Knowledge, and Information Integration to Support Decision Making, taught by the Statistics Group, Los Alamos National Laboratory preceded the conference on 16 October and 17 October.

**Invited Speakers:**- Nozer Singpurwalla, George Washington University (Keynote Address)

*Warranty Contracts and Equilibrium Probabilities* - Alan Agresti (Florida)

*Challenges for Categorical Data Analysis in the 21st Century* - Donald Berry (Texas, MD Anderson Cancer Center)

*Title unavailable* - Noel A. C. Cressie (Ohio State)

*A Spatial-Temporal Statistical Approach to Problems in Command and Control* - Stuart A. Geman (Brown)

*Variance and Invariance in Machine Vision* - Emanuel Parzen (Texas A&M)

*Quantile/Quartile Plots, Conditional Quantiles, Comparison Distributions* - Naomi Oreskes (UC San Diego)

*Conceptual Issues in Model Assessment: What Can We Learn From Past Mistakes?* - Matthew Caffrey (Air Command and Staff College)

*Lessons From the History of Wargaming*

- Nozer Singpurwalla, George Washington University (Keynote Address)
**Special Sessions:**- Biological Warfare, organized by Marek Kimmel of Rice University
- Marek Kimmel, Rice University

*Dispersal of Bacterial Pathogenes: Scenarios, Models and Accidents* - George Weinstock, Department of Molecular Virology & Microbiology,
Baylor College of Medicine

Use of Genomic Technologies to Decode Bacterial Biological Warfare Agents

- Marek Kimmel, Rice University
- Digital Government, organized by Edward Wegman of George Mason University
- Cathy Dippo (BLS)

*Statistics and a Digital Government for the 21st Century* - Alan Karr, National Institute of Statistical Sciences and Sallie Keller-McNulty,
Los Alamos National Laboratory

*Web Dissemination of Disclosure-Limited Analyses of Confidential Data* - Wendy Martinez (NSWC)

*Statistics in Intrusion Detection*

- Cathy Dippo (BLS)
- Highlights of the June 2000 "National Research Council Workshop
on Reliability Issues for DoD Systems," organized by Arthur Fries
of the Institute for Defense Analysis.
- Francisco Samaniego (UC, Davis)

*NRC Workshop on Reliability for DoD Systems--An Overview of the Statistical Content* - Ernest Seglie (OSD/DOT&E)

*NRC Workshop on Reliability for DoD Systems--A DoD Perspective*

- Francisco Samaniego (UC, Davis)

- Biological Warfare, organized by Marek Kimmel of Rice University
**Contributed Papers**- George Hanna (AMSAA)

*Reliability Described by Belief Functions, A Second Look* - Yontha Ath (Cal State U, Dominguez Hills) and Milton Sobel (UC, Santa
Barbara)

*Stochastic Properties for Uniformly Optimally Reliable Networks* - Douglas Frank (Indiana University, PA)

*Estimating Parameters in a Bimodal Distribution* - Eugene Dutoit (DBBL, Fort Benning)

*The Use of Cluster Analysis in the MOUT ACTD* - MAJ Tom Cioppa and Tom Lucas (NPS)

*Efficient Search Strategies in High-Dimensional Complex Models* - Carl Russell (JNTF)

*Graphical Analysis of Communications Latency in a Large Distributed Simulation* - Ann Brodeen and Frederick Brundick (ARL), and Malcolm Taylor (OAO Corp)

*Statistical Augmentation of a Database for Use in Optical Character Recognition Software Evaluation* - Barbara Broome, Ann Brodeen, and Frederick Brundick (ARL), and Malcolm
Taylor (OAO Corp)

*Exploring the Use of Reading Comprehension Tests in Evaluating Machine Translation Systems* - Arthur Fries (IDA)

*Another "New" Approach for "Validating" Simulation Models* - Barry Bodt, Joan Forester, Charles Hansen, Eric Heilman, Richard Kaste,
and Janet O'May (ARL)

*A Statistical Analysis of Course of Action Generation ?* - Jock Grynovicki, Kragg Kysor, and Madeline Swann (ARL)

*A Human Factors Analysis of Two Proposed Tactical Operations Center (TOC) Shelter Designs* - David Webb (ARL) and Bruce Held (RAND)

*Modeling of Tank Gun Accuracy Under Two Different Zeroing Methods* - Mike Danesh (Redstone)

*System Reliability for Precision Missilery* - Iris Rivero and Kwang-Jae Kim (Penn State)

*Analysis of Fuzzy Regression for Modeling Shelf-Life of Gun Propellants* - Robert Nowak (Rice)

*Computer Network Tomography for End-to-End Traffic Measurements* - Rong Chen (Illinois, Chicago), Jun Liu (Stanford) and Xiaodong Wang
(Texas A&M)

*Monte Carlo Filters and Their Applications in Target Tracking and Wireless Communications* - David Scott (Rice)

*Clustering and Partial Mixture Estimation* - Jayaram Sethuraman (Florida State)

Modeling Transmission Loss in a Network with a Large Number of Nodes - Bernie Harris (Wisconsin-Madison) and Shun-Yi Chen (Tamkang University)

*Accurate One Sided Tolerance Limits for the Balanced Normal Random Effects Model* - James Thompson (Rice)

*Some Thoughts on the Statistical Legacy of John W. Tukey* - CPT Scott Nestler, United States Military Academy

Atmospheric Properties for Estimation of Infrared Radiance of Ballistic Missiles - Yuling Cui and Nozer D. Singpurwalla, George Washington University

*Damage Assessment Using Test Data and Expert Testimonies*

- George Hanna (AMSAA)
**Events**- Social: Tuesday, October 17, Duncan Hall
- Wilks Award Banquet: Wednesday, October 18, Duncan Hall

**ABSTRACTS
SIXTH U.S. ARMY CONFERENCE ON APPLIED STATISTICS**

**WEDNESDAY, OCTOBER 18
GENERAL SESSION I (0900 - 1030)**

*Warranty Contracts and Equilibrium Probabilities
*Nozer Singpurwalla, George Washington University

The scenario of warranties is at the interface of philosophy, law, and probability. In this talk we describe a real-life scenario involving litigation pertaining to a breach of warranty and discuss its ramifications from a statistical point of view. We claim that the three interpretations of probability, the objective, the subjective and the logical--all come into play when designing an optimum warranty that is also just.

*Variance and Invariance in Machine Vision*

Stuart Geman, Brown University

I will propose a computer vision system based upon a collection of scale-invariant composition rules that define a part-whole hierarchy. I will make a connection to some striking invariance properties of natural images. I will suggest a coarse-to-fine computing engine for scene analysis within the compositional framework. I will discuss some experimental results and make some connections to biological vision systems.

**SPECIAL SESSION ON BIOLOGICAL WARFARE (1100 - 1230)**

*Dispersal of Bacterial Pathogenes: Scenarios,
Models and Accidents
*Marek Kimmel, Rice University

Abstract unavailable

*Use of Genomic Technologies to Decode Bacterial
Biological Warfare Agents
*George Weinstock, Department of Molecular
Virology & Microbiology, Baylor College of Medicine

Bacteria contain a number of genes that contribute to their ability to cause human infections and to resist the action of antibiotics. However only a subset of bacteria within a given species contain these genes. Thus, some bacteria of a species contain only a few genes involved in infection, while others contain many such genes. These pathogenicity and antibiotic resistance genes can be transferred between bacteria of the same species and often between bacteria of different species. In addition, these genes can be inserted into bacteria in the laboratory by recombinant DNA techniques, which facilitates the design and construction of bacteria that are highly virulent and also resistant to most antibiotics. Engineered bacteria of this type represent possible biological warfare or terrorism agents. Genomics technologies such as high-throughput sequencing and DNA chips, are mechanisms by which the constellation of pathogenicity and antibiotic resistance genes of any bacterial isolate can rapidly be assessed.

**CONTRIBUTED SESSION I (1330 - 1500)**

*The Use of Cluster Analysis in the MOUT ACTD
*Eugene Dutoit, Dismounted Battlespace Battle
Lab, Fort Benning

As part of the MOUT ACTD, a simulation study was conducted to determine the operational effects, with respect to communication, on clearing a building floor consisting of a series of rooms in a typical urban setting. The Blue force was attacking and the Threat force was defending the floor. Prior to the force-on-force simulations, the study agency wanted to determine, in advance, the probable communication locations (or nodes) on the floor that the attacking force would use to coordinate the attack. Subject matter experts (SMEs) were given floor diagrams and a detailed description of the scenario and asked to identify all the possible places they believed that communications would occur. A clustering algorithm was used to determine if the group of SMEs were reasonably consistent among themselves as to the location of the subjectively positioned communication nodes. This paper will present the results of this portion of the study and a simple measure of subjective clustering effectiveness.

*A Human Factors Analysis of Two Proposed Tactical
Operations Center (TOC) Shelter Designs
*Jock Grynovicki, Kragg Kysor, and Madeline
Swann, U.S. Army Research Laboratory

The U.S. Army currently fields the Standard Integrated Command Post System-Extensions (SICPS-E) tent shelter system with its command post vehicles to form a tactical operations center (TOC). Although the currently fielded SICPS-E system contains a common workspace, it does not provide an open architecture within which staffs can better perform their functions. An open architecture allows for the uninterrupted view of Command Information Center (CIC) displays and unimpeded movement of personnel within the shelter. Given that the currently fielded SICPS-E system is not ideal, the issue becomes "What type of form should future TOCs take?"

In early 1997, the Vice Chief of Staff, Army (VCSA) established a policy to address TOC development. The intent was to leverage all of the financial and intellectual efforts from across many communities to focus on systems that provide commanders and their staffs the facilities and information required for making military decisions. Subsequently, the Training and Doctrine Command (TRADOC) Program Integration Office-Army Battle Command System (TPIO-ABCS) was assigned the responsibility for focusing TOC development.

In early 1999, HQ TRADOC and the Army Digitization Office (ADO) requested that a high-level body (i.e., "TOC Summit") address the issues that resulted from the complexity of TOC development. Through the TOC Summit venue, it was recognized that the currently fielded SICPS-E system may not suffice as the sheltering system for future U.S. Army TOCs. The Commander, Combined Arms Center (CAC) directed the TRADOC Analysis Center-Fort Leavenworth (TRAC-FLVN) to provide analyses to inform a November 2000 decision on the form to select for the Army's future division-level and brigade-level TOCs. Consequently, TRAC-FLVN requested that ARL-HRED provide human factors analysis support for the November 2000 decision regarding a (1) Mobile Expandable Container Configuration (MECC) and a (2) Custom Tent design. As part of a "Cognitive Engineering of the Battlefield" Science and Technology Objective (STO), the ARL-HRED developed a survey instrument to assess human factors, battlefield management, collaboration, modularity, mobility and security issues. This paper presents an approach to the statistical data collection, human factors analysis, and quantitative results that will be provided to support the decision makers on the form (i.e., platform or shelter) to select for the U.S. Army's future division-level TOCs.

*Exploring the Use of Reading Comprehension Tests
in Evaluating Machine Translation Systems*

Barbara Broome, Ann Brodeen, and Frederick Brundick, U.S. Army Research
Laboratory and Malcolm Taylor, OAO Corporation

Machine translation (MT) is a computer-based application that seeks to convert the content of a passage provided in one human language to another. The Army, in particular, with its land operations in foreign countries and its use of coalition forces, stands to benefit from translation tools. Defense efforts like the Army's Forward Area Language Converter program and the Defense Advanced Research Projects Agency's Trans-lingual Information Detection, Extraction and Summarization program are examples of ongoing efforts to leverage and develop translation tools to support the soldier at all echelons. In earlier studies, a variety of evaluation techniques were employed. Most involved subjective assessments of translation quality, emphasizing the correctness of syntax, morphology, and semantics. Our research explores the use of reading comprehension tests to establish a baseline in the evaluation of MT technology. This paper describes the results of a pilot study in which we employ a standardized reading comprehension test to assess the effectiveness of a French MT system.

**CONTRIBUTED SESSION II (1330 - 1500)**

*Computer Network Tomography for End-to-End Traffic
Measurements
*Robert Nowak and Mark Coates, Rice University

In many situations, the ability to determine whether a packet-based communication
network is performing correctly is vital, but the network must not be overburdened
by probing test traffic. If it is determined that the network is functioning
suboptimally, localization of the dysfunction is a key step towards mitigating
the problem. The fundamental objective of this work is to determine the
extent to which unicast, end-to-end network measurement is capable of determining
internal (not directly measurable) packet losses - the so-called network
tomography problem. The major contributions of this paper are two-fold:
we formulate a measurement procedure for network loss inference based on
end-to-end packet pair measurements, and we develop a statistical modeling
and computation framework for inferring internal network loss characteristics.
Simulation experiments demonstrate the potential of our new framework.

*Modeling Transmission Loss in a Network with a
Large Number of Nodes
*Jayaram Sethuraman, Florida State University

Suppose that a signal with an initial strength from an originating node is transmitted through a network with a large number of intermediate nodes. There will be dissipation as well as some boosting of the signal between nodes. We will explore a general probabilistic model for the total loss in transmission, i.e. for the final strength of signal after passing though a large number of nodes.

To make the problem more mathematical, we assume that a signal has strength X_0 at the originating node 0 and it is transmitted through a path consisting of nodes i=1,2,...,n. Denote the strength of the signal at node i by X_i, i=1,2,...,n. The nodes themselves do not have to be on a straight line, they are the nodes along a certain path. The ratio p_i=X_i/X_(i-1) represents the loss/boost factor at node i; p_i <1 means that there was a loss and p_i >1 means that there was a boost to the signal at node i. We are interested in strength X_n of the final signal after it comes through node n, or more particularly the final loss/boost factor Z_n =X_n/X_0. We present a probabilistic model for the loss/boost factors p_1,...,p_n and obtain simple limiting distributions for the final loss/boost factor Z_n. In some models, the mean of the final loss/boost factor is 1 indicating that on the average there is no loss. In these cases, one can examine the variance, which we obtain, to devise systems with tolerable amounts of fluctuations while the same time there is no loss of strength on the average. In other probabilistic models, there will be a loss or boost in the strength of the final signal. This information can be useful in designing robust systems.

*Monte Carlo Filters and Their Applications in
Target Tracking and Wireless Communications
*Rong Chen, University of Illinois, Chicago,
Jun Liu, Stanford University and Xiaodong Wang, Texas A&M University

Monte Carlo filter is a sequential imputation method designed for on-line signal extraction for nonlinear/nonGaussian dynamic systems and state-space models. We provide a general framework and several special Monte Carlo filtering algorithms and discuss their wide applications, particularly in target tracking and digital wireless communications.

**CONTRIBUTED SESSION III (1530 - 1700)**

*Clustering and Partial Mixture Estimation
*David Scott, Rice University

The use of density estimation to find clusters in data is supplementing ad hoc hierarchical methodology. Examples include finding high-density regions, finding modes in a kernel density estimator, and the mode tree. Alternatively, a mixture model may be fit and the mixture components associated with individual clusters. Fitting a high-dimensional mixture model with many components is difficult to estimate in practice. Here, we describe a new algorithm that estimates a subset of the complete model. In particular, we demonstrate how to fit one component at a time and how the fits may be organized to reveal the complete clustering model.

*Estimating Parameters in a Bimodal Distribution
*Douglas Frank, Indiana University, Pennsylvania

The problem is we have data from a mixture of two populations with unknown means. The source of each datum cannot be identified. We assume the fraction of data from one population is an unknown parameter p. We show methods of estimating the parameter p as well as the means and variances of the mixed populations. The problem is phrased in terms of fitting bimodal test scores but has several possible military applications. For instance we may be receiving fire from two enemy weapon systems with differing rates of fire or kill ratios. We can estimate the number of each type of weapon as well as its capabilities with this procedure.

*Accurate One Sided Tolerance Limits for the
Balanced Normal Random Effects Model*

Bernie Harris, University of Wisconsin, Madison and Shun-Yi Chen, Tamkang
University

Let Xij = u + bi + eij, i = 1,2, ..., I; j = 1,2, ..., J, where Xij is the jth observation from the ith batch. The bi's and the eij's are mutually independent normally distributed random variables with E(bi) = E(eij) = 0 and standard deviations sigma(b) and sigma (w) respectively. A procedure is given for obtaining lower tolerance limits, which are shown to have coverage virtually equal to the nominal coverage. This procedure eliminates the conservatism of some previously proposed techniques. Some numerical comparisons with procedures proposed by G. H. Lemon (1977) and R.W. Mee and D. B. Owen (1983) are given.

**CONTRIBUTED SESSION IV (1530 - 1700)**

*Statistical Augmentation of a Database for Use
in Optical Character Recognition Software Evaluation
*Ann Brodeen and Frederick Brundick, Army
Research Laboratory, and Malcolm Taylor, OAO Corporation

In this paper we consider a statistical approach to augment a limited database of groundtruth documents for use in evaluation of optical character recognition software. A modified moving-blocks bootstrap procedure is used to construct surrogate documents for this purpose which prove to serve effectively, and in some regards, indistinguishably from groundtruth. The proposed method is statistically validated.

*Another "New" Approach for "Validating"
Simulation Models*

Arthur Fries, Institute for Defense Analysis

When test data are sparse and widely scattered across numerous experimental factors, the issue of validating any complementary simulation modeling is problematic. We focus on the extreme case--limited testing within a vast multi-dimensional design space, and only a single replicate per test--although results readily generalize. Our only assumptions are that, for each test, we can measure the associated experimental factor values and input these into the model to generate extensive simulated data. Under the null hypothesis that the model accurately portrays reality, this "distribution" of outcomes facilitates calculation of a p-value. Fisher's combined probability test, for synthesizing results from different experiments, is then applied to obtain an overall characterization of the degree of simulation model "validity". Other variants of this combination methodology are also discussed, including generalizations to goodness-of-fit tests for a uniform distribution. Unique aspects of the model validation problem, vice the standard statistical hypothesis testing regime, are also noted.

*Graphical Analysis of Communications Latency
in a Large Distributed Simulation
*Carl Russell, Joint National Test Facility

The Theater Missile Defense System Exerciser (TMDSE) is a large geographically distributed simulation developed by the Ballistic Missile Defense Organization (BMDO). TMDSE is being used to investigate Joint Data Network (JDN) interoperability between the members of the Theater Missile Defense Family of Systems as they develop. This paper shows how simple statistical graphics implemented on a moderately-high resolution printer can produce rich, easily understood analyses of simulation performance.

**WILKS BANQUET PRESENTATION
**

Matthew Caffrey (Air Command and Staff College)

It's been said, experience is learning from your mistakes, wisdom is learning from the mistakes of others. For almost 200 years modern wargaming has been providing decisive insights and tragic mirages. Do we have the wisdom to use wargaming more effectively by learning from our predecessor's experiences? Professor Caffrey, Air Command and Staff College, Air University, looks back at the history of wargaming and suggests lessons for the future.

**THURSDAY, OCTOBER 19
GENERAL SESSION II (0830 - 1000)**

*Challenges for Categorical Data Analysis in
the 21st Century
*Alan Agresti, University of Florida

As we begin the 21st century, the state-of-the-art in categorical data analysis, as in all branches of statistics, is vastly different than at the start of the 20th century. This article focuses on some of the primary developments of the past ten to twenty years and discusses areas likely to see significant progress in coming years. Topics on which I focus are methods for ordered categorical responses, repeated measurement and clustered data, and small samples and sparse data. As the literature evolves, the variety of options for data analysis continues to increase dramatically. As a consequence, perhaps the greatest challenge now is maintaining adequate communication among statisticians and between statisticians and other scientists.

*A Spatial-Temporal Statistical Approach to Problems
in Command and Control
*Noel A. C. Cressie (Ohio State University)

There are considerable difficulties in the integration, visualization, and overall management of battlespace information. One problem that we see as being very important is the combination of (typically digital) information from multiple sources in a dynamically evolving environment. In this paper, we present a spatial-temporal statistical approach to estimating the constantly changing battlefield, based on noisy data from multiple sources. The potential danger from an enemy's weapons is examined in the spatial domain and is extended to incorporate the temporal dimension. Statistical methods for estimating danger fields are discussed, and an application is given to a data set generated by a simple object-oriented combat-simulation program that we have developed. This research was carried out by a group of Ohio State University statisticians supported by ONR's Probability and Statistics Program.

**SPECIAL SESSION ON THE DIGITAL GOVERNMENT (1030 - 1200)**

*Statistics and a Digital Government for the
21st Century*

Cathy Dippo, Bureau of Labor Statistics

Fostering information technology research for and technology transfer to non-R&D government agencies is the focus of a new Digital Government program at the National Science Foundation. The major Federal statistical agencies have been involved in the development of the program and recruiting academic research partners to apply for NSF grants for four years. Our efforts have been so successful, that $4.2 million of the first year's program funds were awarded in 4 grants for research related to statistics. Several more grants have just been awarded as part of the second round of awards. The partnerships with information technology academic researchers have also led to their applying for grants under NSF's Information Technology Research (ITR) program. Some of the partnerships were developed through participation in what is now called the Digital Government Consortium, which is about to change its membership rules to encourage broader participation by non-R&D agencies.

*Web Dissemination of Disclosure-Limited Analyses
of Confidential Data*

Alan Karr, National Institute of Statistical Sciences

Sallie Keller-McNulty, Los Alamos National Laboratory

This talk describes Web-based systems, under development at NISS, for dissemination of disclosure-limited statistical analyses of confidential data (for example, data collected by Federal agencies such as the Bureau of Labor Statistics). The essential features of these systems are that: (1) The results of statistical analyses are disseminated (as opposed to restricted access via data centers or dissemination of public use microdata); (2) Disclosure risk for each query is computed dynamically, in light of previous queries; (3) Strategies may be applied to reduce risk; (4) Analyses are possible that integrate user data with the confidential data.

*Statistics in Intrusion Detection*

Wendy Martinez, Naval Surface Warfare Center

This talk will focus on several recent efforts in the application of statistics to computer security. These efforts have focused on the application of modern statistical visualization procedures to network traffic data, the inference of machine functionality via cluster analysis, and the application of importance sampling to n-gram based intrusion detection. This work is currently being funded by the Office of Naval Research.

**LUNCHEON PRESENTATION (1245 - 1315)**

*Some Thoughts on the Statistical Legacy of John
W. Tukey
*James Thompson, Noah Harding Professor
of Statistics, Rice University

John Tukey died on July 26, 2000 at the age of 85. Only R.A. Fisher might
be said to have had influence comparable to John's on the position of statistics
and science in the modern world. A radical empiricist, Tukey questioned
everything. He doubted the ubiquity of the Gaussian distribution in real
world phenomena, and he proposed ways in which data analysis could be carried
out with a minimum of prior assumptions. His seminal work in time series
analysis was, perhaps, his earliest work in the area which would come to
be called Exploratory Data Analysis. His robustness work showed us what
could be done with a minimalist approach to assumptions about the generating
processes underlying data. In a meeting in 1978, he met a former Princeton
undergraduate who handed him a business card which bore below the former
undergaduate's name the title "Professional Skeptic". And Tukey
remarked on this as "something that heartened me about what we teach
at Princeton." John saw the importance that high speed computing would
bring to data analysis, and, in many ways, he should be regarded as a founder
of computer science. He gave us such terms as *software *and *bit*
. In many ways, the views of Tukey were found unpalatable by the leaders
of the statistical community early on. I recall that one of his papers required
a five year "review" before being deemed worthy of the *Annals*.
And yet today, who would question that the philosophy of Tukey has emerged
as one of the most significant in contemporary statistics? Where has Tukey
brought us statisticians to in the Twenty-First Century? This paper proposes
to offer some conjectures.

**CLINICAL SESSION I (1330 - 1530)**

*A Statistical Analysis of Course of Action Generation
?*

Barry Bodt, Joan Forester, Charles Hansen, Eric Heilman, Richard Kaste,
and Janet O'May, Army Research Laboratory

A recent approach to course of action (COA) generation employs a genetic algorithm (Fox-GA) to generate a pool of diverse, high quality courses of action. This pool is intended as an initial set which planners can and adjust before nominating a subset to the commander. Evaluation of the recommended COAs by Fox-GA has to date involved either expert opinion as to suitability or results from the abstract wargame internal to Fox-GA. (A single scenario is simulated using different COAs over terrain at the National Training Center, Fort Irwin.) Further, since the concept prototype was a component of a larger research program in military displays, much of the emphasis on Fox-GA development has been in perfecting usability.

In a limited scenario, we seek to examine Fox-GA with respect to two basic questions: (1) how do user controls on Fox-GA affect the quality of the COA reflected in the battle outcome, (2) how well do Fox-GA recommended COAs perform in a more widely accepted simulation, Modular Semi-Automated Forces (ModSAF)? Some information regarding the variability of ModSAF results also factor in the discussion.

The author's principal concern is how to choose a reasonable response measure to support evaluation within FoxGA and comparison with ModSAF. A few measures are proposed. Within the context of the questions we seek to answer about Fox-GA, what advice can the panel offer?

*Efficient Search Strategies in High-Dimensional
Complex Models
*MAJ Tom Cioppa and Tom Lucas, Naval Postgraduate
School

Design of experiments focusing on full and fractional factorial, Latin-hypercube, super-saturated, and Plackett-Burman designs are quite effective in identifying main effects or significant interactions where the number of factors is relatively small. Methodologies such as sequential bifurcation and frequency domain are also effective at identifying main effects, but assume negligible interactions. As models become increasingly complex, especially in combat simulations, these designs may not be effective since interactions can be highly significant. Finding general patterns of behavior is complicated by the difficulties associated with exploring high-dimensional non-linear model surfaces. Existing search tools often restrict analysts to comprehensively exploring a tiny hyperplane of model space, finding extreme points, or varying many factors simultaneously by confounding main effects with interactions. To improve the ability to explore models, a goal is to develop new algorithms and supporting theory for searching high-dimensional spaces that use search strategies that automatically look across a breadth of factors and adaptively focus in on the effects and interactions that are significant. These variable-resolution sequential designs may combine traditional designs (including full and fractional factorials, Latin-hypercube, etc.) with newer approaches such as random perturbations, group screening searches, and frequency domain experiments. An initial proposed approach is to develop the proposed search strategy and supporting theory, and then apply the methodology to an agent-based simulation to examine command and control issues where interactions are known to be significant.

*Statistical Analysis of Atmospheric Properties
for Estimation of Infrared Radiance of Ballistic Missiles*

CPT Scott Nestler, United States Military Academy

Atmospheric properties, like temperature and density, can greatly affect the amount of IR energy that is reflected off an incoming ballistic missile. While many models to predict mean atmospheric conditions exist, there are no global models that account for the variability in these properties. This shortcoming makes it difficult to assess uncertainty due to atmospheric conditions. For this reason, a model that is adjusted for known extreme values is needed for use in describing the global behavior of atmospheric parameters. This study is in support of the Missile and Space Intelligence Center's development of a Bounded Earth Atmospheric Model (BEAM). This study will attempt to create such a model through statistical analyses on an existing atmospheric model. It is expected that BEAM will primarily be used by designers of IR sensors used in missile defense systems.

**CONTRIBUTED SESSION V (1330 - 1530)**

*Stochastic Properties for Uniformly Optimally
Reliable Networks*

Yontha Ath, California State University, Dominguez Hills and Milton Sobel,
University of California, Santa Barbara

The field of Optimal Network Reliability seems to be a pretty hot topic among a small group of engineering people. Although it is highly statistical in nature, few statisticians have become interested in it. It has many applications, e.g to military environments as well to local area computer networks that operate in parallel. Such a network is modelled by a probabilistic graph G in which the points or nodes (i.e., communication centers, satellites, telephones, missile sites or computer stations) are perfectly reliable but the edges (representing telephones lines, communication links such as microwaves or multiply-connected cables) operate independently of one another, each with a given common known probability p(0<p<1). The network G is in an operating state iff the surviving edges induce (contain or consist of) a spanning connected subgraph of the given graph G. We consider only the all-terminal reliability. For an undirected network with n nodes and e edges, what is the most reliable network for the given pair (n,e)?It is quite interesting that a unique optimal graph does exist for many of the pairs (n,e) studied thus far. In this talk, we present several new conjectures for a graph to be uniformly optimally reliable (UOR). We also present some new counterexamples to the known conjectures on UOR graphs in the literature. Finally, we use Dirichlet methodology and apply it to the above network problems. For example, we consider different models for the lifetime of the edges and we are interested in the Expectation and Variance of the resulting (random) time until the (all-terminal communication) system gets severed (i.e., the networkgets disconnected).

*Reliability Described by Belief Functions, A
Second Look
*George Hanna, Army Materiel Systems Analysis
Activity

Glenn Shafer and A. M. Breipohl published the paper Reliability Described by Belief Functions in the 1979 Proceedings of the Annual Reliability and Maintainability Symposium of the IEEE. The paper uses Dempster-Shafer belief functions to assess the reliability of a circuit based on indirect data. The weakest aspect of the method is the construction of the belief functions. The belief functions are assumed to be consonant and one-sided, i.e., non-zero belief is associated only with statements of the form the reliability exceeds a particular value. Surely there will be occasions when the evidence provides support for two-sided statements about reliability. Furthermore, the assignment of confidence levels to belief values though not objectionable is not particularly compelling.

Since the time of the paper, two developments provide value to a reconsideration of Dempster-Shafer belief functions for assessment of reliability. First, is the cost to the Army of obtaining direct data to support verification of reliability requirements for weapons under development. Some items, such as missiles, are inherently expensive to test. For other items money available for testing may be limited due to general reduction in Army funding. Even when the Army can afford direct verification tests of a developmental system there is a strong desire to reduce the cost of verification. Consequently, there is high level interest in new methods to assess both reliability and other performance characteristics using other than direct system test data. In fact the Army has sponsored development of a methodology putatively based on Dempster-Shafer theory to assess reliability using other than direct data. The second development has been one of the reinterpretations of the Dempster-Shafer theory. Specifically the interpretation described in the book A Mathematical Theory of Hints: An approch to the Dempster-Shafer theory of evidence written by Jurg Kohlas and Paul-Andre Monney and published in 1995. The Kohlas and Monney approach to statistical inference employs a process model to relate an observable outcome to an unobservable random variable and a parameter of interest. Process models lead naturally to Dempster-Shafer belief functions by providing a literal meaning to the concept of a basic probability assignment that was introduced in Shafer's book, a mathematical theory of evidence and apparently named because of its formal properties.

The presentation will introduce the process model approach to developing belief functions. The belief function for binomial data based on a stress-strength process model will be discussed in detail. A simple model relating indirect and direct data will be used to illustrate evidentiary combination. The final part of the presentation will consider some of the obstacles to the use of process models and Dempster-Shafer theory for reliability assessment.

*Damage Assessment Using Test Data and Expert
Testimonies*

Yuling Cui and Nozer D. Singpurwalla, George Washington University

This talk lays out a general framework for damage assessment using test data and expert testimonies. Here test data are binary, while expert testimonies arise either in the form of informed judgments, or science based models, or both. Our proposed approach is "normative", in the sense that it is based entirely on the calculus of probability. This is ongoing work.

*System Reliability for Precision Missilery*

Mike Danesh, Aviation and Missile Research, Development and Engineering
Center

This paper presents a top-level summary for the development of a complex missile system. It is based upon a systematic approach that provides insight into the integration of scientific, engineering and mathematical technologies necessary for development of a robust missile system. It has been developed based upon a decade of experience with the PATRIOT system and has been endorsed by established experts.

As part of this talk, a mathematical tool called ExCAP (Confidence Expanded Assessment Package) will also be presented. ExCAP expands on conventional assessment methods by adding capabilities developed in the field of evidential reasoning.

**SPECIAL SESSION ON RELIABILITY (1600 - 1730)**

*NRC Workshop on Reliability for DoD Systems--An
Overview of the Statistical Content
*Francisco Samaniego, University of California,
Davis

The National Research Council of the National Academy of Sciences hosted a DoD-sponsored workshop on Reliability in June, 2000. Invited speakers addressed issues in the areas of system design and performance monitoring, reliability growth, techniques for combining data from related experiments, approaches to the analysis of field-performance data, fatigue modeling and related inferences, software reliability issues, and reliability economics. From my perspective as Chair of the Organizing Committee and of the workshop itself, I will discuss the highlights among the statistical ideas, approaches, visions and recommendations presented at the workshop. Some of their implications for the development of a new "Reliability, Availability and Maintainability (RAM) Primer" will also be discussed.

*NRC Workshop on Reliability for DoD Systems--A
DoD Perspective*

Ernest Seglie, Office of the Secretary of Defense, Operational Test &
Evaluation

In 1998 National Research Council issued a study of *Statistics, Testing
and Defense Acquisition*. It recommended that, "The Department of
Defense and the military services should give increased attention to their
reliability, availability, and maintainability data collection and analysis
procedures because deficiencies continue to be responsible for many of the
current field problems and concerns about military readiness." (Recommendation
7.1,Page 105)

In response to this recommendation and other more pointed observations on the state of current practice in DoD compared to industry best practices, the Department asked the National Academies to host a Workshop on Reliability. A second workshop on software intensive systems will be held next year. The first workshop was held 9 and 10 June 2000 at the National Academy in Washington. Speakers and discussants were from industry, academia, and the Department of Defense. The suggestions from the workshop will be part of an effort to implement the recommendation to produce " new battery of military handbooks containing a modern treatment of all pertinent topics in the fields of reliability and life testing" (Recommendation 7.12,Page126). DoD Handbook 5235.1-H, Test and Evaluation of System Reliability, Availability, and Maintainability: A Primer was last updated in 1982.

Implications of this effort to providing reliable systems for the warfighter will be addressed.

**FRIDAY, OCTOBER 20
GENERAL SESSION III (0830 - 1000)**

*Quantile/Quartile Plots, Conditional Quantiles,
Comparison Distributions
*Emanuel Parzen, Texas A&M University

This talk proposes many new quantile domain statistical methods for data analysis, data modeling, and data mining. As an alternative to modeling means, variances, correlations, and conditional means (regression) we model distributions, conditional distributions, and rank transforms of the observed data as described by quantile functions Q(u), 0<u<1, conditional quantile functions, and conditional comparison distributions. For bivariate data and two samples we propose non-parametric estimation of conditional quantiles in a new way that provides a unification of classical and modern (parametric and non-parametric) statistical methods. To identify distributions fitting one sample we propose Quantile/Quartile plots, which plot quantile/quartile identification function QI(u)=(Q(u)-QM)/QD, defining mid-quartile QM=.5(Q(.25)+Q(.75)), quartile deviation QD=.5(Q(.75)-Q(.25)) which estimate location and scale.

*Inovative Bayesian Designs in Clinical Trials*

Donald Berry (University of Texas, MD Anderson Cancer Center)

I will give some background on Bayesian designs for clinical trials and four examples of Bayesian designs used in actual trials. These examples embody the principles of early stopping, allocating treatements to maximize benefit to patients in and out of the trial, and two variations on a theme (a drug for stroke and another for non-small cell lung cancer): seamless phases II and III with sequential sampling and using surrogate endpoints.

**CONTRIBUTED SESSION VI (1015 - 1115)
**

*Modeling of Tank Gun Accuracy Under Two Different
Zeroing Methods
*David Webb, Army Research Laboratory and
Bruce Held, RAND

The accuracy of weapon systems firing unguided ammunition relies on understanding numerous sources of error and correcting for them. Random errors are addressed through weapon and ammunition design, production, maintenance, and firing processes that minimize the magnitude of these error sources. Bias errors are different in that their magnitude and direction can often be estimated and then accounted while aiming the weapon. In tanks, estimating the bias error is a two-part process. First, boresighting is a process that measures the offset between the fire control system of the tank and the cannon to compute an aiming correction. Second, the actual firing process imparts a deviation to the flight path of the trajectory that is different from what is expected by measuring the pointing direction of the cannon prior to initiating the shot process. This deviation is often referred to as 'jump'. Correcting for jump, which normally has a large, nonrandom component, is called zeroing. Zeroing can be accomplished a number of ways, but the two most common in tank gunnery are individual zero and fleet zero. Tanks are individually zeroed when the jump is estimated, usually through actual firing records, and a unique correction is applied that tank. Fleet zero is a process by which an average jump value is estimated for a number (fleet) of tanks, and the correction implied by that average jump is applied to every tank in the fleet. There are a number of advantages and disadvantages to each of these processes, but the relative accuracy benefits depend on the magnitude of the error sources that make up the total error budget for the fleet of tanks. Although this budget is comprised of numerous error sources, reasonable accuracy estimates are attainable with basic models of the predominant factors. This presentation illustrates models for first-shot accuracy under both zeroing processes to help determine under what conditions each is preferable.

*Analysis of Fuzzy Regression for Modeling Shelf-Life
of Gun Propellants
*Iris Rivero and Kwang-Jae Kim, Penn State
University

Shelf-life estimation of gun propellants is a situation in which representative data describing the factors influencing its behavior are not clearly established, neither is sufficient nor representative of variations presented from lot to lot. This research applies the concept of fuzzy regression analysis to model the life of the gun propellant and determines the distribution behavior that best fitted the relationship. Fuzzy regression takes into account such aspects as a non-linear relationship between factors, and variation that might exist between lots. It has been found through a simulation study that when representative data quality is bad fuzzy regression should not be used, and instead classical statistical regression analysis should be performed.

**GENERAL SESSION IV (1115 - 1200)**

*Conceptual Issues in Model Assessment: What
Can We Learn From Past Mistakes*

Naomi Oreskes, University of California, San Diego

In recent years, there has been growing recognition that complex models of natural systems cannot be validated, and that the term validation is misleading from both scientific and regulatory standpoints. From a regulatory standpoint, problems arise because of differences in the way the term validation is interpreted by expert and lay communities. From a scientific standpoint, problems arise when we assume that model validation provides confirmation of the underlying scientific conceptualization.

Most efforts at model validation concentrate on comparing model output with the natural world. While such comparisons can be useful, they do not provide adequate basis for confidence in the accuracy of the model. There have been many cases in the history of science of models that made accurate, quantitative predictions, but were later shown to be conceptually flawed. This paper examines three examples. In each case, the conceptual flaws were not apparent to their designers and users, yet appear obvious in retrospect. Furthermore, because the flaws were conceptual, quantitative assessment of model accuracy would not have revealed the underlying problems. Hindsight suggests that conceptually flawed models may still be useful for the immediate predictive problems for which they were designed, but they are not reliable for understanding processes and structures. A well-confirmed model may thus be acceptable for a design or problem-solving purpose, as long as that purpose does not require comprehension of underlying causes.