Table of Contents

Keynote Address

Considerations of Inspections for Homeland Security with Cross Linkages to Quality Control, Game Theory, and Stochastic Simulation (Keynote Address Abstract)

James R. Thompson


Models of Homeland Security with Borrowings from SPC and Game Theory (Keynote Paper), James R. Thompson


Short Courses


Statistical Methods in Computer Security (Abstract and Presentation)

David J. Marchette


Topics in Computational Statistics with MATLAB (Abstract and Presentation)

Jeffrey L. Solka



Technology Transfer in Industry using an R-COM Interface (Abstract)

Scott Wander Wiel, John Chambers, Suresh Goyal, David James


Inter-System Interfaces for S (Abstract)

Duncan Temple Lang


Sparse Linear Algebra for the R Language (Abstract ad Presentation)

Pin Ng, Roger Koenker



Exploratory Detection of Differential Gene Expression (Abstract)

John D. Storey


Statistical Issues in the Design of Microarray Experiments (Abstract)

Jean Yee Hwa Yang, Terry Speed


Exploration, Normalization, Summaries and Software of Affymetrix GeneChip Probe Level Data (Abstract and Presentation), Rafael Irizarry

Security and Infrastructure Protection: Interface of Computing Science

and Statistics to the Rescue

Statistical Opportunities in Network Security (Paper and Presentation)

David J. Marchette


Using Statistics to Detect and Thwart Denial of Service Attacks (Abstract)

Carla Brodley


Maximizing Quality and Value in Security: Challenges for Computer Scientists, Statisticians and Their Clients (Abstract), Arnold F. Goodman


Usability and Accessibility of Visualization Tools for Health Statistics

Usability Testing of Map Designs (Paper)

Linda William Pickle


Accessible Graphics on the World Wide Web (Extended Abstract and Presentation)

Dan J. Grauman


Java-based Dynamic Linked Micromap Plots (Abstract and Presentation)

Jim X. Chen, Xusheng Wang, Daniel B. Carr, B. Sue Bell, and Linda W. Pickle


Web Design and Usability Guidelines: An Evidence-based Approach (Abstract)

Sanjay Koyani


Statistical Graphs

Graph-Theoretic Latent Class Discovery and its Robustness to Minimal Discriminating Set Choice (Abstract and Presentation)

Jeff Solka, Carey Priebe, David Marchette


Asymptotic Theory for the Domination Number of Random Class Cover Catch Digraphs (Paper), John C. Wierman and Pengfei Xiang


Bayesian Models for Sparse Edge Weighted Directed Graphs (Abstract and Presentation)

Deepak Agarwal


Sensitivity Analysis of Graph Distance Measures (Paper)

Julian Sorensen, Peter Dickinson and Martina Schubert


Latent Variable Models for Link Analysis of Similarity Data (Paper)

Juan K. Lin

Financial Risk and Fraud Detection

Audit at a Crossroads (Paper)

Conan C. Albrecht


Credit Scoring Using Bureau and Other Data (Abstract)

Garry K. Ottosen


Computational Statistics with Spreadsheets Towards Efficiency, Reproducibility and

Security (Paper), G. Aydinli, W. Härdle, E. Neuwirth


Some Loans are More Equal than Others: Third-Party Originations and Defaults in the

Subprime Mortgage Industry (Abstract and Presentation)

Scott D. Grimshaw, Grant R. McQueen, William P. Alexander, Barrett A. Slade


Computationally Challenging Statistical Methods in Genetics

Enumeration and Simulation of Marriage Node Graphs on Zero Loop Pedigrees (Abstract)

Alun Thomas and Chris Cannings


Statistical and Computational Issues in Mapping Genes in Animal Populations (Abstract)

Fengxing Du


Computations in Animal Breeding (Abstract and Presentation)

Ignacy Misztal and Romdhane Rekaya


Statistical Analysis and Probabilistic Modeling of Internet Traffic

Statistical Clustering of Internet Communication Patterns (Abstract and Presentation)

Felix Hernandez-Campos, A. B. Nobel, F. D. Smith, K. Jeffay


A Solution to the Bandwidth Allocation Problem for the Internet (Abstract)

William S. Cleveland, Jin Cao and Don X. Sun


The Joint Distribution of Internet Flow Sizes and Durations (Abstract and Presentation)

Cheolwoo Park and J S. Marron


Data Analysis and Visualization of Graph Data

Using Graphs to Explore Communication Networks (Abstract)

Chris Volinsky


Graphs and EDA in Computational Biology (Abstract)

Robert Gentleman


Visual Exploration of Graph Data (Abstract and Presentation)

Deborah F. Swayne, Duncan Temple Lang and Andreas Buja


Computational Methods

Cost-Sensitive Classifier Selection Using the ROC Convex Hull Method (Paper and

Presentation), Ross Bettinger


A New Imagery Classification Methods Using Spatial Covariance Information (Paper)

James A. Shine and Daniel B. Carr


A Risk-Utility Framework for Data Swapping (Abstract and Presentation)

Shanti Gomatam, Alan Karr and Ashish Sanil


Optimal Stopping of a Risk Process with Interest Rates (Paper and Presentation)

Bogdan Krzysztof Muciek


A Likelihood Approach for Determining Cluster Number (Paper)

William D. Shannon, Tsvika Klein and Robert Culverhouse


Advances in Face/Pattern Recognition

Obtaining Smooth Directional Field Estimates for Fingerprint Images (Abstract)

Sarat C. Dass


Fast Face Detection with a Boosted CCCD Classifier (Paper), Diego A. Socolinsky,

Joshua D. Neuheisel, Carey E. Priebe, Jason DeVinney, David Marchette


Computer-Intensive Statistical Methods

Parametric Bootstrap Confidence Intervals in Small Area Estimation Problems (Abstract)

Snigdhansu Chatterjee and P. Lahiri


Nonparametric Tail Index Estimation (Abstract)

Tucker McElroy and Dimitris Politis


Bayesian Inference in Single-Layer Neural Networks (Abstract)

Robert Paige


Selecting Optimal Block Lengths for Block Bootstrap Methods (Paper)

S. N. Lahiri

Computational Statistics and NSA

Statistical Analysis of Massive Data Streams: Overview of a CATS Workshop (Abstract)

Sallie Keller-McNulty


A-Family Priors: Smoothing Multinomial Data (Abstract)

Jeff Benedict


Model Building and Diagnostics for Massive Data Sets (Abstract)

David W. Scott


Graphics and Visualization

Visual Data Mining for Quantized Spatial Data (Abstract and Presentation)

Amy Braverman


Interactive Federal Statistical Data on the Web Using “NVIZN” (Paper and Presentation)

Jon Hurst, Jürgen Symanzik and Lacey Gunter


Visualizing Random Forests (Abstract)

Adele Cutler and Leo Breiman


Interactive Spinograms (Abstract)

Heike Hofmann and Martin Theus


Scatterplots for Massive Datasets (Abstract and Presentation)

Martin Theus, Di Cook and Heike Hofmann


Sensors for Biological Threats

Identification of Bio-warfare Agents and Other Applications of Molecular Biology

(Abstract), Todd Ritter


Rapid Microbe Detection with Fluidized Bed Capture and Concentration (Abstract)

Bart Weimer and Marie Walsh


Early Diagnosis of Biological Threats: Progress and Challenges (Abstract)

Stephen S. Morse


Best of KDD-2002

Customer Lifetime Value Modeling and its Use for Retention Planning (Abstract)

Saharon Rosset, Elnat Neumann, Uri Eick, Nurit Vatnik

Doing Something Useless Slightly Faster: The State of the Art in Time Series Data

Mining (Abstract), Eamonn Keogh


Homeland Security and Related Issues

Areas of Homeland Security: At the Computational Statistical Interface (Abstract)

Deborah Leishman


Rule-Based Anomaly Pattern Detection for Detecting Disease Outbreaks (Abstract)

Andrew Moore


Pointers from Research on Data Confidentiality and Data Quality (Abstract and

Presentation), Ashish Sanil


Modern Text Processing and Distribution

The Journal of Statistical Software (Abstract and Presentation)

Jan de Leeuw


Preparing Electronic Books (Abstract and Presentation)

Edward J. Wegman and Amy Braverman


Electronic Books for Experts and Users (Paper and Presentation)

Zdenĕk Hlávka


Environmental Statistics

A Spatial Model for Chronic Wasting Disease in Rocky Mountain Mule Deer (Paper)

Christopher H. Mehl and Craig J. Johns


Spatial Statistics in the Presence of Location Error (Abstract and Presentation)

John Kornak and Noel Cressie


Separating Signal from Noise in Global Warming (Paper)

Bert W. Rust


Predictive Mapping of Forest Characteristics for Fire Risk Assessment (Abstract)

Gretchen Moisen, Tracey Frescino, Cheng Huang, Jim Vogelmann, Zhiliang Zhu


Nonparametric Modeling of Soil Characteristics of Crop Models (Abstract)

Stephan R. Sain and Doug Nychka


Social Networks and Statistics

Random-Effects Models for Network Dependence (Abstract)

Peter Hoff


Ultra-Robust and Scalable Networks Based on Hierarchies (Abstract and Presentation)

Peter Dodds, Duncan Watts and Charles Sabel


Statistical Models, Degeneracy and Inference for Social Networks (Abstract)

Mark S. Handcock


Best of the International Association for Statistical Computing

Incremental Tree-Based Missing Data Imputation with Lexicographic Ordering (Paper

and Presentation), Claudio Conversano and Roberta Siciliano


Many Faces of a Tree (Paper)

Simon Urbanek


A Wildlife Simulation Package (WiSP) (Paper and Presentation)

Walter Zucchini, David Borchers, Stefan Kirchfeld and Martin Erdelmeier


Digital Government Research in Support of Federal Statistics

Using Ontology as Generalized Metadata Schema for Access to Distributed

Heterogeneous Data Sources (Abstract), Edward Hovy


Interfaces to a Statistical Knowledge Network (Abstract)

Gary Marchionini


New Approaches to Mobile Computing for Field Data Collection (Abstract)

Sarah Nusser


Smoothing and Nonparametric Feature Detection

A SiZer Analysis of IP Flow Start Times (Abstract)

J. S. Marron, Felix Hernandez-Campos and F. D. Smith


Longitudinal Kernel Regression (Abstract)

Naisyin Wang, Raymond J. Carroll, Xihong Lin and Ziding Feng


Semiparametric Regression Smoothing and Feature Detection in Time Series (Paper)

Michael G. Schimek

Safety and Security


Traffic Safety Analysis: A Data Mining Approach (Paper)

J. Michael Hardin, Michael Conerly and Wade Watkins


Bayesian Inductively Learned Modules for Safety Critical Systems (Paper)

Jonathan E. Fieldsend, Trevor C. Bailey, Richard M. Everson, Wojtek J. Krzanowski,

Derek Partridge and Vitaly Schetini


Waypoint Analysis for Command and Control (Abstract)

Mark Irwin, David Wendt and Noel Cressie


Continually Improving Stream Analysis for Network Security (Paper and Presentation)

Nancy J. McMillan, Douglas D. Mooney and David A. Burgoon


A Micro-Scale Epidemiological Simulation for Management of Disease (Abstract)

Sid Baccam, Stephen Eubank and Catherine Macken


Public Health Preparedness and Response in Crisis

Using Design-Based Adaptive Sampling Procedures in Site Decontamination (Paper)

Myron J. Katzoff, Abera Wouhib and Joe Fred Gonzalez, Jr


Gamer Theory and Risk Analysis for the Smallpox Threat (Abstract and Presentation)

David Banks


Best of the Journal of Computational and Graphical Statistics

Penalized Survival Models and Frailty (Abstract)

V. Shane Pankratz, Patricia M. Grambsch and Terry M. Therneau


Adaptive Order Selection for Spline Smoothing (Abstract)

Randy Eubank, Chunfeng Huang and Suojin Wang


An Adaptive Spatial Scan Density Estimation Method (Abstract)

Ramani S. Pilla, Peng Tao and Carey Priebe


Infrastructure Security

Energy Infrastructure Vulnerability Assessments (Abstract)

Jeff Dagle


PNNL and International Border Security (Abstract)

William C. Cliff


Electricity Infrastructure Security (Abstract)

Thomas Kropp


Statistical Computing (Refereed Papers)

RGL: A R-Library for 3D Visualization with OpenGL (Paper and Presentation)

Daniel Adler, Oleg Nenadić and Walter Zucchini


Constructive Ensembles for Time Series in Econometrics and Finance (Paper)

H. D. Vinod


The Quickest Sequential Detection of Intrusions in Computer Networks (Abstract)

Boris Rozoskii, Rudolf Blazek, Hongjoong Kim and Alexander Tartakovsky


Implementing Legacy Statistical Algorithms in a Spreadsheet Environment (Paper and

Presentation), Stephen W. Liddle and John S. Lawson



A Bayesian Mixture Model for Bayesian Gene Expression (Abstract and Presentation)

Kim-Anh Do, Peter Mueller and Feng Tang


A Simple Approach to Accommodating Interactive and Batch Processes on a

Bioinformatics Cluster (Abstract)

Warren M. Snelling, John W. Keele and Gregory P. Harhay


Selecting an Optimal Rejection Region for Multiple Testing (Paper)

David R. Bickel


Statistical Methods for Spot Detection with Macroarray Data (Paper)

Yi Xie, Adele Cutler, Bart Weimer and Andrejus Parfionovas


Statistical Issues in Computer Security

Email Worm Propagation on Random Graphs (Paper)

Stephan Bohacek


User Profiling for Intrusion Detection in Windows NT (Paper)

Tom Goldring


A Stochastic Model of Computer Intrusions for Evaluations and Exercises (Abstract and

Presentation), Robert P. Goldman


Multi-Level Monitoring and Fuzzy Clustering to Detect Cyber Attacks (Abstract)

Dipankar Dasgupta, Jonatan Gomez and Fabio Gonzalez


Interactive GeoGraphics for the Web

Integrated Climate Database (Abstract)

Dan Dansereau and Robert R. Gillies


Web Cartography for Municipal Government: An Accessibility Case Study (Abstract)

Robert Edsall


Design and Statistical Computation

Application of Simulated Annealing to D-optimal Design for Polynomial Regression with

Correlated Observations (Abstract), Zewen Zhu and Daniel C. Coster


Experimental Design for Body Image Testing (Paper)

Craig J. Johns and Russel L. Boice


A Stabilized Lugannani-Rice Formula (Paper)

George R. Terrell


Computation of the Normalization Constant for Exponentially Weighted Dirichlet

Distribution Integrals (Paper), Alan Genz and Paul Joyce


A Plot for Visualizing Multivariate Data (Paper and Presentation)

Rida E. A. Moustafa


Nonparametrics (Refereed Papers)

Estimating Partially Linear Models Using Wavelets: A Nonlinear Backfitting Algorithm

(Paper), Leming Qu


Nonlinear Smoothers in Two Dimensions for Environmental Data (Paper)

Karen Kafadar and Max D. Morris


A Comparison of Filters and Wrappers Methods for Feature Selection Methods in

Supervised Classification (Paper), Edgar Acuña


Novel Methods for Multivariate Ordinal Data Applied to Olympic Medals, Risk Profiles,

Genomic Pathways, Genetic Haplotypes, Pattern Similarity, and Array Normalization

(Paper), Knut M. Wittkowski


Data Management for Statistical Databases

Database Technology for Statistical Data (Abstract)

Arie Shoshani


Metadata Usage in Statistical Computing (Paper)

Wilfried Grossmann


Data Structures for HIV and AIDS Notification and Analysis (Abstract and Presentation)

Andrew Westlake



Multiscale ‘Spatial’ Analysis of Network Data: Putting Wavelets on Graphs (Abstract)

Eric D. Kolaczyk


Social Networks and Computer Networks (Abstract and Presentation)

John Rigsby and Jeff Solka


Simultaneous Selection of Features and Metric for Optimal Nearest Neighbor

Classification (Paper and Presentation)

David A. Johannsen, Edward J. Wegman, Jeffrey L. Solka and Carey E. Priebe


Prediction of Catastrophic Events

Detecting Features in Seismic and Geodetic Data (Abstract)

Andrea Donnellan, Robert Granat and John Rundle


Predicting Damaging Climate Events: Methods, Examples, and Public Reaction

(Abstract), David W. Pierce and Tim P. Barnett


Predicting and Comprehending Asteroid Impacts (Abstract and Presentation)

Clark R. Chapman


Graphics for Bio and Chem Informatics

Graphic-Centric, Computationally-Efficient Recursive Partitioning (Abstract)

 James Vivian, S. Stan Young and Christophe Lambert


Applications of Computational Geometry, Statistical Analysis and Graphics to the Study

of Molecular Systems (Abstract), Daniel B. Carr and Iosif Vaisman


Statistical Methods to Compress and Query Large Databases and Data


Data Stream Algorithmics (Abstract)

S. Muthukrishnan

Efficient Processing of Massive Data Streams for Mining and Monitoring (Abstract and

Presentation), Mirek Riedewald, Johannes Gehrke, Alan Demers, Abhinandan Das, Alin



Wavelets for Efficient Querying of Large Multidimensional Data Sets (Abstract and

Presentation), Cyrus Shahabi


Data Mining Combat Simulations

Data Mining Combat Simulations: An Emerging Opportunity (Abstract and Presentation)

Barry A. Bodt


Regression Tree Analysis of Battle Simulation Data (Abstract)

Wei-Yin Loh


Robust Modeling Based on L2E Applied to Combat Simulation Data (Paper)

David B. Kim


Discovery of Battle States Knowledge from Multi-Dimensional Time Series Data (Abstract)

T. W. Liao, B. Bodt, J. Forester, C. Hansen, E. Heilman, C. Kaste, J. O’May


Forests at Risk

Satellite to the Public in Near Real-Time: Providing Active Wildfire Information with

MODIS Rapid Response (Abstract)

Mark Finco, Brad Quayle, Rob Sohlberg and Jacques Descloitres


Identifying “Redtops”: Classification of Satellite Imagery for Tracking Mountain Pine

Beetle Progression through a Pine Forest (Paper)

Richard Cutler, Leslie Brown, James Powell, Barbara Bentz and Adele Cutler


Design Attributes for Sampling Rare Ecological Events in Forest Ecosystems: Lichens in

the Pacific Northwest (Abstract), Thomas C. Edwards and Richard Cutler


Nonparametric Methods and Applications

Multivariate Density Estimation with Permuted Variable-Values (Abstract and

Presentation), Sridevi Parise, Padhraic Smyth and Sergey Kirshner


Computational Challenges in Computing Nearest Neighbor Estimates of Entropy for

Large Molecules (Paper and Presentation)

E. James Harner, Harshinder Singh, Shenggiao Li and Jun Tan


Mixture Transitions for Edge Preservation in Kalman Filtering (Abstract)

Mark Fitzgerald


Statistical Learning Theory and Statistics: Embracing New Technologies (Abstract)

Kevin Watanabe


Using a LOESS Smoother to Estimate the Parameters of an Angular Dependent

Distribution of HRR Data (Abstract)

Bradley C. Wallet, Robert W. Hawley and Troy L. Klein


The following paper was inadvertently omitted from the 2001

Proceedings Volume 33

Interval Computation of Gamma Probabilities and their Inverses (Paper)

Trong Wu