
DEPARTMENT OF STATISTICAL SCIENCE
GENERAL INFORMATION
The University-wide Department of Statistical Science at Cornell coordinates activities in statistics and probability at the undergraduate, graduate, and research levels.
Students interested in graduate study in statistics and probability can apply to the Graduate Field of Statistics or to one of the other graduate fields of study that offer related course work. Students in the Field of Statistics plan their graduate program with the assistance of their Special Committee. For detailed information on opportunities for graduate study, students should contact the director of graduate studies, 610 Rhodes Hall. The department also offers an undergraduate program through the Biometrics Unit in the College of Agriculture and Life Sciences. Undergraduate majors and certificate programs are currently under development for other colleges. For information, contact the undergraduate coordinator, Professor Steven Schwager (424 Warren Hall, 255Ð1644). Statistics courses offered by the department listed below will fill distribution requirements in many of the colleges.
A free consulting service is offered through the Biometrics Unit in the College of Agriculture and Life Sciences. Statistical computing consulting is available through a separate organization, the Office of Statistical Consulting, B21 Savage Hall, 255Ð1926.
The department is organized into four units: Biometrics, Engineering Statistics, Mathematical Statistics and Probability, and Social Statistics. The areas covered include agricultural statistics; biostatistics; economic and social statistics; epidemiology; manufacturing statistics, quality control and reliability; probability theory; sampling theory; statistical computing; statistical design; statistical theory; and stochastic processes and their applications.
Descriptions of undergraduate and graduate courses are listed below.
Biometrics Unit
Engineering Statistics Unit
Mathematical Statistics and Probability Unit
Social Statistics Unit
Related Courses in Other Departments
Major concepts and approaches of statistics are presented at an introductory level. Three broad areas are covered: collecting data, organizing data, and drawing conclusions from data. Topics include sampling, statistical experimentation and design, measurement, tables, graphs, measures of center and spread, probability, the normal curve, confidence intervals, and statistical tests.
Statistical methods are developed and used to analyze data arising from the biological sciences. Topics include point and confidence interval estimation, hypothesis testing, t-tests, correlation, simple linear regression, and analysis of variance and multiple regression. Statistical computing is taught and used throughout the course. Emphasis is on proper use of statistical methodology and interpretation of statistical analyses.
Students will attend weekly seminar, the Biometrics Unit Discussion Series. Can be taken concurrently with BTRY 600 only with permission of instructor. Students can only take course twice.
An introduction to probability theory; foundations, combinatorics, random variables and their probability distributions, expectations, generating functions and limit theory. Biological and statistical applications are the focus. Can serve as either a one-semester introduction to probability or a foundation for a course in the theory of statistics.
The concepts developed in BTRY 408 are applied to provide an introduction to the classical theory of parametric statistical inference. Topics include sampling distributions, parameter estimation, hypothesis testing, and linear regression. Students seeking applied courses in statistical methodology should consider BTRY 601-602 or BTRY 215.
A course of lectures selected by the faculty. Because topics usually change from year to year, this course may be repeated for credit.
Participation in the Biometrics Unit consulting service: faculty-supervised statistical consulting with researchers from other disciplines. Discussion sessions for joint consideration of selected consultations encountered during previous weeks.
Consists of individual tutorial study selected by the faculty. Because topics usually change from year to year, this course may be repeated for credit.
Statistical methods are developed and used to analyze data arising from a wide variety of applications. Topics include descriptive statistics, point and interval estimation, hypothesis testing, inference for a single population, comparisons between two populations, one- and two-way analysis of variance, comparisons among population means, analysis of categorical data, and correlation and regression analysis. Interactive computing is introduced through MINITAB statistical software. Emphasis is on basic principles and criteria for selection of statistical techniques.
A continuation of BTRY 601. Emphasis is on the use of multiple regression analysis, analysis of variance, and related techniques to analyze data in a variety of situations. Topics include an introduction to data collection techniques; least squares estimation; multiple regression; model selection techniques; detection of influential points, goodness-of-fit criteria; principles of experimental design; analysis of variance for a number of designs, including multi-way factorial, nested, and split plot designs: comparing two or more regression lines; and analysis of covariance. Emphasis is on appropriate design of studies prior to data collection, and the appropriate application and interpretation of statistical techniques. For practical applications, computing is done with the MINITAB and SAS statistical packages.
Categorical data analysis, including logistic regression, loglinear models, stratified tables, matched pairs analysis, polytomous response and ordinal data. Applications in biomedical and social sciences.]
Applications of experimental design including such advanced designs as split plot, incomplete blocks, fractional factorials. Use of the computer for both design and analysis will be stressed, with emphasis on solutions of real data problems.
Nonparametric and distribution-free alternatives to normal-theory testing procedures are presented; sign or rank tests for one or two populations; analyses for completely randomized and randomized blocks designs, comparisons among several means; correlation and regression; goodness-of-fit; and tests based on randomization of the data.
This course will develop skills in the preparation and interpretation of epidemiological data by discussing current research topics and issues.
Mathematical and statistical analysis of populations and communities: theory and methods. Spatial and temporal pattern analysis, deterministic and stochastic models of population dynamics. Model formulation, parameter estimation, and simulation and analytical techniques.
This course is a discussion group focusing on statistical problems arising in the environmental sciences. These issues are explored in a number of different ways, such as student presentations of research papers, directed readings, and outside speakers.
Statistical and mathematical topics of current interest in molecular biology: genetic mapping, physical mapping, DNA sequence analysis, phylogenetic inference, population modeling. Topics may vary. The course may be repeated for credit.
Analysis of variance and estimation procedures for unequal-subclass-numbers data. Cell means models for the 1-way classification, nested classifications, and the 2-way crossed classification, both with and without interactions; introduction to multinormal variables and the distribution of quadratic forms. The general linear model (in matrix and vector form), estimable functions, and testable hypotheses. Overparameterized models, restricted models, multifactor cases, covariables, computing.]
This course should give students a working knowledge of basic probability and statistics and their application to engineering. Computer analysis of data and simulation are emphasized. Topics include random variables, probability distributions, expectation, estimation, testing, experimental design, quality control, and regression.
Introduction to the theory of probability as a basis for modeling random phenomena and signals, calculating the response of systems, and making estimates, inferences, and decisions in the presence of chance and uncertainty. Applications will be given in such areas as communications, and device modeling, probability, characteristic functions; nonlinear transformations of data; expectation, correlation; and the central limit theorem.
This second course in probability and statistics provides a rigorous foundation in theory combined with the methods for modeling, analyzing, and controlling randomness in engineering problems. Probabilistic ideas are used to construct models for engineering problems, and statistical methods are used to test and estimate parameters for these models. Specific topics include random variables, probability distributions, density functions, expectation and variance, multidimensional random variables, and important distributions including normal, Poisson, exponential, hypothesis testing, confidence intervals, and point estimation using maximum likelihood and the method of moments.
Basic concepts and techniques of random processes are used to construct models for a variety of problems of practical interest. Topics include the Poisson process, Markov chains, renewal theory, models for queuing and reliability.
Introduction to models for random signals in discrete and continuous time, Markov chains, Poisson process, queuing processes, power spectral densities, Gaussian random process. Response of linear systems to random signals. Elements of estimation and inference as they arise in communications and digital signal processing systems.
Linear models; estimation and testing; confidence sets; diagnostics and residual analysis; variable selection and modeling.
Experimental design to improve industrial products and manufacturing processes. Randomization. Blocking. Fractional factorials. Orthogonal arrays. Nested designs.
A strong familiarity with linear algebra is assumed. An introduction to the theory of algebraic error-control codes. Topics include: Hamming codes, group codes, the standard array minimum-distance decoding, cyclic codes, and the dual of a linear block code. Hamming and Singleton bounds for error-correcting codes. The construction and decoding of Bose-(Ray) Chaudhuri-Hocquenghem (BCH) and Reed-Solomon (RS) codes. Computer methods for the study of the structure and algorithms for error-control are used.
Fundamental results of information theory with application to storage, compression, and transmission of data. Entropy and other information measures. Block and variable length codes. Channel capacity and rate-distortion functions. Coding theorems and converses for classical and multiterminal configurations. Gaussian sources and channels.
Classical line-switched communication networks: point-process models for offered traffic; blocking and queuing analyses. Stability, throughput, and delay of distributed algorithms for packet-switched transmission of data over local-area and wide-area nets: using various protocols, TDMA. Flow control and capacity assignment algorithms for wideband circuit-switched and ATM networks.
An introduction to those methods of making rational decisions and inferences and of forming estimates that are central to problems of communications, detection, pattern recognition, and statistical signal processing. Topics include Bayes, minimax and Neyman-Pearson decision theories; Bayes and maximum likelihood point estimation; Cramer-Rao bound efficient and consistent estimation; spectral estimation, and robust models for signal extraction.
Artificial neural networks are brainlike in being formed out of many highly interconnected nonlinear memoryless elements. Probability theory will provide our primary analytical approach to design and analysis of neural networks. The course will cover mathematical and computer-based design capabilities of feed-forward nets (multilayer perceptrons) that can serve as pattern classifiers.
Basic concepts and techniques of random processes ares used to construct models for a variety of problems of practical interest. Topics include the Poisson process, Markov chains, renewal theory, models for queuing and reliability.
This second course in probability and statistics provides a rigorous foundation in theory combined with the methods for modeling, analyzing, and controlling randomness in engineering problems. Probabilistic ideas are used to construct models for engineering problems, and statistical methods are used to test and estimate parameters for these models. Specific topics include random variables, probability distributions, density functions, expectation and variance, multidimensional random variables, and important distributions including normal, Poisson, exponential, hypothesis testing, confidence intervals, and point estimation using maximum likelihood and the method of moments.
Basic queuing models. Little's Law, PASTA property. Markovian and non-Markovian queues. Optimization of queues. Polling queues: exhaustive and gated service Jackson queuing networks. Open networks and closed networks. Product-form queuing networks.
The first part of this course treats regression methods to model seasonal and non-seasonal data. After that, Box-Jenkins models, which are versatile, widely used, and applicable to nonstationary and seasonal time series, are covered in detail. The various stages of model identification, estimation, diagnostic checking, and forecasting are treated. Analysis of real data is carried out. Assignments require computer work with a time-series package.]
Analysis of data from reliability, fatigue, and life-testing studies in engineering; biomedical applications. Survival distributions, hazard rate, censoring. Life tables. Estimation and hypothesis testing. Standards. Goodness of fit, hazard plotting. Covariance analysis, accelerated life testing. Multiple decrement models, competing risks. Fault tree analysis, reliability of systems.]
Concepts and methods for process and acceptance control. Control charts for variables and attributes. Process capability analysis. Acceptance sampling. Continuous sampling plans. Life tests. Use of experimental design and Taguchi methods for off-line control.
Digital computer programs to simulate the operation of complex discrete, systems in time. Modeling, program organization, pseudo-random-variable generation, simulation languages, statistical considerations, applications to a variety of problem areas.
An introduction to stochastic processes that presents the basic theory together with a variety of applications. Topics include Markov processes, renewal theory, random walks, branching processes, Brownian motion, stationary processes, martingales, and point processes.
Sample spaces, events, sigma fields, probability measures, set induction, independence, random variables, expectation, review of important distributions and transformation techniques, convergence concepts, laws of large numbers and asymptotic normality, conditioning.
Review of distribution theory of special interest in statistics; normal, chi-square, binomial Poisson, t, and F; introduction to statistical decision theory; sufficient statistics; theory of minimum variance unbiased point estimation; maximum likelihood and Bayes estimation; basic principles of hypothesis testing, including Neyman-Pearson Lemma and likelihood ratio principle; confidence interval construction; introduction to linear models.
Statistical inference based on the general linear model; least-squares estimators and their optimality properties; likelihood ratio tests and corresponding confidence regions; simultaneous inference. Applications in regression analysis and ANOVA models. Variance components and mixed models. Use of the computer as a tool for statistics is stressed.
This course is a discussion group focusing on statistical problems arising in the environmental sciences. These issues are explored in a number of different ways, such as student presentations of research papers, directed readings, and outside speakers.
This introductory statistics course will discuss techniques for analyzing data occurring in the real world and the mathematical and philosophical justification for these techniques. Topics include population and sample distributions, central limit theorem, and statistical theories of point estimation, confidence intervals, and testing hypotheses, the linear model, and the least squares estimator. The course concludes with a discussion of tests and estimates for regression and analysis of variance (if time permits). The computer will be used to demonstrate some aspects of the theory, such as sampling distributions and the Central Limit Theorem. In the lab portion of the course, students will learn and use computer-based methods for implementing the statistical methodology presented in the lectures. (No previous familiarity with the computer is presumed).
May be used as a terminal course in basic probability. Intended primarily for those who will continue with Mathematics 472. Topics include combinations, important probability laws, expectations, moments, moment-generating functions, limit theorems. Emphasis is on diverse applications and on development of use in statistical applications. See also the description of Mathematics 571.
Classical and recently developed statistical procedures are discussed in a framework that emphasizes the basic principles of statistical inference and the rationale underlying the choice of these procedures in various settings. These settings include problems of estimation, hypothesis testing, large sample theory.
This is a second-semester undergraduate course on probability. It covers topics from renewal theory, martingales, discrete and continuous time Markov chains, Brownian motion and related diffusion processes, and applications to queuing theory and finance. Theoretical as well as applied aspects of the subject will be emphasized.
Properties and examples of probability spaces. Sample space, random variables, and distribution functions. Expectation and moments. Independence Borel-Cantelli lemma, zero-one law. Convergence of random variables, probability measures, and characteristic functions. Law of large numbers. Selected limit theorems for sums of independent random variables. Markov chains, recurrent events. Ergodic and renewal theorems. Martingale theory. Brownian motion and processes with independent increments.
Topics include an introduction to the theory of point estimation, consistency, efficiency, sufficiency, and the method of maximum likelihood. Convexity and basic concepts of decision theory are introduced. Concepts of sequential methods may be discussed.
An introduction to the basic concepts of statistics and data analysis. Descriptive methods, mathematical models and inference procedures for univariate and bivariate data. Basic statistical designs, and introduction to probability and applications of the Binomial and Normal distributions. Estimation, confidence intervals and tests of significance for a population mean and proportion, simple linear regression, correlation and two-way contingency tables. Students are instructed on the use of a statistics computer package at the beginning of the term and use it for weekly assignments.
A second course in statistics. Applications of statistical data analysis techniques, particularly to the social sciences. Topics include: statistical inference; simple linear regression; multiple linear regression; elements of time series analysis; and sample-survey design. Computer packages are used throughout the course.
Theory and application of statistical sampling, especially in regard to sample design, cost, estimation of population quantities, and error estimation. Assessment of nonsampling errors. Discussion of applications to social and biological sciences and to business problems. Course includes an applied project.
Matrix algebra is a necessary tool for statistics courses such as regression and multivariate analysis and for other "research methods" courses in various other disciplines. One goal of this course is to provide students in various fields of knowledge with a basic understanding of matrix algebra in a language they can easily understand. Topics include special types of matrices; matrix calculations; linear dependence and independence; vector geometry; matrix reduction (trace, determinant, norms); matrix inversion; linear transformation; eigenvalues; matrix decompositions; ellipsoids and distances; some applications of matrices.
First the matrix algebra necessary to analyze regression models is reviewed. Then, multiple linear regression, analysis of variance, nonlinear regression, and linear logistic regression models are covered. For these models, least squares and maximum likelihood estimation, hypothesis testing, model selection, and diagnostic procedures are considered. Illustrative examples are taken from the social sciences. Computer packages are used. Course includes an applied project.
Techniques of multivariate statistical analysis discussed and illustrated by examples from various fields. We emphasize application, but theory will not be ignored. Deviation from assumptions and the rationale for choices among techniques are discussed. Students are expected to learn how to thoroughly analyze real-life data sets using computer-packaged programs. Participants should have some knowledge of matrix notation. Topics include: multivariate normal distribution; sample geometry and multivariate distances; inference about a mean vector; comparison of several multivariate means, variances, and covariances; detection of multivariate outliers; principal component analysis; factor analysis; canonical correlation analysis; discriminant analysis, and multivariate multiple regression.
An advanced undergraduate and beginning graduate course. Includes treatment of association between qualitative variates, rank-order methods, and other nonparametric statistical techniques, including those related to chi-squared.
A first course in statistics for graduate students in the social sciences. Descriptive statistics, probability and sampling distributions, estimation, hypothesis testing, simple linear regression and correlation. Students are instructed in the use of a statistics computer package at the beginning of the term and use it for weekly assignments.
A second course in statistics that emphasizes applications to the social sciences. Topics include; simple linear regression; multiple linear regression (theory, model building, and model diagnostics); and the analysis of variance. Computer packages are used extensively.
An advanced survey of modern data analysis methods. Topics include exploratory data analysis, data reexpression, philosophy of data analysis, robust methods, statistical graphics, regression methods, and diagnostics. Extensive outside readings cover recent and historical work. Participants should have some knowledge of multiple regression, including the use of matrices, and some experience using a computer.]
A survey of new aspects of statistical computing. Topics include: basic numerical methods, numerical linear algebra, nonlinear statistical methods, numerical integration and approximation, smoothing and density estimation. Additional special topics may include: Monte Carlo methods, statistical graphics, computing-intensive methods, parallel computation, computing environments. Designed for gradate students in the statistical sciences and related fields interested in new advances. Students may be asked to write programs in a programming language of their choice.]
An introduction to a variety of statistical techniques that assign objects to categories on the basis of observed characteristics of the objects. Course topics include (but are not limited to): discriminant analysis and its extensions and variations; nearest neighbor methods, classification and regression trees (CART); neural networks for classification; and estimation of error of classification rules.]
This course covers the following topics: loss functions and utility theory, prior information and subjective probability, coherency, basic Bayesian inference, empirical Bayesian inference, robust Bayesian inference, Bayesian computations, ancilliarity, conditional properties of statistical procedures, and Barndorff-Nielsen's exact likelihood theory.]
Provides a comprehensive introduction to the general structural equation system, commonly known as the "LISREL model." One purpose of the course is to demonstrate the generality of this model. Rather than treating path analysis, recursive and nonrecursive models, classical econometrics, and confirmatory factor analysis as distinct and unique, we will treat them as special cases of a common model. Another goal of the course is to emphasize the application of these techniques.
This is an interdisciplinary course for students in applied mathematics, computer science, statistics, and other related fields of applications such as medical, engineering, and social sciences. Topics include: components of expert systems, rule-based expert systems, probability-based expert systems, uncertainty measures, dependency models, Bayesian and Markov networks, propagation of uncertainties, learning structure from data, and examples of applications. Students will use computer software to gain experience.]
This course is a continuation of Economics 519 (Econometrics I) covering (1) statistics; estimation theory, least squares methods, method of maximum likelihood, generalized method of moments, theory of hypothesis testing, asymptotic test theory, and nonnested hypothesis testing and (2) econometrics; the general linear model, generalized least squares, specification tests, instrumental variables, dynamic regression models, linear simultaneous equation models, nonlinear models, and applications.
This course gives the probabilistic and statistical background for meaningful application of econometric techniques. Topics to be covered are (1) probability theory; probability spaces, random variables, distributions, moments, transformations, conditional distributions, distributions theory and the multivariate normal distribution, convergence concepts, laws of large numbers, central limit theorems, Monte Carlo simulation; (2) statistics; sample statistics, sufficiency, exponential families of distributions. Further topics in statistics will be considered in Economics 520.
A course on regression for students in statistical sciences and related fields. Attempts to narrow the gap between the theory and practical application of the linear regression model. Classical and recently developed statistical procedures are discussed. Students will be expected to read articles and thoroughly analyze real-life data sets using computer-packaged programs. Topics include role of variables in regression equation, regression diagnostics (outliers, leverage points, influential observations, generalized linear models, errors in variables, and multicollinearity).
Sampling theory from the viewpoint of mathematical statistics. The first part of the course focuses on the classical or "design" approach; the second part on the more recent "model-based" approach. Attention is paid to recent process in the field.]
The statistical analysis of life history data is playing an increasing role in the social, natural, and physical sciences. We will formulate and solve various practical problems in the statistical analysis of life history data using the modern theory of stochastic processes. We will examine the martingale dynamics for point processes relevant to life history data. Both parametric and nonparametric inference for multiplicative intensity models will be considered. The large sample properties of the proposed procedures will be discussed in detail using recent extensions of functional central limit theorems for martingales.]
Recent research has revealed vast territories of distribution theory that are unfamiliar to most statisticians. Provides an introduction to three topics underlining this "modern" theory: infinite divisibility, decomposability, and stability; characterization of distributions; extensions of univariate distributions to multivariate distributions.
In most statistical models, exact distribution theory for testing hypotheses or constructing confidence intervals is either unavailable or computationally cumbersome. Inferences are routinely performed by using large-sample approximations to the distributions of test statistics. This course provides a survey of some recent higher-order asymptotic approximations for likelihood-based methods of inference.]
A course in practical consulting on "real-world" statistical problems. Under the supervision of the instructor(s), students will hear problems presented by clients (usually faculty and graduate students from other fields) and will collaborate in proposing a statistical model, analyzing data, and interpreting results. Statistical computing will be used as needed.
Advanced topics in econometrics, such as asymptotic estimation and test theory, robust estimation, Bayesian inference, advanced topics in time-series analysis, errors in variable and latent variable models, qualitative and limited dependent variables, aggregation, panel data, and duration models.
This course covers traditional and current time series techniques that are widely used in econometrics. Topics include the theory of stationary stochastic processes including univariate ARMA (p,q) models, spectral density analysis, and vector autoregressive models; parametric and semi-parametric estimation; current developments in distributional theory; estimation and testing in models with integrated regressors including, unit root tests, cointegration, and permanent vs. transitory components.
Advanced topics in econometrics, such as asymptotic estimation and test theory, robust estimation. Bayesian inference, advanced topics in time-series analysis, errors in variable and latent variable models, qualitative and limited dependent variables, aggregation, panel data, and duration models.
Write to
cuinfo-admin@cornell.edu
with your comments and suggestions.