Seminars

2018-11-29 - Bruno Sanso

Presenter:

Bruno Sanso

Title:

Affiliation:

University of California Santa Cruz

Date:

November 29, 2018

Abstract:

Website:

Dr. Sanso's Website

2018-11-15 - Margie Rosenberg

Presenter:

Margie Rosenberg

Title:

Affiliation:

University of Wisconsin-Madison

Date:

November 15, 2018

Abstract:

Website:

Dr. Rosenberg's Website

2018-11-08 - Terrance Savitsky

Presenter:

Terrance Savitsky

Title:

Affiliation:

Bureau of Labor Statistics

Date:

November 8, 2018

Abstract:

Website:

2018-10-25 - Dustin Harding

Presenter:

Dustin Harding

Title:

Affiliation:

UVU

Date:

October 25, 2018

Abstract:

Website:

Dr. Harding's Website

2018-10-18 - Abel Rodriguez

Presenter:

Abel Rodriguez

Title:

Affiliation:

UC Santa Cruz

Date:

October 18, 2018

Abstract:

Website:

Dr. Rodriguez's Website

2018-09-20 - Scott Grimshaw - Going Viral, Binge Watching, and Attention Cannibalism

Presenter:

Dr. Scott Grimshaw

Title:

Going Viral, Binge Watching, and Attention Cannibalism

Affiliation:

BYU

Date:

September 20, 2018

Abstract:

Since digital entertainment is often described as viral this paper uses the vocabulary and statistical methods for diseases to analyze viewer data from an experiment at BYUtv where a program's premiere was exclusively digital. Onset time, the days from the program premiere to a viewer watching the first episode, is modeled using a changepoint between epidemic viewing with a non-constant hazard rate and endemic viewing with a constant hazard rate. Finish time, the days from onset to a viewer watching all episodes, uses an expanded negative binomial hurdle model to reflect the characteristics of binge watching. The hurdle component models binge racing where a viewer watches all episodes on the same day as onset. One reason binge watching appeals to viewers is that they can focus attention on a single program's story line and characters before moving on to a second program. This translates to a competing risks model that has an impact on scheduling digital premieres. Attention cannibalism occurs when a viewer takes a long time watching their first choice program and then never watches a second program or delays watching the second program until much later. Scheduling a difference in premieres reduces attention cannibalism.

Website:

Dr. Grimshaw's website

2018-04-12 - Cristian Tomasetti - Cancer etiology, evolution and early detection

Presenter:

Dr. Cristian Tomasetti

Title:

Cancer etiology, evolution, and early detection

Affiliation:

Johns Hopkins University School of Medicine

Date:

Apr 12, 2018

Abstract:

The standard paradigm in cancer etiology is that inherited factors and lifestyle, environmental exposures are the causes of cancer. I will present recent findings indicating that a third cause, never considered before, plays a large role: "bad luck", i.e. the pure chance involved in DNA replication when cells divide. Novel mathematical and statistical methodologies for distinguishing among these causes will also be introduced. I will then conclude with a new approach for the early detection of cancer.

Website:

Dr. Tomasetti's Website

2018-03-29 - H. Dennis Tolley - What's the Likelihood?

Presenter:

H. Dennis Tolley

Title:

What's the Likelihood?

Affiliation:

BYU

Date:

Mar 29, 2018

Abstract:

The likelihood function plays a major role in both frequentist and Bayesian methods of data analysis. Non-parametric Bayesian models also rely heavily on the form of the likelihood. Despite its heuristic foundation, the likelihood has several desirable large sample statistical properties that prompt its use among frequentists. Additionally, there are other important facets of the likelihood that warrant its formulation in many circumstances. As fundamental as the likelihood is, however, beginning students are only given a cursory introduction into how to formulate the likelihood. This seminar illustrates the formulation of the likelihood for a family of statistical problems common in the physical sciences. By examining the basic scientific principles associated with an experimental set-up, we show the step by step construction of the likelihood, starting with the discrete random walk model as a paradigm. The resulting likelihood is the solution to a stochastic differential equation. Elementary applications of the likelihood are illustrated.

Website:

Dr. Tolley's website

2018-03-22 - Matthew Heaton - Methods for Analyzing Large Spatial Data: A Review and Comparison

Presenter:

Dr. Matthew Heaton

Title:

Methods for Analyzing Large Spatial Data: A Review and Comparison

Affiliation:

BYU

Date:

Mar 22, 2018

Abstract:

The Gaussian process is an indispensable tool for spatial data analysts. The onset of the “big data” era, however, has lead to the traditional Gaussian process being computationally infeasible for modern spatial data. As such, various alternatives to the full Gaussian process that are more amenable to handling big spatial data have been proposed. These modern methods often exploit low rank structures and/or multi-core and multi-threaded computing environments to facilitate computation. This study provides, first, an introductory overview of several methods for analyzing large spatial data. Second, this study describes the results of a predictive competition among the described methods as implemented by different groups with strong expertise in the methodology. Specifically, each research group was provided with two training datasets (one simulated and one observed) along with a set of prediction locations. Each group then wrote their own implementation of their method to produce predictions at the given location and each which was subsequently run on a common computing environment. The methods were then compared in terms of various predictive diagnostics.

Website:

Dr. Heaton's website

2018-03-15 - Timothy Hanson - A unified framework for fitting Bayesian semiparametric models to arbitrarily censored spatial survival data

Presenter:

Timothy Hanson

Title:

A unified framework for fitting Bayesian semiparametric models to arbitrarily censored spatial survival data

Affiliation:

Medtronic

Date:

Mar 15, 2018

Abstract:

A comprehensive, unified approach to modeling arbitrarily censored spatial survival data is presented for the three most commonly-used semiparametric models: proportional hazards, proportional odds, and accelerated failure time. Unlike many other approaches, all manner of censored survival times are simultaneously accommodated including uncensored, interval censored, current-status, left and right censored, and mixtures of these. Left truncated data are also accommodated leading to models for time-dependent covariates. Both georeferenced (location observed exactly) and areally observed (location known up to a geographic unit such as a county) spatial locations are handled. Variable selection is also incorporated. Model fit is assessed with conditional Cox-Snell residuals, and model choice carried out via LPML and DIC. Baseline survival is modeled with a novel transformed Bernstein polynomial prior. All models are fit via new functions which call efficient compiled C++ in the R package spBayesSurv. The methodology is broadly illustrated with simulations and real data applications. An important finding is that proportional odds and accelerated failure time models often fit significantly better than the commonly-used proportional hazards model.

Website:

Dr. Hanson's LinkedIn

2018-03-08 - Daniel Nettleton - Random Forest Prediction Intervals

Presenter:

Dr. Daniel Nettleton

Title:

Random Forest Prediction Intervals

Affiliation:

Iowa State University

Date:

Mar 8, 2018

Abstract:

Breiman's seminal paper on random forests has more than 30,000 citations according to Google Scholar. The impact of Breiman's random forests on machine learning, data analysis, data science, and science in general is difficult to measure but unquestionably substantial. The virtues of random forest methodology include no need to specify functional forms relating predictors to a response variable, capable performance for low-sample-size high-dimensional data, general prediction accuracy, easy parallelization, few tuning parameters, and applicability to a wide range of prediction problems with categorical or continuous responses. Like many algorithmic approaches to prediction, random forests are typically used to produce point predictions that are not accompanied by information about how far those predictions may be from true response values. From the statistical point of view, this is unacceptable; a key characteristic that distinguishes statistically rigorous approaches to prediction from others is the ability to provide quantifiably accurate assessments of prediction error from the same data used to generate point predictions. Thus, we develop a prediction interval -- based on a random forest prediction -- that gives a range of values that will contain an unknown continuous univariate response with any specified level of confidence. We illustrate our proposed approach to interval construction with examples and demonstrate its effectiveness relative to other approaches for interval construction using random forests.

Website:

Dr. Nettleton's website

2018-02-22 - Robert Richardson - Non-Gaussian Translation Processes

Presenter:

Robert Richardson

Title:

Non-Gaussian Translation Processes

Affiliation:

BYU

Date:

Feb 22, 2018

Abstract:

A non-Gaussian translation process is a method used in some engineering applications where a stochastic process is used with non-Gaussian marginal distributions. It could be considered a hierarchical copula model where the correlation structure of the process is defined separately from the marginal distributional characteristics. This approach also yields a simple likelihood function for the finite dimensional distributions of the stochastic process. These processes will be shown in a few applications to either perform tasks that could not be done previously or to do it much more efficiently such as non-Gaussian option pricing, general multivariate stable spatial processes, and non-Gaussian spatio-temporal dynamic modeling.

Website:

Dr. Richardson's Website

2018-02-15 - Jeffery Tessem - How to make more beta cells: exploring molecular pathways that increase functional beta cell mass as a cure for Type 1 and Type 2 diabetes

Presenter:

Dr. Jeffery S Tessem

Title:

How to make more beta cells: exploring molecular pathways that increase functional beta cell mass as a cure for Type 1 and Type 2 diabetes

Affiliation:

Department of Nutrition, Dietetics and Food Science at BYU

Date:

Feb 15, 2018

Abstract:

Both Type 1 (T1D) and Type 2 diabetes (T2D) are caused by a relative insufficiency in functional β-cell mass. Current therapeutic options for diabetes include daily insulin injections to maintain normoglycemia, pharmacological agents to stimulate β-cell function and enhance insulin sensitivity, or islet transplantation. A major obstacle to greater application of islet transplantation therapy is the scarcity of human islets. Thus, new methods for expansion of β-cell mass, applied in vitro to generate the large numbers of human islet cells needed for transplantation, or in situ to induce expansion of the patients remaining β-cells, could have broad therapeutic implications for this disease. To this end, our lab is interested in delineating the molecular pathways that increase β-cell proliferation, enhance glucose stimulated insulin secretion, and protect against β-cell death.

Website:

Dr. Tessem's Website

2018-02-08 - Chris Groendyke - Bayesian Inference for Contact Network Models using Epidemic Data

Presenter:

Chris Groendyke

Title:

Bayesian Inference for Contact Network Models using Epidemic Data

Affiliation:

Robert Morris University

Date:

Feb 8, 2018

Abstract:

I will discuss how network models can be used to study the spread of epidemics through a population, and in turn what epidemics can tell us about the structure of this population. I apply a Bayesian methodology to data from a disease presumed to have spread across a contact network in a population in order to perform inference on the parameters of the underlying network and disease models. Using a simulation study, I will discuss the strengths, weaknesses, and limitations of this type of these models, and the data required for this type of inference. Finally, I will describe an analysis of an actual measles epidemic that spread through the town of Hagelloch, Germany, in 1861 and share the conclusions it allows us to make regarding the population structure.

Website:

Chris's Website

2018-02-01 - Larry Baxter - Structure in Prior PDFs and Its Effect on Bayesian Analysis

Presenter:

Larry Baxter

Title:

Structure in Prior PDFs and Its Effect on Bayesian Analysis

Affiliation:

BYU

Date:

Feb 1, 2018

Abstract:

Bayesian statistics formalizes a procedure for combining established (prior) statistical knowledge with current knowledge to produce a posterior statistical description that presumably is better than either the prior or new knowledge by itself. Two common applications of this theory involve (a) combining established (literature) estimates of model parameter with new data to produce better parameter estimates, and (b) estimating model prediction confidence bands. Frequently, the prior information includes reasonable parameter estimates, poorly quantified and often subjective parameter uncertainty estimates, and no information regarding how the values of one parameter affect the confidence intervals of other parameters. All three of these parameter characteristics affect Bayesian analysis. The first two receive a great deal of attention. The third characteristic, the dependence of model parameters on one another, creates structure in the prior pdfs. This structure strongly influences Bayesian results, often to an extent that rivals or surpasses the parameter uncertainty best estimates. Nevertheless, Bayesian analyses commonly ignore this structure.

All structure stems primarily from the form of the model and, in linear models, does not depend on the observations themselves. Most models produce correlated parameters when applied to real-world engineering and science data. The most common example of structure is parameter correlation coefficients. Linear models produce linear parameter correlations that depend on the magnitude of the independent variable under analysis but that in most practical applications produce large, often close to unity, correlation coefficients. Nonlinear models also generally have correlated parameters. However the correlations can be nonlinear, even discontinuous, and generally involve more complexity than linear model parameter correlations. Parameter correlations profoundly affect the results of Bayesian parameter estimation and prediction uncertainty. Properly incorporated structure produces Bayesian results that powerfully illustrate the strength and potential contribution of the theory. Bayesian analyses that ignore such structure produce poor or even nonsensical results, often significantly worse than a superficial guess.

This seminar demonstrates the importance of prior structure in both parameter estimation and uncertainty quantification using real data from typical engineering systems. Perhaps most importantly, the discussion illustrates methods of incorporating parameter structure for any given model that does not rely on observations. These methods quantify parameter structure, including the lack of structure, for linear and nonlinear models.

Website:

Larry's Website

2018-01-18 - Brad Barney - Growing Curve Methodology with Application to Neonatal Growth Curves

Presenter:

Brad Barney

Title:

Growing Curve Methodology with Application to Neonatal Growth Curves

Affiliation:

BYU

Date:

Jan 18, 2018

Abstract:

As part of postnatal care, newborns are routinely monitored to assess the stability and adequacy of their growth. Interest lies in learning about the typical postnatal growth of especially preterm infants. We briefly consider some general methodological strategies currently employed to parsimoniously construct growth curves for use in medical practice. We present original results using existing methodology known as generalized additive models for location, scale and shape (GAMLSS). We also expand existing methodology on the Bayesian analogue of GAMLSS, known as structured additive distributional regression. In particular, we hierarchically model weight and length jointly, from which we are able to induce a time-varying distribution for Body Mass Index.

Co-Authors:

Adrienne Williamson, Josip Derado, Gregory Saunders, Irene Olsen, Reese Clark, Louise Lawson, Garritt Page, and Miguel de Carvalho

Website:

Brad's page