APL Statistics

A Lady Tasting Tea

Sir Ronald Fisher once had a conversation with a woman who claimed to be able to tell whether the tea or milk was added first to a cup. Fisher, being interested in probability, decided to test this woman's claim empirically by presenting her with 8 randomly ordered cups of tea - 4 with milk added first, and 4 with tea added first. The women was then supposed to select 4 cups prepared with one method, but is allowed to directly compare each cup (e.g. tasting each cup sequentially, or in pairs).

The lady identified each cup correctly. Do we believe that this could happen by random chance alone?

Set up

Number of cups of each type

Number of simulations

Probability of guessing correctly

Under random chance, this should be 0.5

Number of correct guesses of each type (Observed)

Experiment Results

Sampling vs. Theory-based Statistics

Setting up the problem

When we use simulations to examine a hypothesis, we create a distribution that (over many, many simulations) begins to look like our theoretical distribution. This means that simulation-based tests and theory-based tests should come to similar conclusions most of the time. In fact, theory-based tests have some additional assumptions that simulation-based tests do not; as a result, simulation-based tests work even when theory-based tests do not in many cases.

Choose your distribution and parameters

Distribution

Number of samples

Observed value

Mean

Std Dev

Degrees of Freedom (DF)

Numerator df

Denominator df

Simulation Results

Theoretical Results

Studies with One Categorical Variable

Setting up the problem

In one-sample tests of categorical variables, we typically want to know whether the proportion of successes (the quantity we're interested in) is equal to a specific value (that is, $\pi = 0.5$ or something of that sort). Our population parameter, $\pi$, represents the unknown population quantity, and our sample statistic, $\hat p$, represents what we know about the value of $\pi$.

In these tests, our null hypothesis is that $\pi = a$, where a is chosen relative to the problem. Often, $a$ is equal to 0.5, because usually that corresponds to random chance.

When simulating these experiments, we will often use a coin flip (for random chance) or a spinner (for other values of $\pi$) to generate data.

Set your observed values and simulation parameters

Observed value

Total # Trials

# Simulations to run

Simulation success probability

Simulation Results

Theoretical Results

Studies with One Continuous Variable

Setting up the problem

One-sample continuous variable experiments cannot be simulated because we do not usually know the characteristics of the population we're trying to predict from. Instead, we use theory-based tests for continuous one-sample data.

Set your observed values and Null hypothesis

Sample Mean

Sample SD

Sample size

Hypothesized mean (mu)

Null Hypothesis Type

Theoretical Results

Studies with one Categorical and one Continuous Variable

Setting up the problem

In a two-sample test, there are two groups of participants which are assigned different treatments. The goal is to see how the two treatments differ. Because there are two groups, the mathematical formula for calculating the standardized statistic is slightly more complicated (because the variability of $\overline{X}_A - \overline{X}_B$ is a bit more complicated), but in the end that statistic is compared to a similar reference distribution.

Set your observed values and simulation parameters

Group 1 Data

Group 2 Data

The statistic calculated will be $\overline x_1 - \overline x_2$ We will use a null hypothesis of $\mu_1 - \mu_2 = 0$

# Simulations to run

Simulation Results

Theoretical Results

Studies with two Continuous Variable

Setting up the problem

When we have data that consists of two continuous variables, we generally use linear regression to fit a regression line to the data. This line minimizes the errors in $y$, and is sometimes called the least squares regression line.

The regression line, $\hat{y} = a x + b$, consists of a slope and an intercept. If there is no linear relationship between $x$ and $y$, then we would expect $a = 0$.

We can use hypothesis testing to assess whether the value of $a$ is likely to have occurred by random chance if there is no relationship between $x$ and $y$ using a hypothesis test just like we used in previous sections.

Set your observed values and simulation parameters

Variable 1 data

Variable 2 data

The statistic calculated will be the slope of the line, $a$ We will use a null hypothesis of $a = 0$

# Simulations to run