What topics are covered in these Probability and Statistics Review notes?

These study notes cover key concepts and summaries for Probability and Statistics Review.

Are these Probability and Statistics Review study notes free?

Yes, you can read these study notes for free on Cramberry.

Probability and Statistics Review Summary & Study Notes

These study notes provide a concise summary of Probability and Statistics Review, covering key concepts, definitions, and examples to help you review quickly and study effectively.

2.2k words7 views

Notes

Chapter 7 Lesson 7.1: Probability 🧮

This source explains how to list possible outcomes, identify favorable outcomes, compute probability as a ratio, and use relative frequency to estimate probability.
It shows how to interpret probability statements in words (likely, unlikely) and includes example experiments (spinner, weather chances, bottle flips).

Basic building blocks (start very small)

An "experiment" is any action with observable results (for example, spinning a spinner or flipping a bottle).
An "outcome" is one possible result of an experiment (for example, landing on 1 or landing upright).
The sample space is the full list of all possible outcomes for an experiment.
A "favorable outcome" is an outcome that matches the event we're interested in (for example, even numbers when asking for an even result).

How probability is formed (one idea at a time)

Probability measures how likely an event is by comparing favorable outcomes to all possible outcomes.
Formula (explain then name):
1. Compute: number of favorable outcomes divided by number of possible outcomes.
2. This quotient can be written as a fraction, a decimal, or a percent.
3. After explaining this ratio we call it probability.
If we repeat an experiment many times and record results, the fraction of times an event occurs is called the relative frequency and estimates its probability.

Interpret likelihood in everyday words

Near 0%: "impossible" or "very unlikely." 0% means it can't happen.
Between 0% and 25%: "unlikely."
Around 50%: "equally likely" or 50–50.
Around 75%: "likely."
Near 100%: "almost certain."

Worked examples from this source (step-by-step)

Spinner outcomes

Problem: List possible outcomes and answer event questions.
Step 1: List the outcomes shown: 1, 2, 1, 3, 1, 4 (six outcomes total).
a) How many possible outcomes? 6.
b) Favorable outcomes for "even number"? The evens shown are 2 and 4.
c) How many ways to spin a number less than 2? Outcomes less than 2 are the three 1's, so 3 ways.

Weather-likelihood interpretation

Given probabilities: rain 80%, thunderstorms 50%, hail 15%.
Interpretations:
- Rain 80% → close to 75% → "likely" it will rain.
- Thunderstorms 50% → "equally likely" to happen or not.
- Hail 15% → between 0% and 25% → "unlikely."

Bottle-flip relative frequency

Problem: Bottle landed upright 2 times in 25 flips; describe likelihood next flip lands upright.
Solution steps:
1. Compute relative frequency = number of times event occurs / total trials = 2 / 25.
2. Convert to decimal: 2 / 25 = 0.08.
3. Convert to percent: 0.08 × 100 = 8%.
4. Interpretation: 8% is between 0% and 25%, so it is "unlikely" the bottle will land upright on the next flip.

Key terms to memorize

probability — the chance of an event, explained as favorable outcomes over possible outcomes.
outcome — a single result of an experiment.
relative frequency — observed fraction of times an event happened in repeated trials.

Chapter 7 Lesson 7.2: Experimental and Theoretical Probability ⚖️

This source contrasts probability from actual trials (experimental probability) with probability predicted by theory (theoretical probability).
It gives examples: coin/penny spins, selecting vowels, rolling a number cube, and predicting marble counts from sample draws.

Small pieces first: define the two probabilities

When you compute probability by counting from actual trials, that is experimental probability: number of times event occurred divided by total trials.
When you compute probability by reasoning about equally likely outcomes (before doing trials), that is theoretical probability: number of favorable outcomes divided by number of possible outcomes.

Example: Spinning a penny 25 times (experimental)

Data: Heads occurred 6 times out of 25 spins.
Steps:
1. Experimental probability = favorable / total = 6 / 25.
2. Decimal: 6 / 25 = 0.24.
3. Percent: 0.24 × 100 = 24%.
Conclusion: Experimental P(heads) = 6/25 = 0.24 = 24%.

Example: Choosing a vowel from 7 letters (theoretical)

Information: 3 vowels out of 7 total letters.
Steps:
1. Theoretical probability = favorable / possible = 3 / 7.
2. Decimal ≈ 0.4286, percent ≈ 43%.
Conclusion: Theoretical P(vowel) = 3/7 ≈ 43%.

Example: Rolling a number cube 300 times (compare experimental vs theoretical)

Experimental count: ones 48, threes 50, fives 49 → total odd = 48 + 50 + 49 = 147.
Steps for experimental probability:
1. Experimental P(odd) = 147 / 300.
2. Decimal: 147 / 300 = 0.49 → 49%.
Theoretical probability of odd on a six-sided fair cube:
1. Favorable outcomes = three odd faces (1, 3, 5) out of 6.
2. Theoretical P(odd) = 3 / 6 = 1/2 = 0.5 → 50%.
Conclusion: Experimental 49% is close to theoretical 50%.

Example: Predicting marbles in a bag from sample draws (with replacement)

Given: Bag has 50 marbles (unknown colors). You draw 30 times with replacement, observed red 9 times.
Steps:
1. Experimental probability of red = 9 / 30 = 0.3.
2. Predict red marbles in bag = 0.3 × 50 = 15.
Conclusion: Expect about 15 red marbles in the bag.

Key terms to memorize

experimental probability — probability estimated from actual trials.
theoretical probability — probability computed from equally likely outcomes and reasoning.
probability — general idea of chance (favorable / total).

Chapter 8 Lesson 8.1: Samples and Populations 🧑‍🤝‍🧑

This source defines what a population and a sample are, explains how samples give information about populations, and distinguishes unbiased vs biased samples.
It uses school and town examples to show valid vs invalid conclusions from samples.

Atomic definitions first

A "population" is the entire group you care about (all people or objects under study).
A "sample" is a part of the population that you actually examine to learn about the whole.
After these ideas, the term sample refers to that part, and population refers to the whole group.

Why samples are used

Examining every member of a population can be hard or impossible, so we study a sample and generalize carefully.

What makes a good (unbiased) sample

An unbiased sample is representative of the population because it is chosen randomly and is large enough to show real patterns.
A biased sample is not representative because some groups are favored or excluded.

Examples and reasoning (step-by-step)

Which sample is unbiased for estimating how many students ride the bus?

Options: A: 4 students in hallway (too small); B: all students on soccer team (not random); C: 50 twelfth-graders at random (not representative of all grades); D: 100 students at random during lunch.
Conclusion: D is unbiased because it is selected at random and is a large sample.

Landfill opinion surveys (validity of conclusions)

Scenario A: Survey 100 residents who live closest to the new landfill.
- Reasoning: People living closest may have systematic views different from town overall → sample is biased → conclusion not valid for town.
Scenario B: Survey 100 residents at random.
- Reasoning: Random selection reduces systematic bias and is large enough → sample is unbiased → conclusion more likely valid.

Key terms to memorize

population — the whole group you want to know about.
sample — a subset of the population used to estimate properties of the whole.
unbiased sample — a random, representative, and sufficiently large sample.
biased sample — a sample that systematically favors some part of the population.

Chapter 8 Lesson 8.2: Using Random Samples to Describe a Population 📊

This source shows how to use sample proportions and sample means to estimate population counts and averages, and how to describe the center and variation of multiple estimates.
It includes examples: estimating number of students who prefer pop music, and estimating mean hours worked from multiple school samples.

Start with the core idea

A sample gives an estimate for a population value by scaling the sample proportion or sample mean to the whole population.
The process: find the sample proportion or mean, then multiply by the population size (for counts) or average the sample means (for overall mean estimate).

Example 1: Estimating how many students prefer pop music

Population size: 840 students.
Each person surveys 20 students; compute proportion preferring pop then scale to 840.
Steps and results for each sample (showing method):
1. You: pop count = 13 out of 20 → proportion = 13/20 = 0.65 → estimate = 0.65 × 840 = 546.
2. Friend A: pop count = 8 out of 20 → proportion = 8/20 = 0.40 → estimate = 0.40 × 840 = 336.
3. Friend B: pop count = 10 out of 20 → proportion = 10/20 = 0.50 → estimate = 0.50 × 840 = 420.
4. Friend C: pop count = 10 out of 20 → proportion = 10/20 = 0.50 → estimate = 420.
5. Friend D: pop count = 9 out of 20 → proportion = 9/20 = 0.45 → estimate = 0.45 × 840 = 378.
Describing center and variation:
- Median of estimates = 420 students.
- Range = largest − smallest = 546 − 336 = 210 students.

Example 2: Estimating mean hours worked from six samples

Each sample surveys 10 students with part-time jobs; compute mean for each sample, then combine.
Given sample means computed in the source:
- Sample A mean = 7 hours.
- Sample B mean = 7.3 hours.
- Sample C mean = 7.7 hours.
- Sample D mean = 5 hours.
- Sample E mean = 8 hours.
- Sample F mean = 9 hours.
Steps to get overall estimate:
1. Add the six sample means: 7 + 7.3 + 7.7 + 5 + 8 + 9 = 44.
2. Divide by 6 samples: 44 / 6 ≈ 7.333... → 7.3 hours (rounded).
Describe variation among estimates:
- Range = 9 − 5 = 4 hours.
- The estimates cluster near 7–8 hours, with one lower sample (5) and one higher (9).

Key terms to memorize

estimate — a population value guessed from sample data by scaling proportions or averaging sample means.
sample mean — average of values in a sample, used to estimate the population mean.
range — difference between largest and smallest estimate (measure of variation).

Chapter 8 Lesson 8.3: Comparing Populations (Center and Variation) ⚖️📈

This source teaches how to compare two datasets using measures of center (median or mean) and measures of variation (IQR or MAD), and how to interpret which dataset is more likely to contain certain values.
It shows decisions for skewed distributions (use median & IQR) and approximately symmetric distributions (use mean & MAD).

Start with tiny definitions

The center is a single value that represents where the middle of the data lies.
- Use the median for skewed distributions (the middle value when data are ordered).
- Use the mean for roughly symmetric distributions (average of all values).
Variation measures how spread out values are.
- Use the IQR (interquartile range) for skewed data; IQR = Q3 − Q1.
- Use the MAD (mean absolute deviation) for symmetric data; MAD = average of absolute deviations from the mean.

Comparing Dataset A and Dataset B (skewed → median & IQR)

Given:
- Dataset A: Median = 60, IQR = 80 − 30 = 50.
- Dataset B: Median = 90, IQR = 100 − 80 = 20.
Questions and reasoning:
1. Which dataset has greater center? Dataset B (median 90 vs 60).
2. Which dataset has greater variation? Dataset A (IQR 50 vs 20).
3. Which is more likely to contain a value of 95?
  - Dataset A: about 25% of values between 80 and 130.
  - Dataset B: about 50% of values between 80 and 100.
  - Conclusion: Dataset B is more likely to contain 95 because a larger portion of its values lie near 95.
4. Which dataset is more likely to contain a value that differs from the center by at least 30?
  - Since Dataset A has larger IQR, values are more spread out and it's more likely to find values differing by ≥ 30 from the center → Dataset A.

Using mean and MAD for symmetric distributions (dotplot example)

Given:
- Mean of dataset 1 = 54, Mean of dataset 2 = 28.
- MAD for both ≈ 16.
Compute difference in means as multiple of MAD:
1. Difference in means = 54 − 28 = 26.
2. Divide by MAD: 26 / 16 ≈ 1.6.
3. Interpretation: The means differ by about 1.6 times the MAD, so the centers differ by more than one typical deviation.

Key terms to memorize

median — middle value, use for skewed data.
IQR — interquartile range, measure of spread for skewed data.
MAD — mean absolute deviation, typical distance from the mean for symmetric data.
mean — arithmetic average, use for symmetric data.

It's free — no credit card required

Already have an account?

Create your own study notes

Turn your PDFs, lectures, and materials into summarized notes with AI. Study smarter, not harder.

Get Started Free