- Видео 197
- Просмотров 32 339 177
jbstatistics
Добавлен 27 ноя 2011
Jeremy Balka's statistics channel, containing some introductory statistics videos.
(I am an associate professor in the Department of Mathematics and Statistics at the University of Guelph. I have a PhD in statistics, and have taught introductory statistics courses on many occasions.)
I try to be concise, and use real data in the vast majority of cases. Some of these videos move at a quick pace, especially my earlier ones, and they are not designed to be lecture replacements. They may serve to clarify some concepts after lecture. If you want to get ahead, watch the relevant videos before lecture to get an introduction to the topic.
The videos are pitched at the level of an applied introductory statistics course to non-statistics majors.
I'll be updating this channel with new videos until I've filled out a full introductory statistics course (and I may possibly expand after that).
A complete list of videos, organized by topic, can be found at www.jbstatistics.com.
(I am an associate professor in the Department of Mathematics and Statistics at the University of Guelph. I have a PhD in statistics, and have taught introductory statistics courses on many occasions.)
I try to be concise, and use real data in the vast majority of cases. Some of these videos move at a quick pace, especially my earlier ones, and they are not designed to be lecture replacements. They may serve to clarify some concepts after lecture. If you want to get ahead, watch the relevant videos before lecture to get an introduction to the topic.
The videos are pitched at the level of an applied introductory statistics course to non-statistics majors.
I'll be updating this channel with new videos until I've filled out a full introductory statistics course (and I may possibly expand after that).
A complete list of videos, organized by topic, can be found at www.jbstatistics.com.
You sir, are no normal distribution
Not even close. Lots of this sort of stuff in textbooks as well. WT...H?
(And just in case anybody thinks that I think I'm in that far right tail --- I'm definitely the midwit here.)
For the normal distribution, the ratio of area between 0 and 1 SD from the mean to that of the area between 1 and 2 SD from the mean is about 2.51. With rounding, the figure states it as 34/14, which is about 2.43. The ratio of the actual areas seen in this plot is about 3.75 (50% larger than it should be). Not even close.
(And just in case anybody thinks that I think I'm in that far right tail --- I'm definitely the midwit here.)
For the normal distribution, the ratio of area between 0 and 1 SD from the mean to that of the area between 1 and 2 SD from the mean is about 2.51. With rounding, the figure states it as 34/14, which is about 2.43. The ratio of the actual areas seen in this plot is about 3.75 (50% larger than it should be). Not even close.
Просмотров: 3 201
Видео
1000 normal QQ plots in 30 minutes (n = 25, sampling from N(10,5^2))
Просмотров 1,1 тыс.4 месяца назад
Title, essentially. Properly interpreting normal QQ plots takes some experience, and part of that experience is developing a feel for what natural variability looks like on a normal QQ plot (when sampling from a normal distribution). Each of these 1000 plots is based on a sample of size 25 from a normal distribution with a mean of 10 and standard deviation of 5. Created with R’s qqnorm and qqli...
What I envision a virtual work meeting at D2L (Brightspace) to be like (jk, obv)
Просмотров 8 тыс.2 года назад
Todd meets with his supervisor at D2L (Brightspace) to discuss priorities going forward. (I'll get back to stats videos soon. This one's mainly for fun, but inspired by a decade of frustration. In case it's not obvious, I play both parts.) D2L (Brightspace) quizzes do not allow an instructor to change the answer to a numeric response (arithmetic) question and regrade quiz attempts. There's a re...
Basic Probability: The Multiplication Rule
Просмотров 34 тыс.3 года назад
An introduction to the multiplication rule. (I assume the viewer has an understanding of conditional probability and independence, but I do a very brief review of those concepts.) I introduce the multiplication rule for two events, work through a simple example, then discuss the multiplication rule for 3 events and how it generalizes to any number of events. I end with an example using the mult...
Linear Transformations (in a Descriptive Statistics Setting)
Просмотров 12 тыс.3 года назад
I discuss linear transformations, in the context of descriptive statistics. I discuss what a linear transformation is, give an example, discuss the effect of the linear transformation on various summary statistics, and work through a numerical example. The temperature data is from here: climate.weather.gc.ca/climate_data/daily_data_e.html?StationID=51459&timeframe=2&StartYear=1840&EndYear=2021&...
An Introduction to Boxplots
Просмотров 8 тыс.3 года назад
An introductory to boxplots. (I do not carry out any calculations, this video is about interpreting boxplots.) I discuss the basics, an applied example, give a few illustrations of histograms and boxplots under symmetry and skewness, then briefly discuss how large samples often lead to a large number of outliers in a boxplot. Source of the jumping fish example: Brunt et al. (2016). Amphibious f...
On average, what proportion of sample means would a randomly selected 95% CI for mu capture?
Просмотров 11 тыс.5 лет назад
This one's inspired by a common confidence interval misinterpretation. (This is a bit of a different video for me, and if you're just looking for help with specific topics in a statistics course, you may not find it helpful. But there's some good stuff in here.) Here I address what might seem at first like bit of a strange or uninformative question: In repeated sampling from a normally distribu...
Deriving the mean and variance of the least squares slope estimator in simple linear regression
Просмотров 98 тыс.5 лет назад
I derive the mean and variance of the sampling distribution of the slope estimator (beta_1 hat) in simple linear regression (in the fixed X case). I discuss the typical model assumptions, and discuss where we use them as I carry out the derivations. The derivations are carried out using summation notation (no matrices). At the end, I briefly discuss the normality assumption, and how that leads ...
The Law of Total Probability
Просмотров 158 тыс.5 лет назад
I discuss the Law of Total Probability. I begin with some motivating plots, then move on to a statement of the law, then work through two examples.
P(A) = P(A and B) + P(A and Bc)
Просмотров 22 тыс.5 лет назад
A quick video to illustrate that P(A) = P(A and B) P(A and Bc), and work through a simple conditional probability example that makes use of this identity. I know from experience that some of my students have trouble seeing this (especially when it comes up in a formulaic approach to a problem), and I wanted to have a video that I could point them to. My Law of Total Probability video is here: r...
Deriving the least squares estimators of the slope and intercept (simple linear regression)
Просмотров 224 тыс.5 лет назад
I derive the least squares estimators of the slope and intercept in simple linear regression (Using summation notation, and no matrices.) I assume that the viewer has already been introduced to the linear regression model, but I do provide a brief review in the first few minutes. I assume that you have a basic knowledge of differential calculus, including the power rule and the chain rule. If y...
Proof that if events A and B are independent, so are Ac and B (and A and Bc)
Просмотров 63 тыс.5 лет назад
Here I prove that if events A and B are independent, so are A complement and B. (And A and B complement, of course, since which event we call A and which we call B is arbitrary.) Looking for a proof that if A and B are independent, so are their complements? That's here: ruclips.net/video/bnDpZNlVZ3k/видео.html
Proof that if two events are independent, so are their complements.
Просмотров 48 тыс.5 лет назад
Just getting warmed up. Here I prove that if events A and B are independent, so are Ac and Bc. I make use of De Morgan's Laws, without offering a formal proof of that part (but I do provide a brief Venn diagram justification of the needed bit). More probability and statistics videos will follow.
Independent Events (Basics of Probability: Independence of Two Events)
Просмотров 246 тыс.6 лет назад
An introduction to the concept of independent events, pitched at a level appropriate for the probability section of a typical introductory statistics course. I give the definition of independence, work through some simple examples, and attempt to illustrate the meaning of independence in various ways. (Note: I use the phrase "not independent" rather than "dependent" almost exclusively. There is...
Conditional Probability Example Problems
Просмотров 230 тыс.6 лет назад
Conditional probability example problems, pitched at a level appropriate for a typical introductory statistics course. I assume that viewers have already been introduced to the concepts of conditional probability and independence, but I do review the concepts along the way. I work through some problems with the conditional probability formula explicitly, and some using the reduced sample space ...
Basics of Probability: Unions, Intersections, and Complements
Просмотров 254 тыс.6 лет назад
Basics of Probability: Unions, Intersections, and Complements
Don't watch this! (A t test example where nearly everything I say is wrong)
Просмотров 7 тыс.6 лет назад
Don't watch this! (A t test example where nearly everything I say is wrong)
De Morgan's Laws (in a probability context)
Просмотров 113 тыс.6 лет назад
De Morgan's Laws (in a probability context)
An Introduction to Conditional Probability
Просмотров 297 тыс.6 лет назад
An Introduction to Conditional Probability
Are mutually exclusive events independent?
Просмотров 113 тыс.6 лет назад
Are mutually exclusive events independent?
What Does Independence Look Like on a Venn Diagram?
Просмотров 87 тыс.9 лет назад
What Does Independence Look Like on a Venn Diagram?
The Expected Value and Variance of Discrete Random Variables
Просмотров 352 тыс.9 лет назад
The Expected Value and Variance of Discrete Random Variables
An Introduction to Discrete Random Variables and Discrete Probability Distributions
Просмотров 349 тыс.9 лет назад
An Introduction to Discrete Random Variables and Discrete Probability Distributions
Inference for the Ratio of Variances: How Robust are These Procedures?
Просмотров 7 тыс.10 лет назад
Inference for the Ratio of Variances: How Robust are These Procedures?
Inference for a Variance: How Robust are These Procedures?
Просмотров 7 тыс.10 лет назад
Inference for a Variance: How Robust are These Procedures?
The Sampling Distribution of the Ratio of Sample Variances
Просмотров 11 тыс.10 лет назад
The Sampling Distribution of the Ratio of Sample Variances
Inference for Two Variances: An Example of a Confidence Interval and a Hypothesis Test
Просмотров 13 тыс.10 лет назад
Inference for Two Variances: An Example of a Confidence Interval and a Hypothesis Test
The Sampling Distribution of the Sample Variance
Просмотров 62 тыс.10 лет назад
The Sampling Distribution of the Sample Variance
Deriving a Confidence Interval for the Ratio of Two Variances
Просмотров 27 тыс.10 лет назад
Deriving a Confidence Interval for the Ratio of Two Variances
An Introduction to Inference for the Ratio of Two Variances
Просмотров 29 тыс.10 лет назад
An Introduction to Inference for the Ratio of Two Variances
really interesting video😇
This video changed my life...thanks brother
Excelent video! Great explanation.
good
Hi JB! I'm hoping that you see this one day. I've never been able to make the connection between the Law of Large Numbers, the Central Limit Theorem, a test statistic, and hypothesis tests. LLN says that as my sample size gets larger, the sample mean will converge to the population mean. CLT says that for sufficiently large sample sizes, the sampling distribution of sample means will approximate normal, and the mean of the sampling distribution converges to the population mean. This is applicable to various distributions but not all e.g., Cauchy. But how is this connected to a test statistic and hypothesis testing? For example, suppose that a factory produced soap bars distributed normally with mean 2 and standard deviations 0.25. Someone was asked to randomly sample 60 soap bars, but we're not sure if the soap bars come from this factory. We could take a sample mean of our 60 soap bars and compare it to the population mean right? If I can recall correctly, we'd generate a null distribution with a mean of 2 and standard deviations of 0.25 since that's what we know about our parent population that we're interested in. Are we then creating a test statistic and seeing where it lays within the null distribution? Do I have that all correct? But the thing is, we only have one dataset (one sample of size 60). How are we able to use CLT? To me, it's almost as if we're generating more information that we have when talking about the distribution of the test statistic. Furthermore, if that's all correct, where exactly does LLN and CLT come into play? If you're able to reply, that'd be awesome. Thanks for your videos nonetheless!
Hi Nathan. These are big questions that don't have a 30 second answer so I may have trouble getting to them. But for this part: "But the thing is, we only have one dataset (one sample of size 60). How are we able to use CLT? " This is a very common point of confusion, and many, many, many highly viewed videos on RUclips really botch this and perpetuate the misunderstanding. We don't need to repeatedly sample for the CLT to have meaning. The CLT has implications for a single samples in general. The "Suppose we repeatedly sample..." argument is supposed to be illustrative, and I use it myself, but the more decades I put behind me the more I think this just screws people up. The CLT tells us about the probability distribution of the random variable that is the mean (or related things, like a sum). It absolutely applies even though we have a single sample, since it's just speaking to the probability distribution of that statistic. The repeated sampling argument is just an illustrative argument, not something that has to happen for the CLT to have practical meaning. The LLN and CLT are very fundamental things to probability and statistics, and they have implications for hypothesis testing, but asking how they relate doesn't really work as they are very different things. The LLN and CLT are fundamental mathematical theorems, whereas hypothesis testing is a technique that helps us answer questions in certain spots. The biggest thing the CLT brings to the table in applied statistics is that it allows us to use well-developed methods based on the normal distribution, when the distribution of our data is not normal or is unknown, provided we have a large enough sample size.
So is this p hacking? And a two tailed test would have been more appropriate?
While I personally lean towards two-tailed alternatives in the vast majority of situations, I think the use of a one-tailed procedure is reasonable in this situation. Before collecting the data, there was a strong belief (based on previous studies and information) that puerarin would have a tendency to reduce alcohol consumption. So it wasn't just a "ooooh, I think my new drug is better" sort of argument. (Or even worse, using the data to inform the choice of alternative.) Abusing the use of a one-tailed test can be a form of p-hacking, sure, but I don't think that happened in this study.
What an absolute God
What is the relationship between the Law of Large Number, Central Limit Theorem, a test statistic, and hypothesis testing? The LLN says as you i.i.d more samples, the sample mean will converge to the population mean. CLT then says that as i.i.d increases, the distribution of the sampling means approaches approximate normal. How are these concepts all connected?
Thank you so much
Thank you very much
Really cleared up confusion.
Thank you very much.Now i understood the central limit theorm.It is the basis.
The best statistics teacher
does this means that formula foe the variance is same for discrete and continuous variable?
Yes, in the sense that for any random variable X, Var(X) = E[(X - mu)^2].
Excellent!
what the sigma
By far, this is the best explanation I have seen of this topic. Congratulations!
Do you have a video about the sample distribution? I suggest make it
I could not understand this from my stats textbook even after trying for 30 whole mins(even the formula didnt make much sense to me without proper explanation) and it only took me 7 mins to understand it so well here.. thanks for this honestly.
From what table?
The statistics and probability professor of our university taught us continuous probability distribution while we still don't know Integral calculus! Is it not possible to learn continuous probability distribution without integral?
I think it's fine to teach continuous probability distributions without explicitly discussing integration. e.g. Using vague statements like "Probabilities are areas under curves, and we can find those areas with mathematical techniques, and software has incorporated those mathematical techniques for us." I think someone can develop a very good understanding of statistics from that perspective. Keep in mind that most of the continuous distributions we use in statistical inference (e.g. normal, t, F, chi-square) do not have closed-form cumulative distribution functions and must be integrated numerically. So whether someone knows integral calculus or not, in the end areas are found using software. For a full and deeper understanding of what's going on? Sure, knowledge of integral calculus is meaningful. At a level to achieve an understanding of applied statistics? I don't think knowledge of integral calculus is necessary.
@@jbstatistics Thanks for the quick reply🙏 I'm actually preparing for an exam, and when I got to the topic of continuous probability distribution, I got a little confused because of the integral. I meant more to solve exam questions related to continuous probability distribution than to have a theoretical understanding. It seems that to reach the final answer, you need to know integral calculus (or have an advanced calculator).
@@Didier-cu6cb For any of those distributions I named, and many others, there is no closed form solution for the integral. It does not matter how great of an integrator one is, the integrals can solved only by numerical techniques. We do that using software. The world's greatest integrator and a person knowing no calculus whatsoever solve the problem in the same way: By asking software for the appropriate area.
@@jbstatistics Right. Thank you for your response & content.
Thanks you sir For sacrificing 4 us wz this perfect teachings
For real you tube has contributed almost 80% to my understanding of calculated course unit's.
🎉🎉🎉🎉🎉 fantastic
Great bro
best of the best !
Thanks alot it really helped
THANK YOU SO MUCH from Jordan 🇯🇴 🌹
Why did we multiply by 2 here ruclips.net/video/Xi33dGcZCA0/видео.htmlsi=i0pT0C9uJvh1oJle&t=288? For that case you should have considered only one side and not both side as 776 was less than 780.
I suggest you limit the "should haves" until you have a deeper understanding of the situation. If your logic here was valid, then there would be no such thing as a two-sided alternative. The value of the statistic will always be on one side or the other, so if we let that determine things we'd always have a one-sided alternative. So your logic can't be valid, or nobody would ever speak of a two-sided alternative. The choice of alternative ******never never never never never ever ever ever ever ever ever**** depends on the value that we see in the sample that we're using to carry out the test. If we do that we're distorting the math and then making false statements in the end. The choice of alternative depends on the nature of the situation at hand. One should be able to write out the appropriate hypotheses without ever looking at the sample data. If the sample data has influenced your choice of hypothesis, then you've violated a fundamental requirement of hypothesis testing.
You are amazing!!
very helpful, even almost 10 years after upload! thanks
People would absolutely lose their shit these days if u didnt include trans people in ur example
I love having equations explained to me in a way I understand. Thank you
I couldn't solve for the expectation value of negative binomial distribution, is the expression for expectation value(mean here) is correct?
least square mthod vs
If I draw out a random sample of 28 females with a sample mean x bar from the obv, there is a 5 % chance that the Confidence interval does not contain the Population mean. right? assuming CI of 95%
I really started to hate my schools educational system. All it takes is a 2 minutes of the 7 minute video for me the grasp the idea and here I am losing tons of hours staring at meaningless pdfs, shame on absolute zero efforts on teaching. I really appreciate your video! You saved my life and my interest to the statistics by posting this video 11 years ago.
No one people like data analytics
How am I here 11 years later 😭
Excellent. Thank you.
Great explanation, while reading in wiki i realized that we have 2 diff type of Geometric distributions, 1. random variable is no. of trials for 1st success 2. random var is no. of failures to see 1st success. which is very important while conducting the experiment, which i think missed in this current video lecture. thanks.
I think bringing that up in an introductory video on the geometric distribution does more harm than good. It's an extra layer of confusion that people don't need at first. The difference in the random variables, the difference in the means, describing why the variances are the same...it just takes away from the big picture of what the geometric distribution does for us. Sure, I bring it up elsewhere, especially as I use R in my courses and R uses the other definition of the r.v., but I think it would cause more confusion than it's worth in an intro video. Once one is understood, the other comes naturally.
F-this, but in the best way possible! 👍
he speaks like he is from texas
I already knew this but after watching the video, I understood it. Thank you so much.
Le me watching it in 2024🥲
tq for the df point exam in 2hrs
Man don't you love it when someone explaining something says 'But you should already know how to do this part" Go fuck yourself.
I have learnt something. Thanks alot