jbstatistics
jbstatistics
  • Видео 197
  • Просмотров 32 339 177
You sir, are no normal distribution
Not even close. Lots of this sort of stuff in textbooks as well. WT...H?
(And just in case anybody thinks that I think I'm in that far right tail --- I'm definitely the midwit here.)
For the normal distribution, the ratio of area between 0 and 1 SD from the mean to that of the area between 1 and 2 SD from the mean is about 2.51. With rounding, the figure states it as 34/14, which is about 2.43. The ratio of the actual areas seen in this plot is about 3.75 (50% larger than it should be). Not even close.
Просмотров: 3 201

Видео

1000 normal QQ plots in 30 minutes (n = 25, sampling from N(10,5^2))
Просмотров 1,1 тыс.4 месяца назад
Title, essentially. Properly interpreting normal QQ plots takes some experience, and part of that experience is developing a feel for what natural variability looks like on a normal QQ plot (when sampling from a normal distribution). Each of these 1000 plots is based on a sample of size 25 from a normal distribution with a mean of 10 and standard deviation of 5. Created with R’s qqnorm and qqli...
What I envision a virtual work meeting at D2L (Brightspace) to be like (jk, obv)
Просмотров 8 тыс.2 года назад
Todd meets with his supervisor at D2L (Brightspace) to discuss priorities going forward. (I'll get back to stats videos soon. This one's mainly for fun, but inspired by a decade of frustration. In case it's not obvious, I play both parts.) D2L (Brightspace) quizzes do not allow an instructor to change the answer to a numeric response (arithmetic) question and regrade quiz attempts. There's a re...
Basic Probability: The Multiplication Rule
Просмотров 34 тыс.3 года назад
An introduction to the multiplication rule. (I assume the viewer has an understanding of conditional probability and independence, but I do a very brief review of those concepts.) I introduce the multiplication rule for two events, work through a simple example, then discuss the multiplication rule for 3 events and how it generalizes to any number of events. I end with an example using the mult...
Linear Transformations (in a Descriptive Statistics Setting)
Просмотров 12 тыс.3 года назад
I discuss linear transformations, in the context of descriptive statistics. I discuss what a linear transformation is, give an example, discuss the effect of the linear transformation on various summary statistics, and work through a numerical example. The temperature data is from here: climate.weather.gc.ca/climate_data/daily_data_e.html?StationID=51459&timeframe=2&StartYear=1840&EndYear=2021&...
An Introduction to Boxplots
Просмотров 8 тыс.3 года назад
An introductory to boxplots. (I do not carry out any calculations, this video is about interpreting boxplots.) I discuss the basics, an applied example, give a few illustrations of histograms and boxplots under symmetry and skewness, then briefly discuss how large samples often lead to a large number of outliers in a boxplot. Source of the jumping fish example: Brunt et al. (2016). Amphibious f...
On average, what proportion of sample means would a randomly selected 95% CI for mu capture?
Просмотров 11 тыс.5 лет назад
This one's inspired by a common confidence interval misinterpretation. (This is a bit of a different video for me, and if you're just looking for help with specific topics in a statistics course, you may not find it helpful. But there's some good stuff in here.) Here I address what might seem at first like bit of a strange or uninformative question: In repeated sampling from a normally distribu...
Deriving the mean and variance of the least squares slope estimator in simple linear regression
Просмотров 98 тыс.5 лет назад
I derive the mean and variance of the sampling distribution of the slope estimator (beta_1 hat) in simple linear regression (in the fixed X case). I discuss the typical model assumptions, and discuss where we use them as I carry out the derivations. The derivations are carried out using summation notation (no matrices). At the end, I briefly discuss the normality assumption, and how that leads ...
The Law of Total Probability
Просмотров 158 тыс.5 лет назад
I discuss the Law of Total Probability. I begin with some motivating plots, then move on to a statement of the law, then work through two examples.
P(A) = P(A and B) + P(A and Bc)
Просмотров 22 тыс.5 лет назад
A quick video to illustrate that P(A) = P(A and B) P(A and Bc), and work through a simple conditional probability example that makes use of this identity. I know from experience that some of my students have trouble seeing this (especially when it comes up in a formulaic approach to a problem), and I wanted to have a video that I could point them to. My Law of Total Probability video is here: r...
Deriving the least squares estimators of the slope and intercept (simple linear regression)
Просмотров 224 тыс.5 лет назад
I derive the least squares estimators of the slope and intercept in simple linear regression (Using summation notation, and no matrices.) I assume that the viewer has already been introduced to the linear regression model, but I do provide a brief review in the first few minutes. I assume that you have a basic knowledge of differential calculus, including the power rule and the chain rule. If y...
Proof that if events A and B are independent, so are Ac and B (and A and Bc)
Просмотров 63 тыс.5 лет назад
Here I prove that if events A and B are independent, so are A complement and B. (And A and B complement, of course, since which event we call A and which we call B is arbitrary.) Looking for a proof that if A and B are independent, so are their complements? That's here: ruclips.net/video/bnDpZNlVZ3k/видео.html
Proof that if two events are independent, so are their complements.
Просмотров 48 тыс.5 лет назад
Just getting warmed up. Here I prove that if events A and B are independent, so are Ac and Bc. I make use of De Morgan's Laws, without offering a formal proof of that part (but I do provide a brief Venn diagram justification of the needed bit). More probability and statistics videos will follow.
Independent Events (Basics of Probability: Independence of Two Events)
Просмотров 246 тыс.6 лет назад
An introduction to the concept of independent events, pitched at a level appropriate for the probability section of a typical introductory statistics course. I give the definition of independence, work through some simple examples, and attempt to illustrate the meaning of independence in various ways. (Note: I use the phrase "not independent" rather than "dependent" almost exclusively. There is...
Conditional Probability Example Problems
Просмотров 230 тыс.6 лет назад
Conditional probability example problems, pitched at a level appropriate for a typical introductory statistics course. I assume that viewers have already been introduced to the concepts of conditional probability and independence, but I do review the concepts along the way. I work through some problems with the conditional probability formula explicitly, and some using the reduced sample space ...
Basics of Probability: Unions, Intersections, and Complements
Просмотров 254 тыс.6 лет назад
Basics of Probability: Unions, Intersections, and Complements
Don't watch this! (A t test example where nearly everything I say is wrong)
Просмотров 7 тыс.6 лет назад
Don't watch this! (A t test example where nearly everything I say is wrong)
De Morgan's Laws (in a probability context)
Просмотров 113 тыс.6 лет назад
De Morgan's Laws (in a probability context)
An Introduction to Conditional Probability
Просмотров 297 тыс.6 лет назад
An Introduction to Conditional Probability
Are mutually exclusive events independent?
Просмотров 113 тыс.6 лет назад
Are mutually exclusive events independent?
What Does Independence Look Like on a Venn Diagram?
Просмотров 87 тыс.9 лет назад
What Does Independence Look Like on a Venn Diagram?
The Expected Value and Variance of Discrete Random Variables
Просмотров 352 тыс.9 лет назад
The Expected Value and Variance of Discrete Random Variables
An Introduction to Discrete Random Variables and Discrete Probability Distributions
Просмотров 349 тыс.9 лет назад
An Introduction to Discrete Random Variables and Discrete Probability Distributions
Inference for the Ratio of Variances: How Robust are These Procedures?
Просмотров 7 тыс.10 лет назад
Inference for the Ratio of Variances: How Robust are These Procedures?
Inference for a Variance: How Robust are These Procedures?
Просмотров 7 тыс.10 лет назад
Inference for a Variance: How Robust are These Procedures?
The Sampling Distribution of the Ratio of Sample Variances
Просмотров 11 тыс.10 лет назад
The Sampling Distribution of the Ratio of Sample Variances
Inference for Two Variances: An Example of a Confidence Interval and a Hypothesis Test
Просмотров 13 тыс.10 лет назад
Inference for Two Variances: An Example of a Confidence Interval and a Hypothesis Test
The Sampling Distribution of the Sample Variance
Просмотров 62 тыс.10 лет назад
The Sampling Distribution of the Sample Variance
Deriving a Confidence Interval for the Ratio of Two Variances
Просмотров 27 тыс.10 лет назад
Deriving a Confidence Interval for the Ratio of Two Variances
An Introduction to Inference for the Ratio of Two Variances
Просмотров 29 тыс.10 лет назад
An Introduction to Inference for the Ratio of Two Variances

Комментарии

  • @lasithaamarasinghe9251
    @lasithaamarasinghe9251 4 часа назад

    really interesting video😇

  • @MogauTau
    @MogauTau День назад

    This video changed my life...thanks brother

  • @galileodelcurto9300
    @galileodelcurto9300 2 дня назад

    Excelent video! Great explanation.

  • @kkkkkk-or7js
    @kkkkkk-or7js 2 дня назад

    good

  • @nathannguyen2041
    @nathannguyen2041 3 дня назад

    Hi JB! I'm hoping that you see this one day. I've never been able to make the connection between the Law of Large Numbers, the Central Limit Theorem, a test statistic, and hypothesis tests. LLN says that as my sample size gets larger, the sample mean will converge to the population mean. CLT says that for sufficiently large sample sizes, the sampling distribution of sample means will approximate normal, and the mean of the sampling distribution converges to the population mean. This is applicable to various distributions but not all e.g., Cauchy. But how is this connected to a test statistic and hypothesis testing? For example, suppose that a factory produced soap bars distributed normally with mean 2 and standard deviations 0.25. Someone was asked to randomly sample 60 soap bars, but we're not sure if the soap bars come from this factory. We could take a sample mean of our 60 soap bars and compare it to the population mean right? If I can recall correctly, we'd generate a null distribution with a mean of 2 and standard deviations of 0.25 since that's what we know about our parent population that we're interested in. Are we then creating a test statistic and seeing where it lays within the null distribution? Do I have that all correct? But the thing is, we only have one dataset (one sample of size 60). How are we able to use CLT? To me, it's almost as if we're generating more information that we have when talking about the distribution of the test statistic. Furthermore, if that's all correct, where exactly does LLN and CLT come into play? If you're able to reply, that'd be awesome. Thanks for your videos nonetheless!

    • @jbstatistics
      @jbstatistics 20 часов назад

      Hi Nathan. These are big questions that don't have a 30 second answer so I may have trouble getting to them. But for this part: "But the thing is, we only have one dataset (one sample of size 60). How are we able to use CLT? " This is a very common point of confusion, and many, many, many highly viewed videos on RUclips really botch this and perpetuate the misunderstanding. We don't need to repeatedly sample for the CLT to have meaning. The CLT has implications for a single samples in general. The "Suppose we repeatedly sample..." argument is supposed to be illustrative, and I use it myself, but the more decades I put behind me the more I think this just screws people up. The CLT tells us about the probability distribution of the random variable that is the mean (or related things, like a sum). It absolutely applies even though we have a single sample, since it's just speaking to the probability distribution of that statistic. The repeated sampling argument is just an illustrative argument, not something that has to happen for the CLT to have practical meaning. The LLN and CLT are very fundamental things to probability and statistics, and they have implications for hypothesis testing, but asking how they relate doesn't really work as they are very different things. The LLN and CLT are fundamental mathematical theorems, whereas hypothesis testing is a technique that helps us answer questions in certain spots. The biggest thing the CLT brings to the table in applied statistics is that it allows us to use well-developed methods based on the normal distribution, when the distribution of our data is not normal or is unknown, provided we have a large enough sample size.

  • @andrews9719
    @andrews9719 4 дня назад

    So is this p hacking? And a two tailed test would have been more appropriate?

    • @jbstatistics
      @jbstatistics 4 дня назад

      While I personally lean towards two-tailed alternatives in the vast majority of situations, I think the use of a one-tailed procedure is reasonable in this situation. Before collecting the data, there was a strong belief (based on previous studies and information) that puerarin would have a tendency to reduce alcohol consumption. So it wasn't just a "ooooh, I think my new drug is better" sort of argument. (Or even worse, using the data to inform the choice of alternative.) Abusing the use of a one-tailed test can be a form of p-hacking, sure, but I don't think that happened in this study.

  • @caesaaar
    @caesaaar 4 дня назад

    What an absolute God

  • @nathannguyen2041
    @nathannguyen2041 5 дней назад

    What is the relationship between the Law of Large Number, Central Limit Theorem, a test statistic, and hypothesis testing? The LLN says as you i.i.d more samples, the sample mean will converge to the population mean. CLT then says that as i.i.d increases, the distribution of the sampling means approaches approximate normal. How are these concepts all connected?

  • @YourHeartFeelings
    @YourHeartFeelings 6 дней назад

    Thank you so much

  • @YourHeartFeelings
    @YourHeartFeelings 6 дней назад

    Thank you very much

  • @datsmydab-minecraft-and-mo5666
    @datsmydab-minecraft-and-mo5666 6 дней назад

    Really cleared up confusion.

  • @YourHeartFeelings
    @YourHeartFeelings 7 дней назад

    Thank you very much.Now i understood the central limit theorm.It is the basis.

  • @AkashKumar-lr6hc
    @AkashKumar-lr6hc 8 дней назад

    The best statistics teacher

  • @rimshazaidi2375
    @rimshazaidi2375 9 дней назад

    does this means that formula foe the variance is same for discrete and continuous variable?

    • @jbstatistics
      @jbstatistics 9 дней назад

      Yes, in the sense that for any random variable X, Var(X) = E[(X - mu)^2].

  • @monicaarrudadealmeida8112
    @monicaarrudadealmeida8112 9 дней назад

    Excellent!

  • @halo_582
    @halo_582 10 дней назад

    what the sigma

  • @carlosdominguez7088
    @carlosdominguez7088 10 дней назад

    By far, this is the best explanation I have seen of this topic. Congratulations!

  • @Didier-cu6cb
    @Didier-cu6cb 12 дней назад

    Do you have a video about the sample distribution? I suggest make it

  • @aarushialreja2089
    @aarushialreja2089 12 дней назад

    I could not understand this from my stats textbook even after trying for 30 whole mins(even the formula didnt make much sense to me without proper explanation) and it only took me 7 mins to understand it so well here.. thanks for this honestly.

  • @alexzheng982
    @alexzheng982 12 дней назад

    From what table?

  • @Didier-cu6cb
    @Didier-cu6cb 13 дней назад

    The statistics and probability professor of our university taught us continuous probability distribution while we still don't know Integral calculus! Is it not possible to learn continuous probability distribution without integral?

    • @jbstatistics
      @jbstatistics 13 дней назад

      I think it's fine to teach continuous probability distributions without explicitly discussing integration. e.g. Using vague statements like "Probabilities are areas under curves, and we can find those areas with mathematical techniques, and software has incorporated those mathematical techniques for us." I think someone can develop a very good understanding of statistics from that perspective. Keep in mind that most of the continuous distributions we use in statistical inference (e.g. normal, t, F, chi-square) do not have closed-form cumulative distribution functions and must be integrated numerically. So whether someone knows integral calculus or not, in the end areas are found using software. For a full and deeper understanding of what's going on? Sure, knowledge of integral calculus is meaningful. At a level to achieve an understanding of applied statistics? I don't think knowledge of integral calculus is necessary.

    • @Didier-cu6cb
      @Didier-cu6cb 13 дней назад

      @@jbstatistics Thanks for the quick reply🙏 I'm actually preparing for an exam, and when I got to the topic of continuous probability distribution, I got a little confused because of the integral. I meant more to solve exam questions related to continuous probability distribution than to have a theoretical understanding. It seems that to reach the final answer, you need to know integral calculus (or have an advanced calculator).

    • @jbstatistics
      @jbstatistics 13 дней назад

      @@Didier-cu6cb For any of those distributions I named, and many others, there is no closed form solution for the integral. It does not matter how great of an integrator one is, the integrals can solved only by numerical techniques. We do that using software. The world's greatest integrator and a person knowing no calculus whatsoever solve the problem in the same way: By asking software for the appropriate area.

    • @Didier-cu6cb
      @Didier-cu6cb 12 дней назад

      @@jbstatistics Right. Thank you for your response & content.

  • @hayatimulongo
    @hayatimulongo 13 дней назад

    Thanks you sir For sacrificing 4 us wz this perfect teachings

  • @hayatimulongo
    @hayatimulongo 13 дней назад

    For real you tube has contributed almost 80% to my understanding of calculated course unit's.

  • @racimeexe9868
    @racimeexe9868 14 дней назад

    🎉🎉🎉🎉🎉 fantastic

  • @NzogeRamsey
    @NzogeRamsey 16 дней назад

    Great bro

  • @gitgosc7075
    @gitgosc7075 18 дней назад

    best of the best !

  • @LucyMburu605
    @LucyMburu605 18 дней назад

    Thanks alot it really helped

  • @yazanziad6718
    @yazanziad6718 19 дней назад

    THANK YOU SO MUCH from Jordan 🇯🇴 🌹

  • @harsharangapatil2423
    @harsharangapatil2423 19 дней назад

    Why did we multiply by 2 here ruclips.net/video/Xi33dGcZCA0/видео.htmlsi=i0pT0C9uJvh1oJle&t=288? For that case you should have considered only one side and not both side as 776 was less than 780.

    • @jbstatistics
      @jbstatistics 19 дней назад

      I suggest you limit the "should haves" until you have a deeper understanding of the situation. If your logic here was valid, then there would be no such thing as a two-sided alternative. The value of the statistic will always be on one side or the other, so if we let that determine things we'd always have a one-sided alternative. So your logic can't be valid, or nobody would ever speak of a two-sided alternative. The choice of alternative ******never never never never never ever ever ever ever ever ever**** depends on the value that we see in the sample that we're using to carry out the test. If we do that we're distorting the math and then making false statements in the end. The choice of alternative depends on the nature of the situation at hand. One should be able to write out the appropriate hypotheses without ever looking at the sample data. If the sample data has influenced your choice of hypothesis, then you've violated a fundamental requirement of hypothesis testing.

  • @normanremedios8190
    @normanremedios8190 21 день назад

    You are amazing!!

  • @gwineafowl6946
    @gwineafowl6946 21 день назад

    very helpful, even almost 10 years after upload! thanks

  • @alirazi9198
    @alirazi9198 21 день назад

    People would absolutely lose their shit these days if u didnt include trans people in ur example

  • @adammontgomery7980
    @adammontgomery7980 22 дня назад

    I love having equations explained to me in a way I understand. Thank you

  • @CollegeDiaries-hl9st
    @CollegeDiaries-hl9st 24 дня назад

    I couldn't solve for the expectation value of negative binomial distribution, is the expression for expectation value(mean here) is correct?

  • @forheuristiclifeksh7836
    @forheuristiclifeksh7836 25 дней назад

    least square mthod vs

  • @Foram-mz9uo
    @Foram-mz9uo 27 дней назад

    If I draw out a random sample of 28 females with a sample mean x bar from the obv, there is a 5 % chance that the Confidence interval does not contain the Population mean. right? assuming CI of 95%

  • @guzelbora
    @guzelbora 27 дней назад

    I really started to hate my schools educational system. All it takes is a 2 minutes of the 7 minute video for me the grasp the idea and here I am losing tons of hours staring at meaningless pdfs, shame on absolute zero efforts on teaching. I really appreciate your video! You saved my life and my interest to the statistics by posting this video 11 years ago.

  • @lowerterror7993
    @lowerterror7993 29 дней назад

    No one people like data analytics

  • @Jane-jd1fs
    @Jane-jd1fs 29 дней назад

    How am I here 11 years later 😭

  • @neviswarren
    @neviswarren Месяц назад

    Excellent. Thank you.

  • @kiranthota5137
    @kiranthota5137 Месяц назад

    Great explanation, while reading in wiki i realized that we have 2 diff type of Geometric distributions, 1. random variable is no. of trials for 1st success 2. random var is no. of failures to see 1st success. which is very important while conducting the experiment, which i think missed in this current video lecture. thanks.

    • @jbstatistics
      @jbstatistics Месяц назад

      I think bringing that up in an introductory video on the geometric distribution does more harm than good. It's an extra layer of confusion that people don't need at first. The difference in the random variables, the difference in the means, describing why the variances are the same...it just takes away from the big picture of what the geometric distribution does for us. Sure, I bring it up elsewhere, especially as I use R in my courses and R uses the other definition of the r.v., but I think it would cause more confusion than it's worth in an intro video. Once one is understood, the other comes naturally.

  • @PunmasterSTP
    @PunmasterSTP Месяц назад

    F-this, but in the best way possible! 👍

  • @tgturbo7958
    @tgturbo7958 Месяц назад

    he speaks like he is from texas

  • @thereaper_xxx
    @thereaper_xxx Месяц назад

    I already knew this but after watching the video, I understood it. Thank you so much.

  • @yaadrakhna6291
    @yaadrakhna6291 Месяц назад

    Le me watching it in 2024🥲

  • @slayvenom5900
    @slayvenom5900 Месяц назад

    tq for the df point exam in 2hrs

  • @davehansen7321
    @davehansen7321 Месяц назад

    Man don't you love it when someone explaining something says 'But you should already know how to do this part" Go fuck yourself.

  • @user-in2ws8cq5n
    @user-in2ws8cq5n Месяц назад

    I have learnt something. Thanks alot