Assume we have a single predictor variable and quantitative response . We assume that there is a relation between them s.t.
Having some data, we can calculate the and then use it to predict the value for a given . But how do we estimate the coefficients?
Least Squares Method
I studied this formally in the Linear Algebra course and will make a separate note later Least Squares Method, but generally this is how we approach it here. Assume that we have some , then the obtained value for a test data is . Its difference from the real value is . We define residual sum of squares (RSS) as the sum of these . So we try to look for betas that minimize RSS, and the formula for it is
- Try proving it using some calculus
Standard errors
Okay, we calculate the betas, but how do we know that they’re accurate? Or how do we know how accurate they are? Standard errors come to help. The general idea is that if the true population were to be described by linear dependence, then are our best estimates given the sample data. AND if we were to have data about other samples and average them, then we would really get the true betas (without hat). This means that linear regression does not systematically over- or under-estimate the true parameters. Anyway, the formulas for standard errors are on page 75 (I’m too lazy to type them here), but we should be rather interested on how to use them.
Logically, if there was no linear dependence then should be close to zero. But how close is close ?) 2) Recall how to calculate the 95% confidence interval using standard errors (SE). 3) t-statistic is given by, but what does it measure?
RSE and statistic
Okay, let’s assume that now we’re confident in the fact there is some dependence, but this still leaves the question: how accurate is the linear model? One way to answer is to calculate residual standard error (RSE). It basically gives the value of average deviation from the predicted value. So, if it’s large bad.
Good or bad RSE depends on the context of the data. So, it is wise to compare to the mean . Another way that gives universal value is :
where TSS is total square sum . 4) Explain intuitively the meaning of .