Here, I get why you can drop the first term. But why the last term? Why is it a constant?
Please see the discussion here!
I’m going to confess something: I wrote the second equation in that screenshot, and it took me a while to convince myself it was correct, and then I immediately forgot why it was correct .
The answer is that we don’t know $\sigma^2$ in advance, so we use the empirical estimate from the residual, but that’s just $\sigma^hat^2 = 1/N\sum_i^N (y_i - \tilde y_i)^2$. So the last term is a constant, it’s equal to n/2.
MANAGED BY INCF