The quadratic cost function comes from the way you solve the LQR problem. When you apply LQR to a system in order to find the optimal set of actions to apply you obtain them by minimisation of the cost function. If this cost function is not quadratic (which would be the case if the actions were only summed up) you have no guarantee to find a global minimum.

To give an more intuitive explanation, letâs consider the meaning of each term of the cost function. As you said, the first one penalises deviation from the objective (both negative or positive): you want the system to be as close as possible to the objective. The second term penalises *energy consumption*, you donât want to spend to much energy to reach the objective. Large positive or negative actions (in the example they are thrusts of the catâs jetpack) will spend much energy, whichever sign is the action.

In summary, when you perform LQR (& LQG as well), you want to penalise both large errors wrt the objective & large energy consumption to reach the objective.