If you have two lines running through a set of points, how do you know which is the closest? (which is the best fit)?
This is what you do:
For each point, you measure the distance from the line A - this is the "residual".
You square each residual, ie. multiply it by itself. (This gives a positive number).
You add up all the residuals.
This gives you the sum of the squares of the residuals for line A.
You then do the same for line B.
Whichever has the lowest value (the least possible value; closest to zero) is the best fit.
Note that you have to square the residuals to make sure that all values are positive. Otherwise a large positive residual could cancel out a large negative residual. Also by squaring, a large negative residual doesn't have any more influence on lowering the sum, than a large positive residual. Also, by squaring (rather than, say, by taking the absolute value of the residuals), a point a long way from the line has proportionally more effect on the sum).
Now, let's assume that all the points actually lie on the line. The value of each residual is 0, squared it is 0, and the sum of squares of the residuals is 0. That is the lowest possible value. And that is the best possible fit.