The method of least squares was introduced by Legendre in is work Nouvelles Méthodes pour la Detérmination des Orbites des Comètes published in 1805. Certainly, the method was extremely easy to apply except for the computations involved. What it lacked was a theoretical foundation. Here follows a summary of the various arguments put forward to justify the use of least squares.
Robert Ellis has given a detailed comparison of the so-called proofs of the method of least squares in "On the Method of Least Squares" which appeared in the Transactions of the Cambridge Philosophical Society in 1844, pp. 204-219. He treats the proofs of Gauss, Laplace and Ivory.
Cleveland Abbe discovered the proof by Adrain and reported this in "A Historical Note on the Method of Least Squares" which appeared in the American Journal of Science and Arts 1: 411-415 (1871).
J.W.L. Glaisher has contributed "On the Law of Facility of Errors of Observation, and on the Method of Least Squares", Memoirs of the Royal Astronomical Society. Vol. XXXIX (1872) pp. 75-124. In this he first examines the proofs of Adrain. Glaisher also groups proofs according to the following scheme:
This was followed by a study by Mansfield Merriman who has given a chronology of proofs of the method of least squares in The Analyst for March 1877, Volume IV, No. 2, pp. 33-36. This journal and his paper is available through JSTOR. In addition, from the Transactions of the Connecticut Academy, Vol. IV, 1877, we have "A List of Writings relating to the Method of Least Squares, with historical and critical notes." This document lists 408 memoirs, books and parts of books related to the Theory of Errors.
Comments are derived from the aforementioned papers.
Robert Adrain, "Research concerning the probabilities of the errors which happen in making observations." The Analyst (1808) No. IV, pp. 93-109. Two proofs are contained therein. The first is on pages 93-95 and has been reprinted by Cleveland Abbe (see below) and the second lies on pages 96-97. Merriman himself reprints the gist of it in his 1877 paper. Herschel's proof is similar. See also Ellis 1850 below.
Karl Gauss, Theoria
Motus Corporum Coelestium in Sectionibus Conicis Solem Ambientium,
1809. pp. 205-224. Charles Henry Davis made an English
translation published as
Gauss assumes that the arithmetic mean is the most probable result for a sequence of observations of a quantity. The only law of error consistent with this assumption is the Gaussian distribution. From which, the method of least squares will follow.
Ellis states that details of Gauss's reasoning may be found in the paper by Bessel "Bestimmung der Axen des elliptischen Rotationssphäroids, welches den vorhandenen Messungen von Meridianbögen der Erde am melsten entspricht" originally published in Astronomische Nachrichten 14, Nr. 333. Its translation as "Determination of the Axes of the Elliptic Spheroid of Revolution which most nearly corresponds with the existing Measurements of Arcs of the Meridian." is in Taylor's Scientific Memoirs Vol. 2 on pages 387-400.
We note that Encke also followed the same demonstration in the paper "Über die Methode der Kleinsten Quadrate", Berliner Astronomisches Jahrbuch (1834) pages 253-308 of which a translation appears in the same Scientific Memoirs as "On the Method of Least Squares" on pages 317- 369.
August De Morgan discusses the theory of the arithmetic mean in "On the Theory of Errors of Observation" Cambridge Philosophical Transactions, Vol. X, (1864) pp. 409-42.
Schiaparelli provides a justification for the use of the arithmetic mean in "Sur le principe de la moyenne arithmétique" in Astronomische Nachrichten Vol. LXXXVII (1875), Nr. 2068, columns 55-58. A copy may be downloaded from the SAO/NASA Astrophysics Data System.
Pierre Laplace, "Mémoire sur les approximations des formules qui sont fonctions de très-grands nombres, et sur leur application aux probabilités" and supplement, "Mémoire sur les approximations des formules qui sont fonctions de très-grands nombres, et sur leur application aux probabilités (suite)" . Mém. l'Institut France 1809 (1810), 353-415, 559-565. For the proof see pages 383-389 and 559-565.
reproduces the proof in the Théorie
Analytique des Probabilités, Chapter IV (pp. 309-354 or
paragraphs 18-24). His second demonstration may be found on pp.
Laplace shows that the method of least squares follows if all observations follow the same law of error and the number of observations increases without limit. He limits himself to two unknowns. Of course, the proof fails if the law of error is the Cauchy distribution. For this, see Poisson "Sur la probabilité des résultats moyens des observations." Connaissance des Temps, 1827, pp. 273-302.
Ellis gave the extension to any number of unknowns in 1844. Glaisher simplifies the argument of Laplace in "Remarks on certain portions of Laplace's Proof of the Method of Least Squares", Philosophical Magazine Vol. 43, 4th Series, 1864 and again in 1872.
Todhunter also extends the method of Laplace in "On the Method of Least Squares", Transactions of the Cambridge Philosophical Society, Vol. 11 (1871). pp. 219-238.
Carl Gauss, "Theoria
observationum erroribus minimis obnoxiae," Comm. Soc. Gottingen,
Vol. V (1823), pp. 33-90. This is the translation based on that of
Bertrand by Dale F. Trotter.
Here Gauss assumes the importance of the error varies as the square of its magnitude. He has been accused of petitio principii or begging the question The mean value of the sum of the squares is taken as a measure of precision. Merriman believes this argument to be followed only by Helmert in 1872.
James Ivory, "On the method of Least Squares," Tilloch's
Philosophical Magazine, Vol. LXV, (1825) pp. 3-10,
81-88,161-168 and Tilloch's
Philosophical Magazine, Vol. LXVIII (1826) pp.161-165.
Ellis claims that Ivory gave three arguments. Glaisher finds four. The first argument (page 5) rests on an analogy to the condition of equilibrium which leads to the method of least squares. The second (pages 6-7) is based on minimizing the mean square error (the measure of precision) when several sets of observations are made. The third is a variant of this in that one minimizes the measure of precision among a set of observations. The last argument is based upon a symmetric law of error and independent errors in observations.In this case he claims the method of least squares follows from the equations of condition.
Hagen, Grundzüge der Wahrscheinlichkeits-Rechnung, 1837, second edition 1867. An error is the sum of indefinitely many equal elementary errors and equally positive or negative. That is, the elementary errors are distributed binomially with parameter p = 0.5 so that consequently the distribution of an error is asymptotically normal. An exposition by Charles Kummel in English may be found in The Analyst Vol III for 1876 on pages 133-140, 165-178.
Friedrich Bessel "Untersuchungen über die Wahrscheinlichkeit der Beobachtungsfehler." Astronomische Nachrichten, 1838, Vol. XV, col. 369-404. A copy may be downloaded from the SAO/NASA Astrophysics Data System. Bessel demonstrates the the normal distribution of errors cannot be assumed a priori by providing concrete examples. However, the error law is closely approximated if an error is the result of many causes and no cause dominates.
Donkin, An essay on the Theory of the Combination of Observations (1844) Ashmolean Society and in translation as "Sur la Théorie de la Combinaison des Observations" in Liouville's Jour. Math. Vol. XV, (1855) pp. 297-322. Merriman asserts that the reasoning is neither clear nor rigorous.
John Herschel, "Quetlelet on Probabilities." Edinburgh Review (1850) Vol. XCII, pp. 1-57. The proof lies on pages 19 and 20. It is the same as that given by Adrain. For a review of his review, see Robert Ellis in the Philosophical Magazine (1850), Vol. 37, pp. 321-328. Ellis puts the argument into mathematical language. See also Glaisher 1872. George Boole defends Herschel with "On the Application of the Theory of Probabilities to the Question of the Combination of Testimonies or Judgments," Transactions of the Royal Society of Edinburgh, Vol. XXI (1857).
Peter Guthrie Tait, "On the Law of Frequency of
Transactions of the Royal
Society of Edinburgh, Vol. XXIV (1865).
Reprinted in Scientific
Papers Vol. 1, 1898.
Donkin, "On an analogy relating to the theory of probability and on the principle of least squares," Quarterly Jour. Math., 1857, Vol. I, pp. 152-162. An analysis of the reasoning is given in Glaisher 1872.
Crofton, "On the Proof of the Law of Errors of Observations." Phil. Trans. 1870, pp. 175-188. Here an error is the result of a large number of small errors positive and negative but not equally probable.
Several papers on the history of least squares have been published subsequently. These are
The most comprehensive enumeration to date of research in this area has been done by W. Leon Harter. This has been published in a series of articles in the International Statistical Review during 1974 through 1976.