Linear Regression


C++ For Scientists and Engineers


Last Modified: November 25, 1998


This web page presents original C++ code for the implementation of a linear regression analysis. Any comments or requests concerning this page can be made through email at:
David@Swaim.com.

Scientist and engineers are often faced with gathering data experimentally for analysis. Often this data needs to be fit to a theoretical curve to determine the proper coefficients. The simplest "curve" is a straight line, the equation of which is:

y = a + bx

Linear Graph Plot Where y is the dependent coordinate and x is the independent coordinate. The coefficient a is the value of y when x is zero and is called the y intercept, and the coefficient b is the slope of the line or the amount y changes when there is a given change in x. The slope determines the angle of the line when it is plotted on a graph.

The method of least squares can be used to fit experimental data to a theoretical curve. The simplest form of least squares is the linear regression, which fits data to a straight line. The code contained in
linreg.h is a C++ class that will calculate linear regression analysis on a set of data.

Some interesting points concerning the class definition follow:

First, there are two classes in this header file. The first is a class to define a point on a plane. This class, Point2D, is self-contained in the header file. No further implementation code is needed. This class is strictly for use by the LinearRegression class in inputting data points for the calculation. Better implementations of point classes exist in other places.

The second class that is defined is the LinearRegression class itself. This class accepts data points and calculates the coefficients for the straight line. I have overloaded the insertion operator to print the coefficients to a standard C++ output stream.

The LinearRegression class is not self-contained in the header file and requires additional implementation code found in a C++ code file. This implementation code is found in the linreg.cpp file.

To use the LinearRegression class you need to use "#include linreg.h" to include the header in your program and compile and link the linreg.cpp file with your program (as part of the project).

To do a linear regression first declare an instance of a LinearRegression. You have the choice at instantiation of loading an entire set of experimental data in the form of either an array of points or an array of x coordinates and an array of y coordinates. You must specify the number of points in the arrays. It is not necessary that you load all points on instantiation, however. You may instantiate an empty LinearRegression that you will populate one point at a time using the addXY() or addPoint() member functions. You can see the result of the linear regression at any time by using the overloaded insertion operator to print the coefficients or you can get the coefficient of correlation, coefficient of determination, or standard error of the estimate using the appropriate "get" member function.

One of the main reasons to do curve fitting is so you can estimate the values of points you have not measured. The LinearRegression class allows this through the use of the estimateY() member function.
The following listing is a test program utilizing the LinearRegression class:

/*
    Test driver to test linreg.h linear regression class
*/
#include <iostream.h>
#include <iomanip.h>
#include "linreg.h"

double x[] = { 71,  73,  64,  65,  61,  70,  65,  72,  63,  67,  64};
double y[] = {160, 183, 154, 168, 159, 180, 145, 210, 132, 168, 141};

Point2D p[] = { Point2D(71, 160), Point2D(73, 183), Point2D(64, 154),
                Point2D(65, 168), Point2D(61, 159), Point2D(70, 180),
                Point2D(65, 145), Point2D(72, 210), Point2D(63, 132),
                Point2D(67, 168), Point2D(64, 141)};

void main()
{
    cout << "Linear Regression Test\n" << endl;

    LinearRegression lr(x, y, 11);  // create with two arrays
    cout << "Number of x,y pairs = " << lr.items() << endl;
    cout << lr << endl;
    cout << "Coefficient of Determination = "
         << lr.getCoefDeterm() << endl;
    cout << "Coefficient of Correlation = "
         << lr.getCoefCorrel() << endl;
    cout << "Standard Error of Estimate = "
         << lr.getStdErrorEst() << endl;

    cout << "\nLinear Regression Test Part 2 (using Point2Ds)\n" << endl;

    LinearRegression lr2(p, 11);  // create with array of points
    cout << "Number of x,y pairs = " << lr2.items() << endl;
    cout << lr2 << endl;
    cout << "Coefficient of Determination = "
         << lr2.getCoefDeterm() << endl;
    cout << "Coefficient of Correlation = "
         << lr2.getCoefCorrel() << endl;
    cout << "Standard Error of Estimate = "
         << lr2.getStdErrorEst() << endl;

    cout << "\nLinear Regression Test Part 3 (empty instance)\n" << endl;

    LinearRegression lr3;   // empty instance of linear regression

    for (int i = 0; i < 11; i++)
        lr3.addPoint(p[i]);

    cout << "Number of x,y pairs = " << lr3.items() << endl;
    cout << lr3 << endl;
    cout << "Coefficient of Determination = "
         << lr3.getCoefDeterm() << endl;
    cout << "Coefficient of Correlation = "
         << lr3.getCoefCorrel() << endl;
    cout << "Standard Error of Estimate = "
         << lr3.getStdErrorEst() << endl;
}

The above code should give you a feel for the use of the LinearRegression class.

This class is useful in the calculation of other families of curves. By using inheritance you can easily make this class the "engine" for fitting data to exponential and geometric curves. In fact I derived classes from LinearRegression which fit data to exponential and geometric curves. I hope to have web pages on these in the future.

Who do I blame for this Web Page?


This web page was written by
David C. Swaim II who is solely responsible for its content. I hope to have the source code for this and other classes available to FTP soon. In the mean time you can get the code by emailing me at David@Swaim.com.

Copyright © 1997 by David C. Swaim II, all rights reserved.