Regression Analysis with
Diagnostic Tools for Predictions


Europe Mirror Site   Mirror Site for Asia    Mirror Site for Middle East    USA Site


Versión en Español
Colección de JavaScript Estadísticos en los E.E.U.U.
Sitio Espejo para América Latina



This site is a part of the JavaScript E-labs learning objects for decision making. Other JavaScript in this series are categorized under different areas of applications in the MENU section on this page.

Professor Hossein Arsham   


Regression models are often constructed based on certain conditions that must be verified for the model to fit the data well, and to be able to predict accurately. This site provides the necessary diagnostic tools for the verification process and taking the right remedies such as data transformation.

Prior to using this JavaScript it is necessary to construct the scatter-diagram of your data. If by visual inspection of the scatter-diagram, you cannot reject "linearity condition", then you may use this JavaScript.

Enter your up-to-84 sample paired-data sets (X, Y), and then click the Calculate button. Blank boxes are not included in the calculations but zeros are.
In order to perform serial-residual analysis you must enter the independent variable X in increasing order.

Notice:In entering your data to move from cell to cell in the data-matrix use the Tab key not arrow or enter keys.

Predictions by Regression: Confidence interval provides a useful way of assessing the quality of prediction. In prediction by regression often one or more of the following constructions are of interest:

  1. A confidence interval for a single future value of Y corresponding to a chosen value of X.
  2. A confidence interval for a single pint on the line.
  3. A confidence region for the line as a whole.

Confidence Interval Estimate for a Future Value: A confidence interval of interest can be used to evaluate the accuracy of a single (future) value of y corresponding to a chosen value of X (say, X0). This JavaScript provides confidence interval for an estimated value Y corresponding to X0 with a desirable confidence level 1 - a.

Confidence Interval Estimate for a Single Point on the Line: If a particular value of the predictor variable (say, X0) is of special importance, a confidence interval on the value of the criterion variable (i.e. average Y at X0) corresponding to X0 may be of interest. This JavaScript provides confidence interval on the estimated value of Y corresponding to X0 with a desirable confidence level 1 - a.

It is of interest to compare the above two different kinds of confidence interval. The first kind has larger confidence interval that reflects the less accuracy resulting from the estimation of a single future value of y rather than the mean value computed for the second kind confidence interval. The second kind of confidence interval can also be used to identify any outliers in the data.

Confidence Region the Regression Line as the Whole: When the entire line is of interest, a confidence region permits one to simultaneously make confidence statements about estimates of Y for a number of values of the predictor variable X. In order that region adequately covers the range of interest of the predictor variable X; usually, data size must be more than 10 pairs of observations.

In all cases the JavaScript provides the results for the nominal data. For other values of X one may use computational methods directly, graphical method, or using linear interpolations to obtain approximated results. These approximation are in the safe directions i.e., they are slightly wider that the exact values.

Linear Interpolation: To estimate the lower (and upper) limits at given value X, one may use the following by taking a linear interpolations at two known neighboring points to X, say XL and XU, as follow:

The approximate lower limit at X is:

LL(XL) + [ LL(XU) – LL (XL) ] × [X – XL] / [ XU – XL ]

Similarly the upper limit at X is:

UL(XL) + [ UL(XU) – UL (XL) ] × [X – XL] / [ XU – XL ]

The resulting approximation is conservative; therefore it is in the safe side.



 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Variable X
Variable Y
 
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Variable X
Variable Y
 
29
30
31
32
33
34
35
36
37
38
39
40
41
42
Variable X
Variable Y
 
43
44
45
46
47
48
49
50
51
52
53
54
55
56
Variable X
Variable Y
 
57
58
59
60
61
62
63
64
65
66
67
68
69
70
Variable X
Variable Y
 
71
72
73
74
75
76
77
78
79
80
81
82
83
84
Variable X
Variable Y
 
Enter a Confidence Level:
Mean(X) Mean(Y)
Variance(X) Variance(Y)
Slope Its Standard Error
Intercept Its Standard Error
Correlation Its Standard Error
F-Statistic Its P-value
Linearity Condition: