APPLICATION OF PRINCIPAL COMPONENT
ANALYSIS TO DECISION
SUPPORT SYSTEM
By
SHAOKAIWEN
Master of Science
Beijing Agricultural University
Beijing, China
1988
Doctor of Philosophy
Oklahoma State University
Stillwater, Oklahoma
1996
Submitted to the Faculty of the
Graduate College of the
Oklahoma State University
in partial fulfillment of
the requirements for
the Degree of
MASTER OF SCIENCE
July, 1997
OKLAHOMA STATE UNIVERSI1'Y
APPLICATION OF PRINCIPAL COMPONENT
AN AL YSIS TO DECISION
SUPPORT SYSTEM
Thesis Approved:
Dean of the Graduate College
11
ACKNOWLEDGMENTS
I would like to thank Dr. John P. Chandler for awarding me assistantship and
giving me this opportunity to pursue my graduate studies at Computer Science
Department, Oklahoma State University. Thanks are also extended to my major advisor.
Dr. K. M. George, for his guidance, advice, assistance and encouragement throughout my
master program Appreciation is likewise expressed to Dr. William D. Warde for his
encouragement and help throughout my graduate studies and service on my graduate
committee. Thanks are also extended to Dr. Dave S. Buchanan, for his encouragement
and letting me take many computer courses when studying for my Ph. D program in
Animal Science, and Dr. Blayne E. Mayfield for consulting me when I developed the
package.
My thanks go to all of my fellow graduate students who broadened the quality of
my education at OSU.
My appreciation also extends to Mrs. Anna Ventris for preparing many paperwork
for me during my study in Computer Science Department.
To my parents, Xibin Wen and Qinxiu Jiang, I humbly acknowledge their love,
understanding and support. Finally, my greatest thanks go to my lovely wife, Rui Zhang,
for her love, encouragement, understanding, support and many sacrifices she made so that
I could complete my education.
III
T ABLE OF CONTENTS
Chapter Page
1. INTRODUCTION ..................................................................................... I
II. REVIEW OF THE LITERATURE ........................................................... 4
Principal Component Analysis and Eigenvalues
of Covariance Matrix ......................................................................... 4
Correlation Matrix vs. Covariance Matrix ........................................... 10
Application of Principal Component Analysis ..................................... 13
Computation of Eigenvalue and Eigenvector. ...................................... 15
Power Method .......................................................................... 15
Hessenberg a.nd Inverse Power Method ................................... 19
Bisection Method for Symmetric Hessenberg ................... 21
QR Algorithm for Nonsymmetric Hessenberg ................ 23
Inverse Power Method .............................................................. 25
Sensitivity Analysis of the Estimates and Decision System ................. 26
Roundoff Error ......................................................................... 27
Influence of Outliers and Influential Observations
on Estimates .......................................................................... 28
Sampling Error ......................................................................... 29
Sensitivity of Decision Systems ............................................... 30
m. DESIGN AND IMPLEMENT A TION .................................................... 33
Classes .................................................................................................. 34
RowClass .................................................................................. 35
TableClass ................................................................................ 36
Matrix ....................................................................................... 38
SquareMatrix ............................................................................ 39
Sensitivity ................................................................................. 39
Software Architecture ........................................................................... 41
Abstract Level Algorithm ..................................................................... 42
Key Algorithms .................................................................................... 43
IV
Chapter
IV.
V.
Page
RESULTS AND DISCUSSION ............................................ .................. 47
CONCLUSION AND FUTURE WORK ................................................. 65
REFERENCES ........................................................................................ 66
v
LIST OF TABLES
Table Page
I. Principal components based on the covariance matrix for five variables .......... 10
2. Principal components based on the covariance matrix for five variables after the
units of cost changed ..................................... .......................................... 11
3. Principal components based on the correlation matrix for five variables .......... II
vi
LIST OF FIGURES
Figure Page
1. The simple view of PCA. ... ... ... .... .... ............... .... ........ ........ .............. ............... ..... 5
2. The relationship among the class in the project. ................. ...... .... ... .. ................ 41
3. Abstract level control flow ................................................................................ 42
4. The window before loading data .... .... ..................... ... .................... ..... ......... .. ... 49
5. The file open window ......................... .... ....... ..... ................ ....... .. ..... ...... .. ... ...... 50
6. Data loading window .. .... ... ................... ......... ............ .. ..... ........... ....... ..... ...... .... 51
7. The window before calculating correlation ..... ..... ... .. ....... ........... .. ... .. ....... .... ... . 52
8. Correlation coefficient window ..... ..... ..... ... .......... ............ ... ... ....... .. .................. 53
9 .. The window before estimating eigenvalue and eigenvector.. ............................ 54
10. Eigenvalue and eigenvector window ............ ..... ................................. ............... 55
11. The window before calculating rank value .......... ........................... .... ....... ........ 56
12. The data and rank window ...................................................... .. ....... .. ....... .. ..... .. 57
13. The window before calculating rank interval ........... .. ..... ..... ........ ...... ........ ...... 58
14. Weight choose window .. ..... ................ .. ............... .............. ... ........ .... .... .... ........ . 59
15. Data and rank interval window ........................... ... ... ........... .. ..... .. .... ... ....... ..... 60
16. Data sort dialog and window .... ... ................................ ..... ...... .. ............ ....... ..... 61
17. Sorted data and rank window ............................. ........ .. .................. ... ... ... ........... 62
18. The window before calculating weight intervaL ............................................... 63
19. Data, rank interval and weight interval window ............... .... ...... ....................... 64
vii
CHAPTER I
Introduction
Decision support systems are software which are used to develop insight into
system behavior and help managers to make effective plans and decisions. Simulation
and modeling are the basic weapons which are used to simplify the problem, abstract
system behavior, state and explore the relationship among the components of the system,
understand system essence and behavior, predict the results and utilize knowledge to help
decision maker to make high quality decisions. One type of decision support system
addresses the problem to select a choice from many alternatives [George, 1996]. In other
words, the problem is to evaluate and rank a finite number of alternatives with respect to
a finite number of criteria. Rank computation depends on the values of the criteria
variables and their weight values which directly determine the influence of the variables.
How to weight each criteria and how the weights influence the preference of the
alternatives is a very important part in decision research. Much research has been done in
this area, but most of it is subjective. The best weight value should depict the
information of the data set and system behavior.
Principal component analysis (PCA) can reduce the dimensionality of the data set
and simplify the interrelated variables while retaining most of the information presented
in the data set. Much research has indicated principal component analysis has an
intuitively satisfying interpretation and illustrated its application in areas where
judgments are not easy to come by [Ahamad, 1967; Bailey, 1956; Cahalan, 1983; Chang,
1988; Cochran and Home, 1977; Dawkins, 1989; Jolicoeur, 1959; Jolicoeur and
Mosimann, 1960; Kloek and Mennes, 1960; Lee and Chang, 1976; Rao, 1964; Sloan,
1983; Wold, 1976]. Dawkins [Dawkins, 1989], using the first principal component of the
national track records from principal component analysis. ranked the world track
perfonnance. But principal components are influenced by roundoff error, sample data
variation and sampling error. How the rank value changes when the weight is changed
and what are the intervals of the weights with the restriction that the final ranking of the
alternatives does not change? The objective of this research is to explore the application
of PCA in decision support systems and investigate the model behavior under small
changes in its assumption and its parameters, understand the key variables and their
relationships which can most affect the model solutions and corresponding decisions,
validate the model and find better and robust solutions for some particular problems. A
decision support system is implemented as part of this research. It is implemented using
the MS Visual C++ programming language under the MS Windows 95 environment. The
system provides a graphical user interface (GUI) to view results.
The remainder of this thesis is organized as folJows. Computation of PCA,
application of the PCA, sensitivity analysis of the decision systems are studied in Chapter
2. Design and implementation of the system are explained in Chapter 3, also the process,
class, architecture and key algorithms were briefly explained in this Chapter. The result
2
and the interface were shown in Chapter 4. Chapter 5 gives conclusion and some
directions for future work.
3
CHAPTER II
Literature Review
As mentioned in the previous Chapter, decision support systems are software
which are used to develop insight into system behavior which in tum help people to make
effective plans and decisions. Generally, mathematical models are used to simplify the
problem, describe the essence of the problem, state the relationships between decision
variables, intermediate variables and outcomes. In large interrelated data information
systems, reduction of dimension and simplification of the interrelated variables is the first
and also the most important step to interpret the data, and thus to help people to make
right decisions. In this literature review, the author will explain the theory of principal
component analysis and its application in decision support systems.
Principal Component Analysis and Eigenvalues of Covariance Matrix
The central idea of principal component analysis (PC A) is to reduce the
dimension of a data set which consists of a large number of interrelated variables, while
retaining as much as possible the infonnation presented in the data set [Jolliffe 1986].
Let us consider Figure 1, there are 15 observations on two highly correlated variables X,
and Y. There is considerable variation in both variables, though rather more in the
direction of X than Y. If the above observations are expressed in another coordinate
4
system, or the points are projected onto D 1 and D2, then we can find that the variation in
D 1 is increased and variation in D2 decreased. If the observations are different in X and
Y, then they are different in D 1, but maybe not in D2. Also if the observations are
different in D 1, then they are di fferent in X and Y. 01 is also an important direction
because if two points are close on D 1, then it is likely that they are close before they are
projected onto D 1. This is not the case in D2. Many points which are close on D2 may
be originally quite far apart. So projections of the points on D 1 are good representations
of the above observations because we can get most of the information represented by the
original data set. Why is D 1 better than D2 in expressing the data? Because D 1
preserves the variation of the data. In general, if the original data are expressed as
Figure 1. The simple view of PCA
vectors in a Pdirnensional space, then the transformed data are vectors in a subspace of
the Pdimensional space. If x is a vector of variables, then dl (the projections of the
5
observations on D 1) can be expressed as a linear combination of the components of x as
shown in equation (I) below:
Where
(1)
(2)
d1 is a scalar (a value of one direction in transfered coordinate system)
alT is a 1 x p vector
!. is p x p population covariance matrix of x.
The objective of PCA is to find the direction such that after the points are
projected onto it, the variance of the projected points is maximized. In other words
alT!. al is maximized. The maximum of a}I: (XI will be achieved for infinite al. so a
nonnalization constraint must be imposed for (Xl . The most convenient constraint here is
atTal=l.
To maximize a/!.at subject to a/al =1, use the technique of Lagrange
multipliers [Jolliffe, 1986] and maximize
(3)
where A. is a Lagrange multiplier. Differentiation with respect to at gives
6
1: 0.1  A 0.1 = 0, or
(1:  Alp) 0.1 = 0, (4)
where 0 is p x J vector with value of 0 for each element, and Ip is the p x p identity
matrix. So A is an eigenvalue of 1: and 0.1 is the corresponding eigenvector. Note that
the quantity to be maximized is
(5)
so A must be as large as possible. Thus, 0.1 is the eigenvector corresponding to the largest
eigenvalue of 1:, and Var(o.lTX) = 0.1 TVar(X)a.] =0.) Tl:o.l = A), the largest eigenvalue.
A similar theory can apply for samples, for example, if n observations were
collected from a population with Pdimensional random variables, let X represent n
observations of Pdimensional random variables, then the projections of n points on the
direction of 0.1 are
(6)
Where DI is the n x 1 vector of the projections of X on D I
X is the n x p matrix which represents the original sample data
7
L
UI is the projection direction
(7)
So the variance of n? is
Where :E is the p x p covariance matrix of sample XT• The fonnula (8) is the same as
formula (3). Therefore, the procedure for population matrix can be used to derive the PC
for the sample covariance matrix. From this we can find that the essence of peA is
actually to estimate the eigenvalues and eigenvectors of the covariance matrix.
:E is a p x p symmetric matrix, so there exists a p x p orthogonal matrix P such
that pT:EP = n where D is a diagonal matrix whose diagonal elements are the eigenvalues
of:E and the columns of P are the nonnalized eigenvectors of :E. The jlh column of P
corresponds to the ith PC with variance equal to the diagonal element of D for i = I, ... , P
[Moser, 1996]. The variance of x (the original data) is the trace of:E, and
tre:E) = tr e:EI ) = tr (LpTp) = tr ( pT:EP) = tr (D) = Li=l ~
8
so the variance of the original data is equal to the sum of the eigenvalues. For any integer
q (1 $ q $ p), if the eigenvectors of the largest q eigenvalues were used as the linear
transfonnation matrix, then the variance of the transfonned variables will be maximized.
So the task of PCA is to find the q largest eigenvalues and their corresponding
eigenvectors.
In general, let X be a set of points, L be the covariance matrix of X, the rank of L
be p, Db D2, "., Dp be the eigenvectors of L corresponding to eigenvalues AI, A.20 ... , Ap
where AI ~ A2 ~ ... ~ Ap. Let IIDjll = 1 for all i. The properties of D/ s are now summarized
as follows:
1) All DiTS are mutually orthonomal. That is, D/Dj = 0 for i;¢; j and D/Di= 1.
2) If the set X of points are projected on to Dj , then the variance of the projected points
on Dj is Ai.
3) Among all possible directions, DJ is the direction which will produce the largest
variance by projecting points onto it. D2 is the direction in the space perpendicular to
D1 which will produce the second largest variance by projects onto it. In general, Di
1 S i $ P is the direction in the space perpendicular to DJ, ... , D/1 which wi)] produce
the ith largest variance by projecting points onto it. Because q $ p and Di and Dj , i ;¢;
j, are perpendicular to each other, the PCA objective, reducing the dimension and
simplifying the data, are attained.
9
Correlation Matrix vs. Covariance Matrix
In practice, it often occurs that different elements of the origina1 data set are
measured in completely different types of units which in turn results in widely different
variance among variables. For example in the NSN data [ George, 1996], the standard
deviation for MANHR, FAll...URES, COST, CANN and MIC_HRS are 1315.3927,
70.0615,47539.789, 17.3814 and 7347.9464, respectively. The principal components
based on the covariance matrix are given in Table I.
The first component is a slight perturbation of the single variable COST which
has the largest standard deviation, the second component is almost the same as the
variable MIC_HRS with the second highest standard deviation, the third component is
Table 1. Principal Components Based On the Covariance Matrix for Five Variables
Component Number 1 2 3 4 5
MANHR(Xl) 0.0275 0.0007 0.9976 0.0570 0.0274
FAILURES (X2) 0.0012 0.0001 0.0539 0.9934 0.1009
COST (X3) 0.9995 0.0159 0.0275 0.0003 0.0006
CANN(X4) 0.0003 0.0002 0.0329 0.0993 0.9945
MIC_HRS (X5) 0.0159 0.9999 0.0011 0.0002 0.0002
Eigenvalue 2.26E9 5.47E7 1.69E4 1614.75 71.7292
also almost the same as the variable MAN_HRS with the third highest standard deviation,
and so on. Also the eigenvalues for components almost equal the variances of the
corresponding variables. The variance for COST is 2.26E9, the first eigenvalue is a1so
10
2.26E9. Thus the first five components for the covariance matrix tell us almost nothing
apart from the order of sizes of the variances of the original variables. Also even in the
same data set, for example, the above data set, if the units of cost were changed to
thousand dollars, then the variance for it will change to 2.26E3. The PCs also changed
proportional to the change in the variance(Table 2). So the drawback of PCA based on
the covariance matrices is the sensitivity of the PCs to the units of measurement used for
each element [Jolliffe, 1986]. Also another drawback for covariance matrices is that due
to the widely different variance, the covariance among the variables are relatively small
which cause the loss of information for the covariance due to roundoff errors because of
the inherently inaccurate computation of computer. So weighted covariance matrices are
used to eliminate this shortcoming. Most of the time standardized variables (the original
data divided by standard deviation of the variable) are used. Then the covariance matrix
for standard variables changes to correlation matrix of the original variables. The
principal components for the NSN data set using correlation matrix are listed in Table 3.
The first component has moderatesized coefficients for four of the five variables. The
other components except for the second also have moderatesized coefficients for several
variables. The eigenvalue of the first PC for the correlation matrix shows that certain
nontrivial linear functions of the standardized variables account for 71 %, although less
than proportionate, 94%, of the first PC for the covariance matrix in the original
variables, proportion of the total variation in the standardized variables.
All the properties for the covariance are still valid for the correlation matrices,
except that we are now considering PCs of the standardized variable, instead of the
original variable [Jolliffe, 1988]. Although the PCs for the correlation matrix are from
11
Table 2. Principal Components Based On the Covariance Matrix for Five
Variables after the Units of Cost Changed
Component Number 1 2 3 4 5
MANHR (Xl) 0.0191 0.9981 0.0445 0.0015 0.0365
FAILURES (X2) 0.0006 0.0438 0.9937 0.1010 0.0189
COST (X3) 0.0007 0.0359 0.0043 0.2097 0.9771
CANN (X4) 0.0004 0.0107 0.1023 0.9725 0.2088
MIC_HRS (X5) 0.9982 0.0192 0.0002 0.0002 0.0000
Eigenvalue 5.53E7 1.72E6 1617.15 82.04 19.29
Table 3. Principal Components Based On the Correlation Matrix for Five Variables
Component Number 2 3 4 5
MANHR (Xl) 0.5142 0.0546 0.4582 0.0438 0.7217
FAILURES (X2) 0.4831 0.0880 0.4989 0.7141 0.0092
COST (X3) 0.5106 0.0609 05085 0.0113 0.6906
CANN(X4) 0.4841 0.0357 0.5306 0.6932 0.04732
MIC_HRS (X5) 0.0851 0.9921 0.0313 0.0866 0.0002
Eigenvalue 3.5511 0.9881 0.2652 0.1912 0.0042
the standardized variable, the eigenvalues and eigenvectors of the correlation matrix have
no simple relationship with the corresponding covariance matrix. The PCs for covariance
and correlation do not give equivalent information, nor can they be derived directly from
12
each other[Jolliffe, 1986]. If the units for all the variables are the same, covariance are
preferred for PCs.
Application of Principal Component Analysis
The beginnings of principal component analysis are probably to be found in the
works of Karl Pearson in 1901 [Johnson and Wichern, 1982]. The statistical properties of
principal components were investigated in detail by Hotelling in 1933 [Jolliffe, 1986].
Many researchers [Anderson, 1984; Jolliffe, 1986] have given comprehensive
expositions. Since then, PCA has been applied in agriculture, biology, chemistry,
climatology, demography, ecology, economics, food research, geology, psychology and
quality control and other areas [Ahamad, 1967; Bailey, 1956; Cahalan, 1983; Chang,
1988; Cochran and Home, 1977; Dawkins, 1989; Jolicoeur, 1959; Jolicoeur and
Mosimann, 1966; Kloek and Mennes, 1960; Lee and Chang, 1976; Rao, 1964; Sloan,
1983; Wold, 1976]. In the following paragraphs, the author reviews some typical
applications of PCA.
Principal Component Analysis combined with factor analysis was used to
interpret the data in biology and economics and other areas [Johnson and Wichern, 1982;
Joll iffe, 1986]. In biology, the growth of animals are detennined by two different
unobservable factors: genetic and environmental. Bailey [Bailey, 1956] using principaJ
components analysis combined with factors successfully explained the morphogenetic
changes of mice according the observable characters (size and weight). Principal
components were also used as the intennediate step in discriminant analysis, cluster
analysis and canonical correlation analysis [Duchene and Leclercq, 1988; Jeffers, 1967;
13
Jolliffe, 1986; Johnson and Wichern, 1982; Lachenburch, 1975; Sloan, 1983]. In these
applications, the principal component analysis is used to reduce the data. Dawkins
[Dawkins, 1989] used principal component analysis to find the first principal component
which was used to rank the world track performance based on the national track records.
The analysis has an intuitively satisfying interpretation and illustrated well the application
of the principal component analysis in areas where judgments are not easy to come by.
Principal component analysis has been used in allocating multiattribute records
on several disks so as to achieve high degree of concurrency of disk accessing when
responding to partial match queries [Chang, 1988]. The first principal component of a
recordquery incidence matrix was used to rank the records and then similar records were
allocated to different disks. It was found that the average response time of retrieval was
less than that for random allocation. This method is very good for parallel searching.
Principal component analysis was also used in multikey searching [Lee and
Chang, 1976]. When the records are in the form of vectors and each key is in numerical
form, principal component analysis can be used to create new keys from a set of old keys.
These new keys were useful in narrowing down the search domain. The first principal
component could be viewed as hashing addresses for the bestmatch searching problem.
Instead of having to read in all the prototypes, one only had to read a few samples,
resulting in a tremendous saving of the secondary storage device access time.
Computation of Eigenvalue and Eigenvector
Due to the differences in the properties of the matrix (symmetric and asymmetric,
sparing and unsparing), requirement of calculation, flexibility of calculation, and
14
hardware and software availability, many methods (Power Iteration, Hessenberg and QR
methods, QZ algorithm, Jacobi Method, Divide and Conquer Methods and Lanczos
Methods) were invented to compute the eigenvalues and eigenvectors [Golub and Van
Loan, 1993; Johnson and Arnold, 1989]. But here the author explains three methods: the
Power method. the QR method and the Bisection method with Hessenberg fonn.
1. Power Method
The easiest method for calculating the largest eigenvalue and corresponding
eigenvector is power method which uses an iterative procedure to estimate the dominant
eigenvalue of a matrix. Suppose that A is an (n x n ) matrix and A has eigenvalues 1..1,1..2,
1..3, ••• ,/..n, with corresponding eigenvectors UI, U2. U3, ... ,Un; so
We assume that { UJ. U2, U3, ... ,un} is a set of linearly independent vectors. Let us choose
some initial vector Vo. where Vo :1=8, and let Vj = AVj.l. By our linear independence
assumption we know that Vo can be expressed in the form
VI = Avo, then
VI = alAu] + a2Au2 + a3Au3 + ... + anAun
= all..lUI + a2A2112 + a3A3u3 + ... + <tnAnun
15
Now suppose the eigenvalues are ordered so that IAII ~ IA21 ~ IA31 ~ ... ~ IAnI ,
if IA]I > IA21 , then the terms (I •. /A])k are small for large k, where 2 $ i $ n .
If al '# 0, then
To obtain an estimate to AI , we utilize two vectors Vk and Vk+1 calculated iteratively,
where we expect that
Now if we form the quotient
where w is any vector such that WTUI * 0,
A T / T '\. hi T I '\ k T '\
pk = W Vk+1 W Vk::= 1\.1 alW UI 1\.1 alw Ul = 1\.[ .
16
The approximation in the above formula is the essence of the power method.
With respect to the above formula, we note that the reasonable choice for w is the vector
Vk itself. This choice leads to the approximation
It can be shown that if a, :;:. 0 and 1,) .. ,1> 11..21 , then limit value of ~k' is,
Generally, a scaling method is used to find the eigenvector and avoid the overflow. The
scaling is shown below:
Zk+l = A Vk
In summary, the power method proceeds as follows:
1. Guess the initial vector Zo. Vo = Zo IlIZolt
2. Form the sequence Zk = A Vk] , k= 1,2, ...
3. For each k calculate the A T pkl = VIcl Zt,
Then ~k converges to the dominant eigenvalue, that is the largest in absolute value, of the
matrix A.
The power method has several severe restrictions and shortcomings. The choice
of an initial vector Vo must make sure that a, :;:. O. This method is onJy suitable when A
17
has a single dominant eigenvalue, IAII > IA21. Also the method can find just the dominant
eigenvalue and the corresponding eigenvector. In practice, the usefulness of the power
method depends on the ratio IA2111Ali , since it dictates the rate of convergence. Moreover,
it is typically the case in applications where the dominant eigenvalue and eigenvector are
desired. Note that the only thing required to implement the power method is a subroutine
capable of computing matrixvector products. It is not necessary to store A in an nbyn
array. For this reason, the algorithm can be of interest when A is large and sparse and
when there is a sufficient gap between IAII and IA21 [Golub and Van Loan, 1993].
18
2. Hessenberg and Inverse Power Method
Another basic method is to reduce A to Hessenberg fonn, H, to find the estimates
of the eigenvalues of A from H and then apply the inverse power method to the original
matrix A to refine these estimates by iteration.
We know that eigenvalues are calculated using
pet) = det (A  tl)
The roots of pet) = 0 are the eigenvalues of A. The polynomial pCt) may be difficult to
calculate. If S is a nonsingular n x n matrix and B= S·I AS, then the eigenvalues for 8 are
given by:
pet) = det (8  tl) = det (S·IAS  tI) = det (S·1 AS  t S·lS)
= det [S·l(A  tI)S] = det (S·') del(A  tI) det(S)
= det (S'l) det(S) det(A  tl) = det(A  tl).
which are the same as that for A. So, A is generally reduced to Hessenberg form, a
simpler fonn to compute the eigenvalues of A. Householder transfonnations are used to
transfonn A into Hessenberg fonn. The Householder matrix is
where u is a nonzero vector in Rn and I is an (n x n ) identity matrix.
QQ = (I  2uuTluTu)(I  2UUT/UTU)
= I  2UUT/UTU  2UUT/UTU + (2UUT/UTU)(2uUT/UTU)
19
So
= x "y'll
where y = 2u T xlu T u and x is a nonzero vector.
For any nonzero vector v = [VI, V2, V3, ... , vn]T,
choose
where
( 2 2 2 2)0.5
S = ± Vk + Vk+l + Vk+2 + ... + Vn
the sign of s to be chosen so that VkS $ 0
and
Ui = Vi for i = k+l, k+2, ... , n.
20
then
Qv = v  U = [Vb V2, '" Vkl, S, 0, 0, ... of
so using this method the matrix A can be transfonned to Hessenberg form.
The method to estimate the eigenvalues for a symmetric H is easier than that for a
nonsymmetric H. Generally bisection method is used to estimate the eigenvalues of
symmetric Hand QR method for nonsymmetric H. Bisection and QR algorithms are
explained below:
I) Bisection Method
Suppose H is an (n x n ) symmetric Hessenberg matrix. Then H is tridiagonal and
has the form
dl hi 0 0 0 0
hI d2 b2 0 0 0
0 b2 d3 h3 0 0
0 0 b3 d4 b4 0 0
Hn= 0 0 0 b4 ds 0 0
o 0 0 0 0 0 bll2 d III bn I
o 0 0 0 0 0 ° bn I dn
21
Suppose we define the sequence of polynomials [Jacob, 1995]
poet) = 1
p,(t) = d,  t = det (H, tl)
P2tt) = (d2  t) PICt)  b]2 poet) = det (H2tI)
Pi(t) = (di  t) PiI(t)  bi_,2 Pi2(t) = det (HitI)
Pn(t) = (dn  t) Pn,(t)  bn_12 Pn2(t) = det (Hntl)
Pn(t) is the characteristic polynomial for H.
If the subdiagonal entries bl , b2, ... , bn, are all nonzero, then the algorithm is as
follows:
1. Let c be some real number.
2. CaJculate the values of Po(c), PICC), P2(C), ... , Pn(c).
3. Let N(c) be the number of agreements in sign in the sequence poCe), PI(e), P2(C), ... ,
Pn(e).
4. N(e) is equaJ to the number of roots of Pntt) = 0 that are in the interval [c, oc).
In the event that PIeCe) = 0 for some k, we take the sign of Pk(C) to be that of Pk'(C).
To use the above algorithm for computational purposes, we would first determine an
interval [a, b] that contains all the roots of Pn(t) = 0; generally
22
where II..I is the maximum range between a and b.
Next. let c be the midpoint of [a. b] . If N(e) > N(b). then there is at least one eigenvalue
in [c, b]. Let d be the midpoint of [c,b]. If N(d) > N(b), then there is at least one
eigenvalue in [d, b]; on the other hand, if N(d) = N(b), then any eigenvalue in [c, b] must
in [c, d]. In this fashion, by repeatedly halving and testing subintervals we can determine
a small subinterval [r, s] that contains N(r)  N(s) eigenvalues of H. This process can be
terminated when we have detennined k small subintervals, 1),12, h, ... , It. whose union
contains all the eigenvalues of H. The midpoint of subintervals are the estimates of the
eigenvalues.
II). QR algorithm for nonsymmetric Hessenberg matrix
For a given (n x n ) Hessenberg matrix H. let H(I) = H. For each positive integer
k, using the same algorithm that transfonns a ( n x n ) matrix to Hessenberg fonn
transfonn matrix H(k) into an upper triangular matrix R (k). Then the matrix H(k) can be
written as:
(1)
where Q (k) is an orthogonal matrix [Jacob, 1995] and,
23
Q(k) _ Q (k) Q (k) Q (k) Q (It)
 (I) (2) (3) (nI) (2)
Then set
(3)
From (1), we can get
R (k) _ Q (k) Q (k) Q (It) Q (k) H(k) _ (Q(k})IH(k)
 (nI) (n2)··· (2) (1) 
and
SO H(k+l) is similar to H{It) _ When the above procedure is repeated, H(k) will converge to
an uppertriangUlar matrix with the eigenvalues of H on the diagonal.
Unfortunately, the above approach (Hessenberg Form) may lead to severe errors
due to roundoff during the process of reducing the matrix to Hesserbeng form. To
overcome this difficulty, inverse power method is applied to the original matrix to refine
the estimates by iteration. The inverse power method is explained below:
24
L
3) Inverse Power Method for the Eigenvalue Problem
The inverse power method is nothing more than power method applied to the
matrix (A  aIri [ Johnson et. al., 1989]. If a, estimated by H, is a reasonably good
estimate to an eigenvalue A of A, then several steps of the inverse power method will give
a very accurate estimate to A and a corresponding eigenvector. If A is an eigenvalue of A,
then
Au=Au
Au  au= Au  au
(A  aI)u = (A  a)u
Since a is not an eigenvalue of A, (A  aI) is nonsingular; and we can write
so lI(A  a) is an eigenvalue of (A  aIr l and u is a corresponding eigenvector. Suppose
A has eigenvalues At. A2, ,." An and a; is a good estimates to ~ (1 $ i $ n), the eigenvalues
of (A  aIrl are Ill, 112, .'" Iln, where
25
if al == A.I, then III is the dominant eigenvalue of (A  alrt , the power method can be used
to compute III ,
The same procedure can be used to compute the other eigenvalues and their
corresponding eigenvectors.
Sensitivity Analysis of the Estimates and Decision Systems
Sensitivity analysis consists of identifying the relatively sensitive parameters (i.e.,
those which can not be changed without changing the outcome), try to estimate those
parameters more closely. and then select a solution which remains a good one over the
range of likely values of the sensitive parameters [ Hillier and Liebennan, 1986]. Due to
roundoff errors and finite steps of iteration during the process of estimating parameters,
the eigenvalues and their corresponding vectors will not be accurate. Also the outliers
and influential observation, sampling error, even manmade error during the collection of
data will also make the estimates more questionable. Then the application of these
estimates in decision support systems (here in ranking decision system) will result in
changing the final ranking of the alternatives. In the fol1owing sections the author will
briefly explain the influence of roundoff error, outliers and influential data to the
estimates and their influence on the decision system.
26
1). Roundoff Error
The advent of the computer has greatly increased the range of problems in which
matrix theory and linear algebra are applicable to find solutions. However, every
computer has computational limitations which result in a potential source of error for
every arithmetic operation in the computer [Johnson et. aI., 1989]. In particular, when a
matrix is reduced to Hessenberg form, roundoff error will occur, and the Hessenberg
matrix found by the machine will not be quite what it would be (if exact arithmetic were
used). So the eigenvalues which are estimated from Hessenberg form of A are not the
same as those of the original matrix A (and may differ substantially from the eigenvalues
of A). For example:
1
1
0
H=
0
0
H+E=
0 0
1 0
1 1
0 0
0 0
0 0
1 1 0
0 1
000
000
0 0
0 0
0 0
0
0
0
0
27
E
0
0
o
1
Then
det(H  tl) = (Itt
and det(H + E  tI) = (I_t)n + (It+IE
Suppose n=lO and E = 210, then the eigenvalue for H is equal 1 with multiplicity of 10,
but the eigenvalues for H + E are 1.5 or 0.5. A change in H of amount 2 10 produces a
50% change in eigenvalue. But not every perturbation of entries in H will lead to such a
large change in the eigenvalues. Golub and Van Loan [Golub and Van Loan, 1993] has
done comphensive analysis of perturbation theory for eigenvalues and eigenvectors.
2). Influence of Outliers and Influential Observations on Estimates
During the process of the data collection, some atypical factors (systematic and
random errors) may influence the values of the data set which can, but need not, have a
disproportionate effect on pes. If peA is used blindly, then the results can be largely
detennined by a few influential observations [Jolliffe, 1986].
Outliers are generally viewed as observations which are a long way from, or
inconsistent with, the remainder of the data [Jolliffe, 1986]. There are two kinds of
outliers: the extreme data on the original variable and the data which does not conform
with the correlation structure of the remainder of the data. It is impossible to detect the
second outlier by looking solely at the original variables one at a time. Numerous
procedures have been suggested for detecting outliers with respect to a single variables
[Jolliffe, 1986]. Generally, the pes themselves were used to detect potential outliers.
Gnanadesikan and Kettering [Gnanadesikan and Kettering, 1972] found that the outliers
which inflate variance and covariance can be detected from a plot of the first few pes.
28
By contrast, the last few PCs may detect observations which violate the correlation
structure imposed by the bulk of data, which are not apparent with respect to the original
variables [Jolliffe, 1986). But in a small sample data set, the best way to detect outliers is
to compute pes leaving out one (or more) observation(s) [Jolliffe, 1986). The other
possible methods that can be used to detect outliers are test statistics [Gnanadesikan and
Ketterning, 1972; Hawkins, 1974; Jolliffe, 1986].
Outliers whose removal has a large effect are called influential observations.
Whether or not an observation is influential depends on the analysis being done on the
data set; observations which are influential for one type of analysis or parameter of
interest may not be so for a different analysis or parameter. There are two methods which
can be used to detect influence of the observations. One is removal of the observations;
the other is to use influence function [Jolliffe, 1986]. The two methods matche each
other very well [Jolliffe, 1986]. Jolliffe [Jolliffe, 1986] also found that observations
which were most influential for a particular eigenvalue need not be so for the
corresponding eigenvector, and vice versa. Observations may be influential for PC in the
covariance matrix, but may not be in the correlation matrix. An observation may be
influential for one PC only in covariance matrix, but more than one value in correlation
matrix is likely to be affected because the sum of the eigenvalues remains the same.
3. Sampling Error:
Due to the sampling variation, the eigenvalues and eigenvectors from the sample
covariance matrix will differ from their underlying population counterparts. Some
research on the sampling distribution of the eigenvalues and eigenvectors has been done
29
[Anderson, 1984]. Anderson [1963] has developed "large sample distribution theory" for
the eigenvalues and eigenvectors.
4. Sensitivity of Decision Systems
Principal component analysis was applied in one type of decision system which
evaiuates and ranks a finite number of alternatives with respect to a finite number of
criteria [George, 1996]. Rank computation depends only on the values of the critical
variables. Therefore the computed weights of the critical variables directly determine the
influence of the variables and the contribution of the variables to the rank computation.
How to weight each criterion and how the weight influences the preference of the
alternatives is a very important aspect in decision support system research. The best
weight value should depict the information of the data. Principal component is a good
way to evaluate objecti vely each criterion [Dawkins,1989]. But principal components are
influenced by roundoff error, sample data variation and sampling error. How does the
rank value change when the weight changes, and what are the intervals of the weights in
which the final ranking of the alternatives does not change? For example, consider n
alternatives with m criteria. Let the (m x 1) column vector Ai denote the values for each
record and let (m x 1) column vector W represent the weight value of the criteria. Then
the ranking value for each alternative is
where 1 is the (m x 1) column vector with value of 1 for each element. The relationship
of weight value and ranking can be formulated as below:
30
..
When the weight value for criteria i is WiE [wu, WI], the interval of the ranking
value for each alternative is [Ail, Aiu ] where
AjU = max ((AiTW)/(lTW)) and
Ail = min ((A?W)/(lTW)
In fact the above two formulae are linear fractional programming problems. The AjU and
Ail are not necessarily upper bound or lower bound values for W. Much research has
been done for solving this problem [BenIsrael, 1968; BenIrseal and Robers, 1970;
Chames and Cooper, 1962, 1973; Zionts, 1968]. The detailed derivation and proof wiU
be omited in this thesis. BenIsrael and Charnes [BenIsrael and Chames, 1968] has
proved that the maximum and minimum values are located at the vertices of the convex
volume (denominator).
Sometimes, we may be interested in determining the intervals with the restriction
that the final ranking of the alternatives does not change. Another option is following:
With the restriction that final top 100, 50, 20 or 10 items in the ranking do not change,
what percentage (d) of the weight W can be changed?
Let
where d is the percentage value of weight value that can be changed when the rank does
not change.
31
..
ow
Then, the largest value of d satisfies the expression:
where i is the ith alternative in the ranking.
Also, considering a subset of the alternatives in which the change of the final
ranking values is allowed, in what intervals are the weight allowed to vary, and how will
these modifications effect the final ranking values in the entire set of the alternatives? A
similar linear fractional programming problem can be used to solve the above problems.
Up to this point, the author briefly explained the computation and application of
PCA and sensitivity analysis of the decision support systems from the theoretical point of
view. In the next Chapter, the design and implementation of the software will be given.
32
CHAPTER III
Design and Implementation
In this thesis, the author has implemented a software package to compute ranking
based on PCA. The software also implemented sensitivity analysis.
As mentioned earlier, there are many methods to estimate eigenvalues and
eigenvectors and to solve linear fractional programming problem. All the methods are
problemdependent. So, the author has selected algorithms and data structures based on
previous experience. For the data set representation, an observation is viewed as a class
(RowClass). Intuitively, an observation is a row in a matrix, and so, the matrix can be
treated as a collection of instances of RowClass.
As to the computation of eigenvalues and eigenvectors, the bisection method and
the inverse power method were used due to the accuracy of these methods. The
correlation matrix was transformed to a Hessenberg matrix by using Householder
transformations and then the estimates of the eigenvalues were calculated using the
bisection method. The estimates of the eigenvalues and their corresponding eigenvectors
were refined by using inverse power method.
33
q
The computation for linear fractional programming varies depending on the
conditions of the denominator and numerator. Meszaros and Rapcsak [Meszaros and
Rapcsak. 1996] provided a simplex iteration method to do sensitivity analysis. This
algorithm requires O(n log n) arithmetic operations. BenIsrael and Charnes [BenIsrael
and Charnes, 1968] has proved that the maximum and minimum values of the linear
fractional programming problem is located at the vertices of the convex volume. An
algorithm based on the above fact is implemented in this software. Several test data were
used to verify the correctness of this implementation.
The remainder of this Chapter gives the design and implementation. The software
is implemented as a "project" in MS Visual C++. Software design is described in terms of
C++ classes. Their relationship also is shown as a graph. Key algorithms are also
described.
1. Classes
The project implements the following classes: application class, document class,
main frame class, view class, some dialog classes, row class, table class, matrix class,
square matrix class and sensitivity class. The main framework of the first five classes are
generated by using AppWizard and ClassWizard provided by MS Visual C++
environment. The row class is designed to represent the data for each record and table
class represents the whole data set. The matrix class is used to manipulate and manage
the data set, for example, mulitplication of the data set. The square matrix class is used to
estimate the eigenvalues and eigenvectors for correlation matrix. The attributes and
methods for the last five classes are listed below:
34
a. RowClass
Instance Variables:
Array (double)
Length (Unsigned)
Methods:
RowClass(void);
RowClass(unsigned N);
/I Address of an array.
II Length of the array.
/I Initialize array length to N.
RowClass (const RowClass& OrigRow);
1* This is copy constructor. The value of length is set to OrigRow.Length, and the
Array member contains the address of an array that is a copy of OrigRow's array
(or the NULL address). *1
RowClass(void); /I Destructor method.
doubJe& operator[](unsigned i);
I'" This function performs the subscript operation on a RowClass object. It returns
the element of the array pointed to by the instance variable Array whose index is i.
*/
RowClass& operator=(const RowClass& RowObj);
1* This function assigns to an instance of RowClass, a distinct copy of RowObj. * /
friend ostream& operator«(ostream&, const RowClass& RowObj);
1* Prints the RowObj to the output stream. *1
friend istream& operator»(istream&, RowClass& RowObj);
35
1* Reads a row from the input stream into the RowObj. "I
b. TableClass
protected:
Instance variables:
Public:
RowNum (unsigned)
ColNum (unsigned)
ChangeRate (double)
weight (double)
Grid (RowClass *)
II Number of rows in the table
1/ Number of columns in the table
II Change in weight
II Weight value
II Address of a table
Methods:
TableClass (unsigned NumRows, unsigned NumCols, Double InitVal);
TableClass (void);
1* This two constructors set RowNum to NurnRows, ColNum to NumCols (their
default values are zero). *1
TableClass (const TableClass& Original);
1* This copy constructor returns a copy of original object. *1
 TableClass(void); /I Destructor method, reallocation storage
RowClass& operator£] (unsigned i);
1* This method returns the jth row of an instance of TableClass. *1
TableClass& operator=(const Table* Tan);
36
/* The RowNum and ColNum of the target object are set to Tan.RowNum and
Tan.CoINum, respectively. Its data members contain the copies of those of Tan.
*/
double RowSum( unsigned r) const;
1* This method computes the sum of the elements of the rth row */
double ColSum( unsigned r) const;
/* The function computes the sum of the elements of the rth column. */
/* Other "get" and "set" methods for the data members are also included in
TableClass. */
Boolean Load(const strings& FileName);
1* This function loads the data in the file referred by FileName into the instance
variables of the receiver. If load is successful, the function returns a value of true,
otherwise it returns the value false. */
Boolean Write(const strings& FileName) const;
1* This function writes the data from the TableClass to the file specified by
FileName. */
void WeightValue(double* weight);
1* This function sets the weight value for each criteria. */
void CalculateRankVal (void);
1* This function calculates the rank value for each record. */
void QuickSort (unsigned i);
1* This function sorts the table using ith column as key. */
37
friend istream& opertor»(istream& In, TableClass& InTab);
1* The table InTab is initialized appropriately with the number of rows and
columns needed to store the input value. *1
friend ostream& operator «(ostream& out, const TableClass& T);
1* This function overloads the output operator for the TableClass object. *1
c. class matrix:TableClass
friend matrix& operator+ (const matrix& matt, const matrix& mat2);
1* This function adds two matrices. */
friend matrix& operator (const matrix& matt, const matrix& mat2);
1* This function subtracts the matrix named mat2 from the matrix named matI. */
friend matrix& operator* (const matrix& matt, const matrix& mat2);
1* This function multiplies two matrices. *1
matrix& operator**(double k) const;
1* This function transfonns a matrix into another matrix whose elements are kth
powers of the original dements. *1
matrix& operator/(double k) const;
1* This function returns a matrix whose elements are klh roots of the
corresponding elements of the matrix it receives. *1
38
d. SquareMatrix: public matrix
Row& DomEigenVect( void) const;
1* This function computes the eigenvector corresponding to the dominant
eigenvalue. *1
double DomEigVal(void) const;
1* This function computes the dominant eigenvalue. *1
Long double Det(void) const;
1* This functions computes the determinant of a matrix *1
SquareMatrix& Diag(void) const;
1* This function transfonns a square matrix into a new square matrix whose
diagonal elements are the same as the original matrix and whose off diagonal
elements are all zeros. *1
e. class Sensitivity
Instance Varibales:
RowClass* Weight; /I The weight for attributes.
RowClass* RankValue, *HighRank, *LowRank;
II The rank, high rank and low rank values for a record.
TableClass *data;
SquarMat *correlation;
IlPointer to the start address of the data set.
IlPointer to the correlation matrix of the data set
TableElement range;
39
1/ Change in weight with the restriction that the final rank will not be changed.
Methods:
Sensitivity( void);
1* This function constructs the sensitivity class with the default values zero for
scalar instance variables and NULL for address instance variables. *1
SensitivityO; /I Destructor method, deallocates storage.
1* Two functions "set" and "get" are also defined to set and get instance variables
of the receiver. *1
void CalculatePC 0;
/I This function is used to calculate the rank for each record.
void CalculatePClntervalO;
1* This function is used to estimate the maximum and minimum rank values for
each record with the restriction that the weights can be changed within a range. *1
void CalculateMaxWeightO;
1* This function is used to estimate the maximum range with the restriction that
the final rank will not be changed. *1
40
2. Software architecture
The relationship among classes and infonnation exchanges is shown in figure 2.
Figure 2 is based on the classes described in the previous section.
has a
IS a
Class Matrix
is a has a
Class SquareMat
Class Sensitivity class mainframe
has a
infonnation
class document exchange class view class dialog
~=~ ~~
Figure 2. The relationship among the classes in the project
Each observation is stored in a row and the data set has many observations which
are stored in a table. The estimated correlation matrix is stored in a square matrix
represented by the class SqrareMatrix. Weight values and rank values are calculated and
stored in row.
41
3. Abstract Level Algorithm
An abstract view of the software control flow is shown in Figure 3.
I Load the Data
!Compute Covariance matrix I
I Compute CorrelatIOn MatrIX I
QAQ'I
I Compute Hessenberg Fonn I
Bisection Method
I Estimate Eigenvalues I
Inverse Power
,
I Compute Eigenvalue and Eigenvector I
I Calculate the Rankl
I Perfonn Sensitivity Analysis I
Figure 3. Abstract level control flow
42
A TableClass variable is declared to store the data in the table. The covariance
matrix and correlation matrix are calculated for the data set. Then the eigenvector and
eigenvalues were estimated for the correlation matrix which were used to calculate the
rank values. Two kinds of sensitivity analyses described in Chapter 2 were done using
the algorithms given in the next section.
4. Key algorithms
a. Rank value interval calculation
(adopted from Meszaro and Rapcsak. [Meszaro and Rapcsak, 1996] and modified):
Input: the data set and weight value and their range for each critical variable.
Output: the data set with rank value, high rank value, low rank value.
For ifI to n {
WfV'
II n alternatives
II VIis the low bound weight value
1/ Ai is the standard data for alternative i and G is a
1/ scalar.
1/ H is a scalar, the sum of weight value
Sorting the components of Ai, determine a permutation p of (1, 2, ... , m)
such that the sequence {Ai(p)} is monotone nonincreasing (m is the number
attributes).
for j f I to m II Evaluate each criteria for each alternative
set <p f GIH;
43
k f pm; II The jth largest component of Ai
if Ai(k) ~ q> then break; II q> is the maximum value A.u
else {
W (k) = yUCk) II VU is the kth upper bound weight value.
G f G + Ai(k) * (Vu(k)  V'(k);
H f H + (V"(k) VI(k);
}
This algorithm can calculate the max «AjTW)/(lTW», and the min «AjTW)/(lTW» is
estimated by changing the sign of Ai .
b. Interval weight value estimation
(adopted from Meszaro and Rapcsak [Meszaro and Rapcsak, 1996] and modified):
Input: the data set with rank value.
Output: the data set with rank value and degree of tolerable weight change.
Sort the rank value in monotone nonincreasing order. Rank(i) ~ Rank(i+ 1)
Amin f 100;
for i f 1 to n 1
DifAiAi+i;
Gi f ABS(Di) ;
II Ai the standard data of alternatives with
II rank i.
II Each component in Gi is greater than or
44

DTV
A~' xlOO' GT, V '
if A ~ Amin then A.min ~ A ;
c. Matrix inverse computation
Input: Square matrix A of dimension n
Output: Inverse of A if it exists
Check the matrix's dimension;
D = del (A) ;
if (D == 0 ) return error;
temp=A II I
for (unsigned i=O; i<dimension; i++)
find the partial pivot value;
1/ equal to zero.
II V the weight value.
II A II B means the matrix [A B]
1/ I is the identity matrix of dimension n.
if ( pivot row != i ) then swap the rows;
pivot = temp[i][i];
for (unsigned j=i; i<2*RowNum; j++)
temp[ilU]=temp[ilUl/pivot;
for (j=O; j<RowNum; j++) {
if (j=i) continue;
pivot = tempUHil*(l);
45
for (unsigned k=O; k<2*RowNum; k++)
tempu][k] +=temp[i][k]*pivot;
II The right half of the matrix temp is the inverse of the original matrix.
46
...
CHAPTER IV
Results and Discussion
The data published by George [1996] were used as a sample to test the program.
The first PC accounted for 3.54/5=70.8% of the variation in the standard variables. The
weight values of the standard variables are 0.513, 0.484, 0.513, 0.482 and 0.084 for
MAN_HRO, FAll..URESO, CaSTO, CANNO and MIC_HRSO respectively. In fact the
first PC is the average of the first 4 standardized variables (the original value of the
variable divided by their standard deviation). The importance, (i.e. the correlation
between the variable and the first PC [Sarkar, handout in ST AT5063, 1997]), for each
standard variables are 0.965, 0.911,0.965,0.907 and 0.158 respectively. Also for the
sample data set when the weight varied within 0.4%, the final rank will not be changed.
Figures 4 through 19 i1lustrate the interface provided by the software. They also
show the results obtained using the test data. Figures 4 and 5 illustrate how to load the
input data. Figure 6 shows the input data. Figures 7 through 11 show intennediate steps
that can be viewed if the user prefers to view them (In the current implementation, the
user is required to go through all steps.). Figure 12 shows the ranked data. Figure 13
illustrates the pull down window interface for sensiti vity analysis. Figure 14
47
illustrates the window for specifying weight range for sensitivity analysis. Figure 15
shows the minimum and maximum rank values for the change specified for weight values
in Figure14. Figures 16 and 17 show the sorting facility. Figures 18 and 19 show for each
item the percent of weight value for which the ranking will not change.
48
Figure 4. The window before loading data
49
,...,  C.,   I .
Figure 5. The File open window
50
:. Unlllled . prolecll Rr.l1W
col: 5.
HAH HR
61135:.11
39611.'11
11.10
21117.111
1151.10
3327.1'
11161.111
1119.11
25211.1'
1957.1'
1511.DI
16111 . 111
111111. II
1561.111
1561.110
11211.'1
7511.011
1.111
662.00
1127 . '11
13611.011
1175.011
11.011
317.1'
266.111
266.1'
6119.00 •••• liS ....
222."
669.'0
223."
rOIll:39
fAILURE
317. III
11". 1111
0.11
186.111
196.111
91.'11
1"'''.'11
181.111
411.11
UII.DI
11J1.'1I
73.11
119.11
79."
79.1111
1111.'.
51. DO
' •• 11
811.'0
511.111
2'.'0
51.'11
•••• 311.111
31.11
33.11
18.111
•••• 18. 111
35.1'
2."
33."
COST
2327611.811
14321111.10
1.1111
U1II511.2'
3811111.211
1211330. 110
67323.21
36"95.511
91137.5'
7.799.20
511638."1
611765.611
305211.20
5611'7.11
56407.111
3711"8.91
2711119.60
0.11
23962.611
29915.",1
"91173.31
3163 .... 38
'.01
11"'73.111
9628.50
9639.311
23"1.60
1.0'
15567.6'
81137.111
2 ... 215 •• 0
111162.3'
CAtI'!
68.011
117 . 111
'.111
55.111
11.111
32.'.
119."
6."
9.11
31."
311.'1
21.'1
27 . '1
21.11
21."
211.11
13.'1
11.111
38.'.
12 •• '
11.11'
11.11 '.D' 21.11
18.'.
11.D'
1 •. ' • •••• 1 ....
8.11
1••."••
HIC HRS
7538. OIl
1166.011
... 35311 .• 11
936.011
9581 . 11
171.DII
11.'11
31161&.1111
11318.1111
26.1111
1156.111
15311.11
3111'S. DII
377.111
II. DII
11.111
116511.111
1116"3.1111
333.11'
61.1111
2111.DD
25.1111
1'716.1111
3119.811
28911.1111
2593.1111
663.011
55 ... 2.'.
0.'11
1.'11
1.111
123.1111
Figure 6. Data loading window
51
COST CRItH HIC HRS
2327611.811 68.111 7538.11
111321111.1 • 117 •• 1 1'166.01
•• DD 11.10 0.11 1.11 11353D.II
21171 •• 11 1116.111 1111.511.2. 55. I. 936.01
11151.111 196.10 3811110.211 11 ••• 9581.111
3327.111 91.DO 12D33D.III 32.110 171.11
1861.111 1111,1. DO 67323.21 "'9. DI '.111
1119 •• 11 181.01 361195.51 6.DD 3116 .... 11
2520.'1 118.111 91137.5. 9.11 "'318.111
1957.011 110.10 71799.21 31.111 26.1111
1511.011 1114.11 5 .. 6311. liD 3B.II 1156.DI
1681.11 73.111 611765.611 21 ••• 15311.1D
11".111 119. III 31520.20 27.11 31115.1.
1,5611.011 79.01 56"07.10 21.11. 377.'.
1561.01 79.111 56417.11 21.11 11.1'
11211.01 '''.1111 3711118.91 211.11 11.0.
7511.' • 51..11 271109.61 13.11 1165D."
•• 0D D.DI 11.110 0.11 1116113.00
662.011 8".111 23962.6D 38.00 333.'11
827.IID 511.11 29985 .". 12.10 60.'11
1368.111 20.1. 1191173.311 ••• 11 2 .... 011
875.11 51.ID 31634.311 11.'1 25.111
II." 11.1111 0.11 •••• 11716.'.
317 .0' 311.1111 111173.1. 21.111 3119.11
266.01 31.111 9628 .5. 18.D' 2891.01
266.'0 33.oD 9639.3. 11 ••• 2593 •• '
6"'9.01 18.011 231181.61 11.111 663.111
II. liD G.II 0.111 ••• 11 55112 •• 11
11311. DO 18.10 15567.60 1 •• 10 11.11
222 •• 11 35.11 81137.1111 II.'. .... 669.'1 2 •• 1 21121S .1111 •• 11 '.011
223. DO 33.'11 8162.311 1.'0 123 •• '
•. ~.~! 00. t . . ~~,..'::, ........ ,,.., ..... '"";" ~ 11
Figure 7. The window before calculating correlation
52
:. Unlllled  ploleel1 Rr.J I'i:3
his is the correlation .atrix for the data
1. ....
11.815
1.11'0
0.8D3
11.91111
11.815
1.01111
11.815
11.8113
11.1167
1.110
11.815
1.1101
9.803
0.099
' •• 13
1.803
1.8113
1.01111
0.165
1.199
1.1167
'.199
1.165
1.11111
.. ,~ . I I
Figure 8. Correlation coefficient window
53
1 .... 1.803 1.199
8.815 1.803 1.161
1.DIIII I.B15 1 .... 1.8113 '.199
I.BDa 1.803 1.8113 1.IIU1) 1.165
11.1199 1.167 11.199 11.165 1.'111
Figure 9. The window before estimating eigenvalue and eigenvector
54
·'. Unhlled  p.olecl1 l~r.J IE3
his is the eigenuector FDr the data
.'..5..1..3. 1.160 11."81 ., .... 0 •• 717 1.115 1.1164 1.737 I •• ID
11.513 '.161 1.1181 1.1111  •• 717
11."82 11.1111;1 1.565 1.668 ..... 1.01111 11.992 '.1111 11.185 •••••
his is the eigenualue fDr the data
3.5111 11.989 1.28' 0.1911 •••••
. . '11 
Figure lO. Eigenvalue and eigenvector window
55
1.1181 '.1111 '.7.7
1.1164 11.737 0. DIIO
1.513 11.060 0.1181 11.1111 '.7'7
..... 2 1.1110 1.565 1.668 •. 111
1.1811 1.992 1 ."1 '.185 11.110'
his is the eigenualu! for the data
3.5111 1.989 '.281 '.191 •••• 1
:' , ': ., .. " , ., . '' '. ,  '' , I
Figure 11. The window before calculating rank value
S6
.+. Unhlled  prolectl 1M r.1 (3
HAH HR FAILURE COST CAHH HIC HRS Rank
61J35:11 317.111 23276_.81 68." 7538.111 1.119
3961.111 174.01 11132 __ .11 117.00 1166.10 1.91
11.110 11.10 '.'11 11.111 _35311." '.08
2877 .111 186.111 l11t1l51.211 55.110 936.11 0.82
11151.110 196.1111 31111111'.211 11 •• 11 9581.'1 './i2
3327.011 91.00 120330._0 32.111 171.111 1.67
1861.1111 11111.011 67323.20 119.111 11.11 1.62
111119. OD 181.1111 36/i95.50 6.111 3116_.0. 1.36
25211.111 48.011 91137.51 9.111 11318.111 1.112
1957.1111 11111.011 7'799.211 31.111 26.11 1./i9
1511.1111 1114.1111 511638.11' 38.10 1156.11 './i8
16811.110 73_011 60765.60 21.'11 153D." '.39
8_11.011 1 119 .011 31152'.2D 27.'11 31115.DII 1.36
15611.110 79.10 56_'7.11 21.10 377. DII 1.31
1560.110 79.111 561101.11 21.'11 '.111 '.38
10211.011 84.011 37048.90 211.00 0.1111 1.33
751.11D 51.1111 2111119 .611 13.110 116511.'11 1.22
D.OII II.DD 0.0' 1.111 1116/i3.11 '.03
662.110 84.1111 23962.61 38.11 333.IID '.35
827.111 58.11 299115 .IID 12.11 6U.D' '.22
1368.111 20.11' 1I91173.311 1.111 21111.111 1.21
815.1111 51..11 3163/i.311 11.111 25.11 1.22
11.1111 1.'11 ' .11 1.'11 11116.1. 1.12
311.1111 38.0D 111173.11 21." 3119.00 1.18
2M. 1111 31.111 9628.51 18.1' 2891.11 1.15
266.DII 33.0D 9639.30 11." 2593.111 '.12
6119.10 18.111 231181.61 11.10 663.111 '.15
11.111 11.1111 '.11 0.11 55112.011 ' •• 1
!tall. liD 18.111 15561.6' 111.111 II.DII ' ,.12
222.110 35.1111 81137. II 1.11 0.1111 1.111
669.1111 2.1111 2/i2115. II ' .• 11 •••• .. " 223.111 33.11' 8162.3' 1.1' 123.111 11.07
. , 15_.111 .. 6.00 5562.9' 11.11 168 ..... .' ••1   I I
Figure 12. The data and rank window
57
:. UnllUed • prolecll RrJ (J
~ 5emlll""llv~naly'S,:
Intelval lQr Rank :
~~. ., ~.
6435:00 317." 2321611.8' 68." 1538." 1.119
3960.00 1111." 1.32".11 111.'0 1166." 1.91
D.OO •••• •••• 0." 11353 •••• ••• 2817.1111 186. DO 1011150.20 55.00 936.00 0.82
1Q51.1I11 196." 38 •• 0.2. 11." 9581." '.42
3321.110 91.DO 120330.110 32.00 111. O. '.61
1861.IID 11111 ••• 61323.2' "'9." ... . '.62
1009.10 181.'. 36495.5. 6.00 346 .... 0. '.36
2521. III "8.111 91131.5' 9 ••• 11318.11 .... 2
1951.110 1.11.0. 10199.2. 30.01 26.11 ..... 9
1511.1111 114 ••• 5 .. 638 .... 38." 1156." '.118
1680.00 13.'. 61165.6. 21.11 1531." '.39
844.00 119 ••• 30520.2' 21.'. 31115.DO '.36
15611.110 19.00 561101.1' 21.'. 311.DO '.38
1560.00 19.01 564.7.1. 21.'. •••• '.38
102 ...... 84.00 31 .... 8.90 211 ••• '.D' '.33
158.80 51.11 27419.6' 13.'D 1165D.1I '.22
0.80 '.0' 0.00 0.'0 146.3.00 '.03
662.80 84.01 23962.60 38." 333.'0 '.35
821 •• 0 58.00 29905.4' 12.'0 60.0' '.22
1368 •• 0 2 •••• 49473.30 0.'0 211 .... '.20
815.11' 51.00 316311.3' 11..0 25.'0 '.22
1.10 0.11 0.'0 •••• 1.116." '.02
311.00 38.0' 111113.1' 21.11 3119.11 '.18
2M.00 31.'0 9628.5' 18.'11 28911.11 '.15
266.00 33.00 9639.30 11 •• 0 2593.11 '.12
649.00 18." 23481.6' 1 •••• 663.11 '.15
0.00 0.11 •••• '.00 55 .. 2 •• 11 '.11
4311." 18.11 15561.6' 1'.'0 ' •• 11 '.12
222.IUI 35.00 8031.'0 8.00 0.00 '.10
669. aD 2.0' 24215.'0 •••• ' •• 0 .." 223.00 33.'D 8162.3' 1.0' 123.'0 ' •• 7
15 .... 0D 6 . " 5562.9' 1 .... 1684.'. , .. ,
", I .. tIr.'''':'~"'':·''''''~ 1, • ~ . ..,,  ,  , . ,,..  ,  , .  '.  ' ',  ,  I I
Figure 13. The window before calculating rank interval
58
~ ..... : ... ,". ': ...... '.; ..
I1AH HR F.ULURE COST CAHH HIC HRS Rank
61J35: II 317 .11 23276".81 68." 7538." 1 .... 9
3960.1111 1711.11 1 ... 32 ...... 111 "'7." 1166.1' '.91
1.1111 11.111 1.'1 I.'. 1J353 .... ••• 2877.1111 186.1111 1111l1151.21 55.'1 936.11 1.82
1151.110 196.111 381101.211 11.'11 9581.111 ..... 2
3327.00 91.111 121330."1 32." 111.011 '.67
1861.1111 11111.11 67323.21 "'9." 1.811 8.62
11119 .l1li 181.011 361195.50 6.'. 3 ... 6 .... 1111 '.36
2521.811 118.111 91131.51 9.8' 11318.81 .... 2
1957.l1li 1011.1111
1511.011 104.011
16811.1111 73 .IIU
8".11 109 .11
1561.l1li 79.111
1560.1111 19.11
18211.11 811.011
758.11 51.1111
1.1111 11.111
662.111 811.110
827.111 58.1111
1368.1111 211.111 "'91173.31 .... 21&11.811 1.20
875.111 51.01 3163".311 11.ID 25.ID '.22
D.II' 0.1111 8.'11 •••• 1.716 •• 0 1.112
317." 38.011 111&73.11 21.'1 3119.811 1.18
266.1' 31.11 9628.58 18.'" 289'.811 '.15
266.11 33.1111 9639.31 11.11 2593." 1.12
6119.01 1B.OD 231181.68 11.11 663." 8.15
1.1111 1.1111 1.80 0.'1 551&2.11 1.11
1J311.111 18.111 15567.68 18.1' 8.111 '.12
222.111 3S.aa 81137.10 B.III 1.111 1.111
669.1111 2.1111 211215.111 1.11 1.81 1.19
223.11 33.1111 .162.31 1.111 123.111 '.17
15".'1 6.1111 5562.91 11." 16 .... 1111 1.17
"~~t • ".    11
Figure 14. Weight choose window
59

••• Untilled· p.Dlecll 13(,) rt:3
HAH HR FAILURE COST CAHH HIe HRS Rank H_Rank l_Rank
61135:.11 317.1111 23276,..11 61.1111 1538.111 1.119 1.5' 1.22
3960.'11 17~.1I' 1113211".1' 117..11 1166." '.91 '.92 '.7,.
••• 11 •• 1111 1.'0 0.1111 11353,1.111 .... '.1' '.11
2877 .• 1 186.111 111l15'.21 55.1111 936.111 ' •• 2 '.13 1.67
1851.'11 196.111 3801111.21 11.1111 9581.11. ,.,.2 .... 5 1.38
3327.'" 91.111 121331.U 32.011 111.111 1.67 1 . 68 1.56
1861 •• 1 111 .... 11D 67323.21 _9.1111 •• 11. '.62 '.6_ '.55
1.19." 181.11D 361195.5' 6.011 3 ... 6 .... 1111 '.36 '.39 1.33
25211.'1 118. lID 91137 .51 9.1111 11318.111 '.112 I." 1.36
1957 •• 1 111D." 11799.2' 311.1111 26." .... , '.511 .... 11
1511 •• 11 1 .... 111 51163B.1ID 38.1111 1156.111 '.111 '.119 1.113
1681.'0 73.1111 611765.611 21.1111 153 ••• ' '.39 ' . 39 '.32
811_." 1119.1' 31521.21 27.111 31115.11' '.36 '.37 11.31
1560.'0 79.1111 561117.111 21..11 377." '.31 '.38 1.31
1561.'. 79. lID 56,.117.111 21.111 II.ID '.38 '.38 1.31
10211 •• 0 111.1111 37 .... 8.9. 211 . 111 '.11 '.33 '.311 '.31
758.'11 51. lID 2111119.61 13 . 1111 1165 •••• '.22 '.22 '.211
11.'0 II." .... 1.011 1116113 . " '.113 '.13 '.02 662.111 8011. lID 23962.60 38.111 333. '11 •• 35 '.31 '.3'
827.111 58." 299115 .U 12 . 111 61." '.22 '.23 '.11
1361.11. 211." 1191113.31 11.811 21111.11 '.21 '.21 1 .16
875.'0 51.11' 316311.3' 11.11' 25.11 1.22 '.22 '.11
11 . 110 0.11 11.111 11.11' 1.716." ' .12 '.12 1.'2
317.'0 38.11' 11_13.111 21.111 3119." '.18 1.19 1.16
266." 31.'. 9628.511 18.11 2891.'11 '.15 '.16 '.111
266.110 33.1111 9639.311 11.'1 2593.111 1.12 1.13 '.11
6119.1' 18." 23_81.61 11.111 663.1D '.15 '.15 1.13
1 . '11 11.1111 1.111 ' •• 11 55112.111 '.11 •• 11 ••• 1
113' . 1D 18.111 15561.611 111.1111 '.111 11.12 1.12 '.1'
222.'11 35.1111 8131.1' 8.111 .... ' .11 '.11 1.19
669." 2.11' 211215.111 .... 1.11 1.19 1.11 '.111
223.111 33.111 8162.31 1.1. 123.11 '.17 1.11 1.16
1511.'11 6.111 5562.9. 18 •• 11 16811.1' '.11 .... '.11
Figure 15. Data and rank interval window
60
@
HAH ItR FAILURE COST CAHH HIC HRS Rank H_Rank L_Rank
61135:111 317 .111 232761t.811 68.'11 7538.1' 1 .... 9 1.511 1.22
3961.111 171t.OIl 1432".10 1t7.011 1166.IID '.91 11.92 11.7'"
1.011 1.111 •••• 11.'11 1135311.111 1.18 11.111 ' •• 7
2877 .• 11 186.011 1 11411!tD .211 55.'11 936.IID D.82 11.83 1.67
1151.1111 196.111 38 •••• 2. 11..1 9581." ..... 2 '.1t5 '.3'
3327.1111 91.011 120330. It. 32.01 171.1111 •• 67 1.68 '.56
1861 •• 11 111_. III .2. 1t9. •••• '.55
1'19.1111 181.111 '.33
252' •• 11 ItS. I. '.36
1957 •• 1 1111.111 '.1tD
1511 ••• 11J11 ••• • .... 3
1681.'1 73.11 '.32
8".'. 119 ••• '.31
156 •• ID 79 ••• '.31
1561.1. 79.'. '.31
1121t •• D 8 ..... 1 '.31
158 ••• 51 ••• 22 '.2'
•. 11 I •• D .3 ' •• 2
662.'. 8 ..... 1 37 1.3.
827.10 58.'0 23 1.18
1368." 21 ••• 21 1.16
875.'1 51 •• 1 22 1.18 I'.'. 1 •• 11 •••• 11.111 111116. D. 1.12 1.'2 1 •• 2
317 ••• 38.'11 111t73.11 21.'11 31'19." 0.18 1.19 1.16
266.111 31.1111 9628.5. 18.111 2891.1' '.15 '.16 '.1_
266.00 33.1111 9639.31 11.01 2593." 1.12 '.13 '.11
6 ... 9....". 18.111 23"81.60 111.'11 663." '.15 '.15 '.13 ..111 D.'II 11.11 55"2.11 ' •• 1 D.'1 '.11
"311.111 18.111 15567.6' 1'.'11 •••• 1.12 1.12 '.11
222." 35.DII 8137.11 8.'11 D.IID 11.111 11.11 I." 669." 2." 21t215. III .... •••• '.19 '.1' ' •• 7
223.11' 33.DII 8162.311 1..11 123.11 ' •• 7 1.'7 '.16 15 ..... 6.11 5562.911 111.'11 168 ...... '.17 1.18 ' .• 7
Figure 16. Data sort dialog and window
61

.. +. Unhlled . prolectl ~ IQ E3
~g~  . ' .. ' "'~.' ~ . ,  , , :,  . , ,
HAH HR FAILURE COST CAHH MIC HRS hnk H_Rank l_Rank
61135: •• 311.'1 2327611.8' 68." 7538." 1.119 1.5' 1.22
3961.11 174.'1 111321111.11 117.DD 1166.'1 '.91 '.92 '.711
2877 .•• 186." 1 ..... 5 •• 211 55.U 936." '.82 ' •• 3 '.67
3327.10 91." 12033'.41 32.11 171.'1 '.67 '.68 '.56
1861.10 11111.'1 67323.211 119 •• ' .... '.62 '.6" '.55
1511.11 11111.'U 511638.4' 38." 1156." '.118 '.119 ..... 3
1957." 1 UU •• I 711799.21 3'.'1 2,6.11 '.49 '.5' '.111
252 •. " 48.DO 91137.5' 9 •• ' "311.11 '.112 '.11" '.36
1051 ••• 196.U 381".211 11 ••• 9581." 1.112 ..... 5 '.38
1681 .• 1 73.10 611765.61 21. •• 153 .... '.39 '.39 '.32
156U." 79." 561t17.1D 21." 377 ••• •• 38 '.38 '.31
1561.11 79 .• 1 561117.1' 21 •• ' •••• '.38 '.38 '.31
1.119 ••• 181 ••• 361195.511 6." 3116 ..... •• 36 '.39 '.33
8"." 1119 .11 3052 • • 211 27.1' 31115." '.36 '.37 '.31
662." 8_.111 23962.611 38." 333.'. • • 35 •. 37 '.31
11211 .• ' 811.11 37.118.91 211.'. •••• ' . 33 '.311 '.3D
827." 58.'U 29915.11' 12.11 6 •••• ' . 22 '.23 '.1:8
758.11 51' .11 271119.61 13 ••• 1165 •••• '.22 '.22 '.21
875.'. 51. 'U 316311.311 11.'. 25." '.22 '.22 '.1'
1368." 2 •••• 1191173.31 •••• 211 •••• '.2. '.21 '.16
317." 38." 11.1!73.1' 21.'. 3119." '.1' '.19 '.16
266." 31 •• ' 9628.5' 18.'. 289 .... '.15 '.16 '.111
6119. III 18 •• ' 231181.6' n ." 663.111 '.15 '.15 '.13
266 •• ' 33.'1 9639.311 11.'. 2593.'1 '.12 '.13 '.11
113 •••• 18.10 15567.611 1 •••• •••• '.12 '.12 '.11
222.1' 35.11 8137.'0 8.0' ' •• 0 '.1' '.11 '.19
669.'0 2 •• 11 211215 .1111 •••• •••• '.19 '.11 ' •• 7
I.'U •.• D ••• D .... 4353 .... '.tI '.1. ' .• 7
1511.'. 6 .• 11 5562.91 1 •••• 168 ..... ' •• 7 ••• ' •• 7
1511.'U 6." 5562.91 111.'. 23.'. ' •• 7 .... ' •• 6
223.111 33.10 '162.31 1." 123.11 ' •• 7 ••• 7 '.16
165 •• 11 28.10 5978.91 •••• 72." 11.15 '.16 I.IS
126.'11 17 ••• 115611.7' .... 17 .... 1.111 ..... '.13
Figure 17. Sorted data and rank window
62
61135:.11 2327611.811 7531."  317.111 68.111 1."9 1.511 1.22
39611. '111 17 .... 0 111321t1l .1' IIl.DO 1166.111 11.91 0.92 '.711
2877. III 186.'11 114051.2D 55.1111 936.'11 11.82 0.83 0.67
3327.111 91.011 121330."1 32.'11 171 •• 11 1.67 1.68 '.56
1861.1' 111 .... 1 67323.2' "9.111 '.ID 11.62 '.6" '.55
1511.'. 1 .... 11 511638 .... 38.DO 1156 .• 11 11.118 .... 9 '.113
1957." 111.111 711799.2' 311 •• 11 26.'0 0."9 0.5D '.111
25211.011 IIS.OII 91137.5' 9 •• ' 11311.DII 0."2 .. " 0.36
11151.1111 196.111 381111.21 11.'11 9581.11 11."2 .... 5 •• 38
168Q.00 73.00 611765.61 21." 15311." 11.39 '.39 '.32
15611.110 79.110 5611117.1' 21 •• 11 377 ••• 11.38 •• 38 '.31
156D.III 79.01 56 ... 7.1. 21.DII I.'. 0.38 D.38 '.31
11119.10 181.110 36 .. 95.5. 6.'0 3 .. 6 .... 0 1.36 D.39 0.33
84".1111 109.111 31520.2' 27.DO 3 .. 15." 11.36 '.37 '.31
662.111 a ... III 23962.60 38.'1 333.'11 11.35 1.37 '.3' 1112".118 8".00 37D118.9' 211.'0 D." 11.33 '.3" D.3'
827. au 58.11 29915 .11' 12.'1 6D." 11.22 1.23 '.18
758.111 51.IID 27 .. 19.6. 13.11 1165 •• II 11.22 '.22 '.2'
875.111 51.11 316311.3' 11." 25.'. '.22 '.22 '.18
1368.D' 20.10 1191173.31 •• 10 211 •••• 1.21 1.21 0.16
317.01 38.10 111173.11 21.11 3119.'11 D.18 D.19 '.16
266.110 31.IID 9621.51 11." 289 •• '" D.15 1.16 D.111
6119.011 18.0' 23"81.61 1 •••• 663 •• 11 '.15 '.15 '.13
266.DD 33.111 9639.30 11." 2593.'" '.12 '.13 '.11
11311." 18.au 15567.60 11.11 •••• 11.12 '.12 •• 1'
222.1111 35.110 II37.DO 8." .D •• D 11.10 '.11 '.19
669.1111 2.111 211205." .... '.110 0.119 '.10 II •• 7
'.IUI '.IUI .... •••• 11353'.111 '.08 '.10 ' •• 7
1511.111 6.'. 5562.911 11.111 16811.11 1.17 I .• ' •• 7
1511.1111 6.10 5'.062.91 11.'11 23.DO 1.117 1.08 '.16
223.'" 33.'11 8162.31 1.111 123.11 '.117 1.17 1.16
165.10 21.'D 5978.91 .... 72 •• D •• D5 '.16 0.15
126.111 17 ••• 11'.0611.7' .... 178.'1 I. III I. III '.13
.... :. r .... 'i ;1'~": ~ r " .... 1"' ,,:, •. I •• ~., , • ,~ ':, If'·'· • I I
Figure 18. The window before calculating weight interval
63
.;.~ Untilled  prolecl1 IS Q rw
_o:~~
[!j!aE:l I f];l~ITi I< ralU ' . . " .. ' < " ,,> .. '.,~~ _t~ . ;.:J .... " .. : ,
~_~ .l_'" ', __ , _ .... I:.·::...·i .. ':.:~.:'··! ... ,',
his is thl! .a~i.un weight ualup: ....... that thl! final rank will nDt changed.
HAH HR FAILURE COST CAHH HIC HRS Rank H_Rank L_Rank liT \
6435:U 317.10 23216 .... 81 68.'. 153i.DD 1.49 1.50 1.22 100.n
3960. DI 1711.0D 11132"'11.18 117.00 1166.DD D.91 D.92 0.7'" 117.10
2877 .DO 186.QD 1 DJl05'.2D 55.18 936 •• 0 0.82 0.83 D.61 57.5'"
3321.DI 91.DR 121330 .... 0 32.DO 111. gg 1.67 '.68 D.56 15.52
1861.00 111 .... II 61323.20 "'9.0ll D.DD 0.62 '.6'" 0.55 13 ....
1957.DD 1DD.DD 70799.2D 311.111 26.IID '.119 '.51 D .... 1 .... 16
1511.IID 1 DII. DII 5"638.IID 38.110 1156. DD '.48 '.49 11.113 1'.29
2520.00 48.111 91137.511 9.00 "'318.1111 '.42 .... 11 '.36 ..... 11151.DII 196.011 3810D.2D 11.'D 9581.DD 1 .... 2 0.45 1.31 1 1.'7
16811.110 73.ID 6D765.6D 21.'11 15311.DD '.39 1.39 '.32 ...... 12
156D.01I 79.DD 56 .... 7.1 I 21.DD 377.'D •• 38 1.31 0.31 1DD."
156D. DII 19.DD 56I1D7.1' 21.11D 8.111 0.38 1.38 '.31 6.""
1D09.01 181.10 36"95.51 (diO 3 ... 64 •• 0 •• 36 •. 39 0.33 3.62
844.10 1119 •• 11 31152'.20 21.DO 3 ... 15." D.36 '.37 '.31 6.72
662." 84.'11 23962.60 38.00 333.'0 0.35 '.31 '.30 16.29
1024.'D 8 ..... 0 311118.n 211.110 0.10 0.33 ,'.3'" 1.31 P9.n
827.01 58.00 29905 ... D 12.011 60.80 0.22 0.23 11.11 12.11
158.DII 51.10 27"'19.60 13.DD 4650. DO 0.22 1.22 D.21 11.35
815.00 51.'0 3163 .... 30 11.00 25.011 0.22 '.22 '.111 1 .... 51
1368.DI 211.'0 119"13.3D I.DU 21111.DD D.20 '.21 1.16 5.15
311.0D 38.'D 11"73.1D 21.11 3119.DD 0.18 1.19 '.16 1'D.IO
266.DII 31.DD 9628.5D 18.DO 289D.1ID 0.15 D.16 .. ., .. 6.ID
649. DII 11.0D 23"81.61 1I.DO 663.11D 0.15 0.15 '.13 32.21
266.1111 33.0D 9639.3' 11.1111 2593.DD D.12 1.13 D.11 11.35
"311. III 18.IID 15561.6' 1' •• D 11.110 0.12 D.12 '.1' 29.97
222.111 35.00 81J37.IID I.DD D.DD 1.1D D.11 D.89 12.66
669.011 2.D8 2112'5.ID D.'D I.DD 0.19 D.lO '.07 ".22
D.III D.'II D.'II 11.111 "353D.1I1I II.D8 D.1 D 1.117 3.17
15 .... 111 6.ID 5562.90 1 •••• 168 .... DO 0.D7 0.'8 ' •• 1 110."
1511.DR 6.110 5562.9D 1D.ID 23.DD 11.117 D.D8 D.'6 1.69
.. 7'!l. aD . ~~ liD .. 1.,,7 . !'III 1 .. DD 1':;1 aD a D7 a .117 _ a .at. 1.1a., a
Figure 19. Data, rank interval and weight interval window
64
CHAPTER V
Conclusion and Future Work
In this thesis, a decision support system based on PCA is developed. The system
provides a GUI to view results. The software is implemented using MS Visual c++. The
software loads data from a file and ranks them. It also provides methods for performing
sensitivity analysis on the ranking.
From the sample data we can find this method is a good way to objectively
evaluate and interpret the data to generate accurate and correct information for a manager
to make effective decisions. Due to the limitations on accessing actual data sets, the
author could not perform extensive tests of the model. Further tests and enhancements
are suggested as future work.
65
References
Ahamad, B. An Analysis of Crimes by the Method of Principal Components. Appl.
Statist., 16, 1735, 1967.
Anderson, T. W. Asymptotic Theory for Principal Component Analysis. Annals of
Mathematical Statistics, 34, 122148, 1963.
Anderson, T. W. An Introduction to Multivariate Statistical Analysis, New York: John
Wiley, 1984.
Bailey, D. W. A Comparision of Genetic and Environmental Principal Components of
Morphogenesis in Mice. Growth, 20, 6374, 1956.
Beliveau, 1. G., S. Cogan, G. Lallement and F. Ayer. Iterative LeastSquares Calculation
for Modal Eigenvector Sensitivity. AlAA Journal, Vol. 34 (2), 385391, 1996.
BenIsrael, A. and P. D. Robers. A Decomposition Method For Interval Linear
Programming. Management Science, Vol. 16,374387, 1970.
BenIsrael, A. and A. Charnes. An Explicit Solution of A Special Class of Linear
Programming Problems. Operations Research 16, 11661175, 1968.
Bru, M. F. Diffusions of Perturbed Principal Component Analysis. 1. of Multivariate
Analysis 29, 127136, 1989.
Cahalan, R. F. EOF Spectral Estimation in Climate Analysis. Second International
Meeting on Statistical Climatology, Preprints, 4.5.14.5.7,1983.
Caussinus, H. and L. Ferre. Comparing the Parameters of a Model for Several Units by
Means of Principal Component Analysis. Computational Statistics & Data
Analysis. 13, 269280, 1992.
Chang, C. C. Application of Principal Component Analysis to MultiDisk Concurrent
Accessing. BIT 28 205214, 1988.
Chames, A. and W. W. Cooper. An explicit General Solution in Linear Fractional
Programming. Naval Research Logistics Quarterly 20, 449467, 1973.
66
Charnes, A. and W. W. Cooper. Programming with Linear Functional Functionals. Nav.
Res. Log. Quart. 9,18116,1962.
Cochran, R. N. and Home, F. H. Statistically weighted principal component analysis of
rapid scanning wavelength kinetics experiments. Anal. Chern., 49, 846853, 1977.
Dauxois, J., A. Pousse and Y. Romain. Asymptotic Theory for the Principal Component
Analysis of a Random Function: Some Applications to Statistical Inference. 1. of
Multivariate Analysis 12, 136154, 1982.
Dawkins, B. Multivariate Analysis of National Track Records. The American
Statistician, 43, 1 10115,1989.
Duchene, J. and S. Leclercq. An Optimal Transformation for Discriminant and Principal
Component Analysis. IEEE Transactions on Pattern Analysis and Machine
Intelligence. 10, 978983, 1988.
George, K. M. Computer Method for Sustainability Ranking. AFOSR report, 1996.
Gnanadesikan, Rand J. R. Kettenring. Robest Estimates, residuals, and outlier detection
with multiresponse data. Biometrics, 28, 81124, 1972.
Golub, G. H. and C. F. Van Loan. Matrix Computations. Second edition. The Johns
Hopkins University Press, Baltimore, 1993.
Green, B. F. Parameter sensitivity in multivariate methods. J. Multiv. Behav. Res., 12,
263287, 1977.
Hawkins, D. M. The detection of errors in multivariate data using principal component
analysis. Appt Statist., 69, 340344, 1974.
Hillier, F. S. and G. J. Lieberman. Introduction to Operations Research. HoldenDay,
Inc., Oakland, CA, 1986.
Hotelling, H. Analysis of a Complex of Statistical Variables into Principal Components.
Journal of Educational Psychology, 24, 417441,1933.
Jacob, B. Linear Functions and Matrix Theory. SpringerVerlag, 1995.
Jeffers, J. N. R. Two case studies in the application of principal component analysis.
App1. Statist., 16,225236, 1967.
Johnson, L. W., R. D. Riess and J. T. Arnold. Introduction to Linear Algebra.
AddisonWesley Pub., 1989.
67
Johnson, R. A. and D. W. Wichern. Applied Multivariate Statistical Analysis. PrenticeHall,
Inc., Englewood Cliffs, 1982.
Jolicoeur, P. Multivariate Geographical Variation in the Wolf Canis Lupus L. Evolution,
13,283299,1959.
Jolicoeur, P. and J. E. Mosimann. Size and Shape Variation in the Painted Turtle: A
Principal Component Analysis. Growth, 24, 339354, 1966.
Jolliffe, L. T. Principal Component Analysis. SpringerVerlag, New York, 1986.
King, B. Market and Industry Factors in Stock Price Behavior. Journal of Business, 39,
139190, 1966.
KJoek, T. and L. B. M. Mennes. Simultaneous equations estimations based on principal
components of predetemined variables. Econometrican, 28, 4561, 1960.
Korhonen, P. J. Subjective Principal Component Analysis. Computational Statistics &
Data Analysis 2, 243255, 1984.
Krzanowski, W. J. Sensitivity of principal components. J. Roy. Statist. Soc. B, 46, 558
563,1984.
Krzanowski, W. J. CrossValidatory Choice in Principal Component Analysis; Some
Sampling Results. J. Statist. Comput. Simul., Vol. 18,299314, 1983.
Lachenburch, P. A. Discriminant Analysis. New York: Hafners Press. 1975.
Lee, R. C. T., Y. H. Chin and S. C. Chang. Application of Principal Component Analysis
to Multikey Searching. IEEE Transactions on Software Engineering. 2, 185193,
1976.
Maxwell, A. E. Multivariate Analysis in Behavioural Research. London: Chapman and
Hall, 1977.
Meszaros, G. and T. Rapcsak. On Sensitivity Analysis for a Class of Decision Systems.
Decision Support Systems 16,231240,1996.
Moser, K. B. Linear Models: A Mean Model Approach. Academic Press, 1996.
Rao, C. R. The Use and Interpretation of Principal Component Analysis in Applied
Research. Sankhya A, 26, 329358, 1964.
Rao, C. R. Linear Statistical Inference and Its Applications, New York: John Wiley,
1973.
68
Sloan, M. A. G. Using Principal Component Analysis Prior to Agglomerative
Hierarchical Clustering Methods. Ph.D. thesis. Oklahoma State University, 1983.
Steiger, D. M. Beyond WhatIf: Enhancing Model Analysis In a Decision Support
System. Ph.D. thesis, Oklahoma State University, 1993.
Wold, S. Pattern Recognition by Mean of Disjoint Principal Components Models. Patt.
Recog., 8,127139,1976.
Zionts, S. Programming with Linear Fractional Functionals. Nav. Res. Log. Quart. 15,
449452, 1968.
69
VITA
Shaokai Wen
Candidate for the Degree of
Master of Science
Thesis: APPLICATION OF PRINCIPAL COMPONENT ANALYSIS TO DECISION
SUPPORT SYSTEM
Major Field: Computer Science
Biographical: Born in Ningxing, Hunan Province, China, March 2, 1964, the eldest son
of Wen Xibin and Jiang Qinxiu. Marred to Zhang Rui, August 1, 1990.
Education: Graduated in July, 1981 from Ningxing 7th high school in Hunan,
China. Received a Bachelor of Agriculture Science degree in Animal
Science from Hunan Agricultural University in May, 1985, Hunan,
China; received a Master of Animal Science degree form Beijing
Agricultural University, Beijing. China in May, 1988; received the
Doctor of Philosophy degree in Animal Breeding and Reproduction at
Oklahoma State University, Stillwater in July, 1996; completed the
requirements of the Master of Science at Oklahoma State UniversHy,
Stillwater in July, 1997.
Professional Experience: Teaching assistant from September 1988  January
1987 in Beijing Agricultural University, Beijing, China; research
assistant from July 1985  June 1988 in Beijing Agricultural University,
Beijing China; animal scientist from July 1988  July 1990 in the Chinese
Academy of Science, Beijing, China; assistant research professor from
June 1988  December 1992 in the Chinese Academy of Science, Beijing,
China. research assistant from January 1993  June 1996 in Oklahoma
State University, Stillwater, Oklahoma. teaching assistant from August
1996  May 1997 in Oklahoma State University, Stillwater, OKlahoma.
Professional Organization: Chinese Society of Animal Science; American Society of
Animal Science, Gamma Sigma Delta