

small (250x250 max)
medium (500x500 max)
Large
Extra Large
large ( > 500x500)
Full Resolution


COMPARING THE PERFORMANCE OF ORDINAL LOGISTIC REGRESSION AND ARTIFICIAL NEURAL NETWORK WHEN ANALYZING ORDINAL DATA By AISYAH LARASATI Bachelor of Science in Industrial Engineering Sepuluh Nopember Institute of Technology Surabaya, Indonesia 1999 Master of Science in Industrial Engineering Sepuluh Nopember Institute of Technology Surabaya, Indonesia 2003 Submitted to the Faculty of the Graduate College of the Oklahoma State University in partial fulfillment of the requirements for the Degree of DOCTOR OF PHILOSOPHY July, 2012 ii COMPARING THE PERFORMANCE OF ORDINAL LOGISTIC REGRESSION AND ARTIFICIAL NEURAL NETWORK WHEN ANALYZING ORDINAL DATA Dissertation Approved: Dr. Camille F. DeYong Dissertation Adviser Dr. David B. Pratt Dr. William J. Kolarik Dr. Melinda H. McCann Dr. Lisa Slevitch Outside Committee Member Dr. Sheryl A. Tucker Dean of the Graduate College iii TABLE OF CONTENTS Chapter Page I. INTRODUCTION ......................................................................................................1 1.1 Background ........................................................................................................1 1.2 Problem Statement .............................................................................................7 1.3 Purpose ...............................................................................................................8 1.4 Test Case: The Service Profit Chain in Training Restaurants ...........................9 1.5 Summary of the Research Gaps .......................................................................11 1.6 Organization of the Study ................................................................................12 II. REVIEW OF LITERATURE..................................................................................13 2.1 Introduction ......................................................................................................13 2.2 Method for Analyzing Ordinal Data ................................................................14 2.2.1 Ordinal Logistic Regression (OLR) Model ............................................15 2.2.2 Artificial Neural Network (ANN) Model ...............................................18 2.2.3 Performance Metrics ...............................................................................20 2.2.4 Statistical Test to Compare the OLR and ANN Models .........................22 2.3 Generating Correlated Ordinal Data ................................................................24 2.4 Generating Correlation Coefficient..................................................................26 2.5 Training Restaurant ..........................................................................................29 2.6 The Service Profit Chain ..................................................................................31 2.6.1 Link between Employee and Customer Satisfaction ..............................32 2.6.2 Link between Customer Satisfaction and Organization‟s Success Measures .................................................................................................32 2.6.3 Link between Employee Satisfaction and Organization‟s Success Measures .................................................................................................33 2.7 Employee Satisfaction .....................................................................................34 III. RESEARCH METHODOLOGY...........................................................................38 3.1 Introduction ......................................................................................................38 3.2 Research Step 1: Conceptual Frameworks ......................................................41 iv Chapter Page 3.3 Research Step 2: Data Collection Plan ............................................................41 3.3.1 Initial Instrument and Pretest ..................................................................42 3.3.2 Pilot Test .................................................................................................42 3.3.3 Instrument Validity .................................................................................43 3.3.4 Student Instrument ..................................................................................44 3.3.5 Instructor Instrument ...............................................................................47 3.4 Research Step 3: Generating Simulated Data ..................................................49 3.4.1 Procedure to Generate Ordinal Correlated Data .....................................51 3.4.2 Procedure to Generate Random Marginal Probabilities .........................52 3.4.3 Procedure to Generate the Correlation Coefficient and Correlation Matrices...................................................................................................53 3.4.4 Procedure to Validate Generated Data ....................................................54 3.5 Research Step 4: Build Model .........................................................................54 3.5.1 Artificial Neural Network .......................................................................55 3.5.2 Ordinal Logistic Regression ...................................................................56 3.5.3 Comparing Model Performance ..............................................................56 3.6 Summary ..........................................................................................................58 IV. THE ORDINAL LOGISTIC REGRESSION AND ARTIFICIAL NEURAL NETWORK WITH ONE INPUT VARIABLE .....................................................61 4.1 Introduction ......................................................................................................61 4.2 Preparation Steps .............................................................................................62 4.3 Validating Algorithm to Generate Correlated Ordinal Data ............................65 4.4 Scenario 1.........................................................................................................67 4.5 Scenario 2.........................................................................................................69 4.6 Scenario 3.........................................................................................................70 4.7 Misclassification Rates Comparison ................................................................75 4.8 Summary ..........................................................................................................77 V. THE ORDINAL LOGISTIC REGRESSION AND ARTIFICIAL NEURAL NETWORK WITH THREE INPUT VARIABLES ..............................................79 5.1 Introduction ......................................................................................................79 5.2 Preparation Steps .............................................................................................80 5.3 Validating Algorithm to Generate Correlated Ordinal Data ............................85 5.4 Scenario 1.........................................................................................................89 5.5 Scenario 2.........................................................................................................91 5.6 Scenario 3.........................................................................................................94 5.7 Misclassification Rates Comparison ..............................................................100 5.8 Choosing a Model ..........................................................................................102 5.9 Summary ........................................................................................................103 v Chapter Page VI. SUMMARY, CONCLUSION AND FUTURE WORK .....................................106 6.1 Summary ........................................................................................................106 6.2 Conclusion .....................................................................................................111 6.3 Future Work ...................................................................................................112 REFERENCES ..........................................................................................................114 APPENDICES ...........................................................................................................125vi LIST OF TABLES Table Page Table 1.1 Classification of measurement scale ..............................................................4 Table 2.1 Comparison of training restaurants and general type of restaurants ............30 Table 2.2 Constructs of employee satisfaction ............................................................36 Table 2.3 Constructs of student performance ..............................................................37 Table 3.1 Reliability Alpha on pilot data .....................................................................43 Table 3.2 Student questionnaire items .........................................................................46 Table 3.3 Instructor questionnaire items ......................................................................48 Table 3.4 The distribution of random marginal probabilities ......................................52 Table 4.1 Correlation coefficient between student overall satisfaction and performance .................................................................................................65 Table 4.2 Cross tabulated data from Fajar Teaching Restaurant .................................65 Table 4.3 Cross tabulated on the first generated correlated ordinal data .....................66 Table 4.4 Mean rank for student overall satisfaction and performance ......................66 Table 4.5 Mean rank test statistics ..............................................................................67 Table 4.6 Descriptive statistics of misclassification rates from Scenario 1 (one input variable) ......................................................................................69 Table 4.7 Descriptive statistics of misclassification rates from Scenario 2 (one input variable) ......................................................................................70 Table 4.8 The rules to generate marginal probabilities ...............................................72 Table 4.9 Student performance marginal probability distributions ............................73 Table 4.10 Student overall satisfaction marginal probability distributions ................73 Table 4.11 Descriptive statistics of misclassification rates from Scenario 3 (one input variable) ......................................................................................75 Table 5.1 Gamma correlation coefficient from Taylors‟ data ....................................81 Table 5.2 Gamma correlation coefficient from Fajar Teaching Restaurant data ........82 Table 5.3 Cross tabulated data of “understanding what to do” ...................................86 Table 5.4 Cross tabulated data of “opportunity to develop skill” ...............................87 Table 5.5 Cross tabulated data of “enthusiastic feeling” ............................................87 Table 5.6 Mean rank for student overall satisfaction and its three determinants .......88 Table 5.7 Mean rank test statistics ..............................................................................88 Table 5.8 Gamma correlation coefficients between variables used in Scenario 1 (three input variables) ..................................................................................90 vii Table Page Table 5.9 The descriptive statistics of misclassification rates for Scenario 1 (three input variables) ...............................................................................91 Table 5.10 Gamma correlation coefficients between variables used in Scenario 2 (three input variables) ................................................................................93 Table 5.11 The descriptive statistics of misclassification rates for Scenario 2 (three input variables) ...............................................................................93 Table 5.12 The rules to generate marginal probabilities .............................................95 Table 5.13 Marginal probability distributions input variable 1 ...................................96 Table 5.14 Marginal probability distributions input variable 2 ...................................96 Table 5.15 Marginal probability distributions input variable 3 ...................................96 Table 5.16 Marginal probability distributions output variable ....................................97 Table 5.17 Generated correlated coefficient intervals ................................................98 Table 5.18 The descriptive statistics of misclassification rates for Scenario 3 (three input variables .................................................................................99 Table 6.1 Summary of the best guessestimate models ...........................................109 viii LIST OF FIGURES Figure Page Figure 2.1 Information processing in ANN with backpropagation algorithm ...........19 Figure 2.2 A confusion matrix representation for seven class classification problem .......................................................................................................21 Figure 2.3 A taxonomy of statistical test in comparing algorithms .............................22 Figure 2.4 The links in the Service Profit Chain .........................................................31 Figure 3.1 The framework of the research methodology .............................................40 Figure 3.2 The conceptual framework of the study .....................................................41 Figure 4.1 Marginal probability distributions of input and output data in Taylors‟ Dining (one input variable) ........................................................................63 Figure 4.2 Marginal probability distributions of input and output data in FTR (one input variable) ....................................................................................63 Figure 4.3 The distribution of the generated correlation coefficients ..........................71 Figure 5.1 Marginal probability distributions from Taylors‟ data ..............................84 Figure 5.2 Marginal probability distributions from FTR data set ................................84 1 CHAPTER I INTRODUCTION 1.1 Background Service industries measure their performance with respect to customer satisfaction using multiple techniques, including customer surveys (Allen & Seaman, 2007). Surveys are also used to measure employee satisfaction, job performance and other facets of the internal service quality of an organization. Typically, the types of information collected from surveys are related to descriptive, behavioral and attitudinal attributes of the respondents (Rea & Parker, 2005). Socioeconomic data of the respondents (such as income, age, and ethnicity) is an example of descriptive information collected from a survey. Survey questions about respondent behavior, such as utilization of various resources and facilities, are designed to document the respondents‟ patterns of behavior while they are using the facilities. The respondents‟ stated attitudes about various conditions related to the services they used are also commonly found in survey studies. 2 Organizations use this descriptive, behavioral, and attitudinal information from surveys to determine what types of services should be offered or withdrawn, which factors most strongly govern respondents‟ satisfaction with the provided services, how various work environments influence productivity, and many other essential decisions. Thus, survey research has become of critical importance for business decisionmaking (Allen & Seaman, 2007; Rea & Parker, 2005). Stevens‟ classification of measurement scale (Stevens, 1946) classifies data collected from surveys into four types of scales: nominal, ordinal, interval and ratio. Nominal scale refers to categories without ordering the preferences, such as gender (male and female), favorite colors (blue, white, and black), and seasons (fall, spring, summer and winter). Ordinal scale preserves rank ordering in the categories but no measures of distance between categories are possible because the distance between categories are not necessary equal. Some examples of ordinal data are variables describing stages of cancer (I, II, II), the quality of waiting service (poor, acceptable, excellent), and customer satisfaction with a service delivery (very dissatisfied, dissatisfied, neutral, satisfied, and very satisfied). The distance between “neutral” and “satisfied” may not be the same as the distance between “satisfied” and “very satisfied.” An interval scale has the same characteristics as an ordinal scale, but the distances between any points are consistent. However, an interval scale does not have an absolute zero. An example of interval data is temperature in Fahrenheit (F) degrees since 0o F is arbitrary and negative values can be used. Ratio data has all the characteristics of interval data except that it has an absolute zero. Examples of ratio data are a person‟s weight and height. 3 In summary, a nominal scale allows differentiation between responses by categorizing only, while an ordinal scale enables the researcher to determine the rankorder of preferences without using the distance between any points in the scale. In contrast, an interval scale is able to measure the distance between responses. A ratio scale is the highest level of measurement since it has an absolute (as opposed to an arbitrary) zero point. Stevens (1946) also outlines the statistical procedures that are permissible for each type of scale, in which each permissible statistics for each type of scale includes all of its predecessors. The permissible statistics for nominal data should be limited to the mode, the number of cases, and the contingency correlation. The permissible statistics for ordinal data include all statistics for nominal data plus the median and percentiles, while that for interval data include all the statistics for ordinal data and also allows calculation of the mean, standard deviation, and product moment correlation. A ratio scale preserves all of the permissible statistics in the other scales while also allowing coefficient of variation. According to Stevens (1946), performing data analysis without considering the type of measurement scale can lead to meaningless results. Table 1.1 shows Stevens‟ classification of measurement scale. The vast majority of surveys use Likert scales as the rating format (Allen & Seaman, 2007). The Likert scale is used to measure respondents‟ attitudes toward a given statement. Although the Likert scale is commonly constructed as a fivepoint scale, some researchers recommend the use of the sevenpoint scale in order to achieve higher reliability results (Allen & Seaman, 2007; Jamieson, 2004). Sometimes the scale is set to 4 a fourpoint scale or other even numbers in order to force a respondent to make a choice by eliminating the “neutral” option. Table 1.1 Classification of measurement scale (Stevens, 1946) Scale Basic empirical operation Permissible statistics Nominal Determination of equality Number of cases Mode Contingency correlation Ordinal Determination of greater than or less than Median Percentiles Rankorder correlation Interval Determination of equality of intervals or differences Mean Standard deviation Productmoment correlation Ratio Determination of equality of ratios Coefficient of variation The Likert scale often ranges from least to most in order to capture a respondent‟s feeling of intensity toward a given item (Turk, Uysal, Hammit, & Vaske, 2011). For example, respondents are asked to indicate their degree of agreement with a particular statement, and they may express their agreement as “strongly disagree,” “disagree,” “neither disagree nor agree,” “agree,” and “strongly agree.” The response categories in the Likert scale have a rankorder. Although the numbers 1, 2, 3, 4, and 5 may be assigned to the respective response categories, the distance between each category is not equal. For example, the distance between “1=strongly disagree” and “2=disagree” may not be assumed to be the same as the distance between “2=disagree” and “3=neither disagree nor agree.” Thus, the Likert scale should be categorized as an ordinal scale (Allen & Seaman, 2007; Jamieson, 2004). 5 Ordinal data has been widely utilized in education, health, behavioral and social studies. In the social and behavioral sciences, an ordinal scale is often used to measure attitudes and opinions. For example, employees could be asked to rate their overall job satisfaction using ordered categories such as “strongly dissatisfied,” “dissatisfied,” “neutral,” “satisfied,” and “strongly satisfied.” This measure of overall job satisfaction is ordinal because employees who choose “satisfied” experience more positive feeling toward their job than if they choose “neutral.” The rankorder is clear even though the difference between “satisfied” and “neutral” can not be measured numerically and certainly can not be assumed to be equal to other intervals. Ordinal data is different from interval data because the absolute distances between each level in ordinal data are unknown even though the rankorder of the level is clearly defined. Nominal and ordinal data are categorical data but nominal data does not involve a rankorder. In general, data analyses for nominal, interval, and ratio data are clearly defined but this is not the case with data analysis for ordinal data. Many studies treat ordinal data as interval data (Knapp, 1990; Mayer, 1971; Velleman & Leland, 1993). Underlying this might be the fact that parametric tests with interval data are considered easier to interpret and provide more meaningful information than nonparametric tests (Allen & Seaman, 2007; Chimka & Wolfe, 2009). However, treating ordinal data as interval data may result in a misrepresentation of the results and lead to poor decision making since such treatment causes substantial bias by assuming equal intervals between points of the ordinal data and other assumptions related to the data distribution that are rarely fulfilled by ordinal data. 6 A study conducted by Hastie, Botha, and Schnitzler (1989) shows that treating ordinal output data as interval data results in statistically significant interaction between independent variables. However, when this ordinal output data is analyzed as ordinal data, the interaction is not statistically significant. Therefore, many researchers recommend not analyzing ordinal data as interval data in order to achieve a higher capability of detecting meaningful trends of input variables on the response variable. Thus, analyzing ordinal data using methods that are able to maintain the rankorder of ordinal data without assuming equal distances between categories provide more valuable and useful results for further investigation and decisionmaking (Gregoire & Driver, 1987; Jamieson, 2004; Mayer, 1971). Multiple analytical statistical methods are available to analyze ordinal data. These methods can be a modelbased approach, such as models for cumulative response probabilities or a nonmodel based approach, such as a nonparametric method based on ranking. A modelbased approach is commonly used to test causal relationships, while a nonmodel based approach tends to be used for making inferences related to association/correlation measures. A common modelbased method used to analyze ordinal data is an Ordinal Logistic Regression (OLR) model (further explanation of the OLR model is presented in subsection 2.2.1). Several approaches are available to build the OLR model, such as the cumulative link model, the adjacent categories model, and the continuation ratio model. The most commonly used among these three approaches is the cumulative OLR model (Agresti, 2010; Tutz, 2012). In addition to statistical models, several machinelearning algorithms are also available to analyze ordinal data, such as an Artificial Neural Network (ANN) model, a 7 decision tree model, and a Support Vector Machine (SVM) model. An ANN model is a computational model that is inspired by the properties of biological neurons. The ANN model term used in this study refers to a multilayer perceptron (MLP) ANN, an artificial neural network that is comprised of input, hidden and output layers. The hidden layer is the key of an ANN model since it contains the summation and transfer function of each node (further explanation of ANN is presented in subsection 2.2.2). A decision tree model presents a classification rule as a tree in which different subsets of variables are used at different levels of the tree. The classification rule in the tree defines the decision boundary. A SVM model functions as a pattern classification method by finding the optimal separating hyperplane for either linear or nonlinear data. The optimization process in an SVM model relies on the kernel function used in the model Among these three techniques (ANN, decision tree and SVM), the ANN model has more similarities with the regression model than the other models. The comparisons between the ANN model and the logistic regression model for classification or prediction problems of binary response data have been conducted extensively (Deng, Chen, & Pei, 2008; Karlaftis & Vlahogianni, 2011; Paliwal & Kumar, 2009). However, none of the previous studies have compared the performance of OLR and ANN models to analyze ordinal data. 1.2 Problem Statement The benefits of analyzing ordinal data using methods that maintain the rankorder of ordinal data and do not assume equal distances between categories promise meaningful and useful results in decisionmaking. Although some previous studies have applied the 8 OLR or ANN models to analyze ordinal data, the existing research focuses on comparing the performance of the logistic regression and ANN models for classification of binary responses. None of the existing studies compares the performance of the ANN and OLR models to analyze ordinal data under different marginal probability distributions and correlation coefficients. Understanding the impact of different combinations of marginal probability distributions and correlation coefficients on the ANN and OLR performance could help providing a guide for selecting an appropriate model and parameters in order to build a better model to analyze ordinal data. This can, in turn, lead to more efficient and valueadded decisionmaking. 1.3 Purpose The purpose of this study is to compare the application of the OLR and ANN models to analyze ordinal data using different scenarios by varying the combinations of the marginal probability distribution and correlation coefficients. This study attempts to provide the best guidance for model selection for various combinations of marginal distribution and correlation coefficient to analyze ordinal data. The specific objectives of this study are to: 1. Develop the OLR and ANN models to represent a relationship between one predictor and one response variable with various combinations of marginal probability and correlation coefficients. 2. Develop the OLR and ANN models to represent a relationship between three predictors and one response variable with different combinations of marginal probabilities and correlation coefficients. 9 3. Compare the models‟ accuracy. 4. Evaluate the models and summarize the results for use in model selection for each scenario. 1.4 Test Case: The Service Profit Chain in Training Restaurants In order to compare the performance of the Artificial Neural Network (ANN) and Ordinal Logistic Regression (OLR) model to analyze ordinal data, data is collected from two training restaurants by using student satisfaction surveys and instructor evaluations of student job performance. Collected data is used as the source to determine marginal probabilities and correlation coefficients for simulations. Two groups of data are generated in the simulations. The first group of data consists of two variables (one input and one outcome variable). The input variable is the instructor evaluations of student job performance, while the output variable is the student overall satisfaction based on student attitudes and perceptions. The second group of data consists of four variables (three input and one outcome variable), which refers to three determinants of student satisfaction and the student overall satisfaction. Both the OLR and ANN models are built using each data set generated from the simulation and each data set collected from the survey. Finally, this study compares the misclassification rate (the proportion of disagreement between the predictedoutcome and the actual outcome) resulting from the OLR and ANN models. The service sector has been growing rapidly in the past two decades. One of the largest privatesector employers in the United States is the restaurant industry. This industry provides many career opportunities for college students pursuing degrees in hospitality, restaurant management, as well as in the culinary arts. Currently, there are 10 approximately 261 schools that offer degrees in the culinary arts and culinary management in the United States (Hertzman & Ackerman, 2010). As of June 2011, the Accreditation Commission for Programs in Hospitality Administration (ACPHA) has granted accreditation for 55 hospitality programs in the US (chrie.org, 2012). One of the most important facilities in those programs is the training restaurant, since the learning process in the training restaurant improves the skill and critical thinking required for the restaurant industry (Gustafson, Love, & Montgomery, 2005). The case study for this research uses the serviceprofit chain framework as a platform to build OLR and ANN models. The Service Profit Chain (SPC) is a comprehensive framework of the relationships between employee, customer, and profitability introduced by Heskett, Jones, Loveman, Sasser Jr, and Schlesinger (1994). The framework links employee satisfaction with the value of the product and service delivered to create customer satisfaction, and then assess the effect on profitability. The information gained from examining the internal links of the SPC concept in a training restaurant, which involves student satisfaction and job performance during the learning process in the training restaurant, can provide valuable input to improve restaurant performance and customer satisfaction. Although the training restaurant has an important role in the effectiveness of hospitality and culinary programs in preparing students to enter the restaurant industry, this type of training facility has received less attention in the literature (Alexander, Lynch, & Murray, 2009; Nies, 1993). Thus, this exploratory study may help add to the body of knowledge governing the utilization of training restaurants in education. 11 1.5 Summary of the Research Gaps Ordinal data is rankordered data commonly used in social and behavioral studies as well as in educational and health studies. This type of data is different from interval data because the distance between each category is not necessarily equal. Ordinal data is also different from nominal data because of its rankordered property. Despite the distinctive properties of ordinal data, many studies continue analyzing ordinal data using methods that only work properly with interval or nominal data (Agresti, 2010; Hastie et al., 1989; Mayer, 1971). In recent years, regression and ANN models have been considered competing modelbuilding techniques in the literature. Many studies have been conducted to compare and contrast the use of regression and ANN models in the area of prediction and classification problems (Deng et al., 2008; Karlaftis & Vlahogianni, 2011; Luengo, García, & Herrera, 2009; Paliwal & Kumar, 2009). However, none of those studies focus on the use of the OLR and ANN models as a modelbuilding technique for ordinal data. This study compares the performance of the OLR and ANN models by using survey data collected from two training restaurants and artificial data generated through simulation. Artificial data is randomly generated based on marginal probabilities and correlation coefficients. Although some studies that compare regression and ANN models also use simulation to generate data, none of them generates data as correlated ordinal data. Instead, a random uniform distribution is utilized (Cardoso & Da Costa, 2007; Jianlin, Zheng, & Pollastri, 2008). This study builds the OLR and ANN models to explore two relationships in the internal link as explained in the Service Profit Chain (SPC) concept. The case study uses 12 the internal link of the SPC because this link reflects the effectiveness of the learning process in the training restaurant. Also, the number of previous studies that explore the internal link of the SPC is much smaller than that of studies which explore the external link. The internal links are comprised of 1) the relationship between employee satisfaction and employee performance and 2) the relationship between employee satisfaction and the determinant factors of employee satisfaction, such as clarity of job descriptions, selfmotivation, reward, recognition, and many others. Currently, no study has been conducted to compare the OLR and ANN by testing the internal links of the SPC in a training restaurant setting. 1.6 Organization of the Study Chapter I delivers an overview of the main topic under study, and the rationale for the need of such a study. The problem statement, purpose, test case for the study and the research gaps that the study aims to fulfill are also stated. Chapter II provides a review of literature relevant to the development of the study. The methodology and procedures used in the study, including the process for developing the instruments used to collect data are presented in Chapter III. Chapter IV provides the process used to compare the OLR and ANN models with one independent variable and presents the results gained from the comparison. The chapter also explains the simulation process used to generate data with specific marginal probabilities and correlation structure. The results of comparing OLR and ANN models with three independent variables are presented in Chapter V. The last chapter, Chapter VI, contains a summary, conclusions and recommendations for future research.13 CHAPTER II REVIEW OF LITERATURE 2.1 Introduction The first part of this chapter explains the two methods used to analyze ordinal data: the Ordinal Logistic Regression (OLR) and Artificial Neural Network (ANN) models. The next part of this chapter presents the methods used in the study to perform simulations needed to generate artificial data. It also includes the relevant correlation setups, including detailed algorithms used to generate random marginal probabilities, correlation matrices, and correlated ordinal data. The performance metrics and hypothesis testing used to compare the OLR and ANN models are also explained. The last section provides a review of relevant literature about the structure and function of training restaurants, the serviceprofit chain (SPC), and employee satisfaction, which provide the research framework for the case study. 14 2.2 Methods for Analyzing Ordinal Data An ordinal scale is commonly used to gather data about subjective responses in many behavioral studies. For example, some studies explore employee and customer satisfaction and their determinants. Although the variables are measured in ordinal scales, some researchers tend to treat them as continuous variables and to analyze them using linear regression models. For instance, Eskildsen and Nussler (2000) built a linear regression model to predict employee satisfaction in several companies in Denmark, whilst Gustafsson and Johnson (2004) applied a linear regression model to determine attribute importance in a service satisfaction model. Analyzing ordinal data using any model that assumes equal distances between categories of such data may produce meaningless results (Agresti, 2010; Mayer, 1971; Tutz, 2012). The Ordinal Logistic Regression (OLR) and Artificial Neural Network (ANN) models are two analytical methods which are appropriate for analyzing ordinal data. Compared to the ANN model, the OLR is easier to interpret and can be statistically tested. On the other hand, the ANN has a higher capability to deal with any nonlinear functions and any data distribution as well as multicollinearity within input variables (Lin, 2007). Many studies that compare statistical methods and the ANN model to predict overall customer or employee satisfaction show that the ANN model results in a lower standard deviation and misclassification rate than statistical methods (West, Brockett, & Golden, 1997; Gronholdt & Martensen, 2005). However, all of those studies treat the respondents‟ responses either as interval or nominal data, although the responses are measured with the Likerttype scales. Ignoring the rank order of ordinal data by treating such data as nominal scale or assuming equal distances between categories of 15 ordinal data in order to analyze such data as interval data may lead to meaningless findings (Ananth & Kleinbaum, 1997; Jamieson, 2004; Tutz, 2012). 2.2.1 Ordinal Logistic Regression (OLR) Model Regression modeling is a modelbased approach that is useful to investigate the relationship between multiple independent variables and a dependent variable, as well as to examine the effect of independent variables on a dependent variable (Chen & Hughes, 2004). Linear regression and logistic regression are two common regression models used in many previous studies. The decision to choose either linear regression or logistic regression depends on the measurement scale of the dependent variable. When a dependent variable is on a continuous scale, a linear regression is more appropriate. On the other hand, a logistic regression performs better with binary variables. However, a logistic regression model should not be used to analyze ordinal data since this model attains only 50%75% of the asymptotic relative efficiency (the limit of the ratio of the sample size required) compared to an ordinal logistic regression (with a cumulativelogit link) for a five level category dependent variable (Armstrong & Sloan, 1989). An Ordinal Logistic Regression (OLR) model is an extension of a logistic regression that is capable of handling data on ordinal scales. Basically, a logistic regression is used to investigate the relationship between independent and dependent variables, in which the dependent variable is a binary/dichotomous variable. However, a logistic regression can be modified to analyze nominal or ordinal data by changing the link function from simple logistic to cumulative logits (Lawson & Montgomery, 2006). Thus, when a dependent variable is on an ordinal scale, the use of an ordinal regression is 16 more appropriate than a multiple regression (Lundahl, Vegholm, & Silver, 2009; McCullagh, 1980) Other than the OLR, Clogg and Shihadeh (1994) explain that the loglinear model and measures of association are also appropriate methods to analyze ordinal data. These three methods produce similar results, since all of these methods maintain the rank order of the ordinal data and do not assume equal distances between categories of such data. However, when ordinal data is analyzed by using a method that does not consider the rank order of the data, such as a logistic regression model, differences in the results may occur (Clogg & Shihadeh, 1994; Tutz, 2012). Several cumulative link functions are available to build an OLR model, such as the cumulative logits, probit, cauchit, complementary loglog, and the related loglog link (Agresti, 2010). The decision to choose one link over the others depends upon the distribution of the dependent variable. The most commonly used link function in the OLR model is the cumulative logit model (Clogg & Shihadeh, 1994; Fullerton, 2009). The cumulative logit link function is used when an OLR model is applied to the k levels of a dependent variable, the model incorporates k1 logits into a single model. Thus, the function can be written as: (2.1) where j=1,…,k1, and indicates the effect of the independent variables, xi denotes the column vector of the value of the independent variable, yi denotes the response levels of the dependent variable. Based on Equation 2.1, the effect of is the same for each cumulative logit. 17 If denote marginal probabilities of each k level of a dependent variable, then the cumulative logit can be determined as: . (2.2) The cumulative logit link is a symmetric function, thus this link is preferred when the ordinal data of the response variable is evenly distributed among all category levels. If the ordinal data being analyzed tend to be distributed on the higher response levels, such as „very satisfied‟ on a satisfaction rating, the complementary loglog link function is generally used to build the OLR model (Chen & Hughes, 2004). The complementary loglog link function can be written as: . (2.3) With the complementary loglog link function (shown in Equation 2.3), P(Y≤ j) moves toward 1.0 at a higher rate than it moves toward 0.0 (Chen & Hughes, 2004). Therefore, this link function is more suitable when the outcome data is dominantly distributed on the higher level. To interpret OLR results, a researcher should consider the signs and coefficients used in the model. The signs represent the existence of negative or positive effects of the independent variables on the ordinal outcome. The intercept parameter, α, refers to the estimated ordered logits for the adjacent levels of the dependent variable. The coefficient, β, indicates that a one unit change in the independent variable results in a change of the odds of the event occurring by a factor of eβ, holding other independent variables as constant (Fullerton, 2009). 18 2.2.2 Artificial Neural Network (ANN) Model An Artificial Neural Network (ANN) is an informationprocessing model that is inspired by the brain function. The key characteristics of the ANN are its capability to model complexity and uncertainty. The ANN model often performs better than traditional statistical techniques, since this technique does not require the assumptions of traditional statistical techniques, such as linearity, absence of multicollinearity, and normally distributed data (Garver, 2002; Lin, 2007; Nisbet, Elder, & Miner, 2009). ANN models are built through an iterative process in which the model learns the pattern of complex relationships between input and output. The simplest form of a neural network consists of three layers: input, hidden and output. The first layer is comprised of one or more processing elements (PE) that represent independent (predictor) variables, while the output layer contains one or more PEs that are referred as dependent (outcome) variables. The output layer consists of several PEs that represent the model‟s classification decisions. Each PE represents one class of output. The hidden layer in the model connects the input and output layers. In general, there can be one or more hidden layers between the input and output layer. The key element in the ANN is the connection weights (Turban, Sharda, & Delen, 2011). The connection weights represent the relative weight of each input to the next processing element in the hidden layer and output layer. The weights also express how the processing element learns the pattern of information given to the networks. Other important elements in the ANN are the summation and transfer functions. The summation function calculates the weighted sum of all processing elements in the input layer that enters each processing element in the hidden layer. The summation function multiplies 19 each input value by its weight and sums the values to get the weighted sum. This function is also referred as an activation function of each processing element in the input layer. Based on this summation function, an ANN model may or may not use a PE in the input when determining a PE in the sequence layer. In addition, the transfer function determines how the network combines input from each PE in the hidden layer that enters into the PEs in the output layer. Figure 2.1 Information processing in MLP ANN with backpropagation algorithm (Mehrotra, Mohan, & Ranka, 1997) The focus of this study is on multilayer perceptrons (MLP) ANN or feedforward neural networks with a backpropagation algorithm, the most commonly used neural networks for classification problems (Mehrotra et al., 1997; Perlovsky, 2001). The backpropagation MLP ANN, as shown in Figure 2.1, is a type of ANN that adjusts the connection weight by minimizing the error between the desired output and the predicted outcome produced by the network. An ANN with this algorithm is trained by giving input Input Layer Hidden Layer Output Layer PE1 PE3 PE2 PE4 PE1 PE1 PE2 Summation Function Transfer Function PE Wij Error = Desired – Predicted Outcome 20 and output data to the network. During the training period, the network learns the data patterns between the input and output and adjusts its connection weights to minimize error. Once trained, the connection weights are retained and remain available to determine output values for any new input fed to the network. Each PE in the hidden layer transfers several PEs from the input layer to the sequence layers by using summation and transfer functions. Thus, the connection weight in the ANN model is difficult to explaine (Dreiseitl & OhnoMachado, 2002; Turban et al., 2011). More hidden layers used in an ANN model results in more complex connection weights and interdependencies (West, Brockett, & Golden, 1997). Another potential drawback of an ANN model is the possibility of the model reaching the local minimum error rate since the iteration process depends on the sample used to learn the pattern when the network is trained. Thus, a validation data set is needed to decrease this potential weakness (West et al., 1997). 2.2.3 Performance Metrics The performance metrics of a predictive model are frequently measured in terms of an error (Mehrotra et al., 1997). The nature of the problem determines the choice of the error measure. In classification problems, such as the application of a predictive model for nominal and ordinal outcome variables, one of the common measures of error is misclassification rate (Mehrotra et al., 1997; Webb & Copsey, 2011). A smaller misclassification rate indicates better model performance. A misclassification rate can be calculated as: . (2.4) 21 For an ordinal outcome variable with many categories, the misclassification rate refers to the total number of misclassified samples of the outcome categories predicted by a model versus the actual categories for all classes. Some analytical packages such as IBM SPSS Modeler and SAS Enterprise Miner present a confusion matrix to express the performance of a model being used for analysis. A confusion matrix has an appearance similar to that of a contingency table. Each column of this matrix represents the number of cases in an outcome category predicted by a model, while each row represents the number of cases in an actual category. Figure 2.2 shows the confusion matrix resulting from a sevenclass classification problem (the outcome variable is a sevenpoint Likert scale). Thus, the confusion matrix has a dimension of 7x7. Each cell in the confusion matrix indicates number of misclassified/trueclassified samples. When the outcome category of a sample predicted by a model is not the same as the actual category, the sample is counted as misclassified. Otherwise, the sample is counted as trueclassified. Outcome Category (Class) Predicted by a Model 1 2 3 4 5 6 7 Actual Category (Class) 1 True Misclass Misclass Misclass Misclass Misclass Misclass 2 Misclass True Misclass Misclass Misclass Misclass Misclass 3 Misclass Misclass True Misclass Misclass Misclass Misclass 4 Misclass Misclass Misclass True Misclass Misclass Misclass 5 Misclass Misclass Misclass Misclass True Misclass Misclass 6 Misclass Misclass Misclass Misclass Misclass True Misclass 7 Misclass Misclass Misclass Misclass Misclass Misclass True Figure 2.2 A confusion matrix representation for sevenclass classification problem 22 2.2.4 Statistical Test to Compare the OLR and ANN models Determining which type of statistical test to use to compare two or more models is one of the critical problems in this study. Many studies that compare machine learning algorithms and statistical models use different types of statistical tests, such as McNemar‟s test, the Wilcoxon signedrank test, the Quasi F test and hypothesis testing on the average performance, to determine which model (algorithm) performs better for the problem that is being investigated (Dietterich, 1998). A taxonomy that helps to determine the statistical test to be used to compare different models (algorithms) is shown in Figure 2.3. Figure 2.3 A taxonomy of statistical tests in comparing algorithms (Dietterich, 1998) This study follows condition number 5, which suggests 1) to build algorithm on each training data sets of size m, 2) to test the resulting frozen model (classifier) on the testing data set and 3) compare the algorithms‟ accuracy based on the average performance (Dietterich, 1998). These suggestions are similar to the procedure undertaken in this study, which builds the ANN and OLR models using n training data A Taxonomy of statistical questions Single domain Analyze classifiers Predict classifier accuracy Large sample 1 Small sample 2 Choose between classifier Large sample 3 Small sample 4 Analyze algorithms Predict algorithm accuracy Large sample 5 Small sample 6 Choose between algorithm Large sample 7 Small sample Multiple domain 9 23 sets of size m. In this study, each model is trained on each training data set and the resulting classifiers are tested on n testing data sets. The average accuracy or misclassification on test data sets predicts the performance of ANN and OLR models. Then, a hypothesis test on the mean is used to compare the average accuracy or misclassification obtained from the testing data sets. One test procedure for investigating the difference between population means μ1 and μ2 is based on the assumption that the population distributions are normal and the value of the population variance is known to the investigator. However, both of these assumptions are unnecessary if the test procedure is performed on large sample sizes (Devore, 2008). When this test procedure is applied to compare the average misclassification rate from two algorithms, i.e. model 1 and model 2, the hypothesis testing can be expressed as the following: , (2.5) where = the true mean misclassification rate for model 1 = the true mean misclassification rate for model 2 = the sample average of misclassification rate for model 1 = the sample average of misclassification rate for model 2 = sample variance for model 1 = sample variance for model 2 24 = number of sample for model 1 = number of sample for model 2 These tests are usually appropriate if both m and n are more than 40. is rejected if pvalue is smaller than the desired type I error. If H0 is rejected, the result confirms that there is a statistically significant difference between the mean misclassification rate resulting from model 1 and model 2. Otherwise, H0 fails to be rejected, which means the misclassification rate resulting from model 1 is not statistically significant different from the one resulting from model 2. 2.3 Generating Correlated Ordinal Data In order to evaluate and compare the performance of two models with a small data size, simulation is used to generate artificial data (Ibrahim & Suliadi, 2011). Additionally, if the artificial data is generated based on a particular data set in which the responses within a specific subject (respondent) are correlated and the responses between subjects are independent, then the artificial data are classified as correlated ordinal data and commonly generated based on the marginal probabilities and the correlation coefficient (Demirtas, 2006; Ibrahim & Suliadi, 2011; Lee, 1997). Many studies discuss procedures to generate correlated binomial data based on the marginal probabilities and correlation coefficient, but only a few algorithms are available to generate correlated ordinal data. Some methods to generate ordinal data are developed from methods to generate binomial data (Lee, 1997; Sebastian, Dominik, & Friedrich, 2011). Several algorithms have been proposed to generate correlated ordinal data. A technique proposed by Gange (1995) uses the iterative proportional fitting 25 algorithm for generating correlated ordinal data. This method determines the marginal joint distribution based on the loglinear model. However, this method requires intensive computation, even for a small number of variables (Demirtas, 2006; Ibrahim & Suliadi, 2011). Another method proposed by Lee (1997) simulates correlated ordinal data using a convex combination and archimedian copulas approach and computes the correlation coefficient using Goodman Kruskal‟s coefficient. This approach does not require the same intensive level of calculation as the one suggested by Gange (1995), so that any number of categories and variables can be handled easily using this method. Unfortunately, this method cannot handle a negative correlation coefficient. Biswas (2004) generates correlated ordinal data for a specific type of correlation (Autoregressive type correlation). This method requires the variables to be independent and identically distributed. Thus, this method is very restrictive. Another algorithm that has relatively high flexibility is suggested by Demirtas (2006). This algorithm uses the generation of binary data as the intermediate step and computes correlation using Pearson‟s productmoment correlation coefficient. Ordinal values of the original data are collapsed into binary values. Then, iterative calculations are conducted to compute the binary correlation and convert the binary data into ordinal data based on the original marginal distribution. A shortcoming of this method is its incapability to handle negative correlations. Based on the pros and cons of the available algorithms to generate correlated ordinal data, the decision to choose one algorithm over to the other depends on the type of correlation coefficient. If the simulated variables could have a negative correlation coefficient, then the method proposed by Gange (1995) is the preferred algorithm. In circumstances when 26 simulated variables have an autoregressive type correlation, the algorithm introduced by Biswas (2004) is the preferred choice. Alternatively, when simulated variables have positive correlation coefficients, either the algorithm proposed by Demirtas (2006) or Lee (1997) can be used. The difference between each algorithm is the type of correlation used in the simulation. Demirtas (2006) applies Pearson‟s productmoment correlation coefficient and Lee (1997) applies the Gamma correlation coefficient. This study uses the convex combination algorithm proposed by Lee since this algorithm requires a simple calculation and Gamma correlation coefficient, a type of correlation that is suitable for ordinal data. Three main steps to generate correlated ordinal data using the convex combination algorithm proposed by Lee (1997) are 1) finding the extreme table, 2) finding the joint distribution, and 3) applying the inversion algorithm. The extreme table is used to check if the preferred Gamma correlation is achievable with the given marginal probabilities. The joint distribution is determined by applying linear programming to the convex combination of the extreme table. The last step is to generate the ordinal correlated data by applying the inversion algorithm, which aims to generate correlated ordinal observations. 2.4 Generating Correlation Coefficients A simulation to generate correlated ordinal data requires marginal probabilities and correlation coefficients. The correlation coefficients for correlated ordinal data are commonly presented in a correlation matrix. Since a correlation matrix has to be 27 symmetric and positive semidefinite, then a certain algorithm is needed to ensure the fulfillment of this requirement when correlation coefficients are generated. Let rij be the correlation coefficient between xi and xj where x1, x2,…, xn are random variables. A correlation matrix is a symmetric and positive semidefinite matrix form of rij. All entries in a correlation matrix have a value between [1, 1], and the diagonal entries are equal to one. One method to generate correlation matrices is by randomly generating correlation matrices without considering particular settings (Budden, Hadavas, Hoffman, & Pretz, 2007; Joe, 2006; Olkin, 1981). In this method, correlation matrices are randomly generated based on the upper and lower bound set in each entry, which is not consistently [1,1] in order to guarantee that the matrices are positive semidefinite matrices and their diagonal entries are equal to one. The application of this approach to generate a pdimensional correlation matrix R enables entries to be independently generated in the interval [1, 1] and the remaining entries (except the diagonal entries) to be constrained on a specific interval. This specific interval depends upon the value of the first entries and the sequence of the partial correlation being generated. Consider 4x4 correlation matrices. The correlation matrix is in the form of The following procedure is the detailed formula to randomly generate 4 x 4 correlation matrices without considering particular settings as suggested by Budden et al. (2007). The first step in generating correlation matrices is to generate the correlation coefficient of r12, r13, and r14 which can be randomly generated ~ U (1, 1). The second 28 step is to determine the lower and upper limit of the other correlation coefficients in order to ensure generated matrices are symmetric and positive semidefinite. A matrix can be a positive semidefinite matrix if and only if the matrix and all of its symmetric submatrices have a nonnegative determinant. It means that if C is a correlation matrix, det C ≥ 0 and all its submatrices are in the form of is also a correlation matrix for i, j, k {1,2,3,4} ; with no two of i, j, and k equal. Three limits on the possible range of the other correlation coefficients (r23, r24, r34) are determined to ensure the symmetric and positive semidefinite requirement in addition to the symmetric boundary of a correlation matrix, rij = rji. Another method is to randomly generate correlation matrices with particular settings, such as eigenvalues or expected values, and distribution of entries (Marsaglia & Olkin, 1984). Compared to other available methods that are generating correlation matrices based on the distribution of the entries, the Wishart distribution is the most commonly used distribution for generating a correlation matrix (Gentle, 2003). Although the Wishart distribution is initially known as the probability distribution of the covariance matrix, many studies have applied the Wishart distribution to generate correlation matrices since a correlation matrix can be calculated from a covariance matrix. The elements of a correlation matrix can be obtained by dividing the (i,j) element of the covariance matrix by the square root of the product of the ith diagonal element and the jth diagonal element of the covariance matrix (Gentle, 2003). In addition, the p dimension of 29 the correlation matrices and the mean of the randomly generated matrices should be known a priori in order to generate correlation matrices based on the Wishart distribution. This study compares the performance of the OLR and ANN models to analyze ordinal data by fitting ordinal data collected from two training restaurants to both models. The OLR and ANN models are built to analyze the internal link of the Service Profit Chain (SPC). The concept of the SPC and training restaurant is used as the framework and research basis for the case study. The following subsection presents the review of some relevant literature about the concept of training restaurants, the service profit chain, and employee satisfaction. 2.5 Training Restaurant Training restaurants, production kitchens and industrial training placements provide practical elements and vocational settings in food and beverage management curricula. Training restaurants function as learning environments to deliver a mix of practical leadership and management skills to students. In this type of restaurant, students not only learn food production and service, but they also learn managerial skills and techniques (Alexander, 2007). Therefore, students are required to fulfill different responsibilities (either in the kitchen area or in the service area) during their practical activities in training restaurants. For instance, a student who makes salad on one particular day may become a team captain or a waiter on another day. Although the main purpose of training restaurants is not to generate profit, training restaurants are required to generate revenue to cover their operational costs (Alexander et al., 2009). Hospitality departments that operate training restaurants expect 30 the training restaurants to become more costeffective so that the department is able to reduce its subsidy, and the restaurant can gradually achieve financial autonomy. Achieving a condition without any subsidy means that a training restaurant has been successful in creating a realistic learning condition, effectively mixing training and profit making. Therefore, training restaurants should not only be treated and managed as laboratories, but also as business entities. The summary of training restaurant characteristics and a comparison to profitoriented restaurants is presented in Table 2.1. Table 2.1 Comparisons of training restaurants and profitoriented restaurants Profitoriented Restaurant Training Restaurant Main Purpose Profit Generating Learning Media & Revenue Generating Employee RegularPaid Employee Relatively Fixed Position Unpredictable Turnover Students Rolling Position/Responsibility Periodic Turnover rate The unique characteristics of training restaurants may present obstacles to these restaurants gaining profit. According to Nies (1993), more than half of the training restaurants owned by various schools in the US are located inside the school area and operated within limited hours during the school‟s instructional period. These characteristics may create limited access for the public to dine in training restaurants. In addition, training restaurants experience frequent and predictable turnover because different groups of students operate the restaurants for each instructional period (semester/quarter). A high turnover rate requires the restaurants to find creative ways to maintain good relationships with their customers, since the familiarity that commonly supports good relationships between frontline employees and customer is diminished. 31 2.6 The Service Profit Chain Heskett et al. (1994) introduce the Service Profit Chain (SPC) as a comprehensive framework of relationships between employee, customer, and profitability. In a service industry, the theory posits that internal service quality influences employee satisfaction. Internal service quality refers to employees‟ perceptions of their working environment, various aspects of their job and their relationships with peers and supervisors. A satisfied employee tends to deliver better service and product value to the customer. A higher perceived service and product value leads to higher customer satisfaction. In turn, a satisfied customer tends to be a loyal customer. By having a loyal customer, an organization experiences higher growth and profit level. This proposition is supported by empirical studies from various service companies, such as Southwest Airlines and Taco Bell. Figure 2.4 illustrates the proposition of this concept. Figure 2.4 The links in the Service Profit Chain (Heskett et al., 1994) The SPC is recognized by many researchers as the best model to guide service organizations in achieving higher organizational performance (Herington & Johnson, 2010). Many empirical studies test some of the linkages and their results strengthen specific aspects of this framework. For example, Maritz and Nieman (2008) examine the Internal Service Quality Employee Satisfaction External Service Value Customer Satisfaction Customer Loyalty Growth Profitability Internal/Employee External/Customer Organization‟s Success Measures 32 relationships between the service profit chain initiatives (represented by retention and sales volume) and service quality dimensions, whereas Gelade and Young (2005) find that customer satisfaction mediates the relationship between employee attitudes and organizational performance. 2.6.1 Link between Employee and Customer Satisfaction Many studies demonstrate a positive correlation between customer satisfaction and employee satisfaction (Chi & Gursoy, 2009; Koys, 2003). Other studies show that the relationship between customer satisfaction and employee satisfaction gets stronger if the employees have higher loyalty (Gelade & Young, 2005; Schlesinger & Zornitsky, 1991). Furthermore, Gelade and Young (2005) suggest that positive employee experience, as demonstrated by positive attitudes such as satisfaction and commitment and by positive evaluations of organizational climate, are closely related to high levels of customer satisfaction. Thus, employees that have positive feelings about their workplace deliver positive effects when they carry out their work. This emotion is perceived and absorbed by the customer. As a result, customers experience pleasant service encounters. 2.6.2 Link between Customer Satisfaction and Organizational Success Measures The Service Profit Chain (SPC) suggests that profit and other measures of success used in an organization, are positively correlated with customer satisfaction (Heskett & Sasser, 2010). This SPC proposition is supported by other studies which find that customer satisfaction is positively correlated with nonfinancial performance (Schneider, 1991; Tornow & Wiley, 1991) and with financial performance as well (Anderson, 33 Fornell, & Lehmann, 1994; Rust & Zaborik, 1993). Types of financial and nonfinancial measures chosen in a study depend on a company‟s operation. For example, Tornow and Wiley (1991) use two nonfinancial indicators (right first time, on time) and three financial indicators (contract retention, revenue retention and service gross profit) to test the relationship between customer satisfaction and organizational performance in a computer service company. In another perspective, Anderson and Mittal (2000) suggest that the relationship between satisfaction and repurchase in retail industry is nonlinear. In that case, dissatisfaction has a greater impact on repurchase intent than satisfaction and the impact of satisfaction on repurchase intent is greater at the extremes. In addition, they also show that at a certain point, the increased cost to improve customer satisfaction is likely to outweigh the beneficial effects of further customer satisfaction. Therefore, diminishing returns are applied when relating customer satisfaction to profitability. 2.6.3 Link between Employee Satisfaction and Organization’s Success Measures Some studies find that sales and profitability as a measure of business performance have a significant relationship with employee satisfaction and employee retention. Reichheld (1993) explains that a loyal employee tends to establish good relationships with customers. In turn, these relationships will increase customer loyalty, and as a result, increase profitability. Thus, in service industries, employee retention has a significant role because it has a positive relationship with customer retention (Reichheld, 1993). Similarly, Koys (2001) studied this relationship in some outlets of a restaurant 34 chain and found that there was a significant relationship between employee satisfaction and financial performance. In contrast, Bernhardt et al. (2000) and Chi and Gursoy (2009) found that there is no significant relationship between employee satisfaction and financial performance. Similarly, a study of employee perception and business performance using a metaanalysis finds that there is only a small relationship between business unit productivity and profitability, and employee engagement (Harter, Schmidt, & Hayes, 2002). This study explains that customer satisfaction mediates the relationship between employee satisfaction and profitability; thus, there is only either a small relationship or even a nonsignificant relationship between employee satisfaction and profitability (Harter, Schmidt, & Hayes, 2002). 2.7 Employee Satisfaction Disposition (temperament), work environment and culture are key determinants of employee satisfaction according to Saari and Judge (2004). Disposition includes employee personality traits, core selfevaluation, the perception of the job itself, extraversion and conscientiousness. Even though organizations cannot directly influence employee personalities, the use of appropriate selection methods and good alignment between employees and job tasks help to ensure that people are selected for, and placed into, jobs most appropriate for them. In addition, job variation, job range/scope and autonomy of the job are required to ensure the work environment remains interesting and challenging (Love & O'Hara, 1987). Four areas of crosscultural differences among the employees are individualism versus collectivism, uncertainty avoidance versus risk 35 taking, power distance or the extent to which power is unequally distributed, and achievement oriented or nonachievement oriented. Because of the potential for crosscultural misinterpretation, managers should be aware and adjust cultural factors that influence employee attitude and satisfaction (Saari & Judge, 2004). Another study conducted by Gostick and Elton (2007) explores the relationship between employee satisfaction and employee engagement or employee involvement in an organization. The study measures employee engagement based on employee perception toward the opportunity to do satisfying work, acceptance of opinion by the manager, feeling accepted as a team member by peers and supervisors, and the manager‟s recognition (Gostick & Elton, 2007). Internal service quality is also suggested as a determinant factor of employee satisfaction (Fitzsimmons & Fitzsimmons, 2008). According to these authors, internal service quality is related to employee perceived value toward selection and development programs, rewards and recognition, access to information to serve the customers, workplace technology, and job design. Previous studies explore the determinants of employee satisfaction in dining services by using the same constructs as employee satisfaction studies in other areas (Gazzoli, Hancer, & Park, 2010; Salanova, Agut, & Peiro, 2005; Susskind, Kacmar, & Borchgrevink, 2007; Tepeci & Bartlett, 2002). Salanova et.al (2005) uses autonomy, organizational resources, such as technology and training offered, engagement, and service climate as employee satisfaction drivers. In addition, other factors such as role conflict, physical work environment, relationship with peer workers, relationship with superior, and dispositional influence are used as employee satisfaction drivers (Gelade & Young, 2005; Martensen & Granholdt, 2001; Matzler, Fuchs, & Schubert, 2004; 36 Maxham, Netemeyer, & Lichtenstein, 2008; Salanova et al., 2005; Timothy & Chester, 2004). Based on the previous research, this study uses the constructs shown in Table 2.2 to develop the student questionnaires used in the survey. Table 2.2 Constructs of employee satisfaction Dimensions Constructs/Dimension Internal Determinants  Dispositional influence/selfmotivation (Gelade & Young, 2005; Saari & Judge, 2004) External Determinants  Development of competencies, engagement (Salanova et al., 2005)  Superior relationships, working condition, peer relations (Martensen & Granholdt, 2001)  Job clarity, recognition, reward (Saari & Judge, 2004) Based on all of these perspectives, the determinants of employee satisfaction can be classified into two groups: internal and external. The internal determinants come from within the employees themselves, while the external determinants are triggered by the work and organizational conditions. The internal determinants come from the subjective characteristics of employees, which can be either created before they work in the company or after they join the company. On the other hand, the external determinants come from the work environment, which can be influenced by the internal service quality, work conditions, coworkers, leaders and subordinates. The SPC concept posits that satisfied employees tend to have a better performance when they serve a customer. In the training restaurant setting, the employees are the students, who work in the restaurant during a particular semester/quarter as part of a course. The students, who work in training restaurants, are required to do a rolling 37 position, such as serving customers, greeting and directing, and managing the operation of the day. Thus, the students are expected to understand the entire products offered and procedures during the operation as well as and to become skilled at delivering service and managing a restaurant (Maxham et al., 2008; Alexander et al., 2009). Based on the previous research, this study uses the constructs shown in Table 2.3 to develop the instructor questionnaires used in the survey. Table 2.3 Constructs of student performance Dimensions Constructs/Dimension Students InRole Performance  Knowledge of product, knowledge of procedure (Maxham et al., 2008)  Production skill, service skill, managerial skill (Alexander et al., 2009) Employee ExtraRole Performance  Intention to satisfy customer, intention to go beyond duty (Maxham et al., 2008) 38 CHAPTER III RESEARCH METHODOLOGY 3.1 Introduction This chapter describes the research procedures designed to compare performance of the Ordinal Logistic Regression (OLR) and Artificial Neural Network (ANN) models when analyzing ordinal data. In this study, the OLR and ANN models are used to test two relationships in the Service Profit Chain (SPC), the relationship between employee perceived value of the internal and external determinants of employee satisfaction and employee overall satisfaction and the relationship between employee overall satisfaction and job performance. Before building the OLR and ANN models, the study undertakes some preparatory steps, such as checking for missing values and outliers as well as examining data distributions. Since the total number of students who work at the sampled training restaurants is relatively small (n < 30), this study generates additional correlated ordinal data using simulations to build the OLR and ANN models. The preferred model is the one with the lowest averaged misclassification rate, which is calculated as the 39 the proportion of disagreement between the predictedoutcome from a model and the actual outcome from a testing data set. The first step in this research is to create a conceptual framework in order to analyze possible relationships between student overall satisfaction and job performance in a training restaurant by applying the internal link of the Service Profit Chain (SPC) model. This step includes exploring factors that may affect student overall satisfaction and job performance. The second step is to design a data collection plan for use in two different training restaurants, Taylors‟ Dining Room at Oklahoma State University – USA and Fajar Teaching Restaurant at Universitas Negeri Malang  Indonesia. The next step is to generate simulated data that have marginal probability distributions and correlation coefficients similar to data collected from the surveys at both training restaurants. Additional sets of data are also generated using random marginal probabilities and correlation coefficients. Two groups of data are generated in the simulation. The first group consists of two variables (one input and one outcome variable) and refers to the effect of employee overall satisfaction on job performance. The second group consists of four variables (three input variables and one outcome variable) and refers to the effect of student perceived value of three determinants of employee satisfaction on student overall satisfaction. Data that is generated using simulations is split into two data sets, training and testing data sets. Each training or testing data set consists of 50 pair data points (predictor and outcome). Both the OLR and ANN models are fitted to training data sets and used as classifiers (frozen models). The models resulting from this step are used to predict the outcome category of all predictor data points in the testing data sets. The performance of 40 the OLR and ANN models are measured from the misclassification rates, the proportion of disagreement between the predictedoutcome from a model and the actual outcome from a testing data set. The last step in this research is to compare the mean misclassification rate resulting from the constructed OLR and ANN models. The framework of the overall methodology used in this research is presented in Figure 3.1. Figure 3.1 The framework of the research methodology Step 1: Develop Conceptual Framework  Develop conceptual model of studentemployee satisfaction and student performance in training restaurants  Develop list of constructs and items that influence employee satisfaction and performance in restaurant service industry Step 2: Design Data Collection Plan  Design survey instruments  Determine scale of measurement  Develop sampling plan and survey administration plan  Obtain IRB approval Step 3: Generate Simulated Data  Determine marginal probabilities and correlation coefficients  Generate random marginal probabilities  Generate random correlation coeffiecients  Generate simulated data based on marginal probabilities and correlation coefficients Step 4: Build Model  Build ordinal logistic regression and artificial neural network model  Set model evaluation metric  Record misclassification rate for each model  Compare misclassification rates 41 3.2 Research Step 1: Conceptual Frameworks This study follows the proposition from previous literature regarding the effect of employee perceived value of the internal and external determinants of employee satisfaction on employee overall satisfaction and the effect of employee overall satisfaction on job performance. The conceptual framework of this study is illustrated in Figure 3.2. Figure 3.2 The conceptual framework of the study The propositions are: 1: Student perceived value of employee satisfaction determinants affect overall satisfaction. 2: Student overall satisfaction affects job performance. 3.3 Research Step 2: Data Collection Plan This study conducted surveys to collect data. Based on the two categories of respondents who filled out the questionnaires, two types of instruments were used in this study: a studentemployee instrument and an instructor instrument. The questions used in these instruments were based on previous studies in order to ensure the questions had both validity and reliability. The studentemployee instrument contained nine constructs/ dimensions identified by Salanova et al. (2005), Martensen and Granholdt (2001), and Saari and Judge (2004), while the instructor instrument contained questions identified by Student overall satisfaction Job Performance Student perceived value of internal & external determinants of employee satisfaction 42 Maxham et al. (2008) and Alexander et al. (2009). Both instruments only contained closeended questions. A list of constructs used in the student instrument is shown in Table 2.2. 3.3.1 Initial Instrument and Pretest Before applying for Institutional Review Board (IRB) permission, the initial instruments were finalized. The initial instruments contained the following sections: 1) Brief explanation of the research project, including the title and the objective; 2) Confidentiality of the participants, procedure and risks, contact information and the expected length of time to take the survey; 3) Questionnaires. After the development of the initial instruments, a comprehensive discussion with faculty members was conducted to receive any feedback related to the order of the questions, language, general structure of questionnaire items, and the appearance of the instruments. The constructs and items used in the student and instructor questionnaires are listed in the subsections 3.3.4 and 3.3.5 respectively. The IRB approval to conduct surveys at FTR and Taylors‟ Dining can be found in Appendices 2a and 2b. Additionally, the questionnaires used in the survey at Taylors‟ Dining and FTR can be found in Appendices 3a, 3b, 3c, 3d, 3e and 3f. 3.3.2 Pilot Test A pilot test of the student instrument was administered to ten students that were taking Managing Café class in the Culinary Program at the Universitas Negeri Malang. The purpose of the pilot test was to assess the length of time needed to complete the survey as well as to conduct face validity and initial reliability analyses. The study 43 examined reliability based on internal consistency measures using Cronbach‟s Alpha test. Data collected from the pilot test is shown in Appendix 2. The obtained alpha for each construct shown in Table 3.1 was higher than 0.7, the recommended value of alpha for a reliable scale (Turk et al., 2011). Thus, the alphas obtained indicated that the constructs in the instrument had acceptable interitem reliability. Table 3.1 Reliability Alpha on pilot data Construct Number of items Cronbach‟s Alpha Development of competencies 6 items 0.816 Recognition 3 items 0.714 Working condition 4 items 0.721 Reward 6 items 0.790 Engagement 5 items 0.850 Peer relationship 4 items 0.777 Superior relationship 6 items 0.855 Job clarity 5 items 0.741 Dispositional influence/selfmotivation 3 items 0.738 3.3.3 Instrument Validity Validity indicates the ability of an instrument to measure the intended concepts (Turk et al., 2011). The study evaluated the validity of the instrument by investigating the face validity of the instrument. Face validity, a basic index of content validity, indicates the degree to which the items in the instrument appear that they will measure the intended concept (Turk et al., 2011). To ensure the face validity of the instruments, the research advisor and the outside committee member provided feedback on the initial instrument. This repetitive process resulted in rewording some questions. 44 The manager of each training restaurant also provided some comments on the instruments. These comments created differences between the student instruments used in the Fajar Teaching Restaurant and Taylors‟ Dining Room. For example, there are no questions related to compensation for students at Taylors‟ Dining since students work in this restaurant as part of a class. However, there are two questions related to compensation for students at the other training restaurant since they are paid for their work. The manager in Taylors‟ Dining also recommended deleting some questions in the student instrument because of the repetitiveness of the questions. For example, the FTR survey contains four questions related to how the students were rewarded, while the Taylors‟ Dining survey contains only two. As a result, the student instrument used in FTR has more questions (42 questions) than the one used in Taylors‟ Dining (29 questions). The other difference is related to the preferred terminology for the student employee. FTR‟s and Taylors‟ Dining‟s manager recommended using “employee” and “student lab” as the term that refers to student employees in the questionnaire. The pilot test revealed that the instrument did not cause problems in terms of the clarity of the questions and language. 3.3.4 Student Instrument The student instrument measures the students‟ perceived value of some factors that influence their overall satisfaction as studentemployees in the training restaurant. The student instrument consists of two sections. The first section contains 42 items identified by Salanova et al. (2005), Martensen and Granholdt (2001), and Saari and Judge (2004) and uses a sevenpoint Likert scale. In this part, „1‟ indicates that the 45 student “strongly disagrees” with the statement on the instrument, while „7‟ represents strong agreement with the statement being asked. The statements in this section evaluate the student perceived value of internal service factors as well as external factors that may influence his/her satisfaction. The second section intends to measure student overall satisfaction. This section has two questions and uses a sevenpoint Likert scale. In this section, „1‟ indicates that the student is “very dissatisfied” with his/her working experience during the lab session at the restaurant, while „7‟ indicates that the student is “very satisfied.” At the end, the student is asked to write down his/her name so that his/her responses can be paired up with the instructor‟s responses related to his/her job performance. Table 3.2 presents the constructs and items used in the student questionnaire. See Appendix 3a and 3c for the student instrument used in Taylors‟ Dining and FTR. Target Population. The target population for this instrument was studentemployees in the training restaurants. The study employed convenience sampling to collect data. The samples were all students who worked in the Taylors‟ Dining and FTR during the survey period. Sample size. There were 28 studentemployees at Taylors‟ Dining Room and 24 studentemployees at Fajar Teaching Restaurant. Survey Administration. This study administered the surveys by distributing the instrument to all studentemployees before the morning briefing. After filling out the instrument, studentemployees returned the instrument to the frontdesk. 46 Table 3.2 Student questionnaire items Constructs and items* Reward (6 items) Q1a. I am fairly rewarded for the experience I have; Q1b. I am fairly rewarded for the stresses of my job Q1c. I am fairly rewarded for the effort I put forth; Q1d. I am fairly rewarded for the work I have performed well Q22. The pay system is based on achievement; Q23. The pay system is transparent Engagement (5 items) Q2a. When decisions about employee are made at FTR, complete information is collected for making those decisions Q2b. When decisions about employee are made at FTR, all sides affected by the decisions are presented Q2c. When decisions about employee are made at FTR, the decisions are made in timely fashion Q2d. When decisions about employee are made at FTR, useful feedback about the decision and their implementation are provided Q20. My manager involves me in planning the work of my team Superior relationship (7 items) Q2e. My supervisor/manager treat me with respect and dignity Q2f. My supervisor/manager works very hard to be fair Q2g. My supervisor/manager shows concern for my rights as a student employee Q10. I know how the instructor evaluates my performance. Q13. My superior is trustworthy; Q24. My supervisor gives me feedback when I perform poorly Development of competencies (6 items) Q4. My job provides me the opportunity to develop a wide range of my skills Q6. My job allows me to utilize the full range of my educational training Q7. The training I have received has prepared me well for the work I do Q8. I believe I have the opportunity for personal development at FTR Q30. Employees in our organization have knowledge of the job to deliver superior quality product and service Q31. Employees in our organization have the skill to deliver superior quality work and service Recognition (2 items) Q5. My job is important to the success of this restaurant Q32. Employees receive recognition for delivery of superior product and service Q25. My supervisor gives me feedback when I do a better job than average Working condition (4 items) Q14. I have sufficient authority to do my job well ; Q21. Work environment is pleasant Q26. I have autonomy to decide the order of tasks I perform Q33. Employees are provided with tools, technology and other resources to support the delivery of quality product and service Peer relationship (4 items) Q15. Most employees that I worked with are likeable ; Q16. Employees are team oriented Q18. People are treated with respect in my team, regardless of their job Q19. The people in my teams are willing to help each other, even if it means doing something outside their usual duties Job clarity (5 items) Q3. I understand what I have to do on my job. Q9. I am able to satisfy the conflicting demands of various people I work with. Q11. I know what the people I work with expect of me. Q12. I feel that I can get information needed to carry out on my job. Q17. I have a clear understanding of the goals and objectives of this restaurant as a whole Dispositional influence/selfmotivation (3 items) Q27. I am enthusiastic about my job Q28. I am proud of the work I do; Q29. I feel happy when I am working hard *Items written in Italic were removed for Taylors‟ 47 3.3.5 Instructor Instrument Another type of instrument used in this study is the instructor evaluation. This questionnaire has three parts. The first section has seven questions identified by Maxham et al. (2008) and Alexander et al. (2009). This section aims to measure student performance during the working period at the training restaurant, which includes knowledge of product, knowledge of procedure, production skill, service skill, and managerial skill. This section uses a sevenpoint Likert scale, in which „1‟ indicates that a student has a poor performance and „7‟ indicates that a student has an excellent performance. The second section has two questions and aims to measure the student‟s intent to go beyond the minimum requirement. This second section used a sevenpoint Likert scale, in which „1‟ indicates student has very low intent to go beyond the minimum requirement and „7‟ indicates very high intent. The third section, which contains two questions, measures student effort level to satisfy customers based on how often this attribute is observed in the student‟s daily work. This section used a sevenpoint Likert scale, in which „1‟ indicates that the student never puts effort to satisfy customers and „7‟ indicates that the student always tries to satisfy customers. Table 3.3 presents the constructs and items used in the instructor instrument. See Appendices 3b and 3d for the complete instructor instrument used in Taylors‟ Dining and FTR. The items listed in the instructor instrument were the same for both training restaurants. Target Population. The target population for this type of instrument was the instructors who were responsible for supervising all students who operated each restaurant. The instructors evaluated the job performance of each student based on his/her production and service skill during the lab session at the training restaurant. The study conducted 48 convenience sampling to collect instructor evaluations. The samples were all instructors who supervised the students in Taylors‟ Dining and FTR during the survey period. Sample size. Only one instructor supervised each training restaurant. Survey Administration. The study administered the survey by distributing a list of performance measurement items to the instructors during the last week of the survey period. The instructors then assessed each student‟s performance. Table 3.3 Instructor questionnaire items Constructs and items Students InRole Performance (8 items) Q1a. How do you rate this student in terms of performance with regard to knowledge of the restaurant products? Q1b. How do you rate this student in terms of performance with regard to knowledge of opening procedures? Q1c. How do you rate this student in terms of performance with regard to knowledge of closing procedures? Q1d. How do you rate this student in terms of performance with regard to all required tasks specified in his/her role as a student in a laboratory? Q3a. How do you rate this student in terms of performance with regard to production skill? Q3b. How do you rate this student in terms of performance with regard to service skill? Q3c. How do you rate this student in terms of performance with regard to managerial skill? Q2. How do you rate this student in terms of overall performance? Students ExtraRole Performance (3 items) Q4. How do you rate this student‟s intention to go above and beyond “the call of duty”? Q5. How do you rate this student‟s intention to voluntarily do extra or nonrequired work in order to help customer? Q6. How often did the student willingly go out of his/her way to make a customer satisfied? 49 3.4 Research Step 3: Generating Simulated Data A common method to test the performance of statistical and/or machine learning models with a small sample size is by performing a simulation study on generated artificial data. In this study, a student‟s responses within the studentemployee questionnaire were assumed to be correlated, while the responses between any two student surveys were assumed to be independent. Additionally, responses within an instructor‟s questionnaire for any given student were also assumed to be correlated, while the instructor‟s evaluations for different students were assumed to be independent. The simulated data was generated to mimic the students‟ responses and the instructors‟ evaluation that were collected from the surveys. Therefore, this study generated ordinal correlated data to test the performance of the OLR and ANN models in order to mimic the assumption of data collected from the survey, which were correlated within subjects and independent between subjects. There were two groups of data sets generated in this study. The first one consisted of one predictor variable and one outcome variable, while the second one consisted of three predictor variables and one outcome variable. The first data set referred to the link between studentemployee perceived value of employee satisfaction determinants and overall satisfaction, while the second data set referred to the link between studentemployee overall satisfaction and job performance. Since there were only 24 and 28 students responses collected from Fajar Teaching Restaurant and Taylors‟ Dining, this study only used 3 out of 42 items listed as employee satisfaction determinants as the predictor variables in the first data set. The purpose of using only three items is to follow the rule of thumb suggested by Peng, Lee, and Ingersoll (2002) and Churchill and Brown 50 (2007) regarding to the ratio between an outcome variable and its predictors, which is 1:10. The study selected the input variables based on the gamma correlation coefficient as suggested by Guyon and Elisseeff (2003). The top three employee satisfaction determinants that had the highest Goodman Kruskal‟s gamma correlation coefficient with the studentemployee overall satisfaction were chosen as the predictor variables in the first data set. The study uses the Goodman Kruskal‟s gamma to express the correlation coefficient because this coefficient is a common method to measure correlation between ordinal variables if there is a large number of ties in the data set, as in this case study (Lee, 1997). The threepredictor variables for the first data set from Taylors‟ Dining were “understanding what to do,” “enthusiastic feeling” and “opportunity to develop skill.” The predictor variables for data set from FTR were “understanding what to do,” “proud to be a worker” and “opportunity to develop skill.” Three scenarios were carried out to generate each group of data sets: 1) Using marginal probabilities and correlation coefficients obtained from the Taylors‟ Dining Room data set; 2) Using marginal probabilities and correlation coefficients obtained from the Fajar Teaching Restaurant data set; and 3) Using randomly generated marginal probabilities and correlation coefficients to simulate a more general case. For each scenario, 1,000 runs of simulation, which was the same as the number of simulations suggested by Dietterich (1998), were performed in order to account for training and testing data variation and internal randomness. Each run of simulation generated 100 data points, which consisted of 50 training data points and 50 testing data points. By using training data generated from each run of the simulation, both the Ordinal Logistic 51 Regression (OLR) and Artificial Neural Network (ANN) models were built. Then, these two models were used to predict the outcome using the predictor variables in the testing data sets. The last step was to calculate the misclassification rate as the proportion of disagreement between the predictedoutcome resulted from the model and the actual outcome from the testing data set. Smaller misclassification rates were preferred. 3.4.1 Procedure to Generate Ordinal Correlated Data This study applied the convex combination method suggested by Lee (1997) to generate correlated ordinal data based on the marginal probabilities and correlation coefficient. The simulations to generate the data were carried out using SAS 9.3. The correlation coefficient used in the simulation was expressed as the Goodman Kruskal‟s Gamma correlation. According to Ibrahim and Suliadi (2011), the convex combination method required less computation than the iterative proportional fitting method proposed by Gange (1995) and provided more flexibility than the method provided by Biswas (2004). The convex combination method was carried out in two stages. The first one was finding the joint distribution based on the marginal distribution and gamma correlation coefficient, and the next stage was generating ordinal random values by using the inversion algorithm. To validate the results generated from the convex combination method, this study conducted a mean rank test to compare the results and the desired marginal probabilities and correlation coefficients. The procedure to find the joint distribution can be summarized as follows: 1. Identify two extreme tables, the maximal table (πmax, corresponds to and the minimal table (πmin, corresponds to 52 2. Find λ by considering the joint distribution table of and 0≤λ≤1. As long as λ can be identified, then 1 exists. 3. Find joint distributions that meet the univariate and bivariate margins using linear programming. 3.4.2 Procedure to Generate Random Marginal Probabilities Random marginal probabilities were generated following the uniform distribution provided in IBM SPSS Statistics. Since data collected from the training restaurants were on a sevenpoint Likert scale, the study generated the marginal probability for each category response based on the following distribution (see Table 3.4): Table 3.4 The distribution of random marginal probabilities Category response level Rules to generate marginal probabilities Category level 7, p7 p7 U (0,1) Category level 6, p6 p6 U (0,1p7) Category level 5, p5 p5 U(0,1(p6+p7) Category level 4, p4 p4 U(0, 1 (p5+p6+p7)) Category level 3, p3 p3 U(0, 1 (p4+ p5+p6+p7)) Category level 2, p2 p2 U(0, 1 (p3+p4+p5+p6+p7)) Category level 1, p1 p1 = 1 – (p2+p3+p4+p5+p6+p7) where pi denote the proportion of response in the i category. The study started generating the marginal probabilities with the highest category response in order to give the higher category responses more flexibility to vary since survey data was commonly negativelyskewed distributed. The study generated the 53 marginal probabilities following the rules presented in Table 3.4 that were developed after the discussion with the committee member to ensure random and reasonable marginal distributions on the simulated data. 3.4.3 Procedure to Generate the Correlation Coefficient and Correlation Matrices A single correlation coefficient used to correlate studentemployee overall satisfaction and job performance was generated following the uniform distribution provided in the IBM SPSS Statistics. The lower limit of the correlation coefficient was set at 0.27 based on the lower 95% bound of the correlation coefficient between employee satisfaction and job performance in previous research conducted by Judge, Thoresen, Bono, and Patton (2001). The upper limit used to generate the correlation coefficient was set at 0.96, the highest correlation coefficient between employee satisfaction and job performance found in the literature (Judge et al., 2001). After establishing the lower and upper limit, the correlation coefficient was generated as Random correlation matrices were needed to generate data sets with three predictor variables and one outcome variable, which represented the relationship between three student employee satisfaction determinants and overall satisfaction. To ensure that the generated random matrices conformed to the characteristics of correlation matrices (symmetric and positive semidefinite), this study generated 4 x 4 correlation matrices following the algorithm suggested by Budden et al. (2007). Based on this algorithm, if rij is the correlation coefficient between xi and xj, and x1, x2,…, xn are random variables where n = total number of random variables, for j=2,3,4, and i=1, three correlation coefficients 54 (r12, r13, and r14) could be randomly generated using a uniform (1,1) distribution. The other correlation coefficients (r23, r24 and r34) should be randomly chosen from the intervals provided by the algorithm to ensure the symmetric and positive semidefiniteness of the matrices. Since this study found that all variables were positively correlated to each other, then r1j , where j=2,3,4. Additionally, the minimum r23, r24 and r34 were set at 0 and the maximum followed the upper limit given by the algorithm. 3.4.4 Procedure to Validate Generated Data The study performed a mean rank test, a nonparametric rankbased test for ordered categorical responses, to determine whether the generated data had an identical distribution to the original data. This test was performed to ensure that the algorithm used to generate correlated ordinal data worked properly. The study conducted the Wilcoxon test and the MannWhitney test to validate generated data since both of these tests were the most commonly used rank tests for ordered categorical data (Agresti, 2010; Leech, C.Barrett, & Morgan, 2011). 3.5 Research Step 4: Build Model This study used two modelbuilding techniques, the Ordinal Logistic Regression (OLR) and Artificial Neural Network (ANN), to test two relationships in the serviceprofit chain. Before constructing the OLR and ANN models, the study carried out some preparation steps, such as checking for missing values and outliers as well as calculating skewness and kurtosis. Since the total numbers of students who worked at the training 55 restaurants were relatively small, this study also used data generated from a simulation to build the OLR and ANN models. The performance of OLR and ANN models were measured based on the misclassification rate. A model with the lowest misclassification rate was preferred. 3.5.1 Artificial Neural Network Within the ANN model, a specific activation function is used to connect two layers (input and output layer) in the model. The number of nodes in the input and output layers is used to determine the number of nodes in the hidden layer. The type of activation function used in the model depends on the outcome range in the output layer. Other aspects to be considered during the building process are the network architecture and topology, and learning algorithm. This study built the ANN models using IBM SPSS Modeler 14.2. Based on the option available in this software package, steps carried out to build the ANN model can be explained as follows: 1. Determine the objective: build a new model. 2. Determine the type of network architecture: a multilayer perceptron (MLP). 3. Determine the number of nodes in the hidden layer. 4. Determine stopping rules. 5. Determine a percentage of records used for an overfit prevention set 56 3.5.2 Ordinal Logistic Regression The OLR model is an extension of a logistic regression used to analyze ordinal data. The OLR method is the most appropriate and practical technique to analyze the effect of independent variables on a rankordered dependent variable because the dependent variable cannot be assumed as normally distributed or as interval data (Lawson & Montgomery, 2006). The OLR model fit depends on the number of independent variables and the selected link function determined during the modelbuilding phase. This study built the OLR models using IBM SPSS Modeler 14.2. Based on the options available in IBM SPSS Modeler 14.2, the steps to build the OLR models can be explained as follows: 1. Determine whether the intercept is included in the model or not. 2. Specify the link function. 3. Specify the parameter estimation method. 4. Determine the scale parameter estimation method. 5. Specify the iteration rule to control the parameters for model convergence. 3.5.3 Comparing Model Performance This study used misclassification rate to measure the performance of the constructed OLR and ANN models. The misclassification rate was measured as the aggregate ratio of total wrong classifications for all classes to the total number of data used in the model. For example, since the variables used in this study were a sevenpoint Likert scale, then the misclassification rate was calculated as the total number of wrong classifications for response category one to seven. A wrong misclassification occurred 57 when the predicted categories from the model were not the same as the actual categories presented in the testing data. The lower misclassification rate indicates better model performance. In IBM SPSS Modeler 14.2, the misclassification rate is presented along with the confusion matrix. This matrix has an appearance similar to a contingency table and contains information related to the actual and predicted classification done by the specified model. The dimension of this matrix depends on number of the actual and predicted category responses. By using data generated from the simulation, this study built 1,000 OLR and ANN models to compare the misclassification rates obtained from each model. There were 1,000 1 and 2 values calculated from each model, where 1 and 2 referred to misclassification rates resulting from the OLR and ANN models respectively. The number of misclassification rates collected from each model was large enough (n > 30) to apply the central limit theorem to test the difference between the average misclassification rates resulting from the OLR and ANN models. Based on the central limit theorem, the assumption of normally distributed population were unnecessary since the test was performed on large sample sizes (Devore, 2008). Since the population variance was unknown, the test used the sample variance. The hypothesis test was as follows: , and 58 1000 1000 (3.1) where = the true mean misclassification rate for the ordinal logistic regression model = the true mean misclassification rate for the artificial neural network model = the sample average of misclassification rate resulting from the OLR model = the sample average of misclassification rate resulting from the ANN model = sample variance of resulting from the OLR model = sample variance resulting from the ANN model For α = 0.05, α/2 = 0.025, and Zα/2 = 1.96 and Z1α/2 = 1.96 (twosided test). is rejected if p value is smaller than the desired type I error (α). If H0 is rejected, then the study concludes that there is a statistically significant difference on the mean of misclassification rate resulting from the OLR and ANN models. Otherwise, H0 is fail to be rejected, which means the mean of the misclassification rates resulting from the OLR is not statistically significant different from the one resulting from the ANN. 3.6 Summary This chapter presents detailed procedures used to compare the Ordinal Logistic Regression (OLR) and Artificial Neural Network (ANN) models to analyze ordinal data. These procedures can be grouped into 4 steps. The first step is to develop the framework model. The study uses the internal link of the Service Profit Chain (SPC) as the framework to compare the OLR and ANN models. The internal links used in this study 59 consists of two causal links: the link between employee perceived value of the internal and external determinants of employee satisfaction and employee overall satisfaction and the link between employee overall satisfaction and job performance. Based on the framework outlined in the previous step, the second step is to design a data collection plan. The study conducts surveys in two training restaurants, Taylors‟ Dining Room at Oklahoma State UniversityUSA and Fajar Teaching Restaurant (FTR) at Universitas Negeri MalangIndonesia. Students and instructors are the respondents for the surveys. The third step is to generate correlated ordinal data using simulation proposed by Lee (1997). The simulated data is generated based upon the marginal probabilities and correlation coefficients that are similar to that of data collected from Taylors‟ Dining (scenario 1) and FTR (scenario 2), while the last simulated data have random marginal probabilities and random correlation coefficients (scenario 3). The simulated data in this study can be grouped into two sets. The first one is needed to test the relationship between student overall satisfaction and job performance. This data set consists of one input variable and one output variable. The other one is used to test the relationship between three determinants of student overall satisfaction and the student overall satisfaction. This data set consists of four variables which refers to three determinants of student overall satisfaction (input) and student overall satisfaction (output). For each set, the correlated ordinal data are generated from 1,000 run of simulations with 100 observations (50 training data 50 testing data) on each run. The last step is to build the OLR and ANN models using each training data set generated from the simulations as explained previously. The performance of the OLR and 60 ANN models is compared based on the mean of the misclassification rates from the testing data set. The mean of the misclassification rates is calculated as the average of the proportion of disagreement between the predictedoutcome from the model and the actual outcome from the testing data. Hypothesis test on the mean of the misclassification rates is used to identify conditions in which the OLR outperforms the ANN model and vice versa. 61 CHAPTER IV THE ORDINAL LOGISTIC REGRESSION AND ARTIFICIAL NEURAL NETWORK WITH ONE INPUT VARIABLE 4.1 Introduction This chapter presents the Ordinal Logistic Regression (OLR) and Artificial Neural Network (ANN) models that were built using one input variable. The input variable in this case was the student overall satisfaction and the output variable was the student performance. The input variable was obtained from the student instrument, while the output variable was obtained from the instructor instrument. To compare the performance of the OLR and ANN models, three scenarios were designed. The first scenario was to build both models using simulated data that has similar marginal probability distributions and correlation coefficient to collected data from survey at Taylors‟ Dining. The second scenario was to construct both models using simulated data that has similar marginal probability distributions and correlation coefficients to collected data from surveys at 62 Fajar Teaching Restaurant (FTR), while the last scenario was to build both models using randomly generated correlated ordinal data based on the random marginal probabilities and correlation coefficients. 4.2 Preparation Steps Before constructing the models, a review was performed to determine if there were any missing values in any data set. The initial check showed that there were no missing values found in the data collected from both restaurants, Taylors‟ Dining and FTR, respectively. There were 24 and 28 student responses from FTR and Taylors‟ Dining. In addition, there were 24 and 28 responses received from the instructors who evaluated the student performance in each restaurant. The study also explored the marginal probabilities of each collected data set. As shown in Figures 4.1 and 4.2, the distributions of the student overall satisfaction and student performance data from both restaurants were negatively skewed. This meant that most students rated their overall satisfaction as student lab as “neutral” or higher, and most students were assessed as having good performance or higher by the instructor. The skewness values of student overall satisfaction data collected from Taylors‟ Dining and FTR were 1.447 and 0.566, respectively. Additionally, the skewness values of student performance data collected from Taylors‟ and FTR were 0.955 and 0.208, respectively. The skewness indicated that the student overall satisfaction and performance data collected from Taylors‟ Dining was more negatively skewed than the one collected from FTR. The kurtosis values of student overall satisfaction data collected from Taylors‟ Dining and FTR were 1.993 and 0.507 respectively. The kurtosis values 63 indicated the “peakedness” (positive kurtosis) and flatness (negative kurtosis) of student overall satisfaction data collected from Taylors‟ and FTR. Figure 4.1 Marginal probability distributions of input and output data in Taylors‟ dining (one input variable) Figure 4.2 Marginal probability distributions of input and output data in FTR (one input variable) .0 10.0 20.0 30.0 40.0 50.0 60.0 4 5 6 7 Marginal Probabilities (%) Response level Marginal Probability Distributions in Taylors' Dining Student Overall Satisfaction Student Performance .0 5.0 10.0 15.0 20.0 25.0 30.0 35.0 40.0 45.0 50.0 4 5 6 7 Marginal Probabilities (%) Response Level Marginal Probability Distributions Fajar Teaching Restaurant Student Overall Satisfaction Student Performance 64 To be able to construct OLR and ANN models, each student‟s response on the overall satisfaction statement was paired with the student performance assessment by the instructor. All students in FTR put their names on the questionnaire, while seven out of twentyeight students in Taylors‟ Dining did not put their names on the surveys. Thus, the study was not able to calculate the correlation coefficient for data collected from Taylors‟ Dining. Instead, the correlation coefficient between student overall satisfaction and student performance in Taylors‟ Dining was assumed to be similar to the correlation coefficient obtained from FTR. The gamma correlation coefficient between student overall satisfaction and performance based on data collected from FTR and based on data collected from Taylors‟ (excluding students‟ responses without name) are 0.57 and 0.63, respectively. Thus, the correlation coefficients collected from both training restaurant were assumed to be comparable. The correlation coefficient between student overall satisfaction and performance based on data collected from FTR is shown in Table 4.1. The results in Table 4.1 show the obtained Gamma (a correlation coefficient for ordinal scale) is .57 with a significance level of 0.008, which means student overall satisfaction is positively correlated with student performance, assuming α=.01. On the other hand, the obtained Pearson (a correlation coefficient for interval scale) is .438 with a significance level of .032, which means that the correlation is not statistically significant at α=.01. These results indicate that treating ordinal data as different scales, either interval or ordinal, may result in a different correlation coefficient and significance level. The study uses the obtained Gamma correlation coefficient, to generate correlated ordinal data for scenario 65 1 (Taylors‟ Dining Room‟s scenario) and 2 (Fajar Teaching Restaurant‟s scenario) in order to treat the ordinal data with a relevant ordinal analysis. Table 4.1 Correlation coefficient between student overall satisfaction and performance Value Approx. Sig. Ordinal by Ordinal Gamma .570 .008 Interval by Interval Pearson's R .438 .032 N of Valid Cases 24 4.3 Validating Algorithm to Generate Correlated Ordinal Data As explained in section 4.2, some students in Taylors‟ Dining did not put their names on the questionnaire, so it could not be paired with instructor responses. This study used data collected from FTR to validate the algorithm applied to generate correlated ordinal data. Cross tabulated data from FTR and its initial simulated data set are shown in Tables 4.2 and 4.3. The results in Tables 4.2 and 4.3 show by inspection that the difference between marginal probabilities for each response category in data obtained from FTR and from the simulation ranges from 0.7%  9.5%. Table 4.2 Cross tabulated data from Fajar Teaching Restaurant Instructor perception toward student performance Total 5 6 7 Student overall satisfaction 4.00 Count 1 1 0 2 % of Total 4.2% 4.2% .0% 8.3% 5.00 Count 1 4 0 5 % of Total 4.2% 16.7% .0% 20.8% 6.00 Count 2 4 3 9 % of Total 8.3% 16.7% 12.5% 37.5% 7.00 Count 1 2 5 8 % of Total 4.2% 8.3% 20.8% 33.3% Total Count 5 11 8 24 % of Total 20.8% 45.8% 33.3% 100.0% 66 Table 4.3 Cross tabulated data of the first generated correlated ordinal data set Instructor Perception toward Student Performance Total 5.00 6.00 7.00 Student overall satisfaction 4.00 Count 5 2 2 9 % of Total 5.0% 2.0% 2.0% 9.0% 5.00 Count 8 4 1 13 % of Total 8.0% 4.0% 1.0% 13.0% 6.00 Count 5 35 7 47 % of Total 5.0% 35.0% 7.0% 47.0% 7.00 Count 4 10 17 31 % of Total 4.0% 10.0% 17.0% 31.0% Total Count 22 51 27 100 % of Total 22.0% 51.0% 27.0% 100.0% To determine whether the mean rank between the survey data and the simulated data was statistically different or not, a mean rank test was also carried out. The mean ranks for the survey data (data collected from FTR) and the simulated data are shown in Table 4.4, while the Wilcoxon test and MannWhitney test results are shown in Table 4.5. Table 4.4 Mean rank for student overall satisfaction and performance group N Mean Rank Sum of Ranks Student overall satisfaction Survey data 24 61.27 1470.50 Simulated data 100 62.80 6279.50 Total 124 Instructor evaluation on student performance Survey data 24 65.40 1569.50 Simulated data 100 61.81 6180.50 Total 124 Table 4.4 shows that the mean rank of the student overall satisfaction variable from the survey data is lower than the one from the simulated data, while the mean rank 67 of the student performance variable from the survey data is higher than the one from the simulated data. Assuming α=0.01, the asymptotic significance values for the student overall satisfaction and student performance, as shown in Table 4.5, are 0.842 and 0.632, respectively. Both of these significance values are greater than the specified α. Thus, there is no significant difference between mean ranks on FTR‟s student overall satisfaction and student performance data and the simulated data. These results suggest that the algorithm used to generate these correlated ordinal data is valid and can be used for further analyses. Table 4.5 Mean rank test statistics Student overall satisfaction Student performance MannWhitney U 1170.500 1130.500 Wilcoxon W 1470.500 6180.500 Z .199 .479 Asymp. Sig. (2tailed) .842 .632 4.4 Scenario 1 This scenario generated data with similar marginal probabilities to data collected from Taylors‟ Dining. As mentioned in section 4.2, the correlation coefficient used in this scenario was assumed to be similar to data collected from Fajar Teaching Restaurant. The study performed 1,000 runs of the simulation to generate 1,000 data sets with 100 observations in each data set. The 100 observations were then split into two sets: 50 observations were used as a training data set and the others were used as a testing data set. 68 The marginal probabilities of student overall satisfaction and student performance, as shown in Figure 4.1, were negatively skewed, which meant that data was likely to be distributed among the higher response levels. Therefore, a cumulative loglog function is more appropriate for use in the OLR link function than the other available cumulative functions such as cumulative logit or probit (Agresti, 2010; Chen & Hughes, 2004). The study used the multilayer perceptron (MLP) as the network architecture in the ANN model since this architecture is more appropriate for predictive classification problems (Turban, Sharda, & Delen, 2011). The automatic option available in IBM SPSS Modeler was chosen to set the hidden layer since the automated neural networks in IBM SPSS were very powerful (Nisbet et al., 2009). This option let the software determine the number of nodes in the hidden layer that make the model fit best with the data set. The biggest benefit of using the automatic option was that the software automatically searched over the decision surface with different initial learning rates, different momentum, and different numbers of hidden layers in order to get the best parameters for the model (Nisbet et al., 2009). The study allocated 30% of the data set as an overfit prevention data set, which was used to track errors during the training process in order to prevent an over fitted model. The descriptive statistics of the misclassification rates for the OLR and ANN models for scenario 1 are shown in Table 4.6. 69 Table 4.6 Descriptive Statistics of Misclassification Rates from Scenario 1 (one input variable) N Range Min Max Mean Std. Deviation OLR misclassification rate 1000 .44 .22 .66 .4536 .07539 ANN misclassification rate 1000 .42 .24 .66 .4556 .07420 Valid N (listwise) 1000 Table 4.6 indicates that the mean and maximum values of the misclassification rates obtained from the OLR and ANN models were not significantly different. Additionally, there were only small differences between the range and standard deviation resulting from both models. 4.5 Scenario 2 This scenario generated data with similar probabilities and a correlation coefficient to data collected from Fajar Teaching Restaurant. The study also performed similar simulations to those explained in Scenario 1. The marginal probabilities of the student overall satisfaction and the student performance, as shown in Figure 4.2, were negatively skewed. This meant that data was likely to be distributed on the higher response levels. Thus, the cumulative loglog function was more appropriate for use in the OLR link function than the other available cumulative functions such as cumulative logit or probit (Agresti, 2010; Chen & Hughes, 2004). The ANN models for scenario 2 were built using the same approach as scenario 1. This scenario also applied the multilayer perceptron (MLP) network architecture and the automatic option in the hidden layer setting because the automated neural networks 70 provided by IBM SPSS Modeler was very powerful according to Nisbet et al. (2009). To prevent obtaining an overfit model, the study also allocated 30% of the data set as an overfit prevention data set. The descriptive statistics of misclassification rates for the OLR and ANN models for scenario 2 are shown in Table 4.7. This table shows that the range, minimum, and maximum values of the misclassification rates obtained from the OLR and ANN models were exactly the same. The mean misclassification rate from the OLR models was slightly lower than the one from the ANN models. Additionally, small differences were found between the standard deviation of misclassification rates that resulted from both models. Table 4.7 Descriptive Statistics of Misclassification Rates from Scenario 2 (one input variable) N Range Min. Max. Mean Std. Deviation OLR misclassification rate 1000 .44 .20 .64 .4033 .07595 ANN misclassification rate 1000 .44 .20 .64 .4065 .07500 Valid N (listwise) 1000 4.6 Scenario 3 Scenario 3 generated ordinal correlated data based on random marginal probabilities and correlation coefficients using the uniform random generator available in IBM SPSS Statistics 19.0. The random number generator in IBM SPSS has a period of 232, which means that the software can generate 232 random numbers with a uniform distribution before it begins to repeat itself (McCullough, 1999). A previous study 71 suggested that a random number generator with a period of 231 is acceptable to generate 1,000 data points (L'Ecuyer & Hellekalek, 1998). Another study conducted by Knuth (1997) suggested that a more modest period of 231 could be used to generate one million random numbers. Therefore, the use of the random number generator provided by IBM SPSS Statistics 19.0 is acceptable to generate random numbers needed in 1,000 runs of the simulation. As explained in section 3.4.3, the lower limit of the correlation coefficient was set at 0.27 and the upper limit was set at 0.96. These limits were determined based upon the lower 95% bound of the correlation coefficient between employee satisfaction and job performance in the previous research conducted by Judge et al. (2001). By having the lower and upper limit, the correlation coefficient was generated following The distribution of the generated correlation coefficients used in this scenario is shown in Figure 4.3. This figure shows that the generated correlation coefficients are fairly evenly distributed among all intervals. The first and the last intervals were the two intervals in which the generated correlation coefficients were most highly concentrated. Figure 4.3 The distribution of the generated correlation coefficients 0 20 40 60 80 100 120 Frequency Correlation Coefficient Interval Generated Correlation Coefficient Distribution 72 The rules shown in Table 4.8 were used to generate marginal probabilities for both the student overall satisfaction and the student performance variables and were developed following the discussion with the committee member to ensure of the pro
Click tabs to swap between content that is broken into logical sections.
Rating  
Title  Comparing the Performance of Ordinal Logistic Regression and Artificial Neural Network when Analyzing Ordinal Data 
Date  20120701 
Author  Larasati, Aisyah 
Keywords  Artificial Neural Network, Ordinal Data, Ordinal Logistic Regression, Simulation 
Department  Industrial Engineering & Management 
Document Type  
Full Text Type  Open Access 
Abstract  The purpose of this study is to compare the performance of the Ordinal Logistic Regression (OLR) and Artificial Neural Network (ANN) models when analyzing ordinal data using different scenarios by varying the combinations of the marginal probability distributions and correlation coefficients. Two internal links in the Service Profit Chain (SPC), the relationship between employee perceived value of the internal and external determinants of employee satisfaction and employee overall satisfaction and the relationship between employee overall satisfaction and job performance are used as a framework to build the OLR and ANN models. Ordinal data collected from surveys at two trainining restaurants (Taylors' Dining at Oklahoma State University, USA and Fajar Teaching Restaurant at Universitas Negeri Malang, Indonesia) and simulated correlated ordinal data are fitted to the OLR and ANN models in order to compare the mean of misclassification rates from each model. A model with a lower misclassification rate is preferred. The application of the OLR and ANN models to analyze a causal relationship between one input variable and one output variable results in no statistically significant difference between the means of the misclassification rates resulting from both models for all three scenarios tested. On the other hand, the application of the OLR and ANN models to analyze a causal relationship between three input variables and one output variable results in a statistically significant difference between the means of the misclassification rates resulting from both models for all three scenarios tested. The OLR model outperforms the ANN model when it is used to analyze ordinal data that has similar marginal probabilities and correlation coefficients to Taylors' data. In contrast, the ANN model outperforms the OLR model when it is used to analyze ordinal data that has marginal probabilities and correlation coefficients either similar to FTR's data or randomly distributed. These results suggest that the complexity of the problem, which is represented by the number of input variables (attributes), and the complexity of the data structures, which is represented by the correlation coefficient and marginal probability distribution including the kurtosis, should be considered before fitting data sets to either the OLR or ANN models. 
Note  Dissertation 
Rights  © Oklahoma Agricultural and Mechanical Board of Regents 
Transcript  COMPARING THE PERFORMANCE OF ORDINAL LOGISTIC REGRESSION AND ARTIFICIAL NEURAL NETWORK WHEN ANALYZING ORDINAL DATA By AISYAH LARASATI Bachelor of Science in Industrial Engineering Sepuluh Nopember Institute of Technology Surabaya, Indonesia 1999 Master of Science in Industrial Engineering Sepuluh Nopember Institute of Technology Surabaya, Indonesia 2003 Submitted to the Faculty of the Graduate College of the Oklahoma State University in partial fulfillment of the requirements for the Degree of DOCTOR OF PHILOSOPHY July, 2012 ii COMPARING THE PERFORMANCE OF ORDINAL LOGISTIC REGRESSION AND ARTIFICIAL NEURAL NETWORK WHEN ANALYZING ORDINAL DATA Dissertation Approved: Dr. Camille F. DeYong Dissertation Adviser Dr. David B. Pratt Dr. William J. Kolarik Dr. Melinda H. McCann Dr. Lisa Slevitch Outside Committee Member Dr. Sheryl A. Tucker Dean of the Graduate College iii TABLE OF CONTENTS Chapter Page I. INTRODUCTION ......................................................................................................1 1.1 Background ........................................................................................................1 1.2 Problem Statement .............................................................................................7 1.3 Purpose ...............................................................................................................8 1.4 Test Case: The Service Profit Chain in Training Restaurants ...........................9 1.5 Summary of the Research Gaps .......................................................................11 1.6 Organization of the Study ................................................................................12 II. REVIEW OF LITERATURE..................................................................................13 2.1 Introduction ......................................................................................................13 2.2 Method for Analyzing Ordinal Data ................................................................14 2.2.1 Ordinal Logistic Regression (OLR) Model ............................................15 2.2.2 Artificial Neural Network (ANN) Model ...............................................18 2.2.3 Performance Metrics ...............................................................................20 2.2.4 Statistical Test to Compare the OLR and ANN Models .........................22 2.3 Generating Correlated Ordinal Data ................................................................24 2.4 Generating Correlation Coefficient..................................................................26 2.5 Training Restaurant ..........................................................................................29 2.6 The Service Profit Chain ..................................................................................31 2.6.1 Link between Employee and Customer Satisfaction ..............................32 2.6.2 Link between Customer Satisfaction and Organization‟s Success Measures .................................................................................................32 2.6.3 Link between Employee Satisfaction and Organization‟s Success Measures .................................................................................................33 2.7 Employee Satisfaction .....................................................................................34 III. RESEARCH METHODOLOGY...........................................................................38 3.1 Introduction ......................................................................................................38 3.2 Research Step 1: Conceptual Frameworks ......................................................41 iv Chapter Page 3.3 Research Step 2: Data Collection Plan ............................................................41 3.3.1 Initial Instrument and Pretest ..................................................................42 3.3.2 Pilot Test .................................................................................................42 3.3.3 Instrument Validity .................................................................................43 3.3.4 Student Instrument ..................................................................................44 3.3.5 Instructor Instrument ...............................................................................47 3.4 Research Step 3: Generating Simulated Data ..................................................49 3.4.1 Procedure to Generate Ordinal Correlated Data .....................................51 3.4.2 Procedure to Generate Random Marginal Probabilities .........................52 3.4.3 Procedure to Generate the Correlation Coefficient and Correlation Matrices...................................................................................................53 3.4.4 Procedure to Validate Generated Data ....................................................54 3.5 Research Step 4: Build Model .........................................................................54 3.5.1 Artificial Neural Network .......................................................................55 3.5.2 Ordinal Logistic Regression ...................................................................56 3.5.3 Comparing Model Performance ..............................................................56 3.6 Summary ..........................................................................................................58 IV. THE ORDINAL LOGISTIC REGRESSION AND ARTIFICIAL NEURAL NETWORK WITH ONE INPUT VARIABLE .....................................................61 4.1 Introduction ......................................................................................................61 4.2 Preparation Steps .............................................................................................62 4.3 Validating Algorithm to Generate Correlated Ordinal Data ............................65 4.4 Scenario 1.........................................................................................................67 4.5 Scenario 2.........................................................................................................69 4.6 Scenario 3.........................................................................................................70 4.7 Misclassification Rates Comparison ................................................................75 4.8 Summary ..........................................................................................................77 V. THE ORDINAL LOGISTIC REGRESSION AND ARTIFICIAL NEURAL NETWORK WITH THREE INPUT VARIABLES ..............................................79 5.1 Introduction ......................................................................................................79 5.2 Preparation Steps .............................................................................................80 5.3 Validating Algorithm to Generate Correlated Ordinal Data ............................85 5.4 Scenario 1.........................................................................................................89 5.5 Scenario 2.........................................................................................................91 5.6 Scenario 3.........................................................................................................94 5.7 Misclassification Rates Comparison ..............................................................100 5.8 Choosing a Model ..........................................................................................102 5.9 Summary ........................................................................................................103 v Chapter Page VI. SUMMARY, CONCLUSION AND FUTURE WORK .....................................106 6.1 Summary ........................................................................................................106 6.2 Conclusion .....................................................................................................111 6.3 Future Work ...................................................................................................112 REFERENCES ..........................................................................................................114 APPENDICES ...........................................................................................................125vi LIST OF TABLES Table Page Table 1.1 Classification of measurement scale ..............................................................4 Table 2.1 Comparison of training restaurants and general type of restaurants ............30 Table 2.2 Constructs of employee satisfaction ............................................................36 Table 2.3 Constructs of student performance ..............................................................37 Table 3.1 Reliability Alpha on pilot data .....................................................................43 Table 3.2 Student questionnaire items .........................................................................46 Table 3.3 Instructor questionnaire items ......................................................................48 Table 3.4 The distribution of random marginal probabilities ......................................52 Table 4.1 Correlation coefficient between student overall satisfaction and performance .................................................................................................65 Table 4.2 Cross tabulated data from Fajar Teaching Restaurant .................................65 Table 4.3 Cross tabulated on the first generated correlated ordinal data .....................66 Table 4.4 Mean rank for student overall satisfaction and performance ......................66 Table 4.5 Mean rank test statistics ..............................................................................67 Table 4.6 Descriptive statistics of misclassification rates from Scenario 1 (one input variable) ......................................................................................69 Table 4.7 Descriptive statistics of misclassification rates from Scenario 2 (one input variable) ......................................................................................70 Table 4.8 The rules to generate marginal probabilities ...............................................72 Table 4.9 Student performance marginal probability distributions ............................73 Table 4.10 Student overall satisfaction marginal probability distributions ................73 Table 4.11 Descriptive statistics of misclassification rates from Scenario 3 (one input variable) ......................................................................................75 Table 5.1 Gamma correlation coefficient from Taylors‟ data ....................................81 Table 5.2 Gamma correlation coefficient from Fajar Teaching Restaurant data ........82 Table 5.3 Cross tabulated data of “understanding what to do” ...................................86 Table 5.4 Cross tabulated data of “opportunity to develop skill” ...............................87 Table 5.5 Cross tabulated data of “enthusiastic feeling” ............................................87 Table 5.6 Mean rank for student overall satisfaction and its three determinants .......88 Table 5.7 Mean rank test statistics ..............................................................................88 Table 5.8 Gamma correlation coefficients between variables used in Scenario 1 (three input variables) ..................................................................................90 vii Table Page Table 5.9 The descriptive statistics of misclassification rates for Scenario 1 (three input variables) ...............................................................................91 Table 5.10 Gamma correlation coefficients between variables used in Scenario 2 (three input variables) ................................................................................93 Table 5.11 The descriptive statistics of misclassification rates for Scenario 2 (three input variables) ...............................................................................93 Table 5.12 The rules to generate marginal probabilities .............................................95 Table 5.13 Marginal probability distributions input variable 1 ...................................96 Table 5.14 Marginal probability distributions input variable 2 ...................................96 Table 5.15 Marginal probability distributions input variable 3 ...................................96 Table 5.16 Marginal probability distributions output variable ....................................97 Table 5.17 Generated correlated coefficient intervals ................................................98 Table 5.18 The descriptive statistics of misclassification rates for Scenario 3 (three input variables .................................................................................99 Table 6.1 Summary of the best guessestimate models ...........................................109 viii LIST OF FIGURES Figure Page Figure 2.1 Information processing in ANN with backpropagation algorithm ...........19 Figure 2.2 A confusion matrix representation for seven class classification problem .......................................................................................................21 Figure 2.3 A taxonomy of statistical test in comparing algorithms .............................22 Figure 2.4 The links in the Service Profit Chain .........................................................31 Figure 3.1 The framework of the research methodology .............................................40 Figure 3.2 The conceptual framework of the study .....................................................41 Figure 4.1 Marginal probability distributions of input and output data in Taylors‟ Dining (one input variable) ........................................................................63 Figure 4.2 Marginal probability distributions of input and output data in FTR (one input variable) ....................................................................................63 Figure 4.3 The distribution of the generated correlation coefficients ..........................71 Figure 5.1 Marginal probability distributions from Taylors‟ data ..............................84 Figure 5.2 Marginal probability distributions from FTR data set ................................84 1 CHAPTER I INTRODUCTION 1.1 Background Service industries measure their performance with respect to customer satisfaction using multiple techniques, including customer surveys (Allen & Seaman, 2007). Surveys are also used to measure employee satisfaction, job performance and other facets of the internal service quality of an organization. Typically, the types of information collected from surveys are related to descriptive, behavioral and attitudinal attributes of the respondents (Rea & Parker, 2005). Socioeconomic data of the respondents (such as income, age, and ethnicity) is an example of descriptive information collected from a survey. Survey questions about respondent behavior, such as utilization of various resources and facilities, are designed to document the respondents‟ patterns of behavior while they are using the facilities. The respondents‟ stated attitudes about various conditions related to the services they used are also commonly found in survey studies. 2 Organizations use this descriptive, behavioral, and attitudinal information from surveys to determine what types of services should be offered or withdrawn, which factors most strongly govern respondents‟ satisfaction with the provided services, how various work environments influence productivity, and many other essential decisions. Thus, survey research has become of critical importance for business decisionmaking (Allen & Seaman, 2007; Rea & Parker, 2005). Stevens‟ classification of measurement scale (Stevens, 1946) classifies data collected from surveys into four types of scales: nominal, ordinal, interval and ratio. Nominal scale refers to categories without ordering the preferences, such as gender (male and female), favorite colors (blue, white, and black), and seasons (fall, spring, summer and winter). Ordinal scale preserves rank ordering in the categories but no measures of distance between categories are possible because the distance between categories are not necessary equal. Some examples of ordinal data are variables describing stages of cancer (I, II, II), the quality of waiting service (poor, acceptable, excellent), and customer satisfaction with a service delivery (very dissatisfied, dissatisfied, neutral, satisfied, and very satisfied). The distance between “neutral” and “satisfied” may not be the same as the distance between “satisfied” and “very satisfied.” An interval scale has the same characteristics as an ordinal scale, but the distances between any points are consistent. However, an interval scale does not have an absolute zero. An example of interval data is temperature in Fahrenheit (F) degrees since 0o F is arbitrary and negative values can be used. Ratio data has all the characteristics of interval data except that it has an absolute zero. Examples of ratio data are a person‟s weight and height. 3 In summary, a nominal scale allows differentiation between responses by categorizing only, while an ordinal scale enables the researcher to determine the rankorder of preferences without using the distance between any points in the scale. In contrast, an interval scale is able to measure the distance between responses. A ratio scale is the highest level of measurement since it has an absolute (as opposed to an arbitrary) zero point. Stevens (1946) also outlines the statistical procedures that are permissible for each type of scale, in which each permissible statistics for each type of scale includes all of its predecessors. The permissible statistics for nominal data should be limited to the mode, the number of cases, and the contingency correlation. The permissible statistics for ordinal data include all statistics for nominal data plus the median and percentiles, while that for interval data include all the statistics for ordinal data and also allows calculation of the mean, standard deviation, and product moment correlation. A ratio scale preserves all of the permissible statistics in the other scales while also allowing coefficient of variation. According to Stevens (1946), performing data analysis without considering the type of measurement scale can lead to meaningless results. Table 1.1 shows Stevens‟ classification of measurement scale. The vast majority of surveys use Likert scales as the rating format (Allen & Seaman, 2007). The Likert scale is used to measure respondents‟ attitudes toward a given statement. Although the Likert scale is commonly constructed as a fivepoint scale, some researchers recommend the use of the sevenpoint scale in order to achieve higher reliability results (Allen & Seaman, 2007; Jamieson, 2004). Sometimes the scale is set to 4 a fourpoint scale or other even numbers in order to force a respondent to make a choice by eliminating the “neutral” option. Table 1.1 Classification of measurement scale (Stevens, 1946) Scale Basic empirical operation Permissible statistics Nominal Determination of equality Number of cases Mode Contingency correlation Ordinal Determination of greater than or less than Median Percentiles Rankorder correlation Interval Determination of equality of intervals or differences Mean Standard deviation Productmoment correlation Ratio Determination of equality of ratios Coefficient of variation The Likert scale often ranges from least to most in order to capture a respondent‟s feeling of intensity toward a given item (Turk, Uysal, Hammit, & Vaske, 2011). For example, respondents are asked to indicate their degree of agreement with a particular statement, and they may express their agreement as “strongly disagree,” “disagree,” “neither disagree nor agree,” “agree,” and “strongly agree.” The response categories in the Likert scale have a rankorder. Although the numbers 1, 2, 3, 4, and 5 may be assigned to the respective response categories, the distance between each category is not equal. For example, the distance between “1=strongly disagree” and “2=disagree” may not be assumed to be the same as the distance between “2=disagree” and “3=neither disagree nor agree.” Thus, the Likert scale should be categorized as an ordinal scale (Allen & Seaman, 2007; Jamieson, 2004). 5 Ordinal data has been widely utilized in education, health, behavioral and social studies. In the social and behavioral sciences, an ordinal scale is often used to measure attitudes and opinions. For example, employees could be asked to rate their overall job satisfaction using ordered categories such as “strongly dissatisfied,” “dissatisfied,” “neutral,” “satisfied,” and “strongly satisfied.” This measure of overall job satisfaction is ordinal because employees who choose “satisfied” experience more positive feeling toward their job than if they choose “neutral.” The rankorder is clear even though the difference between “satisfied” and “neutral” can not be measured numerically and certainly can not be assumed to be equal to other intervals. Ordinal data is different from interval data because the absolute distances between each level in ordinal data are unknown even though the rankorder of the level is clearly defined. Nominal and ordinal data are categorical data but nominal data does not involve a rankorder. In general, data analyses for nominal, interval, and ratio data are clearly defined but this is not the case with data analysis for ordinal data. Many studies treat ordinal data as interval data (Knapp, 1990; Mayer, 1971; Velleman & Leland, 1993). Underlying this might be the fact that parametric tests with interval data are considered easier to interpret and provide more meaningful information than nonparametric tests (Allen & Seaman, 2007; Chimka & Wolfe, 2009). However, treating ordinal data as interval data may result in a misrepresentation of the results and lead to poor decision making since such treatment causes substantial bias by assuming equal intervals between points of the ordinal data and other assumptions related to the data distribution that are rarely fulfilled by ordinal data. 6 A study conducted by Hastie, Botha, and Schnitzler (1989) shows that treating ordinal output data as interval data results in statistically significant interaction between independent variables. However, when this ordinal output data is analyzed as ordinal data, the interaction is not statistically significant. Therefore, many researchers recommend not analyzing ordinal data as interval data in order to achieve a higher capability of detecting meaningful trends of input variables on the response variable. Thus, analyzing ordinal data using methods that are able to maintain the rankorder of ordinal data without assuming equal distances between categories provide more valuable and useful results for further investigation and decisionmaking (Gregoire & Driver, 1987; Jamieson, 2004; Mayer, 1971). Multiple analytical statistical methods are available to analyze ordinal data. These methods can be a modelbased approach, such as models for cumulative response probabilities or a nonmodel based approach, such as a nonparametric method based on ranking. A modelbased approach is commonly used to test causal relationships, while a nonmodel based approach tends to be used for making inferences related to association/correlation measures. A common modelbased method used to analyze ordinal data is an Ordinal Logistic Regression (OLR) model (further explanation of the OLR model is presented in subsection 2.2.1). Several approaches are available to build the OLR model, such as the cumulative link model, the adjacent categories model, and the continuation ratio model. The most commonly used among these three approaches is the cumulative OLR model (Agresti, 2010; Tutz, 2012). In addition to statistical models, several machinelearning algorithms are also available to analyze ordinal data, such as an Artificial Neural Network (ANN) model, a 7 decision tree model, and a Support Vector Machine (SVM) model. An ANN model is a computational model that is inspired by the properties of biological neurons. The ANN model term used in this study refers to a multilayer perceptron (MLP) ANN, an artificial neural network that is comprised of input, hidden and output layers. The hidden layer is the key of an ANN model since it contains the summation and transfer function of each node (further explanation of ANN is presented in subsection 2.2.2). A decision tree model presents a classification rule as a tree in which different subsets of variables are used at different levels of the tree. The classification rule in the tree defines the decision boundary. A SVM model functions as a pattern classification method by finding the optimal separating hyperplane for either linear or nonlinear data. The optimization process in an SVM model relies on the kernel function used in the model Among these three techniques (ANN, decision tree and SVM), the ANN model has more similarities with the regression model than the other models. The comparisons between the ANN model and the logistic regression model for classification or prediction problems of binary response data have been conducted extensively (Deng, Chen, & Pei, 2008; Karlaftis & Vlahogianni, 2011; Paliwal & Kumar, 2009). However, none of the previous studies have compared the performance of OLR and ANN models to analyze ordinal data. 1.2 Problem Statement The benefits of analyzing ordinal data using methods that maintain the rankorder of ordinal data and do not assume equal distances between categories promise meaningful and useful results in decisionmaking. Although some previous studies have applied the 8 OLR or ANN models to analyze ordinal data, the existing research focuses on comparing the performance of the logistic regression and ANN models for classification of binary responses. None of the existing studies compares the performance of the ANN and OLR models to analyze ordinal data under different marginal probability distributions and correlation coefficients. Understanding the impact of different combinations of marginal probability distributions and correlation coefficients on the ANN and OLR performance could help providing a guide for selecting an appropriate model and parameters in order to build a better model to analyze ordinal data. This can, in turn, lead to more efficient and valueadded decisionmaking. 1.3 Purpose The purpose of this study is to compare the application of the OLR and ANN models to analyze ordinal data using different scenarios by varying the combinations of the marginal probability distribution and correlation coefficients. This study attempts to provide the best guidance for model selection for various combinations of marginal distribution and correlation coefficient to analyze ordinal data. The specific objectives of this study are to: 1. Develop the OLR and ANN models to represent a relationship between one predictor and one response variable with various combinations of marginal probability and correlation coefficients. 2. Develop the OLR and ANN models to represent a relationship between three predictors and one response variable with different combinations of marginal probabilities and correlation coefficients. 9 3. Compare the models‟ accuracy. 4. Evaluate the models and summarize the results for use in model selection for each scenario. 1.4 Test Case: The Service Profit Chain in Training Restaurants In order to compare the performance of the Artificial Neural Network (ANN) and Ordinal Logistic Regression (OLR) model to analyze ordinal data, data is collected from two training restaurants by using student satisfaction surveys and instructor evaluations of student job performance. Collected data is used as the source to determine marginal probabilities and correlation coefficients for simulations. Two groups of data are generated in the simulations. The first group of data consists of two variables (one input and one outcome variable). The input variable is the instructor evaluations of student job performance, while the output variable is the student overall satisfaction based on student attitudes and perceptions. The second group of data consists of four variables (three input and one outcome variable), which refers to three determinants of student satisfaction and the student overall satisfaction. Both the OLR and ANN models are built using each data set generated from the simulation and each data set collected from the survey. Finally, this study compares the misclassification rate (the proportion of disagreement between the predictedoutcome and the actual outcome) resulting from the OLR and ANN models. The service sector has been growing rapidly in the past two decades. One of the largest privatesector employers in the United States is the restaurant industry. This industry provides many career opportunities for college students pursuing degrees in hospitality, restaurant management, as well as in the culinary arts. Currently, there are 10 approximately 261 schools that offer degrees in the culinary arts and culinary management in the United States (Hertzman & Ackerman, 2010). As of June 2011, the Accreditation Commission for Programs in Hospitality Administration (ACPHA) has granted accreditation for 55 hospitality programs in the US (chrie.org, 2012). One of the most important facilities in those programs is the training restaurant, since the learning process in the training restaurant improves the skill and critical thinking required for the restaurant industry (Gustafson, Love, & Montgomery, 2005). The case study for this research uses the serviceprofit chain framework as a platform to build OLR and ANN models. The Service Profit Chain (SPC) is a comprehensive framework of the relationships between employee, customer, and profitability introduced by Heskett, Jones, Loveman, Sasser Jr, and Schlesinger (1994). The framework links employee satisfaction with the value of the product and service delivered to create customer satisfaction, and then assess the effect on profitability. The information gained from examining the internal links of the SPC concept in a training restaurant, which involves student satisfaction and job performance during the learning process in the training restaurant, can provide valuable input to improve restaurant performance and customer satisfaction. Although the training restaurant has an important role in the effectiveness of hospitality and culinary programs in preparing students to enter the restaurant industry, this type of training facility has received less attention in the literature (Alexander, Lynch, & Murray, 2009; Nies, 1993). Thus, this exploratory study may help add to the body of knowledge governing the utilization of training restaurants in education. 11 1.5 Summary of the Research Gaps Ordinal data is rankordered data commonly used in social and behavioral studies as well as in educational and health studies. This type of data is different from interval data because the distance between each category is not necessarily equal. Ordinal data is also different from nominal data because of its rankordered property. Despite the distinctive properties of ordinal data, many studies continue analyzing ordinal data using methods that only work properly with interval or nominal data (Agresti, 2010; Hastie et al., 1989; Mayer, 1971). In recent years, regression and ANN models have been considered competing modelbuilding techniques in the literature. Many studies have been conducted to compare and contrast the use of regression and ANN models in the area of prediction and classification problems (Deng et al., 2008; Karlaftis & Vlahogianni, 2011; Luengo, García, & Herrera, 2009; Paliwal & Kumar, 2009). However, none of those studies focus on the use of the OLR and ANN models as a modelbuilding technique for ordinal data. This study compares the performance of the OLR and ANN models by using survey data collected from two training restaurants and artificial data generated through simulation. Artificial data is randomly generated based on marginal probabilities and correlation coefficients. Although some studies that compare regression and ANN models also use simulation to generate data, none of them generates data as correlated ordinal data. Instead, a random uniform distribution is utilized (Cardoso & Da Costa, 2007; Jianlin, Zheng, & Pollastri, 2008). This study builds the OLR and ANN models to explore two relationships in the internal link as explained in the Service Profit Chain (SPC) concept. The case study uses 12 the internal link of the SPC because this link reflects the effectiveness of the learning process in the training restaurant. Also, the number of previous studies that explore the internal link of the SPC is much smaller than that of studies which explore the external link. The internal links are comprised of 1) the relationship between employee satisfaction and employee performance and 2) the relationship between employee satisfaction and the determinant factors of employee satisfaction, such as clarity of job descriptions, selfmotivation, reward, recognition, and many others. Currently, no study has been conducted to compare the OLR and ANN by testing the internal links of the SPC in a training restaurant setting. 1.6 Organization of the Study Chapter I delivers an overview of the main topic under study, and the rationale for the need of such a study. The problem statement, purpose, test case for the study and the research gaps that the study aims to fulfill are also stated. Chapter II provides a review of literature relevant to the development of the study. The methodology and procedures used in the study, including the process for developing the instruments used to collect data are presented in Chapter III. Chapter IV provides the process used to compare the OLR and ANN models with one independent variable and presents the results gained from the comparison. The chapter also explains the simulation process used to generate data with specific marginal probabilities and correlation structure. The results of comparing OLR and ANN models with three independent variables are presented in Chapter V. The last chapter, Chapter VI, contains a summary, conclusions and recommendations for future research.13 CHAPTER II REVIEW OF LITERATURE 2.1 Introduction The first part of this chapter explains the two methods used to analyze ordinal data: the Ordinal Logistic Regression (OLR) and Artificial Neural Network (ANN) models. The next part of this chapter presents the methods used in the study to perform simulations needed to generate artificial data. It also includes the relevant correlation setups, including detailed algorithms used to generate random marginal probabilities, correlation matrices, and correlated ordinal data. The performance metrics and hypothesis testing used to compare the OLR and ANN models are also explained. The last section provides a review of relevant literature about the structure and function of training restaurants, the serviceprofit chain (SPC), and employee satisfaction, which provide the research framework for the case study. 14 2.2 Methods for Analyzing Ordinal Data An ordinal scale is commonly used to gather data about subjective responses in many behavioral studies. For example, some studies explore employee and customer satisfaction and their determinants. Although the variables are measured in ordinal scales, some researchers tend to treat them as continuous variables and to analyze them using linear regression models. For instance, Eskildsen and Nussler (2000) built a linear regression model to predict employee satisfaction in several companies in Denmark, whilst Gustafsson and Johnson (2004) applied a linear regression model to determine attribute importance in a service satisfaction model. Analyzing ordinal data using any model that assumes equal distances between categories of such data may produce meaningless results (Agresti, 2010; Mayer, 1971; Tutz, 2012). The Ordinal Logistic Regression (OLR) and Artificial Neural Network (ANN) models are two analytical methods which are appropriate for analyzing ordinal data. Compared to the ANN model, the OLR is easier to interpret and can be statistically tested. On the other hand, the ANN has a higher capability to deal with any nonlinear functions and any data distribution as well as multicollinearity within input variables (Lin, 2007). Many studies that compare statistical methods and the ANN model to predict overall customer or employee satisfaction show that the ANN model results in a lower standard deviation and misclassification rate than statistical methods (West, Brockett, & Golden, 1997; Gronholdt & Martensen, 2005). However, all of those studies treat the respondents‟ responses either as interval or nominal data, although the responses are measured with the Likerttype scales. Ignoring the rank order of ordinal data by treating such data as nominal scale or assuming equal distances between categories of 15 ordinal data in order to analyze such data as interval data may lead to meaningless findings (Ananth & Kleinbaum, 1997; Jamieson, 2004; Tutz, 2012). 2.2.1 Ordinal Logistic Regression (OLR) Model Regression modeling is a modelbased approach that is useful to investigate the relationship between multiple independent variables and a dependent variable, as well as to examine the effect of independent variables on a dependent variable (Chen & Hughes, 2004). Linear regression and logistic regression are two common regression models used in many previous studies. The decision to choose either linear regression or logistic regression depends on the measurement scale of the dependent variable. When a dependent variable is on a continuous scale, a linear regression is more appropriate. On the other hand, a logistic regression performs better with binary variables. However, a logistic regression model should not be used to analyze ordinal data since this model attains only 50%75% of the asymptotic relative efficiency (the limit of the ratio of the sample size required) compared to an ordinal logistic regression (with a cumulativelogit link) for a five level category dependent variable (Armstrong & Sloan, 1989). An Ordinal Logistic Regression (OLR) model is an extension of a logistic regression that is capable of handling data on ordinal scales. Basically, a logistic regression is used to investigate the relationship between independent and dependent variables, in which the dependent variable is a binary/dichotomous variable. However, a logistic regression can be modified to analyze nominal or ordinal data by changing the link function from simple logistic to cumulative logits (Lawson & Montgomery, 2006). Thus, when a dependent variable is on an ordinal scale, the use of an ordinal regression is 16 more appropriate than a multiple regression (Lundahl, Vegholm, & Silver, 2009; McCullagh, 1980) Other than the OLR, Clogg and Shihadeh (1994) explain that the loglinear model and measures of association are also appropriate methods to analyze ordinal data. These three methods produce similar results, since all of these methods maintain the rank order of the ordinal data and do not assume equal distances between categories of such data. However, when ordinal data is analyzed by using a method that does not consider the rank order of the data, such as a logistic regression model, differences in the results may occur (Clogg & Shihadeh, 1994; Tutz, 2012). Several cumulative link functions are available to build an OLR model, such as the cumulative logits, probit, cauchit, complementary loglog, and the related loglog link (Agresti, 2010). The decision to choose one link over the others depends upon the distribution of the dependent variable. The most commonly used link function in the OLR model is the cumulative logit model (Clogg & Shihadeh, 1994; Fullerton, 2009). The cumulative logit link function is used when an OLR model is applied to the k levels of a dependent variable, the model incorporates k1 logits into a single model. Thus, the function can be written as: (2.1) where j=1,…,k1, and indicates the effect of the independent variables, xi denotes the column vector of the value of the independent variable, yi denotes the response levels of the dependent variable. Based on Equation 2.1, the effect of is the same for each cumulative logit. 17 If denote marginal probabilities of each k level of a dependent variable, then the cumulative logit can be determined as: . (2.2) The cumulative logit link is a symmetric function, thus this link is preferred when the ordinal data of the response variable is evenly distributed among all category levels. If the ordinal data being analyzed tend to be distributed on the higher response levels, such as „very satisfied‟ on a satisfaction rating, the complementary loglog link function is generally used to build the OLR model (Chen & Hughes, 2004). The complementary loglog link function can be written as: . (2.3) With the complementary loglog link function (shown in Equation 2.3), P(Y≤ j) moves toward 1.0 at a higher rate than it moves toward 0.0 (Chen & Hughes, 2004). Therefore, this link function is more suitable when the outcome data is dominantly distributed on the higher level. To interpret OLR results, a researcher should consider the signs and coefficients used in the model. The signs represent the existence of negative or positive effects of the independent variables on the ordinal outcome. The intercept parameter, α, refers to the estimated ordered logits for the adjacent levels of the dependent variable. The coefficient, β, indicates that a one unit change in the independent variable results in a change of the odds of the event occurring by a factor of eβ, holding other independent variables as constant (Fullerton, 2009). 18 2.2.2 Artificial Neural Network (ANN) Model An Artificial Neural Network (ANN) is an informationprocessing model that is inspired by the brain function. The key characteristics of the ANN are its capability to model complexity and uncertainty. The ANN model often performs better than traditional statistical techniques, since this technique does not require the assumptions of traditional statistical techniques, such as linearity, absence of multicollinearity, and normally distributed data (Garver, 2002; Lin, 2007; Nisbet, Elder, & Miner, 2009). ANN models are built through an iterative process in which the model learns the pattern of complex relationships between input and output. The simplest form of a neural network consists of three layers: input, hidden and output. The first layer is comprised of one or more processing elements (PE) that represent independent (predictor) variables, while the output layer contains one or more PEs that are referred as dependent (outcome) variables. The output layer consists of several PEs that represent the model‟s classification decisions. Each PE represents one class of output. The hidden layer in the model connects the input and output layers. In general, there can be one or more hidden layers between the input and output layer. The key element in the ANN is the connection weights (Turban, Sharda, & Delen, 2011). The connection weights represent the relative weight of each input to the next processing element in the hidden layer and output layer. The weights also express how the processing element learns the pattern of information given to the networks. Other important elements in the ANN are the summation and transfer functions. The summation function calculates the weighted sum of all processing elements in the input layer that enters each processing element in the hidden layer. The summation function multiplies 19 each input value by its weight and sums the values to get the weighted sum. This function is also referred as an activation function of each processing element in the input layer. Based on this summation function, an ANN model may or may not use a PE in the input when determining a PE in the sequence layer. In addition, the transfer function determines how the network combines input from each PE in the hidden layer that enters into the PEs in the output layer. Figure 2.1 Information processing in MLP ANN with backpropagation algorithm (Mehrotra, Mohan, & Ranka, 1997) The focus of this study is on multilayer perceptrons (MLP) ANN or feedforward neural networks with a backpropagation algorithm, the most commonly used neural networks for classification problems (Mehrotra et al., 1997; Perlovsky, 2001). The backpropagation MLP ANN, as shown in Figure 2.1, is a type of ANN that adjusts the connection weight by minimizing the error between the desired output and the predicted outcome produced by the network. An ANN with this algorithm is trained by giving input Input Layer Hidden Layer Output Layer PE1 PE3 PE2 PE4 PE1 PE1 PE2 Summation Function Transfer Function PE Wij Error = Desired – Predicted Outcome 20 and output data to the network. During the training period, the network learns the data patterns between the input and output and adjusts its connection weights to minimize error. Once trained, the connection weights are retained and remain available to determine output values for any new input fed to the network. Each PE in the hidden layer transfers several PEs from the input layer to the sequence layers by using summation and transfer functions. Thus, the connection weight in the ANN model is difficult to explaine (Dreiseitl & OhnoMachado, 2002; Turban et al., 2011). More hidden layers used in an ANN model results in more complex connection weights and interdependencies (West, Brockett, & Golden, 1997). Another potential drawback of an ANN model is the possibility of the model reaching the local minimum error rate since the iteration process depends on the sample used to learn the pattern when the network is trained. Thus, a validation data set is needed to decrease this potential weakness (West et al., 1997). 2.2.3 Performance Metrics The performance metrics of a predictive model are frequently measured in terms of an error (Mehrotra et al., 1997). The nature of the problem determines the choice of the error measure. In classification problems, such as the application of a predictive model for nominal and ordinal outcome variables, one of the common measures of error is misclassification rate (Mehrotra et al., 1997; Webb & Copsey, 2011). A smaller misclassification rate indicates better model performance. A misclassification rate can be calculated as: . (2.4) 21 For an ordinal outcome variable with many categories, the misclassification rate refers to the total number of misclassified samples of the outcome categories predicted by a model versus the actual categories for all classes. Some analytical packages such as IBM SPSS Modeler and SAS Enterprise Miner present a confusion matrix to express the performance of a model being used for analysis. A confusion matrix has an appearance similar to that of a contingency table. Each column of this matrix represents the number of cases in an outcome category predicted by a model, while each row represents the number of cases in an actual category. Figure 2.2 shows the confusion matrix resulting from a sevenclass classification problem (the outcome variable is a sevenpoint Likert scale). Thus, the confusion matrix has a dimension of 7x7. Each cell in the confusion matrix indicates number of misclassified/trueclassified samples. When the outcome category of a sample predicted by a model is not the same as the actual category, the sample is counted as misclassified. Otherwise, the sample is counted as trueclassified. Outcome Category (Class) Predicted by a Model 1 2 3 4 5 6 7 Actual Category (Class) 1 True Misclass Misclass Misclass Misclass Misclass Misclass 2 Misclass True Misclass Misclass Misclass Misclass Misclass 3 Misclass Misclass True Misclass Misclass Misclass Misclass 4 Misclass Misclass Misclass True Misclass Misclass Misclass 5 Misclass Misclass Misclass Misclass True Misclass Misclass 6 Misclass Misclass Misclass Misclass Misclass True Misclass 7 Misclass Misclass Misclass Misclass Misclass Misclass True Figure 2.2 A confusion matrix representation for sevenclass classification problem 22 2.2.4 Statistical Test to Compare the OLR and ANN models Determining which type of statistical test to use to compare two or more models is one of the critical problems in this study. Many studies that compare machine learning algorithms and statistical models use different types of statistical tests, such as McNemar‟s test, the Wilcoxon signedrank test, the Quasi F test and hypothesis testing on the average performance, to determine which model (algorithm) performs better for the problem that is being investigated (Dietterich, 1998). A taxonomy that helps to determine the statistical test to be used to compare different models (algorithms) is shown in Figure 2.3. Figure 2.3 A taxonomy of statistical tests in comparing algorithms (Dietterich, 1998) This study follows condition number 5, which suggests 1) to build algorithm on each training data sets of size m, 2) to test the resulting frozen model (classifier) on the testing data set and 3) compare the algorithms‟ accuracy based on the average performance (Dietterich, 1998). These suggestions are similar to the procedure undertaken in this study, which builds the ANN and OLR models using n training data A Taxonomy of statistical questions Single domain Analyze classifiers Predict classifier accuracy Large sample 1 Small sample 2 Choose between classifier Large sample 3 Small sample 4 Analyze algorithms Predict algorithm accuracy Large sample 5 Small sample 6 Choose between algorithm Large sample 7 Small sample Multiple domain 9 23 sets of size m. In this study, each model is trained on each training data set and the resulting classifiers are tested on n testing data sets. The average accuracy or misclassification on test data sets predicts the performance of ANN and OLR models. Then, a hypothesis test on the mean is used to compare the average accuracy or misclassification obtained from the testing data sets. One test procedure for investigating the difference between population means μ1 and μ2 is based on the assumption that the population distributions are normal and the value of the population variance is known to the investigator. However, both of these assumptions are unnecessary if the test procedure is performed on large sample sizes (Devore, 2008). When this test procedure is applied to compare the average misclassification rate from two algorithms, i.e. model 1 and model 2, the hypothesis testing can be expressed as the following: , (2.5) where = the true mean misclassification rate for model 1 = the true mean misclassification rate for model 2 = the sample average of misclassification rate for model 1 = the sample average of misclassification rate for model 2 = sample variance for model 1 = sample variance for model 2 24 = number of sample for model 1 = number of sample for model 2 These tests are usually appropriate if both m and n are more than 40. is rejected if pvalue is smaller than the desired type I error. If H0 is rejected, the result confirms that there is a statistically significant difference between the mean misclassification rate resulting from model 1 and model 2. Otherwise, H0 fails to be rejected, which means the misclassification rate resulting from model 1 is not statistically significant different from the one resulting from model 2. 2.3 Generating Correlated Ordinal Data In order to evaluate and compare the performance of two models with a small data size, simulation is used to generate artificial data (Ibrahim & Suliadi, 2011). Additionally, if the artificial data is generated based on a particular data set in which the responses within a specific subject (respondent) are correlated and the responses between subjects are independent, then the artificial data are classified as correlated ordinal data and commonly generated based on the marginal probabilities and the correlation coefficient (Demirtas, 2006; Ibrahim & Suliadi, 2011; Lee, 1997). Many studies discuss procedures to generate correlated binomial data based on the marginal probabilities and correlation coefficient, but only a few algorithms are available to generate correlated ordinal data. Some methods to generate ordinal data are developed from methods to generate binomial data (Lee, 1997; Sebastian, Dominik, & Friedrich, 2011). Several algorithms have been proposed to generate correlated ordinal data. A technique proposed by Gange (1995) uses the iterative proportional fitting 25 algorithm for generating correlated ordinal data. This method determines the marginal joint distribution based on the loglinear model. However, this method requires intensive computation, even for a small number of variables (Demirtas, 2006; Ibrahim & Suliadi, 2011). Another method proposed by Lee (1997) simulates correlated ordinal data using a convex combination and archimedian copulas approach and computes the correlation coefficient using Goodman Kruskal‟s coefficient. This approach does not require the same intensive level of calculation as the one suggested by Gange (1995), so that any number of categories and variables can be handled easily using this method. Unfortunately, this method cannot handle a negative correlation coefficient. Biswas (2004) generates correlated ordinal data for a specific type of correlation (Autoregressive type correlation). This method requires the variables to be independent and identically distributed. Thus, this method is very restrictive. Another algorithm that has relatively high flexibility is suggested by Demirtas (2006). This algorithm uses the generation of binary data as the intermediate step and computes correlation using Pearson‟s productmoment correlation coefficient. Ordinal values of the original data are collapsed into binary values. Then, iterative calculations are conducted to compute the binary correlation and convert the binary data into ordinal data based on the original marginal distribution. A shortcoming of this method is its incapability to handle negative correlations. Based on the pros and cons of the available algorithms to generate correlated ordinal data, the decision to choose one algorithm over to the other depends on the type of correlation coefficient. If the simulated variables could have a negative correlation coefficient, then the method proposed by Gange (1995) is the preferred algorithm. In circumstances when 26 simulated variables have an autoregressive type correlation, the algorithm introduced by Biswas (2004) is the preferred choice. Alternatively, when simulated variables have positive correlation coefficients, either the algorithm proposed by Demirtas (2006) or Lee (1997) can be used. The difference between each algorithm is the type of correlation used in the simulation. Demirtas (2006) applies Pearson‟s productmoment correlation coefficient and Lee (1997) applies the Gamma correlation coefficient. This study uses the convex combination algorithm proposed by Lee since this algorithm requires a simple calculation and Gamma correlation coefficient, a type of correlation that is suitable for ordinal data. Three main steps to generate correlated ordinal data using the convex combination algorithm proposed by Lee (1997) are 1) finding the extreme table, 2) finding the joint distribution, and 3) applying the inversion algorithm. The extreme table is used to check if the preferred Gamma correlation is achievable with the given marginal probabilities. The joint distribution is determined by applying linear programming to the convex combination of the extreme table. The last step is to generate the ordinal correlated data by applying the inversion algorithm, which aims to generate correlated ordinal observations. 2.4 Generating Correlation Coefficients A simulation to generate correlated ordinal data requires marginal probabilities and correlation coefficients. The correlation coefficients for correlated ordinal data are commonly presented in a correlation matrix. Since a correlation matrix has to be 27 symmetric and positive semidefinite, then a certain algorithm is needed to ensure the fulfillment of this requirement when correlation coefficients are generated. Let rij be the correlation coefficient between xi and xj where x1, x2,…, xn are random variables. A correlation matrix is a symmetric and positive semidefinite matrix form of rij. All entries in a correlation matrix have a value between [1, 1], and the diagonal entries are equal to one. One method to generate correlation matrices is by randomly generating correlation matrices without considering particular settings (Budden, Hadavas, Hoffman, & Pretz, 2007; Joe, 2006; Olkin, 1981). In this method, correlation matrices are randomly generated based on the upper and lower bound set in each entry, which is not consistently [1,1] in order to guarantee that the matrices are positive semidefinite matrices and their diagonal entries are equal to one. The application of this approach to generate a pdimensional correlation matrix R enables entries to be independently generated in the interval [1, 1] and the remaining entries (except the diagonal entries) to be constrained on a specific interval. This specific interval depends upon the value of the first entries and the sequence of the partial correlation being generated. Consider 4x4 correlation matrices. The correlation matrix is in the form of The following procedure is the detailed formula to randomly generate 4 x 4 correlation matrices without considering particular settings as suggested by Budden et al. (2007). The first step in generating correlation matrices is to generate the correlation coefficient of r12, r13, and r14 which can be randomly generated ~ U (1, 1). The second 28 step is to determine the lower and upper limit of the other correlation coefficients in order to ensure generated matrices are symmetric and positive semidefinite. A matrix can be a positive semidefinite matrix if and only if the matrix and all of its symmetric submatrices have a nonnegative determinant. It means that if C is a correlation matrix, det C ≥ 0 and all its submatrices are in the form of is also a correlation matrix for i, j, k {1,2,3,4} ; with no two of i, j, and k equal. Three limits on the possible range of the other correlation coefficients (r23, r24, r34) are determined to ensure the symmetric and positive semidefinite requirement in addition to the symmetric boundary of a correlation matrix, rij = rji. Another method is to randomly generate correlation matrices with particular settings, such as eigenvalues or expected values, and distribution of entries (Marsaglia & Olkin, 1984). Compared to other available methods that are generating correlation matrices based on the distribution of the entries, the Wishart distribution is the most commonly used distribution for generating a correlation matrix (Gentle, 2003). Although the Wishart distribution is initially known as the probability distribution of the covariance matrix, many studies have applied the Wishart distribution to generate correlation matrices since a correlation matrix can be calculated from a covariance matrix. The elements of a correlation matrix can be obtained by dividing the (i,j) element of the covariance matrix by the square root of the product of the ith diagonal element and the jth diagonal element of the covariance matrix (Gentle, 2003). In addition, the p dimension of 29 the correlation matrices and the mean of the randomly generated matrices should be known a priori in order to generate correlation matrices based on the Wishart distribution. This study compares the performance of the OLR and ANN models to analyze ordinal data by fitting ordinal data collected from two training restaurants to both models. The OLR and ANN models are built to analyze the internal link of the Service Profit Chain (SPC). The concept of the SPC and training restaurant is used as the framework and research basis for the case study. The following subsection presents the review of some relevant literature about the concept of training restaurants, the service profit chain, and employee satisfaction. 2.5 Training Restaurant Training restaurants, production kitchens and industrial training placements provide practical elements and vocational settings in food and beverage management curricula. Training restaurants function as learning environments to deliver a mix of practical leadership and management skills to students. In this type of restaurant, students not only learn food production and service, but they also learn managerial skills and techniques (Alexander, 2007). Therefore, students are required to fulfill different responsibilities (either in the kitchen area or in the service area) during their practical activities in training restaurants. For instance, a student who makes salad on one particular day may become a team captain or a waiter on another day. Although the main purpose of training restaurants is not to generate profit, training restaurants are required to generate revenue to cover their operational costs (Alexander et al., 2009). Hospitality departments that operate training restaurants expect 30 the training restaurants to become more costeffective so that the department is able to reduce its subsidy, and the restaurant can gradually achieve financial autonomy. Achieving a condition without any subsidy means that a training restaurant has been successful in creating a realistic learning condition, effectively mixing training and profit making. Therefore, training restaurants should not only be treated and managed as laboratories, but also as business entities. The summary of training restaurant characteristics and a comparison to profitoriented restaurants is presented in Table 2.1. Table 2.1 Comparisons of training restaurants and profitoriented restaurants Profitoriented Restaurant Training Restaurant Main Purpose Profit Generating Learning Media & Revenue Generating Employee RegularPaid Employee Relatively Fixed Position Unpredictable Turnover Students Rolling Position/Responsibility Periodic Turnover rate The unique characteristics of training restaurants may present obstacles to these restaurants gaining profit. According to Nies (1993), more than half of the training restaurants owned by various schools in the US are located inside the school area and operated within limited hours during the school‟s instructional period. These characteristics may create limited access for the public to dine in training restaurants. In addition, training restaurants experience frequent and predictable turnover because different groups of students operate the restaurants for each instructional period (semester/quarter). A high turnover rate requires the restaurants to find creative ways to maintain good relationships with their customers, since the familiarity that commonly supports good relationships between frontline employees and customer is diminished. 31 2.6 The Service Profit Chain Heskett et al. (1994) introduce the Service Profit Chain (SPC) as a comprehensive framework of relationships between employee, customer, and profitability. In a service industry, the theory posits that internal service quality influences employee satisfaction. Internal service quality refers to employees‟ perceptions of their working environment, various aspects of their job and their relationships with peers and supervisors. A satisfied employee tends to deliver better service and product value to the customer. A higher perceived service and product value leads to higher customer satisfaction. In turn, a satisfied customer tends to be a loyal customer. By having a loyal customer, an organization experiences higher growth and profit level. This proposition is supported by empirical studies from various service companies, such as Southwest Airlines and Taco Bell. Figure 2.4 illustrates the proposition of this concept. Figure 2.4 The links in the Service Profit Chain (Heskett et al., 1994) The SPC is recognized by many researchers as the best model to guide service organizations in achieving higher organizational performance (Herington & Johnson, 2010). Many empirical studies test some of the linkages and their results strengthen specific aspects of this framework. For example, Maritz and Nieman (2008) examine the Internal Service Quality Employee Satisfaction External Service Value Customer Satisfaction Customer Loyalty Growth Profitability Internal/Employee External/Customer Organization‟s Success Measures 32 relationships between the service profit chain initiatives (represented by retention and sales volume) and service quality dimensions, whereas Gelade and Young (2005) find that customer satisfaction mediates the relationship between employee attitudes and organizational performance. 2.6.1 Link between Employee and Customer Satisfaction Many studies demonstrate a positive correlation between customer satisfaction and employee satisfaction (Chi & Gursoy, 2009; Koys, 2003). Other studies show that the relationship between customer satisfaction and employee satisfaction gets stronger if the employees have higher loyalty (Gelade & Young, 2005; Schlesinger & Zornitsky, 1991). Furthermore, Gelade and Young (2005) suggest that positive employee experience, as demonstrated by positive attitudes such as satisfaction and commitment and by positive evaluations of organizational climate, are closely related to high levels of customer satisfaction. Thus, employees that have positive feelings about their workplace deliver positive effects when they carry out their work. This emotion is perceived and absorbed by the customer. As a result, customers experience pleasant service encounters. 2.6.2 Link between Customer Satisfaction and Organizational Success Measures The Service Profit Chain (SPC) suggests that profit and other measures of success used in an organization, are positively correlated with customer satisfaction (Heskett & Sasser, 2010). This SPC proposition is supported by other studies which find that customer satisfaction is positively correlated with nonfinancial performance (Schneider, 1991; Tornow & Wiley, 1991) and with financial performance as well (Anderson, 33 Fornell, & Lehmann, 1994; Rust & Zaborik, 1993). Types of financial and nonfinancial measures chosen in a study depend on a company‟s operation. For example, Tornow and Wiley (1991) use two nonfinancial indicators (right first time, on time) and three financial indicators (contract retention, revenue retention and service gross profit) to test the relationship between customer satisfaction and organizational performance in a computer service company. In another perspective, Anderson and Mittal (2000) suggest that the relationship between satisfaction and repurchase in retail industry is nonlinear. In that case, dissatisfaction has a greater impact on repurchase intent than satisfaction and the impact of satisfaction on repurchase intent is greater at the extremes. In addition, they also show that at a certain point, the increased cost to improve customer satisfaction is likely to outweigh the beneficial effects of further customer satisfaction. Therefore, diminishing returns are applied when relating customer satisfaction to profitability. 2.6.3 Link between Employee Satisfaction and Organization’s Success Measures Some studies find that sales and profitability as a measure of business performance have a significant relationship with employee satisfaction and employee retention. Reichheld (1993) explains that a loyal employee tends to establish good relationships with customers. In turn, these relationships will increase customer loyalty, and as a result, increase profitability. Thus, in service industries, employee retention has a significant role because it has a positive relationship with customer retention (Reichheld, 1993). Similarly, Koys (2001) studied this relationship in some outlets of a restaurant 34 chain and found that there was a significant relationship between employee satisfaction and financial performance. In contrast, Bernhardt et al. (2000) and Chi and Gursoy (2009) found that there is no significant relationship between employee satisfaction and financial performance. Similarly, a study of employee perception and business performance using a metaanalysis finds that there is only a small relationship between business unit productivity and profitability, and employee engagement (Harter, Schmidt, & Hayes, 2002). This study explains that customer satisfaction mediates the relationship between employee satisfaction and profitability; thus, there is only either a small relationship or even a nonsignificant relationship between employee satisfaction and profitability (Harter, Schmidt, & Hayes, 2002). 2.7 Employee Satisfaction Disposition (temperament), work environment and culture are key determinants of employee satisfaction according to Saari and Judge (2004). Disposition includes employee personality traits, core selfevaluation, the perception of the job itself, extraversion and conscientiousness. Even though organizations cannot directly influence employee personalities, the use of appropriate selection methods and good alignment between employees and job tasks help to ensure that people are selected for, and placed into, jobs most appropriate for them. In addition, job variation, job range/scope and autonomy of the job are required to ensure the work environment remains interesting and challenging (Love & O'Hara, 1987). Four areas of crosscultural differences among the employees are individualism versus collectivism, uncertainty avoidance versus risk 35 taking, power distance or the extent to which power is unequally distributed, and achievement oriented or nonachievement oriented. Because of the potential for crosscultural misinterpretation, managers should be aware and adjust cultural factors that influence employee attitude and satisfaction (Saari & Judge, 2004). Another study conducted by Gostick and Elton (2007) explores the relationship between employee satisfaction and employee engagement or employee involvement in an organization. The study measures employee engagement based on employee perception toward the opportunity to do satisfying work, acceptance of opinion by the manager, feeling accepted as a team member by peers and supervisors, and the manager‟s recognition (Gostick & Elton, 2007). Internal service quality is also suggested as a determinant factor of employee satisfaction (Fitzsimmons & Fitzsimmons, 2008). According to these authors, internal service quality is related to employee perceived value toward selection and development programs, rewards and recognition, access to information to serve the customers, workplace technology, and job design. Previous studies explore the determinants of employee satisfaction in dining services by using the same constructs as employee satisfaction studies in other areas (Gazzoli, Hancer, & Park, 2010; Salanova, Agut, & Peiro, 2005; Susskind, Kacmar, & Borchgrevink, 2007; Tepeci & Bartlett, 2002). Salanova et.al (2005) uses autonomy, organizational resources, such as technology and training offered, engagement, and service climate as employee satisfaction drivers. In addition, other factors such as role conflict, physical work environment, relationship with peer workers, relationship with superior, and dispositional influence are used as employee satisfaction drivers (Gelade & Young, 2005; Martensen & Granholdt, 2001; Matzler, Fuchs, & Schubert, 2004; 36 Maxham, Netemeyer, & Lichtenstein, 2008; Salanova et al., 2005; Timothy & Chester, 2004). Based on the previous research, this study uses the constructs shown in Table 2.2 to develop the student questionnaires used in the survey. Table 2.2 Constructs of employee satisfaction Dimensions Constructs/Dimension Internal Determinants  Dispositional influence/selfmotivation (Gelade & Young, 2005; Saari & Judge, 2004) External Determinants  Development of competencies, engagement (Salanova et al., 2005)  Superior relationships, working condition, peer relations (Martensen & Granholdt, 2001)  Job clarity, recognition, reward (Saari & Judge, 2004) Based on all of these perspectives, the determinants of employee satisfaction can be classified into two groups: internal and external. The internal determinants come from within the employees themselves, while the external determinants are triggered by the work and organizational conditions. The internal determinants come from the subjective characteristics of employees, which can be either created before they work in the company or after they join the company. On the other hand, the external determinants come from the work environment, which can be influenced by the internal service quality, work conditions, coworkers, leaders and subordinates. The SPC concept posits that satisfied employees tend to have a better performance when they serve a customer. In the training restaurant setting, the employees are the students, who work in the restaurant during a particular semester/quarter as part of a course. The students, who work in training restaurants, are required to do a rolling 37 position, such as serving customers, greeting and directing, and managing the operation of the day. Thus, the students are expected to understand the entire products offered and procedures during the operation as well as and to become skilled at delivering service and managing a restaurant (Maxham et al., 2008; Alexander et al., 2009). Based on the previous research, this study uses the constructs shown in Table 2.3 to develop the instructor questionnaires used in the survey. Table 2.3 Constructs of student performance Dimensions Constructs/Dimension Students InRole Performance  Knowledge of product, knowledge of procedure (Maxham et al., 2008)  Production skill, service skill, managerial skill (Alexander et al., 2009) Employee ExtraRole Performance  Intention to satisfy customer, intention to go beyond duty (Maxham et al., 2008) 38 CHAPTER III RESEARCH METHODOLOGY 3.1 Introduction This chapter describes the research procedures designed to compare performance of the Ordinal Logistic Regression (OLR) and Artificial Neural Network (ANN) models when analyzing ordinal data. In this study, the OLR and ANN models are used to test two relationships in the Service Profit Chain (SPC), the relationship between employee perceived value of the internal and external determinants of employee satisfaction and employee overall satisfaction and the relationship between employee overall satisfaction and job performance. Before building the OLR and ANN models, the study undertakes some preparatory steps, such as checking for missing values and outliers as well as examining data distributions. Since the total number of students who work at the sampled training restaurants is relatively small (n < 30), this study generates additional correlated ordinal data using simulations to build the OLR and ANN models. The preferred model is the one with the lowest averaged misclassification rate, which is calculated as the 39 the proportion of disagreement between the predictedoutcome from a model and the actual outcome from a testing data set. The first step in this research is to create a conceptual framework in order to analyze possible relationships between student overall satisfaction and job performance in a training restaurant by applying the internal link of the Service Profit Chain (SPC) model. This step includes exploring factors that may affect student overall satisfaction and job performance. The second step is to design a data collection plan for use in two different training restaurants, Taylors‟ Dining Room at Oklahoma State University – USA and Fajar Teaching Restaurant at Universitas Negeri Malang  Indonesia. The next step is to generate simulated data that have marginal probability distributions and correlation coefficients similar to data collected from the surveys at both training restaurants. Additional sets of data are also generated using random marginal probabilities and correlation coefficients. Two groups of data are generated in the simulation. The first group consists of two variables (one input and one outcome variable) and refers to the effect of employee overall satisfaction on job performance. The second group consists of four variables (three input variables and one outcome variable) and refers to the effect of student perceived value of three determinants of employee satisfaction on student overall satisfaction. Data that is generated using simulations is split into two data sets, training and testing data sets. Each training or testing data set consists of 50 pair data points (predictor and outcome). Both the OLR and ANN models are fitted to training data sets and used as classifiers (frozen models). The models resulting from this step are used to predict the outcome category of all predictor data points in the testing data sets. The performance of 40 the OLR and ANN models are measured from the misclassification rates, the proportion of disagreement between the predictedoutcome from a model and the actual outcome from a testing data set. The last step in this research is to compare the mean misclassification rate resulting from the constructed OLR and ANN models. The framework of the overall methodology used in this research is presented in Figure 3.1. Figure 3.1 The framework of the research methodology Step 1: Develop Conceptual Framework  Develop conceptual model of studentemployee satisfaction and student performance in training restaurants  Develop list of constructs and items that influence employee satisfaction and performance in restaurant service industry Step 2: Design Data Collection Plan  Design survey instruments  Determine scale of measurement  Develop sampling plan and survey administration plan  Obtain IRB approval Step 3: Generate Simulated Data  Determine marginal probabilities and correlation coefficients  Generate random marginal probabilities  Generate random correlation coeffiecients  Generate simulated data based on marginal probabilities and correlation coefficients Step 4: Build Model  Build ordinal logistic regression and artificial neural network model  Set model evaluation metric  Record misclassification rate for each model  Compare misclassification rates 41 3.2 Research Step 1: Conceptual Frameworks This study follows the proposition from previous literature regarding the effect of employee perceived value of the internal and external determinants of employee satisfaction on employee overall satisfaction and the effect of employee overall satisfaction on job performance. The conceptual framework of this study is illustrated in Figure 3.2. Figure 3.2 The conceptual framework of the study The propositions are: 1: Student perceived value of employee satisfaction determinants affect overall satisfaction. 2: Student overall satisfaction affects job performance. 3.3 Research Step 2: Data Collection Plan This study conducted surveys to collect data. Based on the two categories of respondents who filled out the questionnaires, two types of instruments were used in this study: a studentemployee instrument and an instructor instrument. The questions used in these instruments were based on previous studies in order to ensure the questions had both validity and reliability. The studentemployee instrument contained nine constructs/ dimensions identified by Salanova et al. (2005), Martensen and Granholdt (2001), and Saari and Judge (2004), while the instructor instrument contained questions identified by Student overall satisfaction Job Performance Student perceived value of internal & external determinants of employee satisfaction 42 Maxham et al. (2008) and Alexander et al. (2009). Both instruments only contained closeended questions. A list of constructs used in the student instrument is shown in Table 2.2. 3.3.1 Initial Instrument and Pretest Before applying for Institutional Review Board (IRB) permission, the initial instruments were finalized. The initial instruments contained the following sections: 1) Brief explanation of the research project, including the title and the objective; 2) Confidentiality of the participants, procedure and risks, contact information and the expected length of time to take the survey; 3) Questionnaires. After the development of the initial instruments, a comprehensive discussion with faculty members was conducted to receive any feedback related to the order of the questions, language, general structure of questionnaire items, and the appearance of the instruments. The constructs and items used in the student and instructor questionnaires are listed in the subsections 3.3.4 and 3.3.5 respectively. The IRB approval to conduct surveys at FTR and Taylors‟ Dining can be found in Appendices 2a and 2b. Additionally, the questionnaires used in the survey at Taylors‟ Dining and FTR can be found in Appendices 3a, 3b, 3c, 3d, 3e and 3f. 3.3.2 Pilot Test A pilot test of the student instrument was administered to ten students that were taking Managing Café class in the Culinary Program at the Universitas Negeri Malang. The purpose of the pilot test was to assess the length of time needed to complete the survey as well as to conduct face validity and initial reliability analyses. The study 43 examined reliability based on internal consistency measures using Cronbach‟s Alpha test. Data collected from the pilot test is shown in Appendix 2. The obtained alpha for each construct shown in Table 3.1 was higher than 0.7, the recommended value of alpha for a reliable scale (Turk et al., 2011). Thus, the alphas obtained indicated that the constructs in the instrument had acceptable interitem reliability. Table 3.1 Reliability Alpha on pilot data Construct Number of items Cronbach‟s Alpha Development of competencies 6 items 0.816 Recognition 3 items 0.714 Working condition 4 items 0.721 Reward 6 items 0.790 Engagement 5 items 0.850 Peer relationship 4 items 0.777 Superior relationship 6 items 0.855 Job clarity 5 items 0.741 Dispositional influence/selfmotivation 3 items 0.738 3.3.3 Instrument Validity Validity indicates the ability of an instrument to measure the intended concepts (Turk et al., 2011). The study evaluated the validity of the instrument by investigating the face validity of the instrument. Face validity, a basic index of content validity, indicates the degree to which the items in the instrument appear that they will measure the intended concept (Turk et al., 2011). To ensure the face validity of the instruments, the research advisor and the outside committee member provided feedback on the initial instrument. This repetitive process resulted in rewording some questions. 44 The manager of each training restaurant also provided some comments on the instruments. These comments created differences between the student instruments used in the Fajar Teaching Restaurant and Taylors‟ Dining Room. For example, there are no questions related to compensation for students at Taylors‟ Dining since students work in this restaurant as part of a class. However, there are two questions related to compensation for students at the other training restaurant since they are paid for their work. The manager in Taylors‟ Dining also recommended deleting some questions in the student instrument because of the repetitiveness of the questions. For example, the FTR survey contains four questions related to how the students were rewarded, while the Taylors‟ Dining survey contains only two. As a result, the student instrument used in FTR has more questions (42 questions) than the one used in Taylors‟ Dining (29 questions). The other difference is related to the preferred terminology for the student employee. FTR‟s and Taylors‟ Dining‟s manager recommended using “employee” and “student lab” as the term that refers to student employees in the questionnaire. The pilot test revealed that the instrument did not cause problems in terms of the clarity of the questions and language. 3.3.4 Student Instrument The student instrument measures the students‟ perceived value of some factors that influence their overall satisfaction as studentemployees in the training restaurant. The student instrument consists of two sections. The first section contains 42 items identified by Salanova et al. (2005), Martensen and Granholdt (2001), and Saari and Judge (2004) and uses a sevenpoint Likert scale. In this part, „1‟ indicates that the 45 student “strongly disagrees” with the statement on the instrument, while „7‟ represents strong agreement with the statement being asked. The statements in this section evaluate the student perceived value of internal service factors as well as external factors that may influence his/her satisfaction. The second section intends to measure student overall satisfaction. This section has two questions and uses a sevenpoint Likert scale. In this section, „1‟ indicates that the student is “very dissatisfied” with his/her working experience during the lab session at the restaurant, while „7‟ indicates that the student is “very satisfied.” At the end, the student is asked to write down his/her name so that his/her responses can be paired up with the instructor‟s responses related to his/her job performance. Table 3.2 presents the constructs and items used in the student questionnaire. See Appendix 3a and 3c for the student instrument used in Taylors‟ Dining and FTR. Target Population. The target population for this instrument was studentemployees in the training restaurants. The study employed convenience sampling to collect data. The samples were all students who worked in the Taylors‟ Dining and FTR during the survey period. Sample size. There were 28 studentemployees at Taylors‟ Dining Room and 24 studentemployees at Fajar Teaching Restaurant. Survey Administration. This study administered the surveys by distributing the instrument to all studentemployees before the morning briefing. After filling out the instrument, studentemployees returned the instrument to the frontdesk. 46 Table 3.2 Student questionnaire items Constructs and items* Reward (6 items) Q1a. I am fairly rewarded for the experience I have; Q1b. I am fairly rewarded for the stresses of my job Q1c. I am fairly rewarded for the effort I put forth; Q1d. I am fairly rewarded for the work I have performed well Q22. The pay system is based on achievement; Q23. The pay system is transparent Engagement (5 items) Q2a. When decisions about employee are made at FTR, complete information is collected for making those decisions Q2b. When decisions about employee are made at FTR, all sides affected by the decisions are presented Q2c. When decisions about employee are made at FTR, the decisions are made in timely fashion Q2d. When decisions about employee are made at FTR, useful feedback about the decision and their implementation are provided Q20. My manager involves me in planning the work of my team Superior relationship (7 items) Q2e. My supervisor/manager treat me with respect and dignity Q2f. My supervisor/manager works very hard to be fair Q2g. My supervisor/manager shows concern for my rights as a student employee Q10. I know how the instructor evaluates my performance. Q13. My superior is trustworthy; Q24. My supervisor gives me feedback when I perform poorly Development of competencies (6 items) Q4. My job provides me the opportunity to develop a wide range of my skills Q6. My job allows me to utilize the full range of my educational training Q7. The training I have received has prepared me well for the work I do Q8. I believe I have the opportunity for personal development at FTR Q30. Employees in our organization have knowledge of the job to deliver superior quality product and service Q31. Employees in our organization have the skill to deliver superior quality work and service Recognition (2 items) Q5. My job is important to the success of this restaurant Q32. Employees receive recognition for delivery of superior product and service Q25. My supervisor gives me feedback when I do a better job than average Working condition (4 items) Q14. I have sufficient authority to do my job well ; Q21. Work environment is pleasant Q26. I have autonomy to decide the order of tasks I perform Q33. Employees are provided with tools, technology and other resources to support the delivery of quality product and service Peer relationship (4 items) Q15. Most employees that I worked with are likeable ; Q16. Employees are team oriented Q18. People are treated with respect in my team, regardless of their job Q19. The people in my teams are willing to help each other, even if it means doing something outside their usual duties Job clarity (5 items) Q3. I understand what I have to do on my job. Q9. I am able to satisfy the conflicting demands of various people I work with. Q11. I know what the people I work with expect of me. Q12. I feel that I can get information needed to carry out on my job. Q17. I have a clear understanding of the goals and objectives of this restaurant as a whole Dispositional influence/selfmotivation (3 items) Q27. I am enthusiastic about my job Q28. I am proud of the work I do; Q29. I feel happy when I am working hard *Items written in Italic were removed for Taylors‟ 47 3.3.5 Instructor Instrument Another type of instrument used in this study is the instructor evaluation. This questionnaire has three parts. The first section has seven questions identified by Maxham et al. (2008) and Alexander et al. (2009). This section aims to measure student performance during the working period at the training restaurant, which includes knowledge of product, knowledge of procedure, production skill, service skill, and managerial skill. This section uses a sevenpoint Likert scale, in which „1‟ indicates that a student has a poor performance and „7‟ indicates that a student has an excellent performance. The second section has two questions and aims to measure the student‟s intent to go beyond the minimum requirement. This second section used a sevenpoint Likert scale, in which „1‟ indicates student has very low intent to go beyond the minimum requirement and „7‟ indicates very high intent. The third section, which contains two questions, measures student effort level to satisfy customers based on how often this attribute is observed in the student‟s daily work. This section used a sevenpoint Likert scale, in which „1‟ indicates that the student never puts effort to satisfy customers and „7‟ indicates that the student always tries to satisfy customers. Table 3.3 presents the constructs and items used in the instructor instrument. See Appendices 3b and 3d for the complete instructor instrument used in Taylors‟ Dining and FTR. The items listed in the instructor instrument were the same for both training restaurants. Target Population. The target population for this type of instrument was the instructors who were responsible for supervising all students who operated each restaurant. The instructors evaluated the job performance of each student based on his/her production and service skill during the lab session at the training restaurant. The study conducted 48 convenience sampling to collect instructor evaluations. The samples were all instructors who supervised the students in Taylors‟ Dining and FTR during the survey period. Sample size. Only one instructor supervised each training restaurant. Survey Administration. The study administered the survey by distributing a list of performance measurement items to the instructors during the last week of the survey period. The instructors then assessed each student‟s performance. Table 3.3 Instructor questionnaire items Constructs and items Students InRole Performance (8 items) Q1a. How do you rate this student in terms of performance with regard to knowledge of the restaurant products? Q1b. How do you rate this student in terms of performance with regard to knowledge of opening procedures? Q1c. How do you rate this student in terms of performance with regard to knowledge of closing procedures? Q1d. How do you rate this student in terms of performance with regard to all required tasks specified in his/her role as a student in a laboratory? Q3a. How do you rate this student in terms of performance with regard to production skill? Q3b. How do you rate this student in terms of performance with regard to service skill? Q3c. How do you rate this student in terms of performance with regard to managerial skill? Q2. How do you rate this student in terms of overall performance? Students ExtraRole Performance (3 items) Q4. How do you rate this student‟s intention to go above and beyond “the call of duty”? Q5. How do you rate this student‟s intention to voluntarily do extra or nonrequired work in order to help customer? Q6. How often did the student willingly go out of his/her way to make a customer satisfied? 49 3.4 Research Step 3: Generating Simulated Data A common method to test the performance of statistical and/or machine learning models with a small sample size is by performing a simulation study on generated artificial data. In this study, a student‟s responses within the studentemployee questionnaire were assumed to be correlated, while the responses between any two student surveys were assumed to be independent. Additionally, responses within an instructor‟s questionnaire for any given student were also assumed to be correlated, while the instructor‟s evaluations for different students were assumed to be independent. The simulated data was generated to mimic the students‟ responses and the instructors‟ evaluation that were collected from the surveys. Therefore, this study generated ordinal correlated data to test the performance of the OLR and ANN models in order to mimic the assumption of data collected from the survey, which were correlated within subjects and independent between subjects. There were two groups of data sets generated in this study. The first one consisted of one predictor variable and one outcome variable, while the second one consisted of three predictor variables and one outcome variable. The first data set referred to the link between studentemployee perceived value of employee satisfaction determinants and overall satisfaction, while the second data set referred to the link between studentemployee overall satisfaction and job performance. Since there were only 24 and 28 students responses collected from Fajar Teaching Restaurant and Taylors‟ Dining, this study only used 3 out of 42 items listed as employee satisfaction determinants as the predictor variables in the first data set. The purpose of using only three items is to follow the rule of thumb suggested by Peng, Lee, and Ingersoll (2002) and Churchill and Brown 50 (2007) regarding to the ratio between an outcome variable and its predictors, which is 1:10. The study selected the input variables based on the gamma correlation coefficient as suggested by Guyon and Elisseeff (2003). The top three employee satisfaction determinants that had the highest Goodman Kruskal‟s gamma correlation coefficient with the studentemployee overall satisfaction were chosen as the predictor variables in the first data set. The study uses the Goodman Kruskal‟s gamma to express the correlation coefficient because this coefficient is a common method to measure correlation between ordinal variables if there is a large number of ties in the data set, as in this case study (Lee, 1997). The threepredictor variables for the first data set from Taylors‟ Dining were “understanding what to do,” “enthusiastic feeling” and “opportunity to develop skill.” The predictor variables for data set from FTR were “understanding what to do,” “proud to be a worker” and “opportunity to develop skill.” Three scenarios were carried out to generate each group of data sets: 1) Using marginal probabilities and correlation coefficients obtained from the Taylors‟ Dining Room data set; 2) Using marginal probabilities and correlation coefficients obtained from the Fajar Teaching Restaurant data set; and 3) Using randomly generated marginal probabilities and correlation coefficients to simulate a more general case. For each scenario, 1,000 runs of simulation, which was the same as the number of simulations suggested by Dietterich (1998), were performed in order to account for training and testing data variation and internal randomness. Each run of simulation generated 100 data points, which consisted of 50 training data points and 50 testing data points. By using training data generated from each run of the simulation, both the Ordinal Logistic 51 Regression (OLR) and Artificial Neural Network (ANN) models were built. Then, these two models were used to predict the outcome using the predictor variables in the testing data sets. The last step was to calculate the misclassification rate as the proportion of disagreement between the predictedoutcome resulted from the model and the actual outcome from the testing data set. Smaller misclassification rates were preferred. 3.4.1 Procedure to Generate Ordinal Correlated Data This study applied the convex combination method suggested by Lee (1997) to generate correlated ordinal data based on the marginal probabilities and correlation coefficient. The simulations to generate the data were carried out using SAS 9.3. The correlation coefficient used in the simulation was expressed as the Goodman Kruskal‟s Gamma correlation. According to Ibrahim and Suliadi (2011), the convex combination method required less computation than the iterative proportional fitting method proposed by Gange (1995) and provided more flexibility than the method provided by Biswas (2004). The convex combination method was carried out in two stages. The first one was finding the joint distribution based on the marginal distribution and gamma correlation coefficient, and the next stage was generating ordinal random values by using the inversion algorithm. To validate the results generated from the convex combination method, this study conducted a mean rank test to compare the results and the desired marginal probabilities and correlation coefficients. The procedure to find the joint distribution can be summarized as follows: 1. Identify two extreme tables, the maximal table (πmax, corresponds to and the minimal table (πmin, corresponds to 52 2. Find λ by considering the joint distribution table of and 0≤λ≤1. As long as λ can be identified, then 1 exists. 3. Find joint distributions that meet the univariate and bivariate margins using linear programming. 3.4.2 Procedure to Generate Random Marginal Probabilities Random marginal probabilities were generated following the uniform distribution provided in IBM SPSS Statistics. Since data collected from the training restaurants were on a sevenpoint Likert scale, the study generated the marginal probability for each category response based on the following distribution (see Table 3.4): Table 3.4 The distribution of random marginal probabilities Category response level Rules to generate marginal probabilities Category level 7, p7 p7 U (0,1) Category level 6, p6 p6 U (0,1p7) Category level 5, p5 p5 U(0,1(p6+p7) Category level 4, p4 p4 U(0, 1 (p5+p6+p7)) Category level 3, p3 p3 U(0, 1 (p4+ p5+p6+p7)) Category level 2, p2 p2 U(0, 1 (p3+p4+p5+p6+p7)) Category level 1, p1 p1 = 1 – (p2+p3+p4+p5+p6+p7) where pi denote the proportion of response in the i category. The study started generating the marginal probabilities with the highest category response in order to give the higher category responses more flexibility to vary since survey data was commonly negativelyskewed distributed. The study generated the 53 marginal probabilities following the rules presented in Table 3.4 that were developed after the discussion with the committee member to ensure random and reasonable marginal distributions on the simulated data. 3.4.3 Procedure to Generate the Correlation Coefficient and Correlation Matrices A single correlation coefficient used to correlate studentemployee overall satisfaction and job performance was generated following the uniform distribution provided in the IBM SPSS Statistics. The lower limit of the correlation coefficient was set at 0.27 based on the lower 95% bound of the correlation coefficient between employee satisfaction and job performance in previous research conducted by Judge, Thoresen, Bono, and Patton (2001). The upper limit used to generate the correlation coefficient was set at 0.96, the highest correlation coefficient between employee satisfaction and job performance found in the literature (Judge et al., 2001). After establishing the lower and upper limit, the correlation coefficient was generated as Random correlation matrices were needed to generate data sets with three predictor variables and one outcome variable, which represented the relationship between three student employee satisfaction determinants and overall satisfaction. To ensure that the generated random matrices conformed to the characteristics of correlation matrices (symmetric and positive semidefinite), this study generated 4 x 4 correlation matrices following the algorithm suggested by Budden et al. (2007). Based on this algorithm, if rij is the correlation coefficient between xi and xj, and x1, x2,…, xn are random variables where n = total number of random variables, for j=2,3,4, and i=1, three correlation coefficients 54 (r12, r13, and r14) could be randomly generated using a uniform (1,1) distribution. The other correlation coefficients (r23, r24 and r34) should be randomly chosen from the intervals provided by the algorithm to ensure the symmetric and positive semidefiniteness of the matrices. Since this study found that all variables were positively correlated to each other, then r1j , where j=2,3,4. Additionally, the minimum r23, r24 and r34 were set at 0 and the maximum followed the upper limit given by the algorithm. 3.4.4 Procedure to Validate Generated Data The study performed a mean rank test, a nonparametric rankbased test for ordered categorical responses, to determine whether the generated data had an identical distribution to the original data. This test was performed to ensure that the algorithm used to generate correlated ordinal data worked properly. The study conducted the Wilcoxon test and the MannWhitney test to validate generated data since both of these tests were the most commonly used rank tests for ordered categorical data (Agresti, 2010; Leech, C.Barrett, & Morgan, 2011). 3.5 Research Step 4: Build Model This study used two modelbuilding techniques, the Ordinal Logistic Regression (OLR) and Artificial Neural Network (ANN), to test two relationships in the serviceprofit chain. Before constructing the OLR and ANN models, the study carried out some preparation steps, such as checking for missing values and outliers as well as calculating skewness and kurtosis. Since the total numbers of students who worked at the training 55 restaurants were relatively small, this study also used data generated from a simulation to build the OLR and ANN models. The performance of OLR and ANN models were measured based on the misclassification rate. A model with the lowest misclassification rate was preferred. 3.5.1 Artificial Neural Network Within the ANN model, a specific activation function is used to connect two layers (input and output layer) in the model. The number of nodes in the input and output layers is used to determine the number of nodes in the hidden layer. The type of activation function used in the model depends on the outcome range in the output layer. Other aspects to be considered during the building process are the network architecture and topology, and learning algorithm. This study built the ANN models using IBM SPSS Modeler 14.2. Based on the option available in this software package, steps carried out to build the ANN model can be explained as follows: 1. Determine the objective: build a new model. 2. Determine the type of network architecture: a multilayer perceptron (MLP). 3. Determine the number of nodes in the hidden layer. 4. Determine stopping rules. 5. Determine a percentage of records used for an overfit prevention set 56 3.5.2 Ordinal Logistic Regression The OLR model is an extension of a logistic regression used to analyze ordinal data. The OLR method is the most appropriate and practical technique to analyze the effect of independent variables on a rankordered dependent variable because the dependent variable cannot be assumed as normally distributed or as interval data (Lawson & Montgomery, 2006). The OLR model fit depends on the number of independent variables and the selected link function determined during the modelbuilding phase. This study built the OLR models using IBM SPSS Modeler 14.2. Based on the options available in IBM SPSS Modeler 14.2, the steps to build the OLR models can be explained as follows: 1. Determine whether the intercept is included in the model or not. 2. Specify the link function. 3. Specify the parameter estimation method. 4. Determine the scale parameter estimation method. 5. Specify the iteration rule to control the parameters for model convergence. 3.5.3 Comparing Model Performance This study used misclassification rate to measure the performance of the constructed OLR and ANN models. The misclassification rate was measured as the aggregate ratio of total wrong classifications for all classes to the total number of data used in the model. For example, since the variables used in this study were a sevenpoint Likert scale, then the misclassification rate was calculated as the total number of wrong classifications for response category one to seven. A wrong misclassification occurred 57 when the predicted categories from the model were not the same as the actual categories presented in the testing data. The lower misclassification rate indicates better model performance. In IBM SPSS Modeler 14.2, the misclassification rate is presented along with the confusion matrix. This matrix has an appearance similar to a contingency table and contains information related to the actual and predicted classification done by the specified model. The dimension of this matrix depends on number of the actual and predicted category responses. By using data generated from the simulation, this study built 1,000 OLR and ANN models to compare the misclassification rates obtained from each model. There were 1,000 1 and 2 values calculated from each model, where 1 and 2 referred to misclassification rates resulting from the OLR and ANN models respectively. The number of misclassification rates collected from each model was large enough (n > 30) to apply the central limit theorem to test the difference between the average misclassification rates resulting from the OLR and ANN models. Based on the central limit theorem, the assumption of normally distributed population were unnecessary since the test was performed on large sample sizes (Devore, 2008). Since the population variance was unknown, the test used the sample variance. The hypothesis test was as follows: , and 58 1000 1000 (3.1) where = the true mean misclassification rate for the ordinal logistic regression model = the true mean misclassification rate for the artificial neural network model = the sample average of misclassification rate resulting from the OLR model = the sample average of misclassification rate resulting from the ANN model = sample variance of resulting from the OLR model = sample variance resulting from the ANN model For α = 0.05, α/2 = 0.025, and Zα/2 = 1.96 and Z1α/2 = 1.96 (twosided test). is rejected if p value is smaller than the desired type I error (α). If H0 is rejected, then the study concludes that there is a statistically significant difference on the mean of misclassification rate resulting from the OLR and ANN models. Otherwise, H0 is fail to be rejected, which means the mean of the misclassification rates resulting from the OLR is not statistically significant different from the one resulting from the ANN. 3.6 Summary This chapter presents detailed procedures used to compare the Ordinal Logistic Regression (OLR) and Artificial Neural Network (ANN) models to analyze ordinal data. These procedures can be grouped into 4 steps. The first step is to develop the framework model. The study uses the internal link of the Service Profit Chain (SPC) as the framework to compare the OLR and ANN models. The internal links used in this study 59 consists of two causal links: the link between employee perceived value of the internal and external determinants of employee satisfaction and employee overall satisfaction and the link between employee overall satisfaction and job performance. Based on the framework outlined in the previous step, the second step is to design a data collection plan. The study conducts surveys in two training restaurants, Taylors‟ Dining Room at Oklahoma State UniversityUSA and Fajar Teaching Restaurant (FTR) at Universitas Negeri MalangIndonesia. Students and instructors are the respondents for the surveys. The third step is to generate correlated ordinal data using simulation proposed by Lee (1997). The simulated data is generated based upon the marginal probabilities and correlation coefficients that are similar to that of data collected from Taylors‟ Dining (scenario 1) and FTR (scenario 2), while the last simulated data have random marginal probabilities and random correlation coefficients (scenario 3). The simulated data in this study can be grouped into two sets. The first one is needed to test the relationship between student overall satisfaction and job performance. This data set consists of one input variable and one output variable. The other one is used to test the relationship between three determinants of student overall satisfaction and the student overall satisfaction. This data set consists of four variables which refers to three determinants of student overall satisfaction (input) and student overall satisfaction (output). For each set, the correlated ordinal data are generated from 1,000 run of simulations with 100 observations (50 training data 50 testing data) on each run. The last step is to build the OLR and ANN models using each training data set generated from the simulations as explained previously. The performance of the OLR and 60 ANN models is compared based on the mean of the misclassification rates from the testing data set. The mean of the misclassification rates is calculated as the average of the proportion of disagreement between the predictedoutcome from the model and the actual outcome from the testing data. Hypothesis test on the mean of the misclassification rates is used to identify conditions in which the OLR outperforms the ANN model and vice versa. 61 CHAPTER IV THE ORDINAL LOGISTIC REGRESSION AND ARTIFICIAL NEURAL NETWORK WITH ONE INPUT VARIABLE 4.1 Introduction This chapter presents the Ordinal Logistic Regression (OLR) and Artificial Neural Network (ANN) models that were built using one input variable. The input variable in this case was the student overall satisfaction and the output variable was the student performance. The input variable was obtained from the student instrument, while the output variable was obtained from the instructor instrument. To compare the performance of the OLR and ANN models, three scenarios were designed. The first scenario was to build both models using simulated data that has similar marginal probability distributions and correlation coefficient to collected data from survey at Taylors‟ Dining. The second scenario was to construct both models using simulated data that has similar marginal probability distributions and correlation coefficients to collected data from surveys at 62 Fajar Teaching Restaurant (FTR), while the last scenario was to build both models using randomly generated correlated ordinal data based on the random marginal probabilities and correlation coefficients. 4.2 Preparation Steps Before constructing the models, a review was performed to determine if there were any missing values in any data set. The initial check showed that there were no missing values found in the data collected from both restaurants, Taylors‟ Dining and FTR, respectively. There were 24 and 28 student responses from FTR and Taylors‟ Dining. In addition, there were 24 and 28 responses received from the instructors who evaluated the student performance in each restaurant. The study also explored the marginal probabilities of each collected data set. As shown in Figures 4.1 and 4.2, the distributions of the student overall satisfaction and student performance data from both restaurants were negatively skewed. This meant that most students rated their overall satisfaction as student lab as “neutral” or higher, and most students were assessed as having good performance or higher by the instructor. The skewness values of student overall satisfaction data collected from Taylors‟ Dining and FTR were 1.447 and 0.566, respectively. Additionally, the skewness values of student performance data collected from Taylors‟ and FTR were 0.955 and 0.208, respectively. The skewness indicated that the student overall satisfaction and performance data collected from Taylors‟ Dining was more negatively skewed than the one collected from FTR. The kurtosis values of student overall satisfaction data collected from Taylors‟ Dining and FTR were 1.993 and 0.507 respectively. The kurtosis values 63 indicated the “peakedness” (positive kurtosis) and flatness (negative kurtosis) of student overall satisfaction data collected from Taylors‟ and FTR. Figure 4.1 Marginal probability distributions of input and output data in Taylors‟ dining (one input variable) Figure 4.2 Marginal probability distributions of input and output data in FTR (one input variable) .0 10.0 20.0 30.0 40.0 50.0 60.0 4 5 6 7 Marginal Probabilities (%) Response level Marginal Probability Distributions in Taylors' Dining Student Overall Satisfaction Student Performance .0 5.0 10.0 15.0 20.0 25.0 30.0 35.0 40.0 45.0 50.0 4 5 6 7 Marginal Probabilities (%) Response Level Marginal Probability Distributions Fajar Teaching Restaurant Student Overall Satisfaction Student Performance 64 To be able to construct OLR and ANN models, each student‟s response on the overall satisfaction statement was paired with the student performance assessment by the instructor. All students in FTR put their names on the questionnaire, while seven out of twentyeight students in Taylors‟ Dining did not put their names on the surveys. Thus, the study was not able to calculate the correlation coefficient for data collected from Taylors‟ Dining. Instead, the correlation coefficient between student overall satisfaction and student performance in Taylors‟ Dining was assumed to be similar to the correlation coefficient obtained from FTR. The gamma correlation coefficient between student overall satisfaction and performance based on data collected from FTR and based on data collected from Taylors‟ (excluding students‟ responses without name) are 0.57 and 0.63, respectively. Thus, the correlation coefficients collected from both training restaurant were assumed to be comparable. The correlation coefficient between student overall satisfaction and performance based on data collected from FTR is shown in Table 4.1. The results in Table 4.1 show the obtained Gamma (a correlation coefficient for ordinal scale) is .57 with a significance level of 0.008, which means student overall satisfaction is positively correlated with student performance, assuming α=.01. On the other hand, the obtained Pearson (a correlation coefficient for interval scale) is .438 with a significance level of .032, which means that the correlation is not statistically significant at α=.01. These results indicate that treating ordinal data as different scales, either interval or ordinal, may result in a different correlation coefficient and significance level. The study uses the obtained Gamma correlation coefficient, to generate correlated ordinal data for scenario 65 1 (Taylors‟ Dining Room‟s scenario) and 2 (Fajar Teaching Restaurant‟s scenario) in order to treat the ordinal data with a relevant ordinal analysis. Table 4.1 Correlation coefficient between student overall satisfaction and performance Value Approx. Sig. Ordinal by Ordinal Gamma .570 .008 Interval by Interval Pearson's R .438 .032 N of Valid Cases 24 4.3 Validating Algorithm to Generate Correlated Ordinal Data As explained in section 4.2, some students in Taylors‟ Dining did not put their names on the questionnaire, so it could not be paired with instructor responses. This study used data collected from FTR to validate the algorithm applied to generate correlated ordinal data. Cross tabulated data from FTR and its initial simulated data set are shown in Tables 4.2 and 4.3. The results in Tables 4.2 and 4.3 show by inspection that the difference between marginal probabilities for each response category in data obtained from FTR and from the simulation ranges from 0.7%  9.5%. Table 4.2 Cross tabulated data from Fajar Teaching Restaurant Instructor perception toward student performance Total 5 6 7 Student overall satisfaction 4.00 Count 1 1 0 2 % of Total 4.2% 4.2% .0% 8.3% 5.00 Count 1 4 0 5 % of Total 4.2% 16.7% .0% 20.8% 6.00 Count 2 4 3 9 % of Total 8.3% 16.7% 12.5% 37.5% 7.00 Count 1 2 5 8 % of Total 4.2% 8.3% 20.8% 33.3% Total Count 5 11 8 24 % of Total 20.8% 45.8% 33.3% 100.0% 66 Table 4.3 Cross tabulated data of the first generated correlated ordinal data set Instructor Perception toward Student Performance Total 5.00 6.00 7.00 Student overall satisfaction 4.00 Count 5 2 2 9 % of Total 5.0% 2.0% 2.0% 9.0% 5.00 Count 8 4 1 13 % of Total 8.0% 4.0% 1.0% 13.0% 6.00 Count 5 35 7 47 % of Total 5.0% 35.0% 7.0% 47.0% 7.00 Count 4 10 17 31 % of Total 4.0% 10.0% 17.0% 31.0% Total Count 22 51 27 100 % of Total 22.0% 51.0% 27.0% 100.0% To determine whether the mean rank between the survey data and the simulated data was statistically different or not, a mean rank test was also carried out. The mean ranks for the survey data (data collected from FTR) and the simulated data are shown in Table 4.4, while the Wilcoxon test and MannWhitney test results are shown in Table 4.5. Table 4.4 Mean rank for student overall satisfaction and performance group N Mean Rank Sum of Ranks Student overall satisfaction Survey data 24 61.27 1470.50 Simulated data 100 62.80 6279.50 Total 124 Instructor evaluation on student performance Survey data 24 65.40 1569.50 Simulated data 100 61.81 6180.50 Total 124 Table 4.4 shows that the mean rank of the student overall satisfaction variable from the survey data is lower than the one from the simulated data, while the mean rank 67 of the student performance variable from the survey data is higher than the one from the simulated data. Assuming α=0.01, the asymptotic significance values for the student overall satisfaction and student performance, as shown in Table 4.5, are 0.842 and 0.632, respectively. Both of these significance values are greater than the specified α. Thus, there is no significant difference between mean ranks on FTR‟s student overall satisfaction and student performance data and the simulated data. These results suggest that the algorithm used to generate these correlated ordinal data is valid and can be used for further analyses. Table 4.5 Mean rank test statistics Student overall satisfaction Student performance MannWhitney U 1170.500 1130.500 Wilcoxon W 1470.500 6180.500 Z .199 .479 Asymp. Sig. (2tailed) .842 .632 4.4 Scenario 1 This scenario generated data with similar marginal probabilities to data collected from Taylors‟ Dining. As mentioned in section 4.2, the correlation coefficient used in this scenario was assumed to be similar to data collected from Fajar Teaching Restaurant. The study performed 1,000 runs of the simulation to generate 1,000 data sets with 100 observations in each data set. The 100 observations were then split into two sets: 50 observations were used as a training data set and the others were used as a testing data set. 68 The marginal probabilities of student overall satisfaction and student performance, as shown in Figure 4.1, were negatively skewed, which meant that data was likely to be distributed among the higher response levels. Therefore, a cumulative loglog function is more appropriate for use in the OLR link function than the other available cumulative functions such as cumulative logit or probit (Agresti, 2010; Chen & Hughes, 2004). The study used the multilayer perceptron (MLP) as the network architecture in the ANN model since this architecture is more appropriate for predictive classification problems (Turban, Sharda, & Delen, 2011). The automatic option available in IBM SPSS Modeler was chosen to set the hidden layer since the automated neural networks in IBM SPSS were very powerful (Nisbet et al., 2009). This option let the software determine the number of nodes in the hidden layer that make the model fit best with the data set. The biggest benefit of using the automatic option was that the software automatically searched over the decision surface with different initial learning rates, different momentum, and different numbers of hidden layers in order to get the best parameters for the model (Nisbet et al., 2009). The study allocated 30% of the data set as an overfit prevention data set, which was used to track errors during the training process in order to prevent an over fitted model. The descriptive statistics of the misclassification rates for the OLR and ANN models for scenario 1 are shown in Table 4.6. 69 Table 4.6 Descriptive Statistics of Misclassification Rates from Scenario 1 (one input variable) N Range Min Max Mean Std. Deviation OLR misclassification rate 1000 .44 .22 .66 .4536 .07539 ANN misclassification rate 1000 .42 .24 .66 .4556 .07420 Valid N (listwise) 1000 Table 4.6 indicates that the mean and maximum values of the misclassification rates obtained from the OLR and ANN models were not significantly different. Additionally, there were only small differences between the range and standard deviation resulting from both models. 4.5 Scenario 2 This scenario generated data with similar probabilities and a correlation coefficient to data collected from Fajar Teaching Restaurant. The study also performed similar simulations to those explained in Scenario 1. The marginal probabilities of the student overall satisfaction and the student performance, as shown in Figure 4.2, were negatively skewed. This meant that data was likely to be distributed on the higher response levels. Thus, the cumulative loglog function was more appropriate for use in the OLR link function than the other available cumulative functions such as cumulative logit or probit (Agresti, 2010; Chen & Hughes, 2004). The ANN models for scenario 2 were built using the same approach as scenario 1. This scenario also applied the multilayer perceptron (MLP) network architecture and the automatic option in the hidden layer setting because the automated neural networks 70 provided by IBM SPSS Modeler was very powerful according to Nisbet et al. (2009). To prevent obtaining an overfit model, the study also allocated 30% of the data set as an overfit prevention data set. The descriptive statistics of misclassification rates for the OLR and ANN models for scenario 2 are shown in Table 4.7. This table shows that the range, minimum, and maximum values of the misclassification rates obtained from the OLR and ANN models were exactly the same. The mean misclassification rate from the OLR models was slightly lower than the one from the ANN models. Additionally, small differences were found between the standard deviation of misclassification rates that resulted from both models. Table 4.7 Descriptive Statistics of Misclassification Rates from Scenario 2 (one input variable) N Range Min. Max. Mean Std. Deviation OLR misclassification rate 1000 .44 .20 .64 .4033 .07595 ANN misclassification rate 1000 .44 .20 .64 .4065 .07500 Valid N (listwise) 1000 4.6 Scenario 3 Scenario 3 generated ordinal correlated data based on random marginal probabilities and correlation coefficients using the uniform random generator available in IBM SPSS Statistics 19.0. The random number generator in IBM SPSS has a period of 232, which means that the software can generate 232 random numbers with a uniform distribution before it begins to repeat itself (McCullough, 1999). A previous study 71 suggested that a random number generator with a period of 231 is acceptable to generate 1,000 data points (L'Ecuyer & Hellekalek, 1998). Another study conducted by Knuth (1997) suggested that a more modest period of 231 could be used to generate one million random numbers. Therefore, the use of the random number generator provided by IBM SPSS Statistics 19.0 is acceptable to generate random numbers needed in 1,000 runs of the simulation. As explained in section 3.4.3, the lower limit of the correlation coefficient was set at 0.27 and the upper limit was set at 0.96. These limits were determined based upon the lower 95% bound of the correlation coefficient between employee satisfaction and job performance in the previous research conducted by Judge et al. (2001). By having the lower and upper limit, the correlation coefficient was generated following The distribution of the generated correlation coefficients used in this scenario is shown in Figure 4.3. This figure shows that the generated correlation coefficients are fairly evenly distributed among all intervals. The first and the last intervals were the two intervals in which the generated correlation coefficients were most highly concentrated. Figure 4.3 The distribution of the generated correlation coefficients 0 20 40 60 80 100 120 Frequency Correlation Coefficient Interval Generated Correlation Coefficient Distribution 72 The rules shown in Table 4.8 were used to generate marginal probabilities for both the student overall satisfaction and the student performance variables and were developed following the discussion with the committee member to ensure of the pro 



A 

B 

C 

D 

E 

F 

I 

J 

K 

L 

O 

P 

R 

S 

T 

U 

V 

W 


