ANEWMEASUREMENTSCALEFOREMPLOYEE ENGAGEMENT: SCALE DEVELOPMENT, PILOT TEST, AND REPLICATION CHRISTOPHER H. THOMAS Department of Management Northern Illinois University DeKalb, IL 60115 INTRODUCTION Kahn (1990, 1992)introduced the concept of personal engagement with work as “the harnessing of organizational members’selvesto their work roles; in engagement, people employ and expressthemselves physically, cognitively, and emotionally during role performances,” (1990, p. 694). As a heightened personal and emotional investment in one’sjob and work duties that goes beyond satisfaction or commitment, engagement has been linked to beneficial outcomesfor both individuals and organizations(Gubman, 2004; Harter, Schmidt, & Hayes, 2002; Kahn, 1992;Robinson, Perryman, & Hayday, 2004; Salanova, Agut & Piero, 2005). However, development ofthe research stream has been hampered by conceptual inconsistencies and an overall lack oftheoretical development. Moreover, it is unclearif engagement is a new, distinct construct, orifit merely represents an aggregate of other established constructs. The purpose ofthisresearch wasto clarify the underlying conceptualstructure of engagement, and was conducted in two phases. Phase 1 consisted of construct validation and scale development (cf., DeVellis, 2003; Hinkin 1995, 1998), and Phase 2 consisted of a pilot-test of the new scale, and data collection and analysisfrom a replication sample. PHASE 1: CONSTRUCT DEFINITION AND VALIDATION Although Kahn’s(1990)seminal work explored “discrete moments” of engagement (1992, p. 343),studying engagement as a psychologicalstate remains consistent with his framework. Specifically, Kahn described psychological presence as an “experientialstate enabling organization membersto draw deeply on their personalselvesin role performances,” (1992, p.321). Extantresearch supportsthe contention that engagement, as a state, provides a complex perspective on the individuals’ relationships with their work that accountsfor consistent patterns of workplace behaviors(Maslach & Leiter, 1997; Schaufeli & Bakker, 2003). Previousresearchers have attempted, with varying degrees ofsuccess, to empirically model engagement as a multidimensionalstate (May, Gilson & Harter, 2004; Schaufeli & Bakker, 2003). The main departure ofthis project from previousresearch isthat engagement was conceptualized as unidimensional. Specifically, although the consequences of engagement (i.e., behaviors) occur in three categories—physical, cognitive, emotional—the state preceding these behaviors was modeled as unidimensional (Kahn, 1992; May et al., 2004). Scale Development Scale development was undertaken to create itemsto measure engagement as a unidimensional experientialstate that isinfluenced by elements ofthe individual’s work context. This measure would account for both the affective-cognitive components of engagement ascaptured by existing instruments, and also for work-role behavioral intentions. Engagement is not an enduring personality trait that is generalizable acrosssituations. Rather, it is a relatively stable psychologicalstate. In terms of temporalstability (i.e, malleability), variance in individual levels of engagement can be caused by changesin the work environment or outside-life influences. Thus, Employeeengagement is a relatively stable psychologicalstate influenced by interactions of individuals and their work environment. Engaged employees are characterized by a readiness and willingnessto direct personal energiesinto physical, cognitive, and emotional expressions associated with fulfilling required and discretionary work roles. Semi-Structured Interviews. The preliminary set ofscale items was generated based on insightsfrom existing literature and from information gained through semi-structured interviews. Individuals occupying diverse occupationsfrom various organizations were selected forthese sessions. Males(n = 9) and females(n = 9) each represented 50% ofthe sample, and the intervieweesranged in age from 26 to 65 years with a mean age of 36.5 (SD = 9.9 years). Engagement wasstrongly related to the job tasks.Respondentstended to connect engagement to times when they were performing tasks of personalsignificance to them. Also, when respondents had knowledge of tangible results and could quantify their progresstoward providing a high-quality work product they reported being more engaged. Respondents also linked engagement to times when they were particularly driven to be a noticeable asset to the firm. Individuals were curious and more creative; they approached their jobsin a thoughtful manner; they critically examined the elements and details of theirtasks; they felt a sense of competence and were willing to take risks; they were more persistent and industriousthan was normal for them; and they sought to provide high-quality products and services while being good representatives of their organization. Furthermore, among those who considered themselvesto be generally dutiful, conscientious, orindustriousitseemed that these characteristics were magnified during times of engagement. The consensus wasthat when the right personal, task, and contextual elements came together a state of engagement wasthe result. Item Generation. Items were created to capture the state of mind described by interviewees. The newly created items were intentionally worded such that they could be used in diverse employment and research settings. An initial pool of eighty-one items wasreduced to forty-four after a preliminary examination revealed problemslike overly complex wording, being too industry specific, or other violations of “best-practices” for item structure (cf., Hinkin, 1998). Expert Opinion. The nextstep in establishing content validity followed the example of MacKenzie, Podsakoff, and Fetter (1991). Seven subject matter experts(SMEs) were asked to classify each item based on the degree to which it appeared to be an appropriate measure of the intended construct. Itemsthat generated at least 70% agreement were retained. SMEs were also prompted to provide open-ended feedback. Seventeen items met the initialretention criteria. Two ofthese items were rated by at least one expert to be of “low relevance or not at all relevant,” and consequently were eliminated. Between two remaining items with very similar focus one was chosen at random for elimination. Improvementsin item wording were made following a supplementalround of consultation with SMEs.Content Adequacy. Content adequacy was assessed using the procedure described by Schriesheim et al. (1993). The Schriesheim et al.method isintended for multidimensionalscales, and had to be amended forthe proposed unidimensionalscale. To approximate thismethod, the definitions and scale itemsfrom two distinct, but possibly related, constructs were included along with the definition and itemsfor engagement. This allowed for analysis of whether engagement items were perceived as being adequate for measuring the intended construct, as well as whether the items were perceived as distinct from the other constructs. The sample consisted of fifty-nine MBA students at a large southeastern university. An extended matrix ofrespondent ratings was created for this analysis(cf., Hinkin & Tracey, 1999; Schriesheim et al, 1993). Fourrows of data were generated for each respondent representing their rating for each item on each ofthe three constructs being compared and a fourth category of “None of these / Other.” The sample of 59 respondentsthereby created a dataset of 236 rows. Each item displayed the highest mean value on itsintended construct, thus establishing preliminary evidence that the individual items were perceived as belonging to their designated construct. As no items had the highest mean value on the “Other” category, it was eliminated from further consideration. Next, the data were analyzed via exploratory factor analysis using principal axisfactoring. Three distinctfactors emerged from the data, with all but one item loading on itsintended factor. The item which loaded incorrectly was a newly created engagement item. It wassubsequently discarded. PHASE 2: PILOT TEST, VALIDITY ASSESSMENT, REPLICATION The nextstep wasto administer the new scale, along with measures of other constructs, to a sample representative ofthe population ofinterest, (Hinkin 1995, 1998). This allowed assessment of both factorstructure, and convergent and discriminant validity. An email wassent to 1,500 randomly selected StudyResponse.com members who met the following inclusion criteria: U.S. resident, employed full or part-time, and at least 18 years old. The initialsample included 527 responses. Sixty-one cases were removed due to incomplete data, resulting in a finalsample of 466 responses(31% response rate). The mean age ofthissample was 38.0 (SD = 10.8).Respondentsindicated an average of 67.9 months(SD = 77.3)in their current job, and 70.8 months(SD = 77.4) with their current organization. Factor Analysis Both exploratory and confirmatory factor analyses were used to test and refine the scale. A split-sample method was utilized on the pilot-test data,such that 200 randomly selected observations were used for EFA, and the remaining 266 observations composed the CFA sample. Exploratory Factor Analysis(EFA). EFA was conducted in SPSS using Principal Axis Factoring with Varimax rotation. The a priori assumption wasthat the items would generate a single factorsolution. The solution converged after nine iterations, with two extracted factors. The eigenvaluesforthese factors were 9.94 and 1.46, respectively. Other than being reversescored, a distinct commonality did not exist among the three items composing the second factor. Researchers have documented cases where reverse-scored itemsintroduce systematic errorresulting in an artifactual response factor consisting of all negatively-worded items (Harvey, Bilings, & Nilan, 1985; Schmitt & Stults, 1985). Thus, these three items were removed,and the fifteen remaining items were reanalyzed. The 15-item EFA converged on the expected single-factorsolution (α= .95). Three items generated relatively low item-total correlations, and also produced the lowest factorloadings. These items were eliminated and the twelve bestfunctioning items were retained finalscale development. Confirmatory Factor Analysis(CFA). Using maximum-likelihood estimation via LISREL 8.52 (Joreskog & Sorbom, 1996), a model wasspecified with the twelve items designated to load on a single factor. The fit of this model was marginal(χ 2 = 187.85, df = 54, p <.01; RMSEA = .10; TLI = .97;CFI = .98; GFI = .89). While TLI andCFIindicated good fit (Hu & Bentler, 1999; Lance & Vandenberg, 2002), GFI waslower than desired, andRMSEA fell outside the acceptable upper bound of.08. The residual covariance matrix was examined to determine the sources of poorfit(Byrne, 1998; Loehlin, 1998). The standardized residualsrevealed three sets of problematic item pairs. In each case, the items-pairs were capturing an overlapping portion ofthe engagement domain, thus one item from each pair was eliminated. Having eliminated three items, the remaining nineitems were analyzed. The fit ofthis nine-item model improved significantly (χ 2 = 35.24, df = 27, ns; RMSEA = .03; TLI = .99; CFI = 1.00; GFI = .97). Scale alpha was.93. Convergent and Discriminant Validity. Gerbing and Anderson (1988)refer to thisstage as verification ofthe external consistency of the measure since the test assuresthat items continue to be associated with their prescribed scale when examined among multiple measures. CFA has become the method of choice for this procedure (Bagozzi, Yi, & Phillips, 1991). The processissimilar to the scale refinement process, but with additional measuresincluded (Hinkin, 1998). The results of this expanded CFA supported the single factorstructure of work engagement, and established evidence of convergent and discriminant validity. All items had significant loadings and the fit of the model was good (χ 2 = 1857.79, df = 725, p < .01; RMSEA = .08; TLI = .96;CFI = .96;RFI = .93)(Hu et al., 1999; Lance et al., 2002; Millsap, 2002). Additional constructs were chosen based on the assumption that they would occupy space within the nomological net of work engagement (e.g., organizational commitment, job satisfaction, turnoverintentions, job meaningfulness, and work alienation). Meaningfulness was expected to positively influence the degree to which workers become engaged with theirjobs. As expected, these two factors were positively correlated at a high level (r=.64, p < .01). Likewise, individuals displaying high levels of work alienation would not be expected to be engaged. This relationship wassupported by a significant, negative association (r= -.46, p < .01). In terms of criterion-related validity, engaged workers would be expected to report higherlevels of organizational commitment, greaterjob satisfaction, and lower turnover intentions. These assumptions were supported (r oc = .51, p < .01;rjs = .52, p < .01;rti = -.40, p < .01). The associationsindicated strong, predictable relationships, and also supported the distinctiveness of engagement from existing measures. ------------------------------------------ Insert Table 1 about here ------------------------------------------- Replication Study. The finalstep wasto gather data from an additional independent sample for replication purposes(Hinkin, 1998). Three units of a not-for-profit, community-owned health care system located in the southeastern United States were sampled. Unit managers were asked to distribute surveys at a weekly meeting. Respondents were informed that participation was voluntary, and that theirresponses were completely anonymous. Respondents were allowed to complete the surveys during working hours and returned them via a selfaddressed stamp envelope. Ninety-eightsurveys were distributed and 57 were returned during the next three weeks. The response rate was 58%. The replication data furthersupported the findingsfrom the pilot-testsample. To begin, the engagementscale generated an alpha value of.89. This value isslightly lower than the alpha value obtained from the pilot-test, but was nonethelessindicative of internal consistency. Following the suggestions of Fan and Thompson (2001) regarding scale reliability, a 95% confidence interval for the alpha value was constructed to ensure that the entire range not only exceeded the minimally acceptable value of.70 suggested by Nunnally (1978), but also meets his more stringentrecommendation of.80 or greaterfor applied research (Lance, Butts, & Michels, 2006). The lower bound was.85 and the upper bound was.93. Meaningfulness wasstrongly correlated with engagement in both samples(r pilot = .64; rreplication = .65), as was organizational commitment(r pilot = .51;rreplication = .58). Additional correlation data supported significant relationships between engagement and several behavioral outcomes. Specifically, with task performance behaviors(r = .49, p < .01), contextual performance behaviors(r = .77, p < .01), role innovation (r = .49, p < .01), and positive emotional displays(r = .55, p < .01). Various demographic characteristic were examined forsignificant associations. Previous engagement research reportsthat engagement levels are higherforsupervisors and managers than for employees at lowerlevels of the organization (Robinson et al., 2004; Schaufeli & Bakker, 2003). The current data supported these findings. Mean-level differences between staff employees,supervisors, and managers were significant (p < .05). Previousstudiesreport that demographic differencesin engagement are statistically non-significant, orthose differencesthat do exist have been labeled as “significant, but weak,” orlacking “practicalsignificance,” (Robinson et al., 2004; Schaufeli & Bakker, 2003, p. 18). The current data followed these patternsin that males did reportslightly higher engagement, and the relationship of engagement with age wasin a positive direction; however there were no statistically significant gender, age, or racial differences. DISCUSSION The number of engagement-related publicationsindicates a growing interest in engagement as a means of propelling individualsto make greater contributionsto organizational success. Thisstudy bolsters existing engagement research, positions engagement as a distinct construct within organizationalresearch, and provides new insightsto propelfuture research. The newly developed scale supportsthe idea that engagement is a state of aroused,situationspecific motivation that is correlated with both attitudinal and behavioral outcomes. Individually the scale items appear to measure separate phenomenon like curiosity, diligence, industriousness, learning orientation, or even achievement motives. However, the amalgamation ofthese separate motivations being activated in concert with one another identifiesthe state of employee engagement. REFERENCES AVAILABLE FROM THE AUTHORTable 1 Final 9-item EmployeeEngagementScale 1. I am willing to really push myself to reach challenging work goals. 2. I am prepared to fully devote myself to performing my job duties. 3. I get excited thinking about new waysto do my job more effectively. 4. I am enthusiastic about providing a high quality product orservice. 5. I am always willing to “go the extra mile” in orderto do my job well. 6. Trying to constantly improve my job performance is very important to me. 7. My job is a source of personal pride. 8. I am determined to be complete and thorough in all my job duties. 9. I am ready to put my heart and soul into my work. Note. Pilot-testsample (N = 266, α = .93);Replication sample (N = 57, α = .89) . of the new scale, and data collection and analysisfrom a replication sample. PHASE 1: CONSTRUCT DEFINITION AND VALIDATION Although Kahn’s(1990)seminal work. individuals and organizations(Gubman, 2004; Harter, Schmidt, & Hayes, 2002; Kahn, 1992;Robinson, Perryman, & Hayday, 2004; Salanova, Agut & Piero,