The morphological implications of two polysemous formatives in Khmer
The morphological implications of two polysemous formatives in Khmer Stephen Self GIAL Abstract While Khmer shows historical evidence of a rich prefixal and infixal morphology, the modern language is most often treated synchroncially as strictly isolating (Gorgoniev 1966:46–7) Compounding is admitted as the one truly productive path to word formation (Gorgoniev 1966:50; Schiller 1989:280) Yet two scholars, Gorgoniev and Haiman, have also observed what they categorize as synchronically productive derivational processes involving what appears to be prefixation Gorgoniev (1966:54) refers to the formatives that participate in these processes as “semi-prefixes.” Three of these ‘semi-prefixes’ are completely productive synchronically and derive from full lexical native Khmer words Of these, two are used to derive abstract nouns from verbs: kaː ‘business, work’ and səckdəj ‘signification’ (Gorgoniev 1966:55; Haiman 2011:47–8) The third, neak ‘person’, is used to derive agentive-type nominals corresponding more or less to English nouns in -er (Gorgoniev 1966:55; Haiman 2011:43, 74) In a sampling of 4,500 lexical entries from a Khmer dictionary, Gorgoniev (1966:54–5) found 222 words formed with kaː, 36 words formed with səckdəj and 121 words formed with neak On the productivity of derivation via kaː, Haiman (2011:48) remarks: “I have never encountered a case when this nominalization is rejected as ungrammatical, for any verb.” On that of neak, he writes: “So regular is this process that one is hesitant to categorize it with derivational morphological phenomena at all” (Haiman 2011:74) For both Gorgoniev and Haiman, the temptation to view these three lexemes more as affixes than as full lexical words stems from the words’ reduced semantics in the ‘derived’ forms Gorgoniev (1966:54) defines his category of ‘semi-affixes’ as “elements which have not lost their lexical meaning altogether, but recurring in a large number of words have assumed the character of seminotional [sic] affixes, that is to say, of affixes with traces of lexical meaning.” Yet, while Gorgoniev’s designation of ‘semi-affix’ leaves one unsure as to the formatives’ precise lexical status, Haiman uses the same semantic criterion to establish a specific threshold for deciding between an independent syntactic word and a grammatical affix He writes: “…[A] morpheme is a derivational affix if its meaning is no longer exactly the same as the word that it sounds exactly like” (2011:43) Taking neak as an example, Haiman (2011:44) states that as long as the word maintains its principle independent lexical meaning, “it will be treated in the syntax as a head noun rather than in the morphology as an agentive prefix.” Haiman warns that his approach will seem insufficiently restrictive for skeptical readers Indeed, Jacob (1993:54) had earlier expressed doubts about Gorgoniev’s similar analysis, noting: “The semiprefixes might equally be treated as nouncomponents used with high frequency in the first position in a compound….” The purpose of this squib is to investigate Haiman’s and Gorgoniev’s claims for the (semi)prefixal status of the native Khmer formative neak I argue that agentive neak constructions are not simply phrases formed in syntax, as is shown by the fact that they cannot take tense/aspect/modality (TAM) marking nor accommodate modifiers or determiners inserted between the core constituents and are restricted in what argument structure they can display by lexical, not syntactic, requirements I also claim that while neak constructions resemble verbal compounds in their internal structure and occasionally lexicalized semantics, they may yet be differentiated from such compounds by the high degree of semantic bleaching of neak, its use with nouns that are either inherently agentive or already possess their own agentive morphology, and its lexicalized use to distinguish the female practitioner of the given profession or activity named in the verb from her male counterpart The structure of the paper is as follows In section 2, I consider the peculiar nature of morphology in Khmer and the strangeness of claiming derivational morphology for an isolating language Section tackles the arguments for and against the syntactic formation of neak constructions, as well as something of the history of syntactic treatments of morphological phenomena In section 4, I turn to arguments for and against viewing neak constructions as verbal compounds The conclusion considers implications of Page of 23 the study for another formative in Khmer that seems to be have been largely grammaticalized to the status of a pluralizing prefix: puak This particular word forms a nice counterpoint to neak since it functions to mark inflectional, rather than derivational, meaning Derivational morphology in an isolating language? Aikhenvald (2007:3–5) structures much of her discussion of the typology of word formation around two parameters which she contends have been used in the linguistic literature since the nineteenth century to assess morphology from a typological perspective They are the transparency of word-internal boundaries and the internal complexity of words Languages which display little-to-no internal boundaries or complexity in their words are classified as isolating and analytic Aikhenvald singles out Chinese and Vietnamese as prototypical examples In both languages, the only synchronically productive morphological process is compounding, which can be prolific indeed It has been estimated that as much as 80% of the Modern Mandarin Chinese vocabulary consists of disyllabic words (Yuzhi 2002:71) Given the widely recognized near-total isomorphism between syllable and morpheme in Mandarin (Chao 1968:138–9), disyllabic words must by definition consist of two lexical morphemes As a result, Mandarin has been described as “a language of compound words” (Arcodia 2007:83) Khmer is another language Aikhenvald (2007:43–4) singles out as being of the isolating type She characterizes what non-compounding derivational morphology is apparent in Khmer as “fossilized,” comprising “just some relics of non-category-changing morphology.” In his recent reference grammar of Khmer, Haiman (2011:164) observes that Khmer lacks inflectional morphology as well Schiller (1989:280) concludes similarly: Essentially, Khmer words are all of the same morphological category, and cannot be inflected Compounding is possible, however, and quite prolific There are some items which appear to have affixes (prefixed and infixes), but these are simply vestiges of morphology from the era of Old and Middle Khmer, which had productive affixation Aronoff and Fudeman (2011:179) offer as a primary characteristic of isolating languages that they evidence “no derivational or inflectional processes of any kind.” The observations of Khmer by Aikhenvald, Haiman, and Schiller would seem to bear this generalization out Thus it comes as some surprise to find Haiman rather more sanguine about the possibility of noncompounding derivational morphology in Khmer He writes: “Like most SE Asian languages, including Chinese, Khmer seems to rely heavily on positional syntactic criteria However, more than most of these languages, Khmer also seems to have some derivational affixation” (2003:508) In introducing the third chapter of his grammar entitled “Derivational morphology and word formation,” Haiman (2011:43) observes: “A recurrent analytical problem when dealing with an isolating language like Khmer is the identification of derivational morphemes as such.” Here he implies that negative assessments of the existence of derivational morphology in Khmer likely stem from a misapprehension of the true nature of morphology in the language “[O]ne man’s word,” he writes tellingly, “is another man’s affix” (Haiman 2011:43) Haiman uses the example given in (1) to illustrate the analytical problem posed by the possible affixword neak ‘person’ (1) person neak aːn read (Haiman 2011:43) According to Haiman (2011:43), this string of just two words can be interpreted in three rather distinct ways: i) as a complete sentence with unmarked SVO word order (2); ii) as a reduced relative clause (3); or iii) as a single, agentive NP (4) (2) ‘(The) person reads.’ (3) ‘(the) person (who) reads’ (4) ‘reader’ Page of 23 A few background facts about Khmer are needed to clarify what at first blush might seem a dubious or even outlandish claim First, Khmer orthography does not use spaces between words within phrases; spaces occur as punctuation only between sense units, as between a subordinate clause and the matrix clause for example (Smyth 2008:14) Thus, words with derivational morphology inherited from Pali such as (5) and obvious native compounds such as (6) both appear as single orthographic words in Khmer (5) eːk-phiap one-state ‘unity, autonomy, agreement’ (Haiman 2011:51) (6) koun proh child male ‘son’ (Smyth 2008:228, transcription regularized) Second, native Khmer words all have either a monosyllabic or a so-called ‘sesquisyllabic’ pattern, the latter consisting of an initial unstressed half-syllable (the anacrusic syllable) followed by a main, stressed syllable (Haiman 2011:29; Huffman 1967:44–5; Huffman 1987:11; Gorgoniev 1966:30–5) While the sesquisyllabic syllable template has historically exerted intense pressure on words with inherited infixes and prefixes from earlier periods of the language, causing phonological reductions and deletions (Haiman 1998; Farmer 2009), compounds in modern Khmer consist of full disyllables: that is, each word in the compound for the most part retains its full phonological shape and stress (Jacob 1968:198–9) Thus, there are no consistent phonetic cues to disambiguate single derived words from compounds (Haiman 2011:43).2 Third, relative clauses in Khmer are standardly introduced by the relativizer/complementizer dael, which also serves to introduce various types of complement clauses (Comrie & Horie 1995:73) Subjects, primary and secondary (i.e indirect) objects, and possessors can all be relativized: subjects and secondary objects via the gap strategy or pronoun retention, objects by gap only, and possessors via pronoun retention only (Natchanan 2005:123) In addition, the relativizer dael can be dropped from most relative clauses without complication, giving rise to reduced relative clauses (Haiman 2011:313–4) While Haiman notes that “[t]here seems to be no clear rule regarding when the dael is required,” all of the examples he cites of situations where the clause is attested without dael though his consultants reported the relativizer could be included, and vice versa, involve the subject as relativized function and the gap strategy as relativization strategy Thus, the reduced relative clauses Haiman cites as attested appear identical in structure to example (1) above, with overt SV word order (7) Though Haiman cites these examples with the relativizer included as optional within parentheses, I give the phrases as attested, without the optional dael, so as to highlight their actual similarity to (1) (7) a kreː pɛːt miən kɑŋ ruɲ bed hospital have wheel push ‘hospital gurney’ b c miən plav ʔɑt have path NEG ‘there is a path that roːm toh body.hair grow mnuh daə person walk is untraveled’ knoŋ tiː sŋat kɑmpɑŋ in place secret secret For the notion of ‘orthographic word’ and the disconnect between it and other notions of wordhood, see Packard (2000:7–8) and Dixon (2009:5–7, 35) Aikhenvald (2007:25) notes that this same problem exists in both Boumaa Fijian and Portuguese, where the components of compounds retain their independent phonological status (cf Dixon 1988:226) Page of 23 ‘body hair which grows in secret areas’ (Haiman 2011:313, transcription regularized) Following Kroeger (2004:5), I take it to be axiomatic that the structural ambiguities behind the variant readings in (2)-(4) disclose the otherwise invisible structural relationships that exist between and among strings of words These different interpretations arise from differences in the actual linguistic structures that exist unseen in the understanding of users of the language; they are not simply semantic or pragmatic issues Thus, to the interpretations in (2)-(4), we could assign the phrase structure representations in Figure 1-Figure 3, respectively Figure 1: Phrase structure for (2) Figure 2: Phrase structure for(3) Figure 3: Phrase structure for (4) It is reading (4) which calls into question the lexical status of neak That is, this interpretation prompts Haiman (2011:43, 164) to question whether neak might not evidence a pattern of grammaticalization in its use in the modern language that we might call ‘morphologization’ following Joseph and Janda (1988) Joseph and Janda (1988:195–6) define morphologization as a transition from a generalization that is non-morphological to one that is, involving movement of syntactic and phonological phenomena into the morphological domain.3 Though isolating languages are sometimes claimed to lack morphology, morphologization in isolating languages is well documented; Mandarin Chinese provides a good example At the conclusion of a 1971 study in which he offered an analysis of, among other morphologized syntactic phenomena, the formation of person-agreement affixes in Bantu from free pronouns, Givón paraphrases an analect NB: This definition seems largely indistinguishable from that of simple ‘grammaticalization’, as discussed by Hopper and Traugott (2003:19, 21): they follow Meillet in applying the term to “the development of grammatical morphemes out of earlier lexical formatives.” Page of 23 variously attributed to both Confucius and Lao-tzu in which, on being informed that Chinese was an isolating language, the sage reportedly replied: “Weep not, my children, for today’s syntax is tomorrow’s morphology” (Givón 1971:413, n 1) Both Li and Thompson (1989) and Packard (2000) include in their grammars of Mandarin significant discussions of bound morphology, most of it traceable historically to independent lexemes And much work continues to be done on grammaticalization and morphologization in the language, involving such diverse phenomena as adverbs incorporating into verbs, resultative verb constructions morphing into aspect markers, and full lexical verbs being reinterpreted as prepositions (e.g Sun 1996; Chui 2000; Lord, Yap & Iwasaki 2002; Lim & Ansaldo 2002; Zhang 2011) Matisoff (1991:392) cites an example of grammaticalization from the isolating Loloish language Lahu that is particularly apropos An independent lexeme meaning ‘female proprietary spirit’ has a grammaticalized usage in Lahu on the basis of the semantic components of ownership and control as the feminine agentive nominalizer seen in (8) Other grammaticalized uses of this same word include as a feminine reflexive pronoun and a lexical noun meaning ‘female body’ (8) gä ga pɔ pỵ šɛ̄-ma 3SG must help birth give female.spirit/NMLZ ‘she who must help give birth, midwife’ (Matisoff’s transcription preserved) Yao’an Lolo provides another example even closer to the Khmer neak construction once allowances are made for the differing word order parameters In Yao’an Lolo, the independent lexeme su ‘person who’ is used to nominalize entire clauses in order to render an agentive reading (Merrifield 2010:26, 149– 51) An example is given in (9) (9) Niul var vur su ddei leil zzirbae gger nia 1PL vegetables sell NMLZ CL to money give must ‘We must give the person who sells vegetables money’ (Merrifield 2010:151, practical orthography preserved) Haiman suggests that neak is on its way to becoming less an independent lexical item and more a grammatical affix in much the same way as the Lahu šɛ̄-ma and Yao’an Lolo su Were Haiman alone in his tentatively affixal analysis of neak, we might feel inclined to dismiss his musings as a sort of provocativeness for provocativeness’ sake, attempting to apply to a well-known isolating language categories and concepts appropriate to more heavily inflected languages such as those of the Austronesian or Indo-European families Yet before him, Gorgoniev (1966:54–5; cf Jacob 1993:54) had already classified neak in the group he designates as “semi-prefixes,” itself part of a larger category of “semi-affixes” that “play a very important part in the derivation of new words.” Like Haiman, Gorgoniev was struck by the apparent productivity of the semi-affixal forms; he writes: “[Traditional affixes] are being replaced by words formed by new means, in particular by ‘semi-affixes’… The author of this book also had occasion to observe ordinary peasants in remote villages use words with new ‘semiaffixes’ in their speech rather than the words with the old infixes (e.g kaːsaəc instead of sɑmnaəc ‘laughter’)” (1966:47) He likewise felt the semantics of affixes like neak attenuating in such coinages, in which the word attached closely to a following verb in the manner of a compound However, Gorgoniev’s initial observation in the subsection on ‘semi-affixes’ that they are “[c]losely linked with word composition by origin” (Gorgoniev 1966:54) brings us back to a central question in seeking to investigate the possibly valid intuition shared by both Gorgoniev and Haiman Since modification and complementation both occur to the right in Khmer and relative clauses may occur without relativizer and often employ the gap strategy for the relativized function of subject, how can we distinguish between the three possible interpretations of agentive neak forms given in Figure 1-Figure above? Furthermore, assuming we manage that disambiguation, how can we additionally distinguish an affixal structure as in Figure from a simple synthetic compound, especially given that Khmer makes frequent use of compounds? In the next section, I examine the morphology-syntax interface and discuss diagnostic tests designed to tease the one apart from the other in Khmer Page of 23 Morphology and Syntax Spencer (2005:74) traces the tendency to view syntax and word-formation as indistinguishable to the early days of American Structuralist linguistics, when scholars like Bloomfield, Harris, Hockett, and Gleason came to view syntax as a concatenation of morphemes In essence, both word order in more configurational languages like English and the complex morphologies of Native American polysynthetic languages could be reduced to representation in position class charts The legacy of this approach, both in the work of Chomsky and that of generative linguists working in the tradition known as Word Syntax, has been to represent derivational or inflectional formatives as terminal nodes in syntax (Spencer 2005:74–5) A critical symmetry is thus created between phrase structure and word structure: both phrases and words are considered to be endocentric or headed by an element contained within them which determines the properties of the whole through a process of feature percolation So in the case of an English derived agentive nominal in -er, it is the suffix that determines the category of the nominal as a whole (Beard 2001:48, 51; Bauer 2003:177–82) The same conclusion would hold true as well for synthetic compounds, in which the right-hand member is a derived nominal like the agentive -er form and the left-hand member is interpreted as its complement or internal argument (Fabb 2001:68–9, 75) A controversial proposal known as the Right-hand Head Rule (Williams 1981; Fabb 2001:70) has even maintained that the head of a compound or a derived word will always appear on the right This claim has proven controversial both due to the large number of apparent exceptions in English (e.g bewitch, dethrone, overpass) and because it clearly cannot apply across languages (Bauer 2003:182) Vietnamese provides an example of a language where heads appear consistently on the left rather than on the right (Lieber 1980:99) In this respect, the language resembles its distant linguistic relative Khmer Not surprisingly, Vietnamese also has an agentive-type construction that patterns similarly to the neak phrases in Khmer In Vietnamese, the head word is người ‘person’ which is followed by a verb and its dependents, exactly as in Khmer An example is provided in (10) (10) người hiếu lợi person like profit ‘greedy person’ (lit ‘liker of profit’) (Thompson 1984:112–3) Thus, both Khmer and Vietnamese seem amenable to a strict Word Syntax type of analysis with left-hand heads that determine the categories and features of the resultant phrases (cf Jacob 1993:54–5) This conception ends up looking strikingly similar to the ‘phrase’ structure representation given in Figure above The problem is that the structure in (10) could also be represented as a full clause, as in Figure Thompson (1984:126) writes: “In a language like Vietnamese, which is strongly syntactic or isolating (as opposed to synthetic languages like Latin or Russian or even English), it is not surprising that the distinction between the word and the phrase is not as clear as in languages where word boundaries are unambiguous.” Notice that, for Thompson, syntactic is apparently another term for isolating, as though such languages had only syntactic means at their disposal with which to form words, phrases, or any other structure.4 This thinking likely underlies the frequent claim that compounding is the sole productive path to word-formation in languages like Khmer and Vietnamese Thompson goes on to point out, though, that even in English, ambiguous boundaries can be found, as is the case with nominals (NPs?) like jack-in-thebox and jack of all trades Nevertheless, as far as so-called agentive nominals are concerned, the boundaries are seldom, if ever, fuzzy in English In synthetic compounds, the complements line up on the left-hand, in unaccustomed preverbal position, derived agentive nominalizations require complements to appear on the right like normal complements though in the genitive (11) Cf Comrie and Horie (1995:74) and Song (2001:13–4): Since the relativizer dael is also a complementizer, there is no structural difference between relative clauses and complement clauses of a head noun; that is, relative clauses not exist in Khmer as a separate construction Song (2001:14) writes: “In other words, they [i.e Comrie and Horie] suggest that languages which lack relative clauses, such as Japanese and Khmer, make use of a general syntactic construction for relating subordinate clauses to head nouns, which is in turn subject to a wide range of pragmatic, not semantic, interpretations including that of relative clauses.” Page of 23 b (11) a profit-lover lover of profit This observation gets at a fundamental difference between the English and Vietnamese/Khmer agentive constructions that has prompted an interesting generalization from some generative linguists On the basis of the structures in (10) and (11), Brousseau (1989:37; Fabb 2001:71) has claimed that synthetic compounds are only possible in languages where the direction of modification differs from the direction of complementation, as in English According to Brousseau, languages like French, in which both modification and complementation occur to the right of the head, not have synthetic compounds (cf Fabb 2001:68, 71) Thus Khmer and Vietnamese, which also feature rightward complementation and modification, should likewise lack synthetic compounds If this conclusion is correct, it would seem to mean that the structures in (1) and (10) can only be construed as syntactic phrases.5 Such appears to be conclusion Haiman (2011:44) himself reaches: “There is no doubt that neak is undergoing grammaticalization here, but until there are more than a handful of examples [where neak can no longer be glossed as ‘person’], it will be treated in the syntax as a head noun rather than in the morphology as an agentive prefix.” This consideration, in turn, raises another interesting implication for the agentive constructions in isolating languages, one that makes testable predictions In 1970, Chomsky rethought the strictly transformational-generative approach to word-formation in the context of English deverbal nouns (Bauer 2003:167; Beard 2001) He realized there was a disconnect between derived nominals like destruction and gerunds like destroying In particular, while both constructions have the external distribution of NPs, derived nominals also have the internal structure of NPs, while gerunds have the internal structure of VPs (Chomsky 1970:189) Thus, derived nominals contain strictly NP elements like determiners, adjectives, potential for plural number, and complements marked by the preposition of, while gerunds have strictly VP elements like adverbs, bare NP objects, and TAM marking (Kroeger 2004:46) In addition, if overtly expressed, the subject of a derived nominal is required to be in a genitive form, while that of a gerund may appear in either the accusative or genitive (Kroeger 2004:47) Chomsky’s realization of these facts resulted in the formulation of the Lexicalist Hypothesis, according to which derived nominals are formed in the Lexicon and are visible to syntax only as whole words while gerunds are constructed in syntax according to generative principles If the agentive structures in Khmer and similar isolating languages are strictly syntactic constructions, then, we would expect them to display the same internal structure of a VP evident with English gerunds Such is not necessarily the case, however The evidence from a variety of isolating languages suggests that agentive nominalizations not have the full internal structure of VPs Instead, they tend to prohibit or express a strong dispreference for TAM marking and adverbial elements and occasionally permit a reduced argument structure Merrifield (2010:150) notes that Yao’an Lolo agentive nominalizations with su cannot take aspect particles or mode words Moreover, while they can take aspect auxiliaries, they exhibit a clear preference not to include any TAM marking at all She cites the example in (12) as an illustration (12) xie zi (*dae/ar/var/ddo/cexr/nia)su house build IPFV/PFV/can/able/capable/will NMLZ ‘person who builds house [sic]’ (Merrifield 2010:150) On the other hand, there not appear to be any restrictions as to the realization of arguments of the verb in agentive nominalizations in Yao’an Lolo nor on the complexity of the verb itself Example (13) shows a serial verb construction and a three-place predicate inside an agentive nominalization NB: I take it for granted that what is at stake in the discussion of the directionality of synthetic compounds transcends the trivial observation that languages without prenominal modification would have a different word order from English Rather, what is in question is whether languages like Khmer might have verbal compounds in which what is interpreted semantically as the verbal complement is not a syntactically visible complement Page of 23 (13) Ngo ngo leil xie vur gger su ssormaer halddei leil mia ar 1SG 1SG to house sell give NMLZ woman that-CL to see PFV ‘I saw the woman, the one who sold me the house.’ (Merrifield 2010:150–1) We can contrast this behavior with that shown from Mandarin Chinese in example (14) Here, the threeplace predicate mài ‘sell’ is unable to take an indirect object As Li and Thompson (1989:579) explain, when a nominalization is used alone as a NP, it is always ungrammatical to include the secondary object (14) (*wǒ) mài qìchē de 1SG sell car NMLZ Intended meaning: ‘car sellers to me’ (Li & Thompson 1989:579, transcription preserved) Both the Yao’an Lolo and Mandarin examples demonstrate that agentive nominalizations not have the full internal structure of normal VPs By requiring a reduction in the number and realization of arguments and disallowing the full range of TAM marking, these structures display special internal structures intermediate between that of NPs and VPs The neak constructions in Khmer exhibit a similar special internal structure Before reviewing the internal structure of Khmer neak constructions, it is necessary to establish through examples that their external distribution is that of a NP This is shown in (15) Khmer uses the copula ciə ‘be’ only for equative clauses (Haiman 2011:212); for attributive and locative clauses, adjectives and the verb nɨw ‘be located at’ function as the respective primary predicates If a construction can follow ciə in an equative clause, it must be considered a NP (15) a cru:k ciə neak toh tiaj pig be person predict prognosticate ‘The pig is a prophet.’ (Haiman 2011:212) b ʔəjləw nih koət ciə neak luək krɨəŋ-tok-tuː now DEM 3SG be person sell furniture ‘Now, he’s a furniture salesman.’ (Huffman, Promchan & Lambert 1970:103) Another reliable indication of the NP status of the neak constructions is their ability to stand in apposition to other NPs Example (16) shows an agentive nominalization with adjectival modifier standing in apposition to the personal pronoun jəːŋ ‘we’ (16) neak cumliəh tmej jəːŋ personflee new 1pl ‘We new refugees’ (lit ‘fleers’) (Haiman 2011:146) However, not all NP complements of ciə headed by neak are available for interpretation as agentive constructions As example (17) shows, the inclusion of an adverbial phrase of comparison and the embedded clause after ceh ‘know’ is incompatible with the agentive interpretation of the neak construction (17) kɲom mɨn ciə neak ceh sansɑːm-sɔmcaj doːc puak khmaɛ krahɑːm teː 1sg neg be person know economize like group Khmer red neg ‘I was not someone, like the Khmer Rouge, who knew how to economize.’ (Haiman 2011:228) The inclusion of TAM marking likewise disturbs the agentive reading Example (18) shows the same verb Page of 23 ceh ‘know’ without an embedded complement clause but with an explicit aspect marker, the postposed verb haəj ‘finish’ Again, no interpretation as an agentive nominalization is available for this sentence (18) kɯː thaː ciə siəwphəw sɑmrap cuaj neakceh khmaɛ haəj FOC say be book for help person know Khmer PFV ‘That is to say: it’s a book to help people who already know Khmer.’ (Haiman 2011:249) Example (19) shows the inclusion of the modal verb trəw ‘hit’, which is used in Khmer as a primary marker of deontic modality (de Haan 1997:52–4) It, too, is incompatible with the agentive reading (19) haet əʋəj baːn ciə neak cɑmbɑc trəw kraleːtməːl adejta kaːl cause which get be person necessary must glance look past time ‘Why (is it that) people must pay attention to the past?” (Haiman 2011:235) If markers of modality are to be included, a full relative clause construction is more common, as in (20) and (21) (20) puəlləroət mneak-mneak trəw neak dael citizen one.person-one.person must be person REL trəw twəː aoj prɑteːh daə must give country walk ‘Each and every citizen should understand that make the country progress.’ (Fisher 1988:30) taɛ juəl thaː kluən ciə only understand say self tɨw liən go fast he is someone who should (21) neak damnaə cumliəh tmej dael trəw tɯh vivaut tɨw kan kanlæŋ person journey evacuate new REL must task evolve go to place pseːŋ-pseːŋ tiət various other ‘“New people” [lit ‘new refugees’] who had to accept new assignments in different places.’ (Haiman 2011:314) Finally, example (22) shows the combination of an aspectual predicate, pdaəm ‘start’, an embedded clause, bɑŋkaət phiasaː laəŋ teəŋ muːl ‘create all of language’, and an adverbial modifier, taɛ muaj dɑːŋ ‘only one time’ No agentive reading is possible (22) mɨn mɛːn miən mnuh naː ciə neak pdaəm bɑŋkaət phiasaː laəŋ teəŋ NEG really have person any be person start create language up all muːl taɛ muaj dɑːŋ entire only one time ‘There is nobody who invented all of language all at once.’ (Haiman 2011:251) The phrase ‘new people’ designated urban dwellers forcibly removed to perform manual labor in the countryside during the Pol Pot years (Smyth 2008:118–20) Page of 23 In a similar manner to the su nominalizations of Yao’an Lolo, however, Khmer agentive neak constructions can include serial verbs, as is shown in the phrase neak damnaə cumliəh ‘person journey evacuate’ from (21) above Thus, Khmer agentive neak constructions possess some unambiguous VP traits while disallowing others A further indicator that the internal structure of Khmer agentive neak nominalizations is not identical to that of normal VP stems from the fact that canonically transitive verbs can be used as modifiers for the head word neak without any explicit objects In these cases, the verb indicates no particular instance of the denoted action but rather attributes to the head neak and its referent habitual or iterative engagement in the activity named by the verb I give examples in (23) b (23) a miən neak praə ciə craːən cralɑm craboːk crabɑl klah have person use be many REL have confuse busysome ‘There are many users who experience confusion.’ kawː nɨw taɛ pibaːk dəːŋ thaː haet əʋəj dael miən kaː NMLZ confuse (Haiman 2011:314) baːn ciə neak still be.at but difficult know say cause what get be person dək-noəm baːn klaːj ciə vana meː kɑmlaŋ lead become turn be caste master strength ‘It is still difficult to know what turned the leadership [lit ‘leaders’] into a ruling class.’ (Haiman 2011:318) c neak bɑmraə twəː rabiəp jaːŋ naː kawː koət mɨn person serve make method kind any so 3SG neg peɲ cət daɛ full heart also ‘However the servant does it, he isn’t satisfied.’ (Jacob 1968:129; Haiman 2011:331) d cam kɲom haw neak bɑmraə sən wait.until 1SG call person serve IMP ‘Wait until I (just let me) call the waiter.’ (Huffman, Promchan & Lambert 1970:173) e daəmbəj tatual baːn nɨw kaː cɨə tok cət rɔbɔh neak so.that acquire get OBJ NMLZ be keep heart GEN person aːn read ‘for acquiring the reader’s trust’ (Haiman 2011:247) It is interesting to note that bɑmraə ‘serve’ in (23) is in fact the morphological causative of praə ‘use’ in (23) While the word can be used as a noun, its primary function is as a transitive verb (Haiman 2011:56, 202, 261, 392) Insofar as the semantics of the agentive forms in the sentences in (23) not entail a specific instance of the events denoted by the verbs, this usage of the Khmer neak construction closely approximates a trait of English agentive synthetic compounds that has been discussed by Rappaport Hovav and Levin (1992) and Van Hout and Roeper (1998) (Spencer 2005:90) Both teams of scholars notice that agentive synthetic compounds like life-saver and lawn-mower not entail an actual event; Page 10 of 23 they can be applied equally to a person who has yet to undertake the action denoted in the verb (as, for example, a person newly hired to the profession) or to the instrument of such a profession (as, for example, the physical objects designated as life-savers and lawn-mowers) On the other hand, the corresponding agentive derived nominals saver of lives and mower of lawns entail a performance of the event and normally only refer to actual human agents of the actions involved (Rappaport Hovav & Levin 1992:133–4; Van Hout & Roeper 1998:175–6) Rappaport Hovav and Levin explain these interesting differences as resulting from a lack of complement structure inheritance and an empty event position in argument structure for the synthetic compounds; for the derived nominals, the event position is filled and the full complement structure of the underlying verb is inherited by the derived expression Van Hout and Roeper invoke a complex minimalist view of verbal syntax where both full verbs and derived nominals contain a VP as well as tense, aspect, and event-voice phrase projections (i.e TP, AspP, VoiceEventP) These projections lie above the level of VP and are missing in the synthetic compounds Rather, synthetic compounds involve base-generated complements in object position that move via head-to-head movement to the preverbal position in an incorporation operation that blocks checking of event-related features like aspect and tense Derived agentive nominalizations not, however, have the complete internal structure of a VP, insofar as they cannot contain adverbs, voice-markers, aspect, or negation (Baker & Vinokurova 2009:517) For this reason, Baker and Vinokurova (2009:528) have suggested that derived agentive nominalizations are formed as Voice Heads, at the level of syntax directly above VP but below TP and AspP Thus, these nominalizations combine directly with a bare VP and possess only so much VP structure as the ability to assign objects and event entailment Despite the fact that, as we saw above, languages like French and Spanish have been claimed to lack synthetic compounds, these two languages offer deverbal structures that exhibit these same semantic phenomena in a way that perhaps belies such claims Both Spanish and French have a verbal compound involving the third person singular present active indicative of a relevant verb followed directly by an apparent complement which is generally either written together with the verb or conjoined with it via a hyphen This ‘object’ must be entirely lacking in specificity and thus can take neither indefinite nor definite article; instead, it appears either as a bare singular or bare plural Examples of these constructions are given in (24) (24) a le lave-vaisselle washes-dishes ‘dishwasher’ (French) el lavavajillas DET washes.dishes ‘dishwasher’ (Spanish) DET b Though the complements vaisselle and vajillas occur to the right of the verb as though they were normal objects, the fact that they cannot take articles indicates that, to some extent, they are not syntactically visible objects but have been incorporated, or rather most likely pseudo-incorporated, into the VP This fact can be demonstrated by the failure of both constructions to accept modification of the apparent objects, shown in (25) (25) b a *le lave-vaisselle-salle det washes-dishes-dirty *‘dirty-dishwasher’ *el lavavajillassucias det washes.dishes.dirty Notice that, even in English translation, the collocation does not work: it is impossible to construe dirty as modifying dishes and not dishwasher Notice, too, that in French and Spanish, as in English, the default interpretation of the correct forms in (24) is as referring to a machine that performs the job function, not On pseudo-incorporation, see Dayal (2011) and Baker (2011) Page 11 of 23 to a human agent Thus, to borrow the terminology of Rappaport Hovav and Levin (1992), not only these forms lack inheritance of the complement structure of the base verbs, but they also not entail an event interpretation Except for the pseudo-incorporation of the objects, these structures are similar to the ‘headless relative clauses’ discussed by Baker and Vinokurova (2009:537–8) as being liable to confusion with agentive nominalizations (e.g the [one who] manages the company) Some languages have both structures, such that it becomes crucial to be able to tease apart the varying uses and structural properties Indeed, French and Spanish both also have derived nominalizations with agentive suffixes as well Examples of these constructions appear in (26) and (27) (26) b c (27) a lavador de vajillas washer.M.SG GEN dishes ‘dishwasher’ lavadora de vajillas washer.F.SG GEN dishes ‘dishwasher’ ?el/la lavador(a) de las DET.M/F washer.M(F) GEN DET ‘washer of dishes’ (Spanish) a laveur vaisselle washer.M.SG dishes ‘dishwasher (man)’ b laveuse vaisselle washer.F.SG dishes ‘dishwasher (woman)’ c ?le/la laveur(-euse) vaisselle DET.M/F washer.M(F) GEN ‘washer of dishes’ (French) vajillas dishes de la DET dishes What is intriguing about these forms is that the a and b versions still lack both complement structure inheritance and an event interpretation: again, notice the conspicuous absence of articles and even the absence of the genitive-marking preposition de in French These forms still refer to job titles that may be applied even to one newly hired who has yet to discharge the job function; in fact, their most common occurrence is in job notices Moreover, the feminine Spanish form in (26) actually refers to the dishwashing machine again, its gender in agreement with the unexpressed noun máquina ‘machine’ The c forms would, then, contrast with those in a and b., having both true objects and event interpretations The problem, though, is that these forms are little attested in actual use The one usage of the masculine form of the French version in (27) that I can locate comes from a 1776 book of rules and order for the Frères Hermites of Mount Valérien near Paris It occurs in a long list of titles of job positions which the brother in charge of directing singing during worship is to select individuals to fill The list includes “le Servant de la Messe Conventuelle, le Lecteur du Réfectoire, le Servant de Table, le Laveur de la Vaisselle” ‘server of the monastic Mass, reader in the refectory, table server, washer of dishes’ (Anonymous 1776:304–5) The one indication in this list that an event interpretation is perhaps intended are the specific time references included: the brother is to make the selections to fill these posts ‘every Saturday’, and these are the jobs that ‘are done weekly.’ However, given that this text is so old, there exists the possibility that this kind of usage has passed out of currency in the modern French and Spanish languages, which would explain the difficulty of finding attestations for it Thus, somewhat contrary to expectation, what French and Spanish might lack are not synthetic compounds, but derived nominals of the type encoded in the English -er expressions with following genitive complements Returning to Khmer, we note that the presence or absence of overt complements appears to be the mechanism that controls the semantics of event interpretation Thus, we can offer the following pairs of Page 12 of 23 examples, illustrating the difference between agentive nominalizations of the same verbs with and without express complements (28) a neak miən kom aːl ɑ: neak krɑ: kom aːl phɨj person have NEG yet happy person poor NEG yet fear ‘Rich people [lit ‘the haves’], don’t be in a hurry to be happy; poor people, don’t be in a hurry to fear.’ (Haiman 2011:268)8 b tək hoː coan tej tiap liap baːn tɨw neak miən səri water flow trample ground low luck get go person have fortune ‘As water flows downhill, so luck goes to those who (already) have good fortune.’ (Haiman 2011:375) (29) (Haiman 2011:404) a sma:n thaː keː deɲ taːm neak luək think say 3PL chase follow person sell ‘He thought they were chasing after the salesman.’ b keː miən deɲ viej neak luək phaom ɛnaː 3PL have chase beat person sell fart where ‘No way did they chase and beat (you for being) the person who had sold the farts.’ (Haiman 2011:231) Apparently, Khmer never quite has the option of a structure analogous to the French and Spanish constructions in (24) Since Khmer lacks articles, there is no way of indicating the (pseudo)incorporation of an object: either the explicit object is present in the construction or not Thus, as we might expect, the Khmer equivalent for ‘dishwasher’ with neak as head noun refers only to humans who perform the function In order to indicate the instrument of washing the dishes, a different head noun is required, as indicated in (30) (30) b a neak liəŋ caːn person wash dish ‘dishwasher’ (person) maːsiːn liəŋ caːn machine wash dish ‘dishwasher’(machine) Indeed, for a host of Khmer neak constructions, there is no available interpretation referring to anything other than a human agent of the action described Moreover, many of these constructions necessarily contain objects in order to distinguish them from otherwise identical structures with different meanings That is, while one can be designated a ‘seller’ or salesman without reference to a specific object sold, such is not the case for furniture salesman, vegetable vendor, taxi driver, bus driver and so on (31).9 As Haiman explains, this is part of a Khmer Rouge proverb saying, in effect, that once the revolution comes, the last shall be first and the first shall be last It does still bear investigating whether these constructions will accept modification of the apparent objects as in the test applied in (25) As is shown in (16), an adjective to the right of an agentive neak construction is naturally Page 13 of 23 (31) b d Lambert 1970:102–3) e a neak twəː kaː person work ‘worker’ (Huffman, Promchan & Lambert 1970:103) neak twəː srae person rice.field ‘farmer’ (Huffman, Promchan & Lambert 1970:103) neak luək krɨəŋ tok tuː person sell accessory table cabinet ‘furniture salesman’ (Huffman, Promchan & neak luək plae chɤ person sell fruit tree ‘greengrocer’ (lit ‘fruit tree seller’) (Jacob 1993:54) In addition, neak can form the same kind of apparent ‘agentive’ constructions even with unaccusative verbs like buah (32) As Haiman (2011:59, 65) notes, buah means ‘to undergo initiation as a monk’ and even has the transitive, ‘causal’ form bambuah ‘to initiate a monk’ The base form buah cannot, then, have an agent; its only argument is the patient/object of the action of initiation 10 (32) loːktɑːp thaː neː æŋ kom thaː əɲcəŋ aɲ ciə neak buah monk answer say hey 2SG neg say thus 1SG be person initiate trəw taɛ must only ‘The monk answered, saying: “Hey, don’t talk like that, I’m an initiated person and should.”’ (Haiman 2011:398) As has been widely noted, true agentive nominalizations cannot be formed from verbs that correspond to events which cannot have agents (Baker & Vinokurova 2009:538; Rappaport Hovav & Levin 1992:129) As a result of these particulars, the Khmer ‘agentive’ neak constructions begin to appear less and less like phrasal constructions and more and more like verbal compounds In the next section, we will look at those aspects of the neak constructions that appear compound-like and attempt to distinguish an affixal character of neak from that of a mere component in compounds Compounding and affixation In his Vietnamese grammar, Thompson (1984:121, 129–30) notes that Vietnamese constructions such as those in (33) pose similar difficulties of analysis to those encountered in the Khmer neak constructions In particular, these structures seem to straddle the line between compounds and syntactically constructed descriptive phrases (33) b c a người person be.located/reside ‘servant’ nhà thương establishment be.wounded ‘hospital’ làmviệc interpreted as having scope over the entire construction Thus, we might predict a structural ambiguity in strings such as person sell fruit new between the interpretations ‘new fruit seller’ and ‘seller of new fruit’ This issue will have to await future research 10 For more on unaccusativity, see Perlmutter (1978) Page 14 of 23 matter/affair ‘to work’ làmruộng rice.field ‘to farm’ (Thompson 1984:129–30; Lieber 1980:99) d Thompson lists four criteria to which an analyst can appeal in order to differentiate such constructions from phrasal material First, compounds have a heavier stress on the second constituent Second, they contain only two constituents Third, no modifying constituent may intervene between them Fourth, their meanings are somewhat lexicalized (cf Lieber 1980:99) As an illustration of what is meant by criterion three that no modifying constituent may intervene, Thompson cites the example in (34) (34) a Nhà không có người house dem neg be person be.located/reside ‘There is no one living in this house.’ or ‘There is no servant in this house.’ b Nhà không có người house dem neg be person dem be.located/reside ‘There is no one at all living in this house.’ *‘There is no servant in this house.’ (Thompson 1984:121) Here, the intervention of the demonstrative forces a differentiation between what would otherwise be a structurally ambiguous sequence On its own, người can be interpreted either as a kind of reduced relative clause meaning ‘person (who) resides’ similar to that in (3) or as a lexicalized compound meaning ‘servant’ with internal structure close to that given for the Khmer neak construction in Figure above As soon as the demonstrative intervenes, however, the reduced relative interpretation becomes obligatory As we saw above, Jacob’s (1993:54) complaint against Gorgoniev’s designation of neak and like words as ‘semi-prefixes’ was that a simpler analysis would involve treating them simply as initial components of compounds that occur with high frequency in first position Two facts arising out of the discussion in section above lend weight to this position: first, that Khmer ‘agentive’ neak constructions exhibit special syntactic features which clearly differentiate them from normal VPs; and second, that they can construe with unaccusative verbs that cannot take true agents Borrowing from the criteria Thompson applies to his Vietnamese data, we can offer a few additional ways in which a compound analysis better suits the Khmer neak construction as well Just as with the Vietnamese construction in (34), insertion of modifying material into a Khmer neak construction alters the meaning and renders an agentive reading impossible In (35), the indefinite modifier naː occurs after the entire neak structure, taking scope over the whole of it to mean ‘any traveler.’ In (35), by contrast, naː interrupts the neak construction and takes scope only over the head noun neak, forcing a reduced-relative-clause interpretation for the structure as a whole b (35) a prasənciə neak damnaə naː mɔːk vɔːŋveːŋ plaw if be person journey indef come get.lost road ‘If any traveler should lose his way.’ (Haiman 2011:309) prasən ciə neak naː damnaə mɔːk vɔːŋveːŋ plaw if be person indef journey come get.lost road ‘If anyone (who is) traveling should lose his way.’ Page 15 of 23 In example (36), the occurrence of aetiət ‘other’ between neak and the verb produces the same effect (36) neak aetiət nɨw phuːmPaunariaj person other be.at village Paunariaj ‘some more people (who are) from the village of Paunariaj’ (Haiman 2011:390) In addition, just as the Vietnamese compound người from example (34) may have a special, lexicalized meaning ‘servant’ when no modifiers intervene to force the relative-clause interpretation, some neak constructions in Khmer have acquired a similar special, lexicalized meaning A good example is neak leːŋ ‘person + play’ which has acquired the meaning of ‘gangster’ in the modern language (Haiman 2011:158) Another strong indicator that neak constructions comprise compounds as opposed to phrasal constituents stems from the fact that the head noun neak can construe with nouns and adjectives as well as verbs to form constructions which appear identical to the agentive structures we have been analyzing thus far but which are clearly not agent-related at all Examples are given in (37) (37) a neak cəmnuəɲ person commerce ‘merchant’ b neak riəcckaː person government.service ‘civil servant’ (Huffman, Promchan & Lambert 1970:103; Gorgoniev 1966:70) c neak toːc person small ‘kiddy’ (Gorgoniev 1966:55) d neak Pnom-Peɲ person Phnom Penh ‘person from Phnom Penh’ (Haiman 2011:165) e neak taː person grandfather ‘ancestral spirits’ or ‘statue of ancestral spirits’ (Haiman 2011:366, 373)11 Notice that (37) involves clearly lexicalized meaning as well: seemingly meaningless on the surface, the combination ‘person + grandfather’ refers to spirits of the ancestors or iconic representations of them in the form of statues It would appear, then, that the case for the compound interpretation is fairly strong, and we might be inclined to grant Jacob her point However, there is clear evidence of grammaticalization or morphologization of neak in its compounding function that pushes the use of the head noun beyond the status of simply the first position in a purely compositional compound First, as Haiman (2011:43–4) recognizes, there are a few cases where an agentive neak construction refers to non-humans In (38), the phrase describes an octopus whose behavior while feeding was felt to be an accurate predictor of World Cup soccer matches There are many more examples of this type of construction than be cited here For more, see (Headley 1977:s.v 97) 11 Page 16 of 23 (38) neak toh tiaj person predict prognosticate ‘prognosticator’ (Haiman 2011:43 cf ex (15) above) More tellingly still, (39) shows the application of a neak construction to an abstract inanimate concept This is one of what Haiman (2011:74) calls “a handful” of situations in which an agentive neak construction refers to an inanimate One wishes he had provided examples in his grammar of the other such situations (39) neak cuaj pdɔl person help impart ‘(grammatical) modifier’ (Haiman 2011:74) nej meaning Beyond the examples Haiman specifically cites as cases in point, we can recognize still more signs of the morphologization of neak in the remainder of his data For example, doubtless as a result of its consistent use in transparent compounds referring to the titles of occupations, neak is also found prefixed to nouns denoting occupations that either already have their own agentive derivations or inherently denote the person who performs a job or duty I provide a list of examples in (40) (40) b c a neak səl-kɑː person skill-NMLZ ‘artist’ (Haiman 2011:263) neak kruː-pɛːt person teacher-hospital ‘doctor’ (Haiman 2011:212) neak kruː person teacher ‘teacher’ (Natchanan 2005:123) In (40), the internal nominalizer -kɑː is one of Gorgoniev’s ‘semi-suffixes’ (1966:57), a bound form that is no longer productive in the modern language and ultimately derives from the Pali participle kara ‘doing’(Haiman 2011:44) In (40) and (40), the word kruː is a Khmer rendering of the Pali guru ‘teacher’ (Haiman 2011:36) All three of these instances would seem to suggest that neak is becoming generalized as a prefixal marker associated with the titles of professions Finally, related to this last use with already agentive nouns, neak has acquired a further grammaticalized function: that of distinguishing between male and female practitioners of the named profession (Haiman 2011:158, 186) In a situation in which both male and female teachers are being addressed or discussed, the male may be referred to as loːk kruː ‘monk + teacher’ and the female as neak kruː ‘person + teacher’ The word loːk derives from the Pali loka ‘world’ and has come to be used in modern Khmer as a formal second person pronoun The word neak may be used as its less formal counterpart Thus, Haiman (2011:186) traces the use of neak in its gender-distinguishing role as stemming from this V/T or vous/tu breakdown between the uses of loːk and neak as independent personal pronouns, where men are referred to more formally than women However, in the examples in (40), there is no indication of gender-specification in the use of the neak-marked terms Indeed, in the context in which neak səlkɑː is used, the form is a collective or plural, lessening the likelihood of its intended referent being solely a group of females For the purposes of comparison, I give the whole sentence in example (41) below (41) niəŋ kmiən tumloːp nɯŋ twəː kaː rəh dɔl neak səlkɑː woman not.havecustom fut nmlz slander criticize arrive personartist ‘She is not in the habit of slandering artists.’ (Haiman 2011:263) Page 17 of 23 kaun We can also note that the forms in (40) not involve direct address (i.e vocatives) but a simple NPs: neak is not part of a title, but rather part of the nouns themselves Simpson (2008:272) observes that formatives used frequently in the creation of transparent compounds that refer to a “connected class of items” are particularly prone to grammaticalization Both he and Haiman (2011:43) cite the example of -man in such English words as doorman, policeman, fireman, and so forth, where the word-cum-suffix man undergoes both phonetic reduction to something like [mən] and semantic bleaching insofar as women may often be the referents of such expressions Iwasaki and Ingkaphirom (2005:43) describe a class of nouns that appear with special frequency as the initial member and left-head of compounds in Thai that function similarly Among them is the word /phûu/ ‘person’ that occurs in such compounds as ‘manager’ (