Processing of recursive sentence structure testing predictions from a connectionist model

Processing of Recursive Sentence Structure: Testing Predictions from a Connectionist Model Morten H Christiansen and Maryellen C MacDonald Program in Neural, Informational and Behavioral Sciences University of Southern California Los Angeles, CA 90089-2520 morten@gizmo.usc.edu mcm@gizmo.usc.edu Abstract We present results from three psycholinguistic experiments which tested predictions from a connectionist model of recursive sentence processing The model was originally developed to capture generalization using non-local information (Christiansen, 1994; Christiansen & Chater, 1994) From this model it was possible to derive novel empirical predictions concerning the processing of di erent kinds of recursive structure We present behavioral results rming network predictions about the acceptability of sentences involving multiple right-branching PPs (Experiment 1), multiple left-branching prenominal genitives (Experiment 2), and doubly center-embedded object relative clauses (Experiment 3) Importantly, these predictions derive from the intrinsic architectural constraints of the model (Christiansen & Chater, in submission), rather than arbitrary, externally speci ed memory limitations We conclude that the SRN is well-suited for the modeling of human performance on recursive sentence structure Introduction One way to evaluate computational models of psycholinguistic phenomena is to assess how well they match behavioral data and whether they make predictions beyond existing data Many models only match data at a fairly gross level of performance, and few make predictions that inspire new experiments This is true of both connectionist and symbolic computational models of language{especially within the area of sentence processing We introduce a performance measure, Grammatical Prediction Error (GPE), which allows for the modeling of grammaticality ratings We use this measure to derive novel empirical predictions from an existing connectionist model of the processing of recursive sentence structure (Christiansen, 1994; Christiansen & Chater, 1994) The predictions suggest that increasing depths of recursion not only decrease the acceptability for center-embedded constructions, but also for the simpler left- and rightbranching constructions These predictions are at odds with many symbolic models of sentence processing Results from three behavioral experiments are presented, rming the model's predictions Connectionist Simulations The predictions were derived from the Simple Recurrent Network (SRN, Elman, 1990) model of recursive sentence processing developed by Christiansen (1994; Christiansen & Chater, 1994) Output (42 units) copy-back Hidden (150 units) Input (42 units) Context (150 units) Figure 1: The basic architecture of the SRN used in Christiansen (1994) Arrows with solid lines denote trainable weights, whereas the arrow with the dashed line denotes the copy-back connections The SRN, as illustrated in Figure 1, is essentially a standard feedforward network equipped with an extra layer of so-called context units The hidden unit activations from the previous time step are copied back to these context units and paired with the current input This means that the current state of the hidden units can in uence the processing of subsequent inputs, providing the SRN with an ability to deal with integrated sequences of input presented successively The SRNs were trained via a word-by-word prediction task on 50,000 sentences (mean length: words; range: 3-15 words) generated by the context-free grammar in Figure (using a 38 word vocabulary) This grammar involved left recursion in the form of prenominal genitives, right recursion in the form of subject relative clauses, sentential complements, prepositional modi cations of NPs, and NP conjunctions, as well as complex recursion in the form of object relative clauses The grammar also incorporated subject noun/verb agreement and three additional verb argument structures (transitive, optionally transitive, and intransitive) The generation of sentences was further restricted by probabilistic constraints on the complexity and depth of recursion - NP VP \." - PropN N N rel N PP gen N N and NP - V(i) V(t) NP V(o) (NP) V(c) that S - who NP V(t o) who VP - prep locN (PP) - (gen) N +\s" S NP VP rel PP gen j j j j j j j j j j Figure 2: The small context-free grammar from used to generate the training corpus 2.1 Deriving Predictions from the SRN Model When evaluating how well the SRN has learned regularities of the grammar, it is important from a linguistic perspective not only to determine whether the words that were activated given prior context are grammatical, but also which items were not activated despite being sanctioned by the grammar The GPE provides an activation-based measure of how well a network is obeying the training grammar in making its predictions, taking hits, false positives, correct rejections as well as false negatives into account The GPE for predicting a particular word was calculated using: GPE = ? hits hits + false positives + false negatives Hits and false positives consisted of the accumulated activations of all units that were grammatical (G) and of all activated units that were ungrammatical (U ), respectively hits = X i2G X ui false positives = i2U ui false negatives = X i2G fni False negatives were calculated as a sum over the (positive) discrepancy fni between the desired activation for a grammatical unit ti and the actual activation of that unit ui ( positives)fi ti = (hits + false P f ti ? ui fni = 0ti ? ui ifotherwise j 2G j The desired activation, ti , was computed as a proportion of the total activation determined by the lexical frequency fi of the word that ui designate and weighted by the sum of the lexical frequencies fj of all the grammatical units The GPE for an individual word re ects the di culty that the SRN experienced for that word given the previous sentential context, and can be mapped qualitatively onto word reading times, with low GPE values re ecting a prediction for short reading times and high values indicating long predicted reading times (MacDonald & Christiansen, in submission) The average GPE across a whole sentence expresses the di culty that the SRN experienced across the sentence as a whole, and have been found to map onto sentence grammaticality ratings (Christiansen & Chater, in submission), with low average GPE scores indicating a low acceptability rating and high scores re ecting high ratings Using the GPE we were able to derive novel predictions concerning the acceptability of three types of sentences involving complex recursive constructions from the existing model by Christiansen (1994; Christiansen & Chater, 1994) Testing SRN Predictions A number of predictions regarding the processing of recursive sentence structure were derived from the model Here we focus on the processing of sentences involving multiple instances of three di erent kinds of recursion: right-branching, left-branching, and center-embedding These predictions from the SRN model were tested in on-line grammaticality judgment experiments using a self-paced reading task with word-by-word center presentation Following the presentation of each sentence, subjects rated the sentence on a 7-point scale (7 = bad) 3.1 Experiment 1: Multiple PP Modi cations of Nouns Increasing the number of recursions in right-branching constructions involving an NP modi ed by several PPs should make the sentences less acceptable Subjects were presented with the sentences with PP (1), PPs (2), or PPs (3): SRN Prediction: Mean Rating (7 = Bad) 1 PP PPs PPs Sentence Type Figure 3: The mean ratings for sentences incorporating 1, or PPs into an NP (Experiment 1) (1) The nurse with the vase says that the owers by the window resemble roses (1 PP) (2) The nurse says that the owers in the vase by the window resemble roses (2 PPs) (3) The blooming owers in the vase on the table by the window resemble roses (3 PPs) The results in Figure show that there was a signi cant e ect of depth of recursion in the direction predicted by the SRN model ( 1(2 70) = 10 87 0001; 2(2 16) = 12 43 001; N=36) F ; : ;p < : F ; : ;p < : 3.2 Experiment 2: Multiple Prenominal Genitives Having two levels of recursion in an NP involving left-branching prenominal genitives should be less acceptable in an object position than in a subject position In this experiment, subjects were presented with sentences containing multiple prenominal genitives either in the subject position (4) or in the object position (5): SRN Prediction: (4) Jane's dad's colleague's parrot followed the baby all afternoon (subject) (5) The baby followed Jane's dad's colleague's parrot all afternoon (object) As predicted by the SRN model, the results in Figure show that multiple prenominal genitives were less acceptable in object position than in subject position ( 1(1 33) = 76 03; 2(1 9) = 48 1; N=34) F F ; : ; : ;p < ;p < : Mean Rating (7 = Bad) : Subject Object Sentence Type Figure 4: The mean ratings for sentences incorporating multiple prenominal genitives in subject or object positions (Experiment 2) Mean Rating (7 = Bad) 2 VPs VPs Sentence Type Figure 5: The mean ratings for the ungrammatical VP constructions and the grammatical VP sentences (Experiment 3) 3.3 Experiment 3: Doubly Center-Embedded Constructions Using an o -line task, Gibson & Thomas (1997) found that ungrammatical NP1 NP2NP3 VP3 VP1 constructions, such as (7), were rated no better than their grammatical counterpart NP1 NP2 NP3 VP3 VP2 VP1 , such as (6): (6) The apartment that the maid who the service had sent over was cleaning every week was well decorated (3 VPs) (7) *The apartment that the maid who the service had sent over was well decorated (2 VPs) People will actually nd the grammatical VP sentence (6) worse than the ungrammatical VP sentence (7) when tested on-line The results presented in Figure rmed the SRN prediction: The grammatical VP sentences were rated signi cantly worse than their ungrammatical VP counterparts ( 1(1 35) = 15 55 0001; 2(1 5) = 85 05; N=36) SRN Prediction: F ; : ;p < : F ; : ;p < : 3.4 Comparing Human and SRN Data Figure shows that the model's average GPE scores correctly predict the behavioral data both within and across experiments Conclusion We have presented results from three grammaticality judgments experiments rming novel predictions derived from an existing connectionist model Importantly, this model was not developed for the purpose of tting these data, but was nevertheless able to predict the patterns of human grammaticality judgments across three di erent kinds of recursive structures We have argued elsewhere that the SRN's ability to model human limitations on complex recursive constructions stems largely from intrinsic architectural constraints (Christiansen & Chater, in submission; MacDonald & Christiansen, in submission) In contrast, the present pattern of results provides a challenge for symbolic models of human performance relying on arbitrary, externally speci ed memory limitations The close t between SRN predictions and the human Genitive Experiment Grammaticality Ratings Ratings Avg GPE 0.5 0.4 Ratings Avg GPE 0.4 4 0.1 PP PPs PPs Sentence Type 0.3 0.2 0.4 0.3 0.2 0.5 Ratings Avg GPE 0.3 Center-Embedding Experiment 0.5 0.1 0.0 0.2 Subject Object Sentence Type 0.1 0.0 VPs VPs 0.0 Avg Grammatical Prediction Error PP Experiment Sentence Type Figure 6: Grammaticality ratings (left y-axes) and GPE averages (right y-axes) from Experiments (left panel), (middle panel), and (right panel) data within and across the three experiments suggests that the SRN is well-suited for the modeling of the processing of recursive sentence structure, and that GPE provides a useful way of mapping SRN performance onto behavioral data References Christiansen, M.H (1994) In nite languages, nite minds: Connectionism, learning and linguistic structure Unpublished PhD thesis, University of Edinburgh Christiansen, M.H & Chater, N (1994) Generalization and connectionist language learning Mind and Language, 9, 273{287 Christiansen, M.H & Chater, N (in submission) Toward a connectionist model of recursion in human linguistic performance Elman, J.L (1990) Finding structure in time Cognitive Science, 14, 179{211 Gibson, E & Thomas, J (1997) Memory limitations and structural forgetting: The perception of complex ungrammatical sentences as grammatical Manuscript, MIT, Cambridge, MA MacDonald, M.C & Christiansen, M.H (in submission) Individual di erences without working memory: A reply to Just & Carpenter and Waters & Caplan Rumelhart, D.E., Hinton, G.E & Williams, R.J (1986) Learning internal representations by error propagation In McClelland, J.L & Rumelhart, D.E (Eds.) Parallel distributed processing, Vol (pp 318{362) Cambridge, MA: MIT Press ... accumulated activations of all units that were grammatical (G) and of all activated units that were ungrammatical (U ), respectively hits = X i2G X ui false positives = i2U ui false negatives... = X i2G fni False negatives were calculated as a sum over the (positive) discrepancy fni between the desired activation for a grammatical unit ti and the actual activation of that unit ui ( positives)fi... grammar The GPE provides an activation-based measure of how well a network is obeying the training grammar in making its predictions, taking hits, false positives, correct rejections as well as

Định dạng
Số trang	6
Dung lượng	199,33 KB