Perception of contingency in conditioning scalar timing, response bias, and erasure of memory by reinforcement

Journal of Experimental Psychology: Animal Behavior Processes 1984, Vol 10, No 3, 333-345 Copyright 1984 by the American Psychological Association, Inc Perception of Contingency in Conditioning: Scalar Timing, Response Bias, and Erasure of Memory by Reinforcement Peter R Killeen and James Phillip Smith Arizona State University Pigeons' key pecks turned off a key light, which also went off independently of their pecks The pigeons were able to correctly discriminate the cause of the stimulus change, although their attributions were strongly affected by the amount and the delay of reward given for correct responses Their discrimination was based on the asynchrony between a response and the change in the key light A simple detection model that combined detectability and motivational factors provided a good description of the data It was shown that the discriminative criteria did not change with changes in distribution of noise events in an optimal fashion and that the pigeons therefore were not "ideal detectors." In the final experiment, the pigeons were asked to discriminate the cause of a key light change, of a hopper illumination, and of a feeding Performance decreased with each condition and with the duration of the last two events It was noted that the memory trace for a stimulus change decays at the same rate as the primary reinforcement gradient but that it decays faster when the delay is filled with an event such as reinforcement The possibility that the effects of reinforcement may be blocked by reinforcement is briefly discussed Discriminative stimuli are denned in terms of their impact on current behavior, but they also affect subsequent behavior Analysis of memory in animals has been stimulated by theories of learning that posit an important role for it and other "cognitive" processes (e.g., Grant, Brewster, & StierhofF, 1983; Shimp, 1976a; Wagner, Rudy, & Whitlow, 1973) The stimuli to be remembered may be not only lights and tones but also reinforcement schedules (Lattal, 1975; Rilling & McDiarmid, 1965), the animal's own behavior (Reynolds, 1966; Shimp, 1983), and even the animal's characterization of a particular behavior-reinforcement contingency (see e.g., Commons & Nevin, 1981) The techniques used in these studies permit evaluation of sensitivity to events in a motivational context different from the ones in which they occurred and thus allow the measurement of detectability separate from motivational biases An instance of this approach that provides the prototype for the research to be reported This research was supported in part by Grant BNS 7624534 from the National Science Foundation Requests for reprints should be sent to Peter Killeen, Department of Psychology, Arizona State University, Tempe, Arizona 85287 here is provided by Killeen (1978, 198 Ib), who asked pigeons to discriminate whether they had brought about a change in the illumination of a key light, or whether that change was a random event initiated by the computer The pigeons were good at the task, but their performance was strongly affected by the amount of food they received for correct detections relative to the amount for correct rejections Accounts of "superstitious" conditioning that emphasized discrimination failure were clearly incorrect: Elevated response rates in the presence of noncontingent rewards are better viewed as the effects of bias on temporal generalization gradients The present experiments provide a stronger basis for that conclusion: The original experiment is replicated, the motivational variables are changed, better control over response-event asynchronies is provided, a model of the discrimination/motivation interaction is developed, and memory for response-light sequences is contrasted with memory for response-reinforcer sequences Experiment Method 333 Subjects Two White Carneaux pigeons (A and B) and Silver King pigeon (C), all with previous histories of 334 PETER R KILLEEN AND JAMES PHILLIP SMITH Table The Order in Which the Pigeons Were Exposed to Each of the Delay Conditions Order Subject A B C 1.5/1.0 1.0/1.5 1.0/2.5 2.5/1.0 1.0/2.5 1.0/1.5 1.0/1.5 2.5/1.0 2.5/1.0 1.0/2.5 1.5/1.0 1.5/1.0 1.0/1.0 1.0/1.0 1.0/1.0 Note The values represent the delay in seconds to the rewards available for correct responses to the left/right keys experimentation, served as subjects They were maintained at 80%-85% of their free-feeding weights Apparatus The experimental apparatus comprised a standard three-key Lehigh Valley operant chamber connected to a PDP11/05 computer The center key was a Gerbrands pigeon key, activated by 0.14 N of force; the side keys were original equipment A food hopper located beneath the center key provided access to mixed grain A house-light atop the front panel provided diffuse illumination during the session Procedure The pigeons were trained to peck on the center key for occasional rewards of access to grain After several days they were shifted to a schedule in which pecks on the center key had a 5% chance of causing its light to go out and the side-key lights to go on Occasionally the center-key light would go off and the side-key lights would go on independently of the pigeons' behavior A single peck on one of the side keys would provide either 2.5 s of access to food or 2.5 s during which the chamber was darkened The subjects would receive the food if and only if they chose the correct side key If their center-key peck was the event that had caused the transition from centerto side-key lights, then a choice of the right key would be rewarded If the transition occurred independently of the pigeon's behavior, then a choice of the left key would be rewarded After reward or time-out, a 3-s intertrial interval elapsed during which only the house-light was illuminated, followed by a reversion to the original condition with the center key lit, The colors of the keys were green (left), white (center), and red (right) The session terminated after 50 rewards The probability that the transition would occur without a key peck was determined by letting the computer generate "pseudopecks" at the same rate as the pigeon's pecks and generating a transition with the same probability (5%) The computer updated its estimate of the pigeons' interresponse time (IRT) every second using an exponentially weighted moving average, weighting the most recent latency 20% and the previous average 80% When the time from the last pseudopeck exceeded the current estimate of the pigeon's IRT, the computer would make another "response." This technique has three implications: (a) The rates of response-dependent and response-independent transitions were approximately equal during a session, (b) The latter events were not truly independent of the pigeon's behavior, because their rates were correlated; during any one trial the correlations were close to zero, but over the course of a session they approached 1.0 (c) A computerinitiated transition could occur "simultaneously" with a pigeon's response and yet be categorized by the computer as computer initiated After 10 sessions, a 1-s delay was instituted between the offset of the side keys and reward (or time out) This condition was in force for 30 sessions, followed by exposure to the major experimental conditions, in which the value of the delay was varied over the range listed in Table Each condition was in effect for approximately 40 sessions Results The pigeons learned the discrimination; over the course of the experiment the average probability of a correct response was 65% However, this statistic underestimates the ability of the pigeons, whose behavior was quite sensitive to the differential payoff for correct yes and no responses Figure plots the data from the various conditions in the traditional coordinates of signal detectability theory: the probability of saying yes given that the pigeons had indeed caused the transition ("hits") and the probability of saying yes given that the computer had caused the transition ("false alarms") As the relative payoff for hits increased, the data points moved from the lower left corner to the upper right corner of the graph Maximizing the probability of being correct did not maximize reward; for instance, when the relative payoff for hits was great, it was to the pigeon's advantage to presume that it caused all transitions about which there was any question, rather than be unbiased A statistic that reflects accuracy independently of bias is A', the area under an average curve through the points (Grier, 1971; Pollack & Norman, 1964) This estimates the accuracy that would obtain if the subjects were unbiased; its value was 79 ± 4% under the conditions of this experiment How did the pigeons so well? Presumably they based their choices on a temporal dis- 335 PERCEPTION OF CONTINGENCY IN CONDITIONING 6 PROBABILITY OF A FALSE ALARM Figure The probability of a right-key response given that the transition to lit side keys was caused by a centerkey peck ("hits") versus the probability of a right-key response given that the transition was caused by a pseudorandom device in the computer ("false alarms") (The smooth line, a "perceiver operating characteristic," is derived from Equation 1, assuming a signal-to-noise ratio [S] of [400/60].) 600 500 400 300 crimination: If a transition immediately followed a center-key response, they responded as though they had generated the transition, whereas if the stimulus change was delayed, they responded as though it was computer initiated Figure bears out this presumption The open symbols are the probability of a right-key response (yes) as a function of the asynchrony between a center-key peck and the subsequent stimulus change for four of the bias conditions (In the 1.0/2.5 condition, the pigeons were most strongly biased away from the right key by the 2.5-s delay of reward, and they almost never responded there.) The data represent the averages for asynchronies of 825 to 475 ms, 475 to 250 ms, 250 to 60 ms, 60 to -50 ms, and -50 to -150 ms, taken from the last seven sessions at each condition It can be seen that as the asynchrony between a response and stimulus change decreases, the probability of a right-key response increases The solid points represent hits, the probability of a right-key response when the stimulus change was caused by the pigeon They are plotted 60 ms to the left of the origin because that was the measured latency of the key-light 200 100 -100 ASYNCHRONY OF EVENT (MSEC) Figure The probability of a right-key response as a function of the asynchrony between the previous center-key peck and the transition to lit side keys (The parameters signify the delay of reinforcement for correct responses on the left and right keys For clarity, each curve and its associated data points are elevated 10% above the ones below it Negative asynchronies indicate that a center-key peck occurred after the transition had begun Filled symbols represent hits, open symbols false alarms The data are averages over subjects over the last seven sessions; the curves in the left panel are derived from the discrimination model.) 336 PETER R KILLEEN AND JAMES PHILLIP SMITH change on these trials Points to the right of lag between a response and the transition The the origin represent center-key pecks that oc- simplest such function is curred after the stimulus change; these reS = fly/ax, a > 0, (2) sponses were infrequent and probably were initiated before the change took place where Oy is the asynchrony of the signal (here 60 ms), and ax is the asynchrony of the event Discussion in question See the Appendix for further discussion of this model Skinner (1948) proposed that superstitious For our measure of motivational bias, we behavior is caused by adventitious contiguities between a response and reward Figure turn to incentive theory (Killeen, 1982a, shows, however, that pigeons can distinguish 1982b) A basic assumption of incentive theory between events that they originate and those is that two factors control responding, a dithat they don't, even when the latter occur rective factor that comprises the effects of the very soon after a response But Figures and delayed primary reinforcer (which decay ex2 also show that these gradients are affected ponentially with delay) and the effects of the by motivational variables; pigeons may behave conditioned reinforcers signaling the delay as though they caused an event whose latency (whose strength approximately equals the rethey can clearly discriminate as greater than ciprocal of the delay) The directive factor is zero if it is to their advantage to so If the multiplied by an arousal factor (A), which is cost of such an assertion is low, as it often is proportional to the overall rate of reinforcein conditioning experiments where it may in- ment, and a response bias parameter (b): volve only a few responses that an animal is M = bA(e-qD + 1/D) (3) well prepared to make (Seligman, 1970), a few instances of approximate contiguity may be The value of q usually lies close to 0.13, adequate to generate durable "superstitious" and that is the value assigned to it here Equaconditioning (Neuringer, 1970) Conversely, tion is evaluated for both no and yes alteronly moderate differential costs may be effec- natives, with the ratio of those values being tive in shifting the attribution of causality; the M' Although this treatment of motivational parametric values of delay used in this study bias is relatively secure, many other concave were established after months of pilot work functions would about as well, because this during which we searched for differentials factor merely sets the level of the curves in small enough to avoid driving the animals to Figure 2; their curvature is determined by the exclusive choices measure of similarity as denned in Equation The performance of the pigeons may be de- The single free parameter in this model is scribed by a simple detection model that com- the ratio of response biases for the two outbines discriminability of the signal with mo- comes, b' = bn/by Assigning b' a value of 0.37 tivational variables (More sophisticated mod- yields the curves in the left panel of Figure els are available from Church & Gibbon, 1982; We may also collapse these curves over their Davison & Tustin, 1978; McCarthy & Davison, abscissas and ask what is the probability of 1980; and Nevin, Jenkins, Whittaker, & Yar- saying yes given that a signal occurred versus enski, 1982.) It is based on the choice model saying yes given that the stimulus change was of Bradley, Terry, and Luce (Luce, 1963) and independent of the animal's behavior for any predicts the probability of saying "Yes, a signal value of M' We know that the value of ay for is present" to be a signal was 60 ms; we can evaluate Equation for a range of values of M' to obtain the Py = S/(S + M'), (1) ordinates of the curve in Figure We may where M' is a measure of the motivational bias then the same for the average asynchrony to say no relative to that to say yes, and S is of the computer-initiated transitions to gena measure of the similarity between the event erate coordinate probabilities of false alarms and the signal to be detected Because the sig- Unfortunately, the value of that asynchrony nal to be detected is a minimal asynchrony, varied with conditions and response rates of similarity must be a decreasing function of the individual subjects and is not generally known PERCEPTION OF CONTINGENCY IN CONDITIONING However, we may treat it as a free parameter and assign it the value of 400 ms and thus obtain the abscissas for the "operating characteristic" drawn in Figure This analysis is different from traditional signal detectability theory in that it uses independent variables to predict both dependent variables, rather than merely using one dependent variable to predict the other As with more familiar signal detection models, the distance of the curve from the positive diagonal reflects the animals sensitivity to the signal; the operating position on that curve is determined by the animal's motivation (M1) This experiment demonstrated an impressive ability of pigeons to discriminate their role in bringing about a subsequent event and a surprising sensitivity of the animals' bias to small differences in the delay of reward The data supported a simple model of performance based on a temporal discrimination and were extended to the coordinate system of signal detectability theory But the indeterminancy of the asynchronies of computer-initiated events—their dependence on the pigeons' center-key response rate and their variability from one condition to another—somewhat undermines the strength of this last demonstration, requiring that a parameter be estimated post hoc That shortcoming is remedied in the next experiment 337 Table The Order in Which the Pigeons Were Exposed to Each of the Amount Conditions Order Subject D E F O 2.5/2.5 2.5/2.5 2.5/2.5 2.5/2.5 1.0/4.0 3.5/1.5 1.0/4.0 3.5/1.5 3.5/1.5 1.0/4.0 3.5/1.5 1.0/4.0 Note The values represent the amount of reward (seconds access to grain) available for correct responses to the left/ right keys The duration of blackout was the same as the duration of reward on that key Apparatus and procedure The apparatus was the same as in the first experiment The procedure was similar to that of the first experiment Pecks on the white center key would cause it to go off and the side keys to come on, with a probability of 05 The delay between a response and this transition was 20 ms, which includes the halflife for the decay in brightness of the center-key light (10 ms) If the pigeon then correctly responded to the right key, it would receive grain; if it incorrectly responded to the left key, it would receive a brief blackout After these events, a 2.5-s intertrial interval ensued, followed by a reversion to the original condition, with the center key lit Pecks on the center key would also be followed, after a delay, by offset of the center key and onset of the side keys with a probability of 05 On these trials responses to the left key would be rewarded, and responses to the right key would be punished with blackout The delays were randomly chosen from the set 40, 120, 200, 280, and 360 ms The measured asynchronies were 20 ms greater than Experiment these values, yielding an average asynchrony of 220 ms The first experiment demonstrated strong for the "noise" event The pigeons received 60 rewards session and approximately 35 sessions at each of the control of pigeons' attributions by small shifts per conditions listed in Table in the delay of rewards The present experiInsofar as the controlling variable in the first experiment ment replicates it with several modifications was the asynchrony between the response arid the transition, In this experiment motivation is controlled by this experiment provides much better control over that differential amounts of reward, rather than dif- variable However, the problem the pigeons must solve has from one of determining causality to pne of making ferential delays of reward In the first exper- ashifted temporal discrimination Of course, it is our contention iment the asynchrony for a transition caused that the latter is often the basis for the former by the animals was 60 ms; in this experiment it is reduced to 20 ms In the first experiment Results the asynchrony between a response and a computer-initiated transition varied as a funcThis task appears more difficult than the tion of the animal's response rate on the center first, with the difference between the "signal" key; in this experiment those events are in- and the average "noise" asynchronies being dependent only 0.2 s and with some of the noise events occurring only 0.06 s after a response Yet the Method pigeons performed about as accurately in this Subjects The subjects were naive Roller pigeons experiment as they did in the first, with values maintained at 80%-85% of their free-feeding weight of A' being 82, 84, and 75 for Conditions 338 PETER R KILLEEN AND JAMES PHILLIP SMITH 1/4, 2.5/2.5, and 3.5/1.5, respectively Their performance, averaged over the last eight sessions at each condition, is shown in Figure 3, where it is evident that small shifts in the amount of reward brought about large shifts in response bias The curves through the data in Figure are parameter-free predictions based on the detection model (Equations and 2) The average ratio of asynchronies for hits to false alarms was always 0.137 (S = 2ay/an/N), and this completely determined the height of the curves above the positive diagonal We see some systematic error, with detectability lower than predicted in the 3.5/1.5 condition and higher elsewhere, but overall the predictions are good In Figure we plot the average temporal gradients for the three conditions The data from these closely spaced time intervals are irregular, with a strong secondary peak at 300 ms The variability in Figure is greater than can be explained by sampling error (binomial "error" for these data would be about 4%) The peak at 300 ms is noteworthy: This latency is of the same magnitude as the most common interresponse time on comparable schedules of reinforcement (250-400 msec; Blough, 1963; Gott & Weiss, 1972) It is quite likely, given the high probability of multiple responses A 24/2.5 ASYNCHRONY OF EVENT (MSEC) Figure The probability of a right-key response as a function of the asynchrony between a center-key peck and the transition to lit side keys in Experiment (The parameters signify the amount of food available for correct responses on the left and right keys; For clarity, each curve is elevated 10% above the ones below it Filled symbols represent hits, open symbols false alarms The data are averages over subjects over the last eight sessions.) at this tempo, that a peck will strike the key just as it changes Pigeons close their eyes during the dozen milliseconds of impact with the key and could easily take the change to be due to such a peck The foregoing interpretation reconciles the data in Figure with those in Figure (where the larger bias obscured the effect) and Figure (where there is a similar, though less marked, effect) It implies, however, that the reported values for A' are not true maxima—by excluding from calculation trials in which a peck immediately followed a stimulus change (and trials in which a peck struck off-key or with subthreshold force) even larger values for A' would be obtained But this experiment was not designed as an evaluation of temporal discriminative ability (for which alternatives of constant asynchrony would be preferred) It was designed to emulate more naturally occurring situations in which organisms must make future decisions (respond again, quit the situation, etc.) based on their detection of a causal relation between their behavior and a change in the environment Animals may use other cues to establish that relation, being well PROBABILITY OF A FALSE ALARM prepared to recognize some types of causal Figure Operating characteristics for the subjects in Ex- links and contraprepared to recognize others periment 2; the curves are parameter-free predictions derived from Equations and (Circles, triangles, and Although the absolute values of A' are likely squares represent food ratios of 1/4, 2.5/2.5, and 3.5/1.5, to vary with the nature of the response, the respectively, for left/right choices.) stimulus, and their compatibility, we have 339 PERCEPTION OF CONTINGENCY IN CONDITIONING O 120 MSEC A 220 MSEC D 420 MSEC 740 600 900 400 300 200 100 ASYNCHRONY OF EVENT (MSEC) Figure The probability of a right-key response as a function of the asynchrony between a center-key peck and the transition to lit side keys in Experiment (The different symbols represent the performance in the different conditions of the experiment: Filled symbols represent hits, open symbols false alarms The data are averages over subjects over the last eight sessions.) demonstrated that a second factor—bias—is also involved Superstitions cannot be attributed simply to failures of discrimination (response-event asynchronies characteristic of adventitious rewards can easily be discriminated by these animals), but should be thought of in terms of a signal-detection task, in which both signal strength and motivational variables play a joint role in determining performance Whereas we have shown that the pigeons' bias is affected by payoff variables in a rational way, this is not to say that they are acting as optimal decision makers Our model predicts that the animal's choice will depend only on the asynchrony of a response-stimulus sequence but not on the statistics of the distribution of asynchronies due to computer-initiated stimuli But an optimal decision strategy would take both items into account and move the criterion for saying yes to longer asynchronies as the typical lag between a response and computer-initiated events increased In the next experiment we stress that implication the response time of the computer and light bulb All subjects experienced these sets in the order 420 ms, 220 ms, and 120 ms, with approximately 40 sessions at each condition Results and Discussion Figure shows the probability of a response on the right key as a function of the asynchrony between a center-key peck and the transition to the side key lights The data come from the last eight sessions at each condition The curves are essentially congruent and confirm our earlier impressions of the acuteness of the pigeon's perception of short delays between its behavior and subsequent events: The curves reach their floors at delays of approximately 150 ms Again, there is a slight rise in the curves between 250 and 400 ms The average values of A' were 90, 85, and 80 for Conditions 420, 220, and 120 Differences in accuracy among conditions were not due to different elevations of the curves but to different extensions beyond 150 ms As the average asynchrony of delayed events Experiment decreased, the animals could optimize their performance by tightening their criterion for Method saying yes: In the conditions where computerThe subjects and apparatus were the same as in EXT initiated events typically occurred sooner after periment 2, a response (e.g., 120 ms), it was to the animal's The procedure was basically the same as in Experiment The amounts of grain available for hits and correct benefit to say no to some of the more quesrejections were equal (2.5 s) The asynchrony of "caused" tionable events that it might otherwise have events was 20 ms; the asynchrony of "uncaused" events, claimed This did not happen; the circles depending on the experimental condition, was drawn from not lie below the other curves between 100 one of three sets with average asynchronies of 120, 220, and 420 ms The compositions of these sets were 120— and 200 ms However, the pigeons are oper40,80,120,160, 300; 220—60,140,220,300,380; 420— ating close to the optimal criteria, and judging 100, 260,420, 580, 740 All values are in ms and include from optimized decision-theory models, this 340 PETER R KILLEEN AND JAMES PHILLIP SMITH systematic error cost them only 4% of the re- would yield an additional reinforcement of 2.5 s; if not, a 2.5 s blackout Conversely, if the feeding was asynchrowards available in the session nous with the pigeon's peck, a response to the left key The accuracy of pigeons in making these would yield a reinforcement of 2.5 s; if "not, a 2.5-s blackout discriminations may be due in part to the fact The feedings were scheduled in the same manner as the that the interval to be timed is initiated by transitions in the previous experiment, with the delay of a key peck That accuracy may be portrayed "immediate" events being 20 ms and that of asynchronous averaging 420 ms, with the same rectangular diswith two different statistics, the just noticeable events tribution as in Experiment difference (jnd) and the Weber fraction If we The procedure for the lite condition was the same as measure the jnd as a function of the semi- that for the food condition, but when the center key went interquartile range (in Figure 5, the abscissa off, the light in the food hopper went on (as in the food coordinate to 0.375 minus the abscissa co- condition), but the food hopper was not activated and no food was available After the lite event, the side keys went ordinate to 0.625), the present experiments on, and the pigeons were reinforced for correct discrimgive an impressive value of approximately 50 inations The purpose of the lite condition was to move ms for all curves If we divide that by the point the pigeons away from the center key or from their position of subjective equality (the abscissa coordinate elsewhere in the box, thus breaking: up response chains might mediate the discrimination, while providing a to 0.50) to obtain the Weber fraction, the result that delay between the event and the subsequent discrimination is a ratio of 0.65, not nearly so impressive that was equal to that experienced in the food condition (values of 0.1 being obtainable at longer inThe procedure for the null condition was identical to tervals) Thus, although absolute sensitivity that in the 420-s condition of Experiment 3: When the key light went off, the side key lights came on im(the jnd) is excellent, relative sensitivity (the center mediately, and correct discriminative responses were reinWeber fraction) is considerably below the best forced obtainable The duration of the food and lite events were equal in We conclude that, whereas pigeons are sen- each phase of the experiment and took values of 1, 2, sitive to the rewards contingent on their de- and s, in that order The basic experimental cycle consisted of a 6-day week, cisions and thus bias their responses in a ra- with the lite condition on Monday and Thursday, the food tional fashion, they are not in this paradigm condition on Tuesday and Friday, and the null condition ideal decision makers, in that their bias does on Wednesday and Saturday The first (1 -s) condition lasted not change with changes in the signal to noise for 12 cycles, and the subsequent conditions for cycles ratio in an optimal fashion Their absolute A session ended after 60 correct responses sensitivity is excellent, their relative sensitivity Results less so It may be seen in Figure that accuracy Experiment decreased as the duration of the food or light In the previous experiments we have dem- increased and that the decrease was much onstrated the acute sensitivity of pigeons to greater for the food condition1 than for the lite the relationship between their behavior and condition; sensitivities to the response-event an ensuing change of cue lights, and we ask relation were equal only at the 1-s condition, in this experiment whether pigeons will be yet where the pigeons would have received little more sensitive to the relationship between their or no food in the food condition Conversely, accuracy in the null condition was high (A1 = behavior and a food reward 89%, the same as that found in Experiment with different subjects under comparable conMethod Subjects and apparatus The subjects were Silver ditions) and did not decrease when event duKing pigeons (H, I, and J) and common pigeon (K) All ration was increased in the other conditions had previous, histories of reinforcement in other experi- This indicates that general motivational ments They were maintained at 80% of their ad libitum changes, such as satiation, were probably not weight responsible for the observed decrements in the The apparatus was the same as in the previous experother conditions iments Procedure There were three major conditions—food, "lite," and null In the food condition, pecks on the center key would occasionally cause it to go off and the food hopper to be elevated After a feeding, the side-key lights would come on If the feeding was an immediate consequence of the pigeon's peck, a response to the right key Discussion This experiment shows a remarkable decrease in pigeons' ability to identify the relation between their behavior and a subsequent re- PERCEPTION OF CONTINGENCY IN CONDITIONING 341 DURATION OF LIGHT OR FOOD (SEC) Figure Accuracy of the pigeons in reporting the temporal relation between a center-key peck and the ensuing event, either illumination of the hopper light (lite), delivery of food (food), or direct transition to lit side keys (null) (The data are plotted for individual subjects, and averages over subjects, as a function of the duration of the event The solid lines are regressions; the dashed line is an exponential decay function originating at A' = 89 and having a rate constant of 0.13/s.) ward as the value of that reward increases This relationship does not seem to make sense from a functional viewpoint, even though the mechanism—decay,of the memory trace as it ages and greater decay when the interval is filled with distracting events—is well documented However, the task we have set the animals, that of remembering the events in question, is not the same as the task of learning instrumental responses (Maki, 1979a) This distinction will be discussed later We may characterize the rate of decay of accuracy by its half-life, the time necessary for the performance to fall half of the distance from its maximum (at s delay) to chance level If we take the maximum to be A' = 89, the half-lives for the lite and food conditions were 4.5 s and 2.0 s, respectively These values are in the range found by other investigators for memory of various events: pigeons' memory of whether they made one or two pecks (2.5 s; Kramer, 1982); pigeons' memory of the location of a red or green stimulus (2.5 s; Jitsumori & Sugimoto, 1982); and pigeons' memory for the location of a previous response (4-6 s; Shimp, 1976b) Jans and Catania (1980) studied pigeons' memory for the color of a stimulus over a "standard" delay (houselight only on) and over a delay filled with "activity" (the feeder was operated) Converting their results to A' and interpolating gives median half-lives of s and s for the standard and activity conditions, respectively The value of the half-lives will be affected by other factors such as recency (Shimp, 1976b; Weisman, Wassermaii, Dodd, & Larew, 1980), "surprisingness" (Maki, 1979b; Wagner, Rudy, & Whitlow, 1973; ef.ColwiU& Dickinson, 1980), blocking effects (Cook, 1980; Grant & Roberts, 1976; Maki, Moe, & Bierley, 1977; Tranberg & Rilling, 1980; Wilkie, Summers, & Speteh, 1981; Williams, 1978), arid, of course, mediating behavior (Kramer, 1982, Experiments & 2; Smith, Niedorowslci, & Attwood, 1982; Zentall, Hogan, Howard, '& Moore, 1978) Where mediation is minimized, the results speak to a very rapid decay of the memory trace We may compare these results with those issuing from the study of the delay of reinforcement gradient, an enterprise with a venerable tradition in experimental psychology 342 PETER R KILLEEN AND JAMES PHILLIP SMITH (Benjamin & Perloff, 1982) Estimates of the half-life vary widely, depending in large part on the investigator's empirical success at minimizing mediation by conditioned reinforcement (Renner, 1964) A theoretical way of removing the effects of conditioned reinforcement is provided by Equation In over a dozen different studies Killeen (1982b) found that the contribution of primary reinforcement to the maintenance of differential responding could be captured by an exponential decay function with a half-life of 5.5 s (q = 0.13); this function provides the dashed line through the lite data in Figure (assuming maximum A' of 89 and correcting for chance).1 Insofar as that line parallels the data, it is evidence that the decay of discriminative (memorial) strength follows a time course similar to the decay in reinforcing strength The disruptiveness of interpolated reinforcement on the memory trace, suggested by Shimp (1976b) and Staddon (1974), is depicted by the food line in Figure and by the "activity" data df Jans and Catania (1980) Just as a reinforcer may intervene between a stimulus and an animals' subsequent report of it, thereby impairing memory of the stimulus, it may also intervene between a resppnse and a subsequent reinforcer, thereby impairing the effectiveness of the delayed reinforcer Arguments for the disruptiveness of interpolated reinforcement "on the delay of reinforcement gradient are provided by Killeen ,(198la, 1982a), who used that assumption to generate models for predicting response rates as a function of reinforcement rates Killeen.argued that, because the strengthening effects of reinforcement on earlier responses was disrupted more by an intervening period of reinforcement than by a period of quiescence, increasing rates of reinforcement would have marginally decreasing effectiveness as it became ever more likely that the range of impact of a reinforcer would be truncated by interposition of other reinforcers These cpnsiderations led to a model formally equivalent to those of Herrnsitein (1970) and Catania (1973), although the interpretation of the parameters differed The present paradigm provides a basis for independent measurement of a key parameter in that model, the disruptiveness of reinforcement as a function of its duration General Discussion We have demonstrated that pigeons are easily able to discriminate between events that they cause and those that are independent of their responding (Experiment 1), even when the contiguity between a response and an adventitious event is quite close (Experiment 3) The discrimination appears to be based on a temporal discrimination in which the critical variable is the ratio of the asynchrony between a response and stimulus, relative to the typical asynchrony between a response and stimulus that the pigeon has been reinforced for calling caused Although the pigeons' attributions are quitev sensitive to the relative payoffs (Experiments & 2), they not change with changes in the distribution of asynchronies of delayed events as they should if they were "ideal observers" maximizing payoff (Experiment 3) In Experiment we demonstrated that the decay of accuracy in attributing the locus of control over the hopper light onset decreased along approximately the same time course as the theoretical decrease in the effectiveness of primary reinforcement This suggests that when there is no significant blocking or differential conditioned reinforcement, the ability of a delayed reinforcer to strengthen a response is on the same order as the animals' ability to remember the response at the onset of reinforcement But as reinforcement continues, memory deteriorates at an accelerated rate Whereas memory decays rapidly, perhaps exponentially, with the duration of a reinforcer, the ability of that reinforcer to strengthen the response should be correlated not with memory at some instant but with the integral of memory over the course of the reinforcer and should thus increase as a concave function of Equation was originally used to.capture the differential impact on choice responses of delays between them and reward The exponential decay of primary reinforcement strength was derived as a consequence of potential blocking of the association between two events as a function of the delay, between them The same rationale is appropriate here, even though the events differ: If there is a constant probability of forgetting an event (or having its association blocked), we expect an exponential function Whether the similarity in values of the time constants is reliable remains to be seen PERCEPTION OF CONTINGENCY IN CONDITIONING 343 Journal of Experimental Psychology: Animal Behavior duration (Killeen, in press) This analysis may Processes, 9, 63-79 account for the marginally decreasing effecGrier, J B (1971) Nonparametric indices for sensitivity tiveness of reinforcers as their duration is exand bias: Computing formulas Psychological Bulletin, tended (earlier parts of the reinforcement event 75, 424-429 blocking the impact of later parts) just as it Herrnstein, R J (1970) On the law of effect Journal of the Experimental Analysis of Behavior, 13, 243-266 does when their frequency is increased (earlier J E., & Catania, A C (1980) Short-term rememreinforcers blocking the impact of later rein- Jans, bering of discriminative stimuli in pigeons Journal of forcers) the Experimental Analysis of Behavior, 34, 177-183 References Benjamin, L X, & Perloff, R (1982) A case of delayed recognition: Frederick Winslow Taylor and the immediacy of reinforcement American Psychologist, 37, 340342 Blough, D S (1963) Ihterresponse time as a function of continuous variables: A new method and some data Journal of the Experimental Analysis of Behavior, 6, 237-246 Bush, R R (1963) Estimation and evaluation In R D Luce, R R Bush, & E Galanter (Eds.), Handbook of mathematical psychology (Vol 1, pp 429-469) New York: Wiley Catania, A C (1973) Self-inhibiting effects of reinforcement Journal of the Experimental Analysis of Behavior, 19, 517-526 Church, R M., & Gibbon, J (1982) Temporal generalization Journal of Experimental Psychology: Animal Behavior Processes, 8, 165-186 Colwill, R M., & Dickinson, A (1980) Short-term retention of "surprising" events by pigeons Quarterly Journal of Experimental Psychology, 32, 539-556 Commons, M L., & Nevin, J A (Eds.) (1981) Quantitative analyses of behavior: Discriminative properties of reinforcement schedules New York: Pergamon Press Cook, R G (1980) Retroactive interference in pigeon short-term memory by a reduction in ambient illumination Journal of Experimental Psychology: Animal Behavior Processes, 6, 326-338 Davison, M C., & Tustin, R D (1978) The relation between the generalized matching law and signal-detection theory Journal of the Experimental Analysis of Behavior, 29, 331-336 Gibbon, J (1977) Scalar expectancy and Weber's law in animal timing Psychological Review, 84, 279-325 Gibbon, J (1981) Two kinds of ambiguity in the study of time In M L Commons & J A Nevin (Eds.) Quantitative analysis of behavior: Discriminative properties of reinforcement schedules (pp 157-189) New York: Pergamon Press Gott, C T., & Weiss, B (1972) The development of fixedratio performance under the influence of ribonucleic acid Journal of the Experimental Analysis of Behavior, J8, 481-497 Grant, D S., & Roberts, W A (1976) Sources of retroactive inhibition in pigeon short-term memory Journal of Experimental Psychology: Animal Behavior Processes, 2, 1-16 Grant, D S., Brewster, R G., & Stierhoff, K A (1983) "Surprisingness" and short-term retention in pigeons Jitsumori, M., & Sugimoto, S (1982) Memory for two stimulus-response items in pigeons Journal of the Experimental Analysis of Behavior, 38, 63-70 Killeen, P R (1978) Superstition: A matter of bias, not detectability Science, 199, 88-90 Killeen, P R (198 la) Averaging theory In C M Bradshaw, E Szabadi, & C F Lowe (Eds.), Quantification of steadystate operant behavior (pp 21-34) New \brk: Elsevier Killeen, P R (1981b) Learning as causal inference In M L Commons and J A, Nevin (Eds.), Quantitative analyses of behavior: Discriminative properties of reinforcement schedules (pp 89-112) New "fork: Pergamon Press Killeen, P R (1982a) Incentive theory In D J Bernstein (Ed.), Nebraska Symposium on Motivation, 1981: Response Structure and Organization (pp 169-216) Lincoln: University of Nebraska Press Killeen, P R (1982b) Incentive theory II: Models for choice Journal of the Experimental Analysis of Behavior, 38, 217-232 Killeen, P R (in press) Incentive theory and reward magnitude Journal of the Experimental Analysis of Behavior Kramer, S P (1982) Memory for recent behavior in the pigeon Journal of the Experimental Analysis of Behavior, 38, 71-85 Lattal, K A (1975) Reinforcement contingencies as discriminative stimuli Journal of the Experimental Analysis of Behavior, 23, 241-246 Luce, R D (1963) Detection and Recognition In R D Luce, R R Bush, & E Galanter (Eds.), Handbook of mathematical psychology (Vol 1, pp 103-189) New York: Wiley Maki, W S (1979a) Discrimination learning without short-term memory: Dissociation of memory processes in pigeons Science, 204, 83-85 Maki, W S (1979b) Pigeons' short-term memories for surprising vs expected reinforcement and nonreinforcement Animal Learning & Behavior, 7, 31-37 Maki, W S., Moe, J C, & Bierley, C M (1977) Shortterm memory for stimuli, responses, and reinforcers Journal of Experimental Psychology: Animal Behavior Processes, 3, 156-177 McCarthy, D., & Davison, M (1980) Independence of sensitivity to relative reinforcement rate and discriminability in signal detection Journal of the Experimental Analysis of Behavior, 34, 273-284 Neuringer, A J, (1970) Superstitious key pecking after three peck-produced reinforcements Journal of the Experimental Analysis of Behavior, 13, 127-134 Nevin, J A., Jenkins, P., Whittaker, S., & Yarenski, P (1982) Reinforcement contingencies and signal detection Journal of the Experimental Analysis of Behavior, 37, 65-79 344 PETER R KILLEEN AND JAMES PHILLIP SMITH Pollack, I., & Norman, D A (1964) A nonparametric analysis of recognition experiments Psychonomic Science, 1, 125-126 Renner, K E (1964) Delay of reinforcement: A historical review Psychological Bulletin, 61, 341-361 Reynolds, G S (1966) Discrimination and emission of temporal intervals by pigeons Journal of the Experimental Analysis of Behavior, 9, 65-68 Rilling, M E., & McDiarmid, C G (1965) Signal detection in fixed-ratio schedules Science, 148, 526-527 Seligman, M E P (1970) On the generality of the laws of learning Psychological Review, 77, 406-418 Shimp, C P (1976a) Organization in memory and behavior Journal of the Experimental Analysis of Behavior, 26, 113-130 Shimp, C P (1976b) Short-term memory in the pigeon: The previously reinforced response Journal of the Experimental Analysis of Behavior, 26, 487-493 Shimp, C P (1983) The local organization of behavior: Dissociations between a pigeon's behavior and self-reports of that behavior Journal of the Experimental Analysis of Behavior, 39, 61-68 Skinner, B F (1948) Superstition in the pigeon Journal of Experimental Psychology, 38, 168-172 Smith, J P., Niedorowski, L., & Attwood, J C (1982) Delayed choice responding by pigeons when the correct response is not predictable from the sample stimulus Journal of the Experimental Analysis of Behavior, 37, 57-63 Staddon, J E R (1974) Temporal control, attention, and memory Psychological Review, 81, 375-391 Tranberg, D K., & Rilling, M (1980), Delay interval illumination changes interfere with pigeon short-term memory Journal of the Experimental Analysis of Behavior, 33, 39-49 Wagner, A R., Rudy, J W., & Whitlow, J W (1973) Rehearsal in animal conditioning Journal of Experimental Psychology, 97, 407-426 Weisman, R G., Wasserman, E A., Dodd, P W D., & Larew, M B (1980) Representation and retention of two-event sequences in pigeons Journal of Experimental Psychology: Animal Behavior Processes, 6, 300-313 Wilkie, D M., Summers, R J., & Spetch, M L (1981) Effect of delay-interval stimuli on delayed symbolic matching to sample in the pigeon Journal of the Experimental Analysis of Behavior, 35, 153-^160 Williams, B A (1978) Information effects on the responsereinforcer association Animal Learning & Behavior, 6, 371-379 Zentall, T R., Hogan, D E., Howard, M M., & Moore, B L S (1978) Delayed matching in the pigeon: Effect on performance of sample-specific observing responses and differential delay behavior Learning and Motivation, 9, 202-218 Appendix The Discrimination Model Equation is an oversimplification Pigeons were reinforced for saying yes to a band of ^synchronies ranging from 10 to 20 ms (Experiment 1) or from 10 to 40 ms (subsequent experiments) Although we had originally presumed that the difference between these values and zero would be inconsequential, the pigeon's sensitivity makes that assumption unsafe However, as a model of causal sequences in an animal's natural environments, these minima have some verisimilitude; transfers of energy are not instantaneous but usually involve deformation of elastic materials so that lags of dozens of milliseconds are to be expected eVen for the most "direct" causal chains These considerations not, however, tell us the appropriate "subjective" value for ay; fortunately, in Equation that value will be absorbed in the motivational parameter (M1) so that its numerical specification is moot A second oversimplification of the model is that it treats only half of the generalization gradient; this is all that is necessary, because asynchronies less than fly were rare But consideration of how we might complete the model is useful One possibility is S = Oy/|aK - Oy| (Al) But this is exactly the kernel around which Church and Gibbon (1982) construct their theory of temporal generalization Because of the differencing operation in the denominator, which implies an interval scale of sensitivity to time and because of the ratio comparison with the standard in the numerator, Equation Al embodies the core assumptions of scalar timing (Gibbon, 1977) Although other rules for constructing scales of subjective time and decision rules for responding based on them are possible, theories based on extensions of Equation A1 are consistent with the largest range of data (see Gibbon, 1981, for a clear discussion of decision rules and subjective scales) Equation Al provides symmetric gradients around «y, whose value, because it appears as a subtrahend, is no longer arbitrary Calculation has PERCEPTION OF CONTINGENCY IN CONDITIONING shown that a value of ms for a, substantially improves the goodness of fit of the model for the data of Experiment (accounting for 94% of the data variance of Figure 5, excluding the triangular data point near 400 ms; without that parameter, accuracy falls to 90%) Inclusion of this estimate of the optimal latency for an event to be perceived as caused has little effect on the data of Experiment (presumably because the values for a* were so much larger than this psychological minimum), Asynchronies of ms are so close to the measurement error of the experiment, however, that little consequence should be attached to their exact values It is possible to argue that the numerator of 345 Equation Al should be ax, not Oy The data disagree, for the goodness of their fit to that model decreases to around 80% for both experiments Still other models are possible If we measure strength as a negative exponential function of the reciprocal of the right side of Equation Al, when that measure is embedded in Equation we get a logistic function, a close approximation to the Gaussian distribution that lies at the heart of many models of signal detectability theory (Bush, 1963) Received June 29, 1983 Revision received August 9, 1983 • Instructions to Authors Authors should prepare manuscripts according to the Publication Manual of the American Psychological Association (3rd ed.) All manuscripts must include an abstract of 100-150 words typed on a separate sheet of paper Typing instructions (all copy must be double-spaced) and instructions on preparing tables, figures, references, metrics, and abstracts appear in the Manual Also, all manuscripts are subject to editing for sexist language For further information on content, authors should refer to the editorial in the January 1982 issue of the Journal (Vol 8, No 1, p 1) For information on the other three JEP journals, authors should refer to editorials in those journals APA policy prohibits an author from submitting the same manuscript for concurrent consideration by two or more journals APA policy also prohibits duplicate publication, that is, publication of a manuscript that has already been published in whole or in substantial part in another publication Authors of manuscripts submitted to APA journals are expected to have available their raw data throughout the editorial review process and for at'least years after the date of publication Blind reviews are optional, and authors who wish blind reviews- must specifically request them when submitting their manuscripts Each copy of a manuscript to be blind reviewed should include a separate title page with authors' names and affiliations, and these should not appear anywhere else on the manuscript Footnotes that identify the authors should be typed on a separate page Authors should make every effort to see that the manuscript itself contains no clues to their identities Manuscripts should be submitted in quadruplicate (the original and three photocopies), and all copies should be clear, readable, and on paper of good quality Authors should keep a copy of the manuscript to guard against loss Mail manuscripts to the Editor, Donald S Blough, Department of Psychology Brown University, Providence, Rhode Island 02912 For the other JEP journals, authors should submit manuscripts (also in quadruplicate) to one of the editors at the following addresses: Journal of Experimental Psychology: General, Sam Glucksberg, Department of Psychology, Princeton University, Princeton, New Jersey 08544; Journal of Experimental Psychology: Learning, Memory, and Cognition, Richard M Shiffrin, Department of Psychology, Indiana University, Bloomington, Indiana 47405; Journal of Experimental Psychology: Human Perception and Performance, William Epstein, Department of Psychology, W J Brogden Psychology Building, University of Wisconsin, 1202 West Johnson Street, Madison, Wisconsin 53706 When one of the editors believes a manuscript is clearly more appropriate for an alternative journal of the American Psychological Association, the editor may redirect the manuscript with the approval of the author ... effects of reinforcement on earlier responses was disrupted more by an intervening period of reinforcement than by a period of quiescence, increasing rates of reinforcement would have marginally... decay in reinforcing strength The disruptiveness of interpolated reinforcement on the memory trace, suggested by Shimp (1976b) and Staddon (1974), is depicted by the food line in Figure and by the... resppnse and a subsequent reinforcer, thereby impairing the effectiveness of the delayed reinforcer Arguments for the disruptiveness of interpolated reinforcement "on the delay of reinforcement

Định dạng
Số trang	13
Dung lượng	1,21 MB