neural networks algorithms applications and programming techniques phần 9 pptx

8.3 ART2 317 Orienting , Attentional subsystem subsystem p 2 Layer •ic xx xx XXI y Fi Layer Input vector Figure 8.6 The overall structure of the ART2 network is the same as that of ART1. The FI layer has been divided into six sublayers, w, x, u,v,p, and q. Each node labeled G is a gain-control unit that sends a nonspecific inhibitory signal to each unit on the layer it feeds. All sublayers on F\, as well as the r layer of the orienting subsystem, have the same number of units. Individual sublayers on FI are connected unit to unit; that is, the layers are not fully interconnected, with the exception of the bottom- up connections to FI and the top-down connections from F 2 . the appearance of the multiplicative factor in the first term on the right-hand side in Eq. (8.31). For the ART2 model presented here, we shall set B and C identically equal to zero. As with ART1, j£ and J^ represent net excitatory and inhibitory factors, respectively. Likewise, we shall be interested in only the asymptotic solution, so I A + (8.32) 318 Adaptive Resonance Theory The values of the individual quantities in Eq. (8.32) vary according to the sublayer being considered. For convenience, we have assembled Table 8.1, which shows all of the appropriate quantities for each F\ sublayer, as well as the r layer of the orienting subsystem. Based on the table, the activities on each of the six sublayers on F\ can be summarized by the following equations: w, = Ii+ aui (8.33) Xl = e +L\\ (8 ' 34) v t = /(I*) + bf(q t ) (8.35) (8.36) (yi)zn (8.37) q t = —p-77 (8.38) e + IIPlI We shall discuss the orienting subsystem r layer shortly. The parameter e is typically set to a positive number considerably less than 1. It has the effect Quantity Layer A D 7+ /r w 11 Ii + au, x e 1 w, u e 1 v t v 1 1 f(xi) + bf(qi) 3 q e i PI r c 1 u-' -f- CD ' 0 HI IMI 0 0 IP Table 8.1 Factors in Eq. (8.32) for each FI sublayer and the r layer. /; is the ith component of the input vector. The parameters a, b, c, and e are constants whose values will be discussed in the text, yj is the activity of the jth unit on the F 2 layer and g(y) is the output function on F 2 . The function f(x) is described in the text. 8.3 ART2 319 of keeping the activations finite when no input is present in the system. We do not require the presence of e for this discussion so we shall set e — 0 for the remainder of the chapter. The three gain control units in FI nonspecifically inhibit the x, u, and q sublayers. The inhibitory signal is equal to the magnitude of the input vector to those layers. The effect is that the activities of these three layers are normalized to unity by the gain control signals. This method is an alternative to the on- center off-surround interaction scheme presented in Chapter 6 for normalizing activities. The form of the function, f(x), determines the nature of the contrast enhancement that takes place on FI (see Chapter 6). A sigmoid might be the logical choice for this function, but we shall stay with Carpenter's choice of °>S S ' where 6 is a positive constant less than one. We shall use 9 = 0.2 in our subsequent examples. It will be easier to see what happens on FI during the processing of an input vector if we actually carry through a couple of examples, as we did with ART1. We shall set up a five-unit F\ layer. The constants are chosen as follows: a= 10; 6 = 10; c = 0.1. The first input vector is I, =(0.2,0.7,0.1,0.5,0.4)' We propagate this vector through the sublayers in the order of the equations given. As there is currently no feedback from u, w becomes a copy of the input vector: w = (0.2,0.7,0.1,0.5,0.4)' x is a normalized version of the same vector: x = (0.205,0.718,0.103,0.513,0.410)' In the absence of feedback from q, v is equal to /(x): v = (0.205,0.718.0, 0.513,0.410)' Note that the third component is now zero, since its value fell below the thresh- old, 0. Because F 2 is currently inactive, there is no top-down signal to FI . In that case, all the remaining three sublayers on F, become copies of v: u = (0.205,0.718,0,0.513,0.410)* p = (0.205,0.718,0,0.513,0.410)' q = (0.205,0.718,0,0.513,0.410)' 320 Adaptive Resonance Theory We cannot stop here, however, as both u, and q are now nonzero. Beginning again at w, we find: w = (2.263,7.920,0.100,5.657,4.526)' x = (0.206,0.722,0.009,0.516,0.413)* v = (2.269,7.942,0.000,5.673,4.538)' where v now has contributions from the current x vector and the u vector from the previous time step. As before, the remaining three layers will be identical: u = (0.206,0.723,0.000,0.516,0.413)' p = (0.206,0.723,0.000,0.516,0.413)' q = (0.206,0.723,0.000,0.516,0.413)' Now we can stop because further iterations through the sublayers will not change the results. Two iterations are generally adequate to stabilize the outputs of the units on the sublayers. During the first iteration through F\, we assumed that there was no top- down signal from F 2 that would contribute to the activation on the p sublayer of F\. This assumption may not hold for the second iteration. We shall see later from our study of the orienting subsystem that, by initializing the top-down weights to zero, Zjj(O) = 0, we prevent reset during the initial encoding by a new F 2 unit. We shall assume that we are considering such a case in this example, so that the net input from any top-down connections sum to zero. As a second example, we shall look at an input pattern that is a simple multiple of the first input pattern—namely, I 2 = (0.8,2.8,0.4,2.0,1.6)' which is each element of Ii times four. Calculating through the F\ sublayers results in w = (0.800,2.800,0.400,2.000,1.600)' x = (0.205,0.718,0.103,0.513,0.410)' v = (0.205,0.718,0.000,0.513,0.410)' u = (0.206,0.722,0.000,0.516,0.413)' p = (0.206,0.722,0.000,0.516,0.413)' q = (0.206,0.722,0.000,0.516,0.413)' The second time through gives w = (2.863,10.020,0.400,7.160,5.726)' x = (0.206,0.722,0.0288,0.515,0.412)* v = (2.269,7.942,0.000,5.672,4.538)' u = (0.206,0.722,0.000,0.516,0.413)' p = (0.206,0.722,0.000,0.516,0.413)' q = (0.206,0.722,0.000,0.516,0.413)' 8.3 ART2 321 Notice that, after the v layer, the results are identical to the first example. Thus, it appears that ART2 treats patterns that are simple multiples of each other as belonging to the same class. For analog patterns, this would appear to be a useful feature. Patterns that differ only in amplitude probably should be classified together. We can conclude from our analysis that FI performs a straightforward nor- malization and contrast-enhancement function before pattern matching is at- tempted. To see what happens during the matching process itself, we must consider the details of the remainder of the system. 8.3.3 Processing on F 2 Processing on FI of ART2 is identical to that performed on ART1. Bottom-up inputs are calculated as in ART1: ~~ iZji (8.40) Competition on Fa results in contrast enhancement where a single winning node is chosen, again in keeping with ART1. The output function of Fa is given by ( d T } = max{T k }Vk 9(Vj) = < n , . k (8-41) I 0 otherwise This equation presumes that the set {T k } includes only those nodes that have not been reset recently by the orienting subsystem. We can now rewrite the equation for processing on the p sublayer of FI as (see Eq. 8.37) _ / Ui if Fa is inactive „ I Ui + dzij if the Jth node on Fa is active 8.3.4 LTM Equations The LTM equations on ART2 are significantly less complex than are those on ART1. Both bottom-up and top-down equations have the same form: Zji = 9(y,} (Pi ~ zn} (8.43) for the bottom-up weights from Vi on FI to Vj on Fa, and zn = g(y } ) (Pi - zij) (8.44) for top-down weights from Vj on Fa to vt on F,. If vj is the winning Fa node, then we can use Eq. (8.42) in Eqs. (8.43) and (8.44) to show that zji = d(Ui + dzij — zji) 322 Adaptive Resonance Theory and similarly z u = d(ui + dzij - Zij) with all other Zij = Zj t = 0 for j ^ J. We shall be interested in the fast-learning case, so we can solve for the equilibrium values of the weights: U ' zji = z u = - — '— (8.45) 1 - a where we assume that 0 < d < 1. We shall postpone the discussion of initial values for the weights until after the discussion of the orienting subsystem. 8.3.5 ART2 Orienting Subsystem From Table 8.1 and Eq. (8.32), we can construct the equation for the activities of the nodes on the r layer of the orienting subsystem: r, = ± (8.46) where we once again have assumed that e — 0. The condition for reset is H =• ' * 47) where p is the vigilance parameter as in ART1. Notice that two F\ sublayers, p, and u, participate in the matching process. As top-down weights change on the p layer during learning, the activity of the units on the p layer also changes. The u layer remains stable during this process, so including it in the matching process prevents reset from occurring while learning of a new pattern is taking place. We can rewrite Eq. (8.46) in vector form as u + cp Then, from ||r|| — (r • r) 1 / 2 , we can write ,, „ [l+2|| CP ||cos(u, P )+|| C p|| 2 ] l/2 I* —— ——————————————————————————————————————————— ^O."TO^ where cos(u, p) is the cosine of the angle between u and p. First, note that, if u and p are parallel, then Eq. (8.48) reduces to ||r|| — 1, and there will be no reset. As long as there is no output from F 2 , Eq. (8.37) shows that u = p, and there will be no reset in this case. Suppose now that F 2 does have an output from some winning unit, and that the input pattern needs to be learned, or encoded, by the F 2 unit. We also do 8.3 ART2 323 not want a reset in this case. From Eq. (8.37), we see that p = u + dz./, where the Jth unit on /•? is the winner and z,/ = (z\j. Z2j, ZA/J)'. If we initialize all the top-down weights, z,-_,-, to zero, then the initial output from FI will have no effect on the value of p; that is, p will remain equal to u. During the learning process itself, z./ becomes parallel to u according to Eq. (8.45). Thus, p also becomes parallel to u, and again ||r|| = 1 and there is no reset. As with ART1, a sufficient mismatch between the bottom-up input vector and the top-down template results in a reset. In ART2, the bottom-up pattern is taken at the u sublevel of F\ and the top-down template is taken at p. Before returning to our numerical example, we must finish the discussion of weight initialization. We have already seen that top-down weights must be initialized to zero. Bottom-up weight initialization is the subject of the next section. 8.3.6 Bottom-Up LTM Initialization We have been discussing the modification of LTM traces, or weights, in the case of fast-learning. Let's examine the dynamic behavior of the bottom-up weights during a learning trial. Assume that a particular FI node has previously encoded an input vector such that ZJ- L = uj(\ — d), and, therefore, ||zj|| = j|u||/(l - d) — 1/0 - d), where zj is the vector of bottom-up weights on the Jth, F 2 node. Suppose the same node wins for a slightly different input pattern, one for which the degree of mismatch is not sufficient to cause a reset. Then, the bottom-up weights will be receded to match the new input vector. During this dynamic receding process, ||zj|| can decrease before returning to the value 1/0 - d). During this decreasing period, ||r|| will also be decreasing. If other nodes have had their weight values initialized such that ||zj(0)|| > I/O - d), then the network might switch winners in the middle of the learning trial. We must, therefore, initialize the bottom-up weight vectors such that l|z|| 1 -d We can accomplish such an initialization by setting the weights to small random numbers. Alternatively, we could use the initialization Zji(0) < —————j= (8.49) (1 - d)VM This latter scheme has the appeal of a uniform initialization. Moreover, if we use the equality, then the initial values are as large as possible. Making the initial values as large as possible biases the network toward uncommitted nodes. Even if the vigilance parameter is too low to cause a reset otherwise, the network will choose an uncommitted node over a badly mismatched node. This mechanism helps stabilize the network against constant receding. 324 Adaptive Resonance Theory Similar arguments lead to a constraint on the parameters c and d; namely, C<1 < 1 (8.50) 1 -d As the ratio approaches 1, the network becomes more sensitive to mismatches because the value of ||r|| decreases to a smaller value, all other things being equal. 8.3.7 ART2 Processing Summary In this section, we assemble a summary of the processing equations and constraints for the ART2 network. Following this brief list, we shall return to the numerical example that we began two sections ago. As we did with ART1, we shall consider only the asymptotic solutions to the dynamic equations, and the fast-learning mode. Also, as with ART1, we let M be the number of units in each F\ sublayer, and N be the number of units on FT. Parameters are chosen according to the following constraints: a, b > 0 0 < d < 1 cd < 1 1 -d 0< 0 < 1 0 < p < 1 e <C 1 Top-down weights are all initialized to zero: Zij(0) = 0 Bottom-up weights are initialized according to 1 (1 - Now we are ready to process data. 1. Initialize all layer and sublayer outputs to zero vectors, and establish a cycle « counter initialized to a value of one. 2. Apply an input pattern, I to the w layer of FI . The output of this layer is Wi = I, + auj 3. Propagate forward to the x sublayer. Wl 8.3 ART2 325 4. Propagate forward to the v sublayer. Vi = f(Xi) + bf( qi ) Note that the second term is zero on the first pass through, as q is zero at that time. 5. Propagate to the u sublayer. Ui = e + \\v\\ 6. Propagate forward to the p sublayer. Pi = Ui + dz u where the Jth node on F 2 is the winner of the competition on that layer. If FI is inactive, p- t = u,j. Similarly, if the network is still in its initial configuration, pi = Ui because -z,-j(0) = 0. 7. Propagate to the q sublayer. 8. Repeat steps 2 through 7 as necessary to stabilize the values on F\. 9. Calculate the output of the r layer. Uj + C.p, e + \\u\\ + \\cp\\ 10. Determine whether a reset condition is indicated. If p/(e + \\r\\) > 1, then send a reset signal to F 2 . Mark any active F 2 node as ineligible for competition, reset the cycle counter to one, and return to step 2. If there is no reset, and the cycle counter is one, increment the cycle counter and continue with step 11. If there is no reset, and the cycle counter is greater than one, then skip to step 14, as resonance has been established. 11. Propagate the output of the p sublayer to the F 2 layer. Calculate the net inputs to FI. M ^r—> 12. Only the winning F 2 node has nonzero output. 0 otherwise Any nodes marked as ineligible by previous reset signals do not participate in the competition. 326 Adaptive Resonance Theory 13. Repeat steps 6 through 10. 14. Modify bottom-up weights on the winning F 2 unit. z.i, = , 1 - d 15. Modify top-down weights coming from the winning F 2 unit. u, Z,J = 1 -d 16. Remove the input vector. Restore all inactive F 2 units. Return to step 1 with a new input pattern. 8.3.8 ART2 Processing Example We shall be using the same parameters and input vector for this example that we used in Section 8.3.2. For that reason, we shall begin with the propagation of the p vector up to FI. Before showing the results of that calculation, we shall summarize the network parameters and show the initialized weights. We established the following parameters earlier: a = 10; b = 10; c = 0.1,0 = 0.2. To that list we add the additional parameter, d = 0.9. We shall use N = 6 units on the F 2 layer. The top-down weights are all initialized to zero, so Zjj(0) = 0 as discussed in Section 8.3.5. The bottom-up weights are initialized according to Eq. (8.49): Zji = 0.5/(1 - d)\/M = 2.236, since M = 5. Using I = (0.2,0.7,0.1,0.5,0.4)* as the input vector, before propagation to F 2 we have p = (0.206,0.722,0,0.516,0.413)'. Propagating this vector forward to F 2 yields a vector of activities across the F 2 units of T = (4.151,4.151,4.151,4.151,4.151,4.151)' Because all of the activities are the same, the first unit becomes the winner and the activity vector becomes T = (4.151,0,0,0,0,0)' and the output of the F 2 layer is the vector, (0.9,0,0,0,0,0)'. We now propagate this output vector back to FI and cycle through the layers again. Since the top-down weights are all initialized to zero, there is no change on the sublayers of FI . We showed earlier that this condition will not result in a reset from the orienting subsystem; in other words, we have reached a resonant state. The weight vectors will now update according to the appropriate equations given previously. We find that the bottom-up weight matrix is / 2.063 7.220 0.000 5.157 4.126 \ 2.236 2.236 2.236 2.236 2.236 2.236 2.236 2.236 2.236 2.236 2.236 2.236 2.236 2.236 2.236 2.236 2.236 2.236 2.236 2.236 \ 2.236 2.236 2.236 2.236 2.236 / [...]... March 198 8 [7] Gail A Carpenter, Stephen Grossberg, and Courosh Mehanian Invariant recognition of cluttered scenes by a self-organizing ART architecture: CORT-X boundary segmentation Neural Networks, 2(3):1 69- 181, 198 9 [8] Michael A Cohen and Stephen Grossberg Absolute stability of global pattern formation and parallel memory storage by competitive neural networks IEEE Transactions on Systems, Man, and. .. Bibliography 3 39 [11] Stephen Grossberg, editor The Adaptive Brain, Vol I: Cognition, Learning, Reinforcement, and Rhythm North Holland, Amsterdam, 198 7 [12] Stephen Grossberg, editor The Adaptive Brain, Vol II: Vision, Speech, Language and Motor Control North Holland, Amsterdam, 198 7 [13] Stephen Grossberg, editor Neural Networks and Natural Intelligence MIT Press, Cambridge, MA, 198 8 [14] P Kolodzy and E... network to illusions and infrared imagery In Proceedings of the IEEE First International Conference on Neural Networks, San Diego, CA, pp IV- 193 -IV-202, June 198 7 IEEE [15] Paul J Kolodzy Multidimensional machine vision using neural networks In Proceedings of the IEEE First International Conference on Neural Networks, San Diego, CA, pp II-747-II-758, June 198 7 IEEE [16]T W Ryan and C L Winter Variations... pattern recognition and cooperative-competetive decision making by neural networks In H Szu, editor Hybrid and Optical Computing SPIE, 198 6 [2] Gail A Carpenter and Stephen Grossberg ART 2: Self-organization of stable category recognition codes for analog input patterns Applied Optics, 26(23): 491 9- 493 0, December 198 7 [3] Gail A Carpenter and Stephen Grossberg ART2: Self-organization of stable category... September-October 198 3 [9] Stephen Grossberg Adaptive pattern classsification and universal receding, I: Parallel development and coding of neural feature detectors In Stephen Grossberg, editor Studies of Mind and Brain D Reidel Publishing, Boston, pp 448- 497 , 198 2 [10] Stephen Grossberg Studies of Mind and Brain, volume 70 of Boston Studies in the Philosophy of Science D Reidel Publishing Company, Boston, 198 2 Bibliography... Carpenter, and their colleagues Starting with Grossberg's work in the 197 0s, and continuing today, a steady stream of papers has evolved from Grossberg's early ideas Many such papers have been collected into books The two that we have found to be the most useful are Studies of Mind and Brain [10] and Neural Networks and Natural Intelligence [13] Another collection is The Adaptive Brain, Volumes I and II... International Conference on Neural Networks, San Diego, CA, pp II-737-II-745, June 198 7 IEEE [5] Gail A Carpenter and Stephen Grossberg A massively parallel architecture for a self-organizing neural pattern recognition machine Computer Vision, Graphics, and Image Processing, 37:54-115, 198 7 [6] Gail A Carpenter and Stephen Grossberg The ART of adaptive pattern recognition by a self-organizing neural network Computer,... on adaptive resonance In Maureen Caudill and Charles Butler, editors, Proceedings of the IEEE First International Conference on Neural Networks, San Diego, CA, pp II767-11-775, June 198 7 IEEE [ 17] T W Ryan, C L Winter, and C J Turner Dynamic control of an artificial neural system: the property inheritance network Applied Optics, 21(23): 496 1- 497 1, December 198 7 H A P T E R Spatiotemporal Pattern Classification... Examples of these applications can be found in the papers by Carpenter 338 Bibliography et al [7], Kolodzy [15], and Kolodzy and van Alien [14] An alternate method for modeling the orienting subsystem can be found in the papers by Ryan and Winter [16] and by Ryan, Winter, and Turner [17] Bibliography [1] Gail A Carpenter and Stephen Grossberg Associative learning, adaptive pattern recognition and cooperative-competetive... analog input patterns In Maureen Caudill and Charles Butler, editors, Proceedings of the IEEE First International Conference on Neural Networks, San Diego, CA, pp II727-11-735, June 198 7 IEEE [4] Gail A Carpenter and Stephen Grossberg Invariant pattern recognition and recall by an attentive self-organizing ART architecture in a nonstationary world In Maureen Caudill and Charles Butler, editors, Proceedings . here, however, as both u, and q are now nonzero. Beginning again at w, we find: w = (2.263,7 .92 0,0.100,5.657,4.526)' x = (0.206,0.722,0.0 09, 0.516,0.413)* v = (2.2 69, 7 .94 2,0.000,5.673,4.538)' where. u and p. First, note that, if u and p are parallel, then Eq. (8.48) reduces to ||r|| — 1, and there will be no reset. As long as there is no output from F 2 , Eq. (8.37) shows that u = p, and there. cycle counter to one, and return to step 2. If there is no reset, and the cycle counter is one, increment the cycle counter and continue with step 11. If there is no reset, and the cycle counter

Định dạng
Số trang	41
Dung lượng	1,06 MB