Computing and visualization in science (2005)

TeAM YYePG Comput Visual Sci (2004) Digital Object Identifier (DOI) 10.1007/s00791-004-0145-0 Computing and Visualization in Science Regular article On a modular architecture for finite element systems Digitally signed bycodes TeAM YYePG I Sequential DN: cn=TeAM YYePG, c=US, Krzysztof Bana´s o=TeAM YYePG, ou=TeAM Section of Applied Mathematics ICM, Cracow University of Technology, Warszawska 24, 31-155 Krak´ow, Poland (e-mail: email=yyepg@msn.com Krzysztof.Banas@pk.edu.pl) YYePG, Reason: accuracy Received:I 9attest December to 2002the / Accepted: 22 March 2003 Published online: 17 August 2004 –  Springer-Verlag 2004 and integrity of this document by: G Wittum Date:Communicated 2005.07.10 22:54:00 +08'00' Abstract The paper discusses basic principles for the modular design of sequential finite element codes The emphasis is put on computational kernels, considering pre- and postprocessing as separate programs Four fundamental modules (subsystems) of computational kernels are identified for the four main tasks: problem definition, approximation, mesh manipulation and linear system solution Example interfaces between the four modules that can accommodate a broad range of application areas, approximation methods, mesh types and existing solvers are presented and discussed The extensions for other modules and for interfaces with external environments are considered The paper prepares ground for the next article considering the architecture of parallel finite element systems Introduction Finite element codes are becoming more and more complex Simulations concern multi-scale phenomena and multiphysics processes modeled by coupled systems of nonlinear partial differential equations posed in complicated 3D domains Adaptive meshes require more elaborate data structures, efficient solvers use multi-level preconditioning, approximation methods operate in more sophisticated function spaces and use complex algorithms for error estimation Additionally, for the full utilization of capabilities offered by today’s parallel computers with memory hierarchies, the explicit model of programming with domain decomposition and message passing proves to be the most efficient [23] All these call for a new programming paradigm for the finite element method The implementations described in the popular textbooks, like [15, 21, 26] are considered the old paradigm The finite element program is defined as a single unit with a single data structure The data structure contains information on the problem solved, the approximation used and the finite element mesh The attempts to define a new paradigm have continued for at least ten years [13, 17, 27] and have recently intensified (see e.g articles in [3] and [16]) There are basically two approaches that can be found in literature The first of them are case studies of large, complex systems [5–8] For such systems it is necessary to introduce some kind of modularization of the program Modular design makes codes easier to maintain, modify and extend The basic principles and advantages of modularization are commonly known and acknowledged, the main problem, however, is to find a suitable modular structure for the software in question Although there are many proposed architectures for finite element programs, the problem has not been the subject of separate investigations and there is no solution with widespread acceptance The existing designs not separate a specification for modules from its implementation and are usually influenced by the latter The aim of the present paper is to examine the structure of finite element codes, propose a modular design and define it in the most general form possible To this end the specification for modules is done exclusively in terms of their interfaces with other modules The specification tries to accommodate many existing variations of finite element approximations (continuous, discontinuous, higher order) and related algorithms To limit the scope of investigations, the emphasis is put on computational kernels of finite element codes, the parts related to fundamental calculations in the finite element method Similar efforts for specifying interfaces between different computational modules, in a slightly different context, have been undertaken in [14] and within the broader project described in [1] The second approach for modernizing finite element codes consists in using features of modern programming languages and software development techniques It is usually related to object oriented methodology and the use of C++ K Bana´s (see e.g [4] and the articles in [12]) Classes (or class hierarchies) are created for low level objects such as vectors, elements or materials, as well as for top level constructs corresponding to fundamental finite element modules The drawback of this approach is that class hierarchies often not allow for the strict separation of modules and, therefore, limit the possible implementations of cooperating modules to a specified language The new paradigm for designing finite element codes will be, with no doubt, related to the progress in computer languages and software engineering Finite element programs, like other software, benefit from using such features as derived (constructed) data types, dynamic memory allocation or different implementations of inheritance The profits of using new features of modern languages are counterbalanced by the costs of porting the old software and training programmers Not all features prove to be equally advantageous for scientific computing [2] The languages and their compilers change constantly These make the choice of a programming language for the finite element method by no means obvious There are now at least three modern languages, Fortran90, C and C++, widely used in scientific programming, each with its own advantages and disadvantages, all standardized and equipped with additional libraries and programming tools The solution proposed in the present paper tries to define a top level architecture of finite element systems in a language independent fashion, using module interfaces that allow for interoperability of three mentioned above, standardized languages The restriction to standardized languages ensures the portability as a crucial requirement for scientific codes The choice of a particular language to implement a given module is not discussed Certain languages seem to be most appropriate for certain tasks, like e.g C++ for implementing the hierarchy of possible elements using inheritance On the other hand, the majority of legacy codes is written in Fortran77 The specification of module interfaces for Fortran90 would allow for reusing the most valuable parts of this heritage by a relatively small effort of writing small intermediate Fortran90 wrappers Finite element analysis, in its classical form, is performed in three consecutive steps: pre-processing, processing and post-processing Pre-processing includes geometrical modeling of a computational domain, mesh generation and detailed definition of the problem, comprising the specification of coefficients, boundary conditions and possibly an initial condition Post-processing involves visualization of approximation results and possibly their further transformations Processing contains all computations leading from the specification of the problem to the finite element approximation of the solution All three steps may form a part of a larger simulation or a design process and the programs may be embedded in a problem solving environment A recent trend is to consider a uniform user interface for the whole simulation as a mean for an interactive and possibly collaborative control and steering [9, 18–20, 22] The last issue is the subject of intensive research in recent years, related to the emergence of new computing environments – distributed, webbased, “grid” The architecture proposed in the present paper does not include such extensions Nevertheless, it has been designed as a computational kernel that can be easily parallelized and included into a distributed problem solving environment The computational kernel of a finite element code is defined here as a part that implements algorithms related to the processing phase Some of these algorithms (e.g time integration, nonlinear equations solution) are shared with other methods for approximating partial differential equations (finite difference, finite volume) Due to their generic character these algorithms are not considered in detail in the paper The distinct features of the finite element method are: the use of a weak, integral statement as a basis for approximation, the division of the computational domain into finite elements and the use of function spaces spanned by basis functions that are defined element-wise Therefore the most fundamental algorithms for the finite element analysis are those that transform the weak statement into the system of linear equations Implementation of these algorithms is the main focus of the paper From many modularizations proposed already in the literature, the one considered as fundamental is chosen as a basis for the architecture The splitting consists of three modules: a mesh manipulation module, an approximation module and a problem dependent module Since the system of linear equations produced by the finite element method has several particular characteristics, the interface between the computational kernel and linear equations solvers is also of crucial importance Therefore a linear solver module is also considered as the fourth fundamental module The rationale behind the proposed splitting is that each distinguished module can operate on its own independent data structure This may produce simplifications to the complex data structures of finite element codes, especially in their parallel and adaptive versions On the other hand, the splitting allows for making use of well known advantages of modular design: creation of prototype modules, that adhere to the part of specifications, easy modifications to the code by managing separate modules independently, increased flexibility of combined codes, easier testing and tuning of separate modules, etc If designed in a proper way, small modules may be no more difficult to use than a single large code, and usually offer broader functionality than one, all purpose program In the particular finite element context, modules can be created e.g for hexahedral meshes with anisotropic refinements (e.g for introducing needle elements), tetrahedral and prismatic meshes with slicing of prisms possible (e.g to model boundary layers in complicated domains), for different approximation methods like: discontinuous Galerkin, mixed, hp-adaptive and so on The question of implementation of particular modules is not discussed in the paper, besides few remarks given below Since modules are defined in terms of their interfaces, their implementation may be done using different languages and constructs Probably the most advantageous would be to further split modules into submodules related to different tasks This can be done in an object oriented fashion or in a procedural style Finally, standard libraries can be used for such tasks as e.g linear algebra or memory management The modular design is also recommended here A single module in a single file containing an interface between the finite element code and a library enables more flexibility in changing libraries or extending code’s functionality The paper is the first in a series of articles discussing modular design of finite element systems The second part will describe parallel codes and the following parts will present example implementations and numerical examples On a modular architecture for finite element systems I Sequential codes in application domains such as CFD, multi-phase flow and electro-magnetics There is only one main section in the paper with different subsections discussing different modules and their interfaces A small summarizing section concludes the paper Finite element computational kernel modules and their interfaces The system of modules and their interfaces described below is a proposal for a general, modular design of finite element computational kernels It is not comprehensive, the idea is to present the most important and useful abstractions and concepts, as well as to give illustrating examples There are four modules considered: a mesh manipulation module, an approximation module, a problem dependent module and a linear solver module Each module is build around its own data structure, not shared with other modules, and the interfaces between modules consist only of parameters (named constants) and procedure (subroutine) calls Constants are used to specify dimensions of arrays being arguments of interface routines or to provide possible choices of parameters employed by different conventions Interfaces, as usual, are contained in separate included files All conventions adopted to make interface information unambiguous and comprehensive have to be explained in comments inside included files Each implementation of a module should provide interfaces in all three languages, Fortran90, C and C++ The examples in the next subsections use C or Fortran90 interchangeably For C, pointers are used exclusively as function arguments, to make interoperability with Fortran easier Vector or matrix arguments are always presented as one dimensional arrays leaving storage details to the final specification Two basic types of variables are used: integer and double precision real (with the name double for both, C and Fortran) The question of user specified precision is not considered as belonging to a different level of specification (although the use of some convention in that matter is advisable) 2.1 Mesh manipulation module The mesh manipulation module is responsible for reading, storing and modifying data concerning finite element meshes It also provides all other modules with information on meshes Mesh modifications comprise different kinds of refinements and derefinements: element divisions (h-refinements), vertex movements (r-refinements), element clustering It is relatively easy to separate the mesh manipulation module from the rest of the finite element code The module does not require any information from other fundamental modules There is no information within the module on the approximation it is used for, nor on the kind of problem solved Using the example of the mesh module, the principles for defining a module and the ways in which it communicates with other modules are presented The key idea is to equip each appearing entity of a specific kind with a unique identifier (ID) The entities comprise complex constructs like mesh or solver but also simple constructs like vertex or element In some languages entities may be directly implemented as objects, but the specification does not even require the use of constructed (user defined) data types For portability reasons it is assumed that an identifier is an integer number (as a type, in practice natural numbers are used) A mesh is the fundamental entity for a mesh manipulation module It is assumed that the module can handle several meshes, for approximating different field components or to handle multi-domain approximations It is the responsibility of the approximation and the problem dependent modules to ensure the consistency of approximations on different meshes To create a new mesh an initialization routine is called The call, for the case of mesh data specified in a file, may have the form: mesh_id = mmr_init_mesh( control, filename) where mesh_id is an identifier associated with the newly created mesh, control is an integer parameter with the meaning defined by a particular module (e.g it may be used for parallel execution to specify whether the data in a file concern the whole mesh or a particular submesh only) and filename is the name of a file with mesh data The identifier mesh_id should be used as an argument for all operations on the corresponding mesh An example is given using a simple routine that returns a description of a mesh: call mmr_mesh_introduce( mesh_id, mesh_description) where mesh_description is a character array (string) describing the type of the mesh according to some convention (e.g “ANISOTROPIC HEXAHEDRAL” or “TRIANGULAR + QUADRILATERAL”) The most important mesh entity, as one could expect for the finite element method, is an element From the theoretical point of view [10] an element is a triplet: space domain, a set of shape functions (a local function space) and a set of degrees of freedom (functionals for the space) Two latter ingredients are directly related to approximation and will be considered later The space domain is related indirectly, it provides a domain for integration of terms from a weak formulation (these terms comprise approximating functions) Integrals from a weak formulation are computed separately for each element The mesh manipulation module must provide means for performing a loop over all elements that take part in the integration These elements are called active and the loop may look as follows: elem_id=mmr_first_active_elem(mesh_id); do{ elem_id=mmr_next_active_elem(mesh_id,elem_id)); }while(elem_id0 - equal size neighbor ID*/ /* α(t) > is the learning rate that is a monotonically decreasing function, ri and rc are the vector locations of the ith node and cth node, respectively, δ(t) is the width of the kernel function that defines the size of the learning neighbourhood that decreases monotonically with time, and ||.|| represents the Euclidean distance As a powerful clustering diagram, SOM has attracted intensive attention from researchers for image processing For example, the PicSOM [11], a framework for content-based image retrieval, uses a self-organizing map to organize similar images in nearby neurons, building up a representation of an image database with similar images located near each other With a number of tree structured SOMs trained on different image features, the system yields an efficient retrieval of images similar to a set of reference images when one browses the image database It has been suggested that SOMs provide an efficient tool for both low level image quantification and high level feature localization [5] As shown in Fig 3, a SOM usually consists of a 2D regular grid of nodes A model of some observation is associated with each node Before the training, as shown in the left part, all models are randomly located in the map The right part of Fig illustrates the self-organization effect, i.e., similar images are clustered together, being associated with neighbouring nodes in the map In this paper, we use a SOM to cluster images based on shape features The objective is to find a representation of a video sequence to illustrate the property of an action as a time series After training, the SOM will generate a label for each input image, converting a video sequence to a label series Then, a searching process for motor primitives is applied to construct a primitive vocabulary In order to use SOM, the images need to be transformed into some feature vectors In our approach, we use a shapebased Fourier feature [2] First, the image area is normalized such that the aspect ratio is maintained Prewitt edge images are computed for the normalized frames The edge image contains the most relevant shape information and the discrete Fourier transform can be used to describe it The Fourier transform is computed for the normalized image using the Fig An illustration of the map for image clustering Each node in the hexagonal grid holds an image of a three-section pie Left: the initial map before training After a random initialization of the connection weights, all images are randomly located in the map Right: the map after training Images are clustered according to similarity Notice that neighbouring models are mutually similar in this map discrete fast Fourier transform (FFT) algorithm Then, the magnitude image of the Fourier spectrum is first low-pass filtered and thereafter decimated Searching for primitives After symbolizing the video sequences, computation cost is the key point for the searching algorithm of primitives An exhaustive searching will result in an exponentially increasing complexity Fortunately, exhaustive searching is not necessary when we consider the nature of the actions Basically we can describe an action as a transfer from one pose to another A pose accords to a serial of images that not significantly change Therefore, the whole searching space can be divided into multiple spaces by detecting the poses Further, by using the minimum description length principle, the repetitive substructures, primitives, are identified The rationale behind the minimum description length principle, introduced by Rissanen in 1978 [13] is that the best model for a data set is a model that can minimize the description length of the data set Cook and Holder [4] applied the principle for identifying repetitive substructures in structural data as a basic knowledge discovery approach By replacing previously discovered substructures in the data, a hierarchical description is produced for the structural regularities in the data For a given video sequence, the trained relation maps all individual images onto a 2D network of neurons Consider the time order of all images in the video, the video sequence forms some tracks/paths on the SOM map These tracks represent some substructures in the video sequence, which appear repeatedly As described in the follows, we propose an algorithm to discover these substructures Consider an N × M SOM, with P = N × M neurons By feeding a video sequence into the map, the result is a directional map Denote the P neurons as a symbol set S = {S1 , S2 , , S P } The directional map can be identically X Yu, S.X Yang represented by a symbol series Due to the coordination learning of the SOM, neurons within the neighbourhood represent images that are similar to each other Meanwhile the global competitive learning among neurons drives different images to different neurons As a result, motion primitives in the video sequence are encoded as repetitive substructure in the symbol series In order to represent the direction for connections, we build up a transition matrix, TP×P , with its row and column corresponding to the start and end points of a connection, respectively Specifically, T(i, j ) means the number of times when a track from Si to Sj is observed The transition matrix TP×P is computed by scanning the entire symbol series Then, the maximal element of the matrix is located and two searching processes, as we called forward and backward, are designed to find a primitive Assume the maximal element as T(n, m), standing for a connection from Sn to Sm The forward search is to find a neuron, from which the connection to Sn is the strongest Similarly, the backward search is to find a neuron, to which the connection from Sm is the strongest A threshold is defined on the connection strength to terminate the search, i.e., the connection is cut when the connection is weak The above process will be repeated over the entire matrix As a result, motion primitives that are represented by repetitive substructures in the symbol series are discovered Figure illustrates the process described above by a × SOM A video sequence is fed into the map As shown in Fig 4, we have a directional map Denote the neurons as A, B, C, , I Then, the directional map can be identically represented by a time series of symbols A–I, i.e., ABCHIADADHIFABCADABCEHI The transition matrix for the above symbol series is shown in Fig In this simplified case, there are elements with the maximal value of First we set the threshold for the connection strength as Consider the first element T(A, B) with a value of The forward search is to find a neuron from which the connection to A is the strongest The result is none due to the threshold of This means A is the starting point for the current search of primitive The backward search is to find to which neuron the connection from B is the strongest and we get neuron C But after that the search is terminated because no neuron takes a strong connection from C (The maximal value for the C row is 1, which is below the threshold) Then, we get a primitive as (ABC) Following similar processes, we obtain (AD) and (HI) In summary, the proposed algorithm is described as follows (1) Convert the 2D N × M SOM map into an identical representation of a 1D series of symbols, {S1 , S2 , , S P } The series length, P, equals to the number of neurons in the SOM map (2) Create the transition matrix TP×P Compute T(i, j ) as the number of times when a track from Si to Sj is observed (3) Find the maximal element of T Denote it as Ti , j It represents a track from Si to Sj (4) Fetch the j th row of T Find the maximal element of this row Then, the corresponding symbol is the next symbol after Sj (5) Set the elements whose symbols have been tracked to zero Then repeat Steps (4)–(5) until the current maximal element is less than half of the first maximal element Fig The directional graph obtained by feeding a video segment to a × SOM The SOM has neurons on the map, denoted as A, B, , I, respectively An arrow between two neurons, e.g., A and B, shows that an image corresponding to neuron A is exactly followed by an image of B in the video sequence (6) After finding the global maximal element, the other process is to find the previous symbol by fetching the i th column of T and finding the maximal element This process is also repeated until the current maximal element is less than half of the first maximal element (7) Repeat Steps (3)–(6) until there is no element larger than half of the first maximum The obtained sub-series of symbols represent the so called primitives of motion A new symbol can be defined for each primitive Then, the whole video sequence can be represented by using these symbols, resulting in a concise representation of the video sequence The computational complexity of the proposed searching algorithm is proportional to the length of the video sequence and the size of the transition matrix Suppose the length of the video sequence is L and the size of the transition matrix is P To build up the transition matrix, the scanning of the symbol series involves L operations of comparison Then, the computation to find primitives is basically a global sorting for the P elements of the transition matrix Because the length of a primitive is normally much less than the video sequence, the computation for forward and backward search is ignorable, in comparison to the computation for the transition matrix Therefore, the total computation complexity is estimated as o(L + P ) In addition, the determination of the Fig The transition matrix for the time series of “ABCHIADADHIFABCADABCEHI” The transition graph is shown in Fig The number of “*” for an element of T(i, j ) represents the number of observations of the connection between Neuron i and Neuron j A study of motion recognition from video sequences map size P is made to have an even distribution of all images on the map, as we will discussed later For instance, we have P ∼ L/5 in our simulation On the contrary, the exhaustive searching for primitives is estimated as o(L!) The proposed searching algorithm is much more efficient than an exhaustive searching Simulations A web camera is used to capture a video sequence of a hand clicking on a mouse With the resolution being 320 × 240 and the frequency being 15 frames/second, a 37 seconds sequence with 555 frames is used to test the proposed approach 5.1 Pre-processing of the video sequence After we convert the video sequence into individual image files, the MATLAB Image Toolbox is used to compute shapebased Fourier feature vectors to which SOM can be applied to the clustering Certainly, the feature extraction algorithm is important for the performance because the clustering of the SOM is based on an efficient representation of images by the feature vectors However, this is not the focus here We choose the following algorithm for feature extraction from the literature [2] First the images are normalized to 512 × 512 The Prewitt method is used to compute the edge image Then, an 8-point FFT is calculated The resulted Fourier spectrum is low-pass filtered and decimated by a factor of 32, resulting in a 128D vector for each image This algorithm is used because it is reported to be the most effective one in [2] However, these shape-based Fourier features are by no means recommended for a generalized application of the proposed motion recognition method, because they not well represent the local changes/movement, which are generally important for capturing the motion characteristics More discussions on the feature selection are presented in the final section Fig Illustration of the sample distribution on the trained SOM There are 555 samples in total The map is a 12 × 12 grid The bar height demonstrates the number of samples that take the current node as the best-matching unit This figure is used to help to determine the map size Basically, we expect an even distribution of all samples over the map 5.2 Clustering by SOM The feature vectors obtained in the above process are fed into a 12 × 12 SOM for clustering As there is no prior knowledge for the selection of the number of neurons, we apply a simple rule to help the selection, i.e., an even distribution of samples/features over the whole map Basically , a too large map will fail to discover any similarity among samples while an extra small map might mess everything together By monitoring the sample distribution, as shown in Fig 6, we choose a heuristic structure with 12 × 12 neurons The learning rate function α(t) is chosen as α(t) = a/(t + b), where a and b are chosen so that α(t = T ) = 0.01α(t = 0), with T is the time interval and α(t = 0) = 0.05 The kernel width δ(t) is set to be a linear function that changes from δini = 12 to δ final = In particular, δ(t) = (δ final − δini )t/T + δini Fig Sample distribution on the trained SOM The horizontal axis represents neurons, while the vertical axis is the number of samples that is clustered into the corresponding neuron Fig The unified distance matrix of the trained SOM The distance value is illustrated by the grey level, as shown by the bar on the right A high value means a large distance between neighbouring map units, and thus indicates the cluster boundary X Yu, S.X Yang view, clusters are typically uniform areas of low values Figure presents a better view of the distance matrix by a 3D surface plot, where the X and Y coordinates correspond to the position of the neurons and the Z coordinate represents the distance The high-value areas accord to some distinguish poses in an action, while the low-value areas mean a cluster of slowly changing poses In the experiment, a typical movement starts from a low area, climbs to a high area and returns back to a low area Figure 10 presents a view of the clustering by SOM in more details, by showing the kernel for each neuron in the map Each kernel is a represented by a 128D vector, as the visual feature for one individual image does Each waveform in Fig 10 is drawn for the 128D kernel In general, more similarity among the waveforms indicates that there is less difference by the motion among the images that are clustered to the corresponding neurons 5.3 Motion primitives discovery Fig A 3D view of the distance matrix The X and Y coordinates correspond to the position of the neurons and the Z coordinate represents the distance A better view may be available if colours are displayed, since we also use colours to differentiate the distance Intuitively, the high-value areas correspond to some distinguish poses in an action, while the low-value areas indicate slowly changing poses As shown in Fig 8, the so called U-matrix shows the unified distances between neighbouring units and thus visualizes the cluster structure of the map Note that the U-matrix visualization has much more hexagons than the component planes This is because that hexagons are inserted between map units to show the distance between neighbouring neurons, while a hexagon in the position of a map unit illustrates the average of surrounding values A high value on the U-matrix mean a large distance between neighbouring map units, and thus indicates the cluster boundary From the classification point of Fig 10 Illustration of the codebook for all neurons on the SOM after the training Each waveform is drawn as the 128D kernel for the corresponding neuron The concept of pose detection in video segmentation is very similar to that of the silence detection in speech processing Intuitively, it is reasonable to consider a pose as a piece of silence for human actions Silence detection is widely used for speech segmentation At least, sentences can always be picked up from a long speech by detecting the pause between them Speech silence is defined based on signal energy The pose we define here is based on frequency, while speech information is basically encoded by frequency components In cognitive studies, scientists often address human behaviours as body language Interestingly, this suggests us an effective way to handle video signals As a time sequence, video signals share some common sense with speech signals Speech signal is a time series of sound wave, in which language information is encoded Similarly, a particular action can be addressed as a series of poses Then, a primitive can be symbolized, just as we write down a word to represent a particular sound series When enough action primitives are collected, a vocabulary is formed Then, we can use this vocabulary to describe human behaviours Figure 11 shows some sample shots of action in the video sequence For example, the forefinger’s action is well recognized by searching the serial of {(3,11), (6,12), } By applying the substructure searching algorithm presented above, primitives are extracted as series of neurons, which are represented by a pair of numbers according to their positions on the Fig 11 Some sample shots in the video sequence The upper sequence shows a movement of the forefinger, while the lower sequence shows a movement of the middle finger Each motion serial is symbolized to a neuron serials in the 2D map Neurons are represented by a pair of numbers according to their position on the map For example, (3, 11) means the neuron in the 3rd row and the 11th column A study of motion recognition from video sequences map Then, the whole video sequence is split automatically by dividing and representing the corresponding symbol series with the resulted primitives Conclusion and discussion The video sequence processing approach proposed in this paper features two factors First, due to the unsupervised learning mechanism of the self-organizing map, it saves us some tedious manual computation that is necessary for conventional approaches such as hidden Markov based models Secondly, it gains support from cognitive studies of motion primitives, as well as provides a better understanding of the biological sensory motor systems The proposed searching algorithm for substructure discovery is efficient and effective Compared to an exhaustive searching, its computational complexity is very low, particularly when the sequence length is large In fact, for real world application, where the video sequence is very long, the exhaustive searching is almost useless However, as discussed in Sect 5, the computation requirement for the proposed algorithm is proportional to the sequence length, resulting in a good application The proposed approach is part of our effort on studies of biologically inspired robotics One application of the proposed method is to enhance the learning-by-imitation capability for humanoid robots There has been much work done for building up a humanoid robot system that can automatically gain knowledge on how to control its own movement by observing the behaviour of a model (either a human or another robot) and trying to copy it, which is inspired from studies on how a child learns his/her action by watching and practising [9, 18] The approach discussed in this paper focuses on the observation and understanding part of a learning-byimitation procedure Based on the hypothesis of primitives, it provides a method to interpret the motion of a model, e.g., a human, in the video input Following that, the humanoid robot will try to perform in a similar way Then with a reward given on its performance, the robot is capable of learning the way to control its body, as a child learns to walk, reach, take, etc Another potential application to robotics is video-based navigation As video is the dominant sensor input for robots, it is desirable for us to understand the video as much as we could Motion recognition from video sequence could arm a robot with a nice ability to avoid dynamic obstacles For example, a pedestrian could easily tell if it is possible to avoid a coming car and cross the street safely by taking some looks along the way (here let us set a situation without traffic rules) This is because that a human can gain the motion information of the moving cars by a look A robot that can extract motion information from video input, hopefully, will be able to navigate in the world more naturally and safely Still, the approach is sensitive to the calculation of feature vectors, as a data-driven method normally does This sensitivity may also be understood as that the global shape information is not robust to represent the difference that are resulted in the motion [8] Studies on biological motion perception systems suggest that the global shape information is not necessary for recognizing motions The shape-based Fourier features well catch the global shape information in individual images However, it does not detect local changes/movement efficiently For this reason, finger movements in the test video are exaggerated on purpose For a generalized application of the proposed motion recognition approach, more investigations on the visual features should be conducted As motions are generally involved in objects representation in images, the medial axis transform [15] shall be investigated as a good candidate This is part of our future work References Bizzi, E., Giszter, S., Mussa-Ivaldi, F.A.: Computations Underlying the Execution of Movement: a Novel Biological Perspective Science 253, 287–291 (1991) Brandt, S., Laaksonen, J., Oja, E.: Statistical Shape Features in Content-Based Image Retrieval Proceedings of 15th International Conference on Pattern Recognition, Barcelona, Spain, Vol 12, pp 1062–1065, September (2000) Bregler, C.: Learning and Recognizing Human Dynamics in Video Sequences Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, San Juan, Puerto Rico, pp 568–574, June (1997) Cook, D.J., Holder, L.B.: Substructure Discovery Using Minimum Description Length and Background Knowledge Journal of Artificial Intelligence Research 1, 231–255 (1994) Craw, I., Costen, N.P., Kato, T., Akamatsu, S.: How Should We Represent Faces for Automatic Recognition IEEE Transactions on Pattern Analysis and Machine Intelligence 21(8), 725–736 (1999) Davis, J.W., Bobick, A.F.: The Representation and Recognition of Human Movement using Temporal Templates Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 928–934, June 1997 Guo, A., Yang, S.X.: Neural Network Approaches to Visual Motion Perception Science in China, Series B 37(2), 177–189 (1994) Guo, A., Sun, H., Yang, S.X.: A Multilayer Neural Network Model for Perception of Rotational Motion Science in China, Series C 40(1), 90–100 (1997) Jenkins, O.C., Mataric, M.J.: Deriving Action and Behaviour Primitives from Human Motion Data Proceedings of IEEE/RSJ International Conference on Intelligent Robots and System, Vol 3, pp 2551–2556, September (2002) 10 Kohonen, T.: Self-Organizing Maps New York: Springer 1997 11 Laaksonen, J., Koskela, M., Laakso, S., Oja, E.: Self-Organizing Maps as a Relevance Feedback Technique in Content-Based Image Retrieval Pattern Analysis and Applications 4(2), 140–152 (2001) 12 Lin, C.T., Wu, G.D., Hsiao, S.C.: New Techniques on Deformed Image Motion Estimation and Compensation IEEE Transactions on Systems, Man, and Cybernetics, Part B 29(6), 846–859, (1999) 13 Rissanen, J.: Stochastic Complexity in Statistical Inquiry Singapore: World Scientific 1989 14 Shah, M., Jain, R.: Motion-based recognition Boston: Kluwer Academic Publishers 1997 15 Sherbrooke, E.C., Patrikalakis, N.M., Brisson, E.: An Algorithm for the Medial Axis Transform of 3D polyhedral Solids IEEE Trans on Visualization and Computer Graphics 2(1), 44–61 (1996) 16 Sidenbladh, H., De la Torre, F., Black, M.J.: A Framework for Modeling the Appearance of 3D Articulated Figures Proceedings of Fourth IEEE Conference on Automatic Face and Gesture Recognition, pp 368–375, March (2000) 17 Vesanto, J., Alhoniemi, E.: Clustering of the Self-Organizing Map IEEE Transactions on Neural Networks 11(3), 586–600 (2000) 18 Weber, S., Jenkins, C., Mataric, M.J.: Imitation Using Perceptual and Motor Primitives Proceedings of International Conference on Autonomous Agents, pp 136–137, Barcelona, Spain, June 3–7, (2000) 19 Wu, Y., Huang, T.S.: Hand Modeling, Analysis, and Recognition for Vision-based Human Computer Interaction IEEE Signal Processing Magazine 18(3), 51–60 (2001) ... monitoring flag with options: ! LSC_PRINT_NOT - not print anything ! LSC_PRINT_ERRORS - print error messages only ! LSC_PRINT_INFO - print most important data ! LSC_PRINT_ALLINFO - print all... grid integer :: level_fine_id !in: fine grid level number integer :: l_nrdof_fine !in: fine grid nrdofs for each block integer :: l_posg_fine !in: fine grid positions within the ! global vector... mesh_id !in: mesh ID integer :: elem_id !in: element ID integer :: ref_type !in: indicator of type of refinement integer :: info !out: success or error code end subroutine mmr_elem_divide Once again

Định dạng
Số trang	46
Dung lượng	1,57 MB