1. Trang chủ
  2. » Kinh Doanh - Tiếp Thị

Matrices and graphs theory and applications to economics

258 82 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 258
Dung lượng 17,59 MB

Nội dung

Matrices and Graphs Theory and Applications to Economics Proceedings of the Conferences on Matrices and Graphs Theory and Applications to Economics University of Brescia, Italy June 1993 22 June 1995 Sergio Camiz Dipartimento di Matematica "Guido Castelnuovo" Universita di Roma "La Sapienza", Italy Silvana Stefani Dipartimento Metodi Quantitativi Universita di Brescia, Italy b World Scientific II ' Singapore· New Jersey· London· Hong Kong Published by World Scientific Publishing Co Pte Ltd POBox 128, Farrer Road, Singapore 912805 USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library MATRICES AND GRAPHS Theory and Applications to Economics Copyright © 1996 by World Scientific Publishing Co Pte Ltd All rights reserved This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA In this case permission to photocopy is not required from the publisher ISBN 981-02-3038-9 This book is printed on acid-free paper Printed in Singapore by Uto-Print v FOREWORD the editors The idea to publish this book was born during the conference « Matrices and Graphs: Computational Problems and Economic Applications» held in the far June 1993 at Brescia University The conference was such a success that the organizers, actually the present editors themselves, after a short talk with the lecturers, decided on the spot to apply to the Italian Consiglio Nazionale delle Ricerche (CNR) for a contribution to publish the conference proceedings The second editor did and the contribution came after a while In the meantime the editors organized another conference on «Matrices and Graphs: Theory and Economic Applications», held as the previous in Brescia during June 1995, partly with different invited lecturers The conference was a success again and therefore the first editor applied to the Italian National Research Council and got a second contribution, that came only recently While the lecturers of the first conference, who were not at the second one, were a bit upset, having submitted their paper without seeing any proceedings published at that time, the lecturers common to the first and the second conference suggested to join the contributions and publish a unique book for both conferences This is what we did During all these years, both authors were very busy lecturing, researching, publishing, raising more funds to make their research possible Most papers arrived late and were carefully read by the editors, then the search of suitable referees was not easy, so that the reviewing process took also a while, some papers being sent back to the authors for corrections and then submitted again to referees A complete re-editing was necessary in order to get the uniform editor's style , well, these are the reasons of such a delay, but eventually here we are The book reflects our scientific research background: for academic and scientific reasons both of us were drawn to different research subjects, both shifting from pure to applied mathematics and statistics, with particular attention to data analysis in many different fields the first editor, and to operational research and mathematical finance the second So, in each of the steps of this long way, we collected a bit of knowledge The fact that in most of investigations we dealt with matrices and graphs suggested us to investigate in how many different situations they may be used vi This was the reason that led to the conferences; as a result, this book looks like a patchwork, as it is composed of different aspects: we submit it to the readers, hoping that it will be appreciated, as we did In fact, the numerous contributions come from pure and applied mathematics, operations research, statistics, econometrics Roughly speaking, we can divide the contributions by areas: Graphs and Matrices, from theoretical results to numerical analysis, Graphs and Econometrics, Graphs and Theoretical and Applied Statistics Graphs and Matrices contributions begin with John Maybee: in his paper New Insights on the Sign Stability Theorem, he finds a new characterization of a sign stable matrix, based on some properties of the eigenvectors associated to a sign semi-stable matrix Szolt Thza in Lower Bounds for a Class of Depth- Two Switching Circuits obtains a lower bound for a certain class of (0,1) matrices It is interesting to note that the problem can be formulated in terms of a semicomplete digraph D, if one wants to determine the smallest sum of the number of vertices in complete bipartite digraphs, whose union is the digraph D itself Tiziana Calamoneri and Rossella Petreschi's Cubic Graphs as Model of Real Systems is a survey on cubic graphs, i.e regular graphs of degree three, and at most cubic graphs, i.e graphs with maximum degree three and show a few applications in probability, military problems, and financial networks Silvana Stefani and Anna Torriero in Spectral Properties of Matrices and Graphs describe from one hand how to deduce properties of graphs through the spectral structure of the associated matrices and on the other how to get information on the spectral structure of a matrix through associated graphs New results are obtained towards the characterization of real spectrum matrices, based on the properties of the associated digraphs Guido Ceccarossi in Irreducible Matrices and Primitivity Index obtains a new upper bound for the primitivity index of a matrix through graph theory and extends this concept to the class of periodic matrices Sergio Camiz and Yanda Thlli in Computing Eigenvalues and Eigenvectors of a Symmetric Matrix: a Comparison of Algorithms compare Divide et Impera, a new numerical method for computing eigenvalues and eigenvectors of a symmetric matrix, to more classical procedures Divide et Impera is used to integrate those procedures based on similarity transformations at the step in which the eigensystem of a tridiagonal matrix has to be computed Among contributions on Graphs and Econometrics we find Sergio Camiz paper I/O Analysis: Old and New Analysis Techniques In this paper, Camiz compares various techniques used in I/O analysis to reveal the complex struc- vii ture of linkages among economic sectors: triangularization, linkages comparison, exploratory correspondence analysis, etc Graph analysis, with such concepts as centrality, connectivity, vulnerability, turns out to be a useful tool for identifying the main economic flows, since it is able to reveal the most important information contained in the I/O table Manfred GilU in Graphs and Macroeconometric Modelling deals with the search of a local unique solution of a system of equations and with necessary and sufficient conditions for this s0lution to hold He shows how through a graph theoretic approach the problem can be efficiently investigated, in particular when the Jacobian matrix is large and sparse, a typical case of most econometric models Manfred Gilli and Giorgio Pauletto in Qualitative Sensitivity Analysis in Multiequation Models perform a sensitivity analysis of a given model when a linear approximation is used, the sign is given and there are restrictions on the parameters They show that a qualitative approach, based on graph theory, can be fruitful and lead to conclusions which are more general than the quantitative ones, as they are not limited to a neighborhood of the particular simulation path used Mario FaUva in Hadamard Matrix Product, Graph and System Theories: Motivations and Role in Econometrics shows how the analysis of a model's causal structure can be handled by using Hadamard product algebra, together with graph theory and system theoretical arguments As a result, efficient mathematical tools are developed, to reveal the causal and interdependent mechanisms associated with large econometric models At last, International Comparisons and Construction of Optimal Graphs, by Bianca Maria Zavanella, contains an application of graph theory to the analysis of the European Union countries based on prices, quantities and volumes Graph theory turns out to be a most powerful tool to show which nations are more similar Graphs and Statistics papers are represented by three contributions Giovanna lona Lasinio and Paola Vicard in Graphical Gaussian Models and Regression review the use of graphs in statistical modelling The relative merits of regression and graphical modelling approach are described and compared both form the theoretical point of view and with application to real data Francesco Lagona in Linear Structural Dependence of Degree One among Data: a Statistical Model models the presence of some latent observations using a linear structural dependence among data, thus deriving a particular Markovian Gaussian field Bellacicco and Tulli in Cluster identification in a signed graph by eigenvalue analysis establish a connection between clustering analysis and graphs, by including clustering into the wide class of a graph transformation in terms of cuts and insertion of arcs to obtain a given topology viii After this review, it should be clear how important is the role of matrices and graphs and their mutual relations, in theoretical and applied disciplines We hope that this book will give a contribution to this understanding We thank all the authors for their patience in revising their work A special thanks goes to Anna Torriero and Guido Ceccarossi for their constant help, but especially we would like to thank Yanda Tulli, who did the complete editing trying to (and succeeding in) making order among the many versions of the papers we got during the revision process Last, but not least, thanks to Mrs Chionh of World Scientific Publishers in Singapore, whom we not know personally, but whose efficieny we had the opportunity to know through E-mail October, 1996 Sergio Camiz and Silvana Stefani The manuscripts by Sergio Camiz, Guido Ceccarossi, Manfred Gilli, Giovanna lona Lasinio and Paola Vicard, Francesco Lagona, and Bianca Maria Zavanella, referring to the first Conference, have been received at the end of 1993 The manuscripts of Antonio Bellacicco and Yanda Tulli, Tiziana Calamoneri and Rossella Petreschi, Sergio Camiz and Yanda Tulli, Mario Faliva, Manfred Gilli and Giorgio Pauletto, John Maybee, Silvana Stefani and Anna Torriero, and Szolt Tuza, referring to the second Conference, were received at the end of 1995 This work was granted by the contributions from Consiglio Nazionale delle Ricerche n A.I 94.00967 (Silvana Stefani) and Consiglio Nazionale delle Ricerche n A.I 96.00685 (Sergio Camiz) ix Sergio Camiz is professor of Mathematics at the Faculty of Architecture of Rome University ~La Sapienza~ In the past, he was professor of Mathematics at Universities of Calabria, Sassari, and Molise, of Statistics at Benevento Faculty of Economics of Salerno University, and of Computer Science at the American University of Rome He spent periods as visiting professor at the Universities of Budapest (Hungary), Western Ontario (Canada), Lille (France) and at Tampere Peace Research Institute of Tampere University (Finland), contributed to short courses on numerical ecology in the Universities of Rome, Rosario (Argentina), and Leon (Spain), held conferences on data analysis applications at various italian universities, as well as the Universities of New Mexico (Las Cruces), Brussels (Belgium), Turku and Tampere (Finland), and at IADIZA in Mendoza (Argentina), contributed with communications to various academic congresses in Italy, Europe, and America After a long activity in the frame of computational statistics and data analysis for numerical ecology, and in programming numerical computations in econometrics and in applied mechanics, his present research topics concern the analysis, development, and use of numerical mathematical methods for data analysis and applications in different frames, such as economical geography, archaeology, sociology, and political sciences He was co-editor of two books, one concerning the analysis of urban supplies and the other on pollution problems, and author of several papers published on scientific journals Silvana Stefani is a Full Professor of Mathematics for Economics at the University of Brescia She got her Laurea in Operations Research at the University of Milano She has been visiting scholar in various Universities in Warsaw (Poland), Philadelphia (USA), Jerusalem (Israel), Rotterdam (the Netherlands), New York and Chicago (USA) She has been Head of the Department of Quantitative Methods, University of Brescia, from November 1990 to October 1994 and is currently Coordinator of the Ph.D Programme ~Mathematics for the Analysis of Financial Markets~ She was co-editor of two books, one concerning the analysis of urban supplies and the other on mathematical methods for economics and finance, and author of numerous articles, published in international Journals, in Operations Research, applied Mathematics, Mathematical Finance Typeset by Jj.TEX Edited by Vanda Tulli 229 an intrinsic inference model (Baldessari, 1983b; Baldessari, 1985; Baldessari et al, 1983) Furthermore, it should be noted an important consequence of the Gaussian assumption stated in (5) Let Gh E be a graph with n vertices, n-h maximal cliques, and let Yi be the observation corresponding to the vertex i belonging to the intersection of the maximal cliques Let us evaluate the following marginal joint distribution of the random vector Y n-l = (Y1,· ,Yi-l Yi+l,· ,Yn ) (17) where I;n-l is square matrix obtained from I; after deleting the i-th row and column Clearly, the conditional independence graph for !(Yn-l) is G h - • Indeed, if we integrate N(O,~) with respect to an observation Yj' with the vertex j not belonging to the intersection of the maximal cliques, we still obtain a Gh-equivalent graph with n - vertices In general, if we have the joint density of Y , corresponding to some conditional independence graph G h E g, we may say that under Gaussian distribution hypotheses, integrating the joint density of Y with respect to Yi produces a marginal density corresponding to the subgraph obtained from G h when one deletes the i-th vertex and the edges having i as an end point But the algebraic structure of G h changes only if the vertex i belongs to the intersection of the maximal cliques of G h Estimation problems As an application of the previous results, in this section a particular linear model with dependent errors is considered More precisely, let us suppose to have observed a data vector y, which is assumed to be the realization of random vector Y Furthermore, let us suppose that E(Y) == X{3, where X is a known n x k matrix such that X'X is not singular, and {3 is a k-dimensional vector of unknown parameters For the sake of simplicity, we may state an homoscedasticity hypothesis, namely a; = a , i = 1, ,n In this case the likelihood (12) becomes (18) where k + < n, n - k - being the number of the degrees of freedom The main problem is to find the maximum likelihood estimates for parameters {3, a and a Letting the log-likelihood function (19) 230 where." = (/3, a, (/2) the problem to be solved is the following (20) The simplest procedure for solving (20), is the Gauss-Newton iterative method (21) where p is the step number, are H(p) is information matrix whose (i, j)-elements (22) In the case studied, (21) becomes where, recalling that A(a) = aD, H(j2,a - ';4 [ ';4 (y - Xf3)' D(y - Xf3) J.r(y - Xf3)'D(y - Xf3) (j tr [(I - A(a))-l D] (24) The iterative method (21) provides a convergent sequence of estimates, but it needs initial conditions region ((/2) (0) and a(O) We may use an arbitrary value in the (25) because it is possible to show that in the set (25) the variances/covariances matrix (/2 (I - A(a))-l is positive defined In fact, the var/cov matrix is positive defined if its minimum eigenvalue is positive or, in other words, if (26) But the Gersgoring disk (Magnus and Neudecker, 1988) of the maximum eigenvalue of Dis (27) 231 hence {0- > O} n { a < '\~ax} J {0- > O} n {0 < a < n ~ 1} (28) or, in other words, the region (25) is a subset of the admissibility region for the parameters a and 0- • Finally, the maximum likelihood estimate ij can be used for defining the concentration ellipsoid (29) where H." is the information matrix evaluated at " = ij and X%+2 is the Chi-quared distribution with k + degrees of freedom and corresponding to a significance level chosen Conclusion In several concrete cases studied it is common to observe a series of dependent data In this paper the dependence among data is supposed due to the presence of latent observations This case was investigated first by Baldessari and Gallo (Baldessari and Gallo, 1982) via the concept of LSD hypothesis, although their paper did not provide an useful statistical model With the help of graph theory and random field theory, a new description of the LSD assumption is practicable, which seems very natural and intuitive The main purpose of such a description is the modelling of the presence of latent data with a statistical approach (Section 2) Here variances/covariances matrices and conditional independence graphs played a central role: in particular the relationship between var/cov matrices and graphs is focused Furthermore, we would like estimating and testing the presence of a latent observation in the data set: this is very easy if we use a linear statistical model with dependent errors, as the application in Section shows Acknowledgments This paper was supported by the MURST Research Group 1993 "Analisi dei dati dipendenti " References Baldessari, B , 1983a «Analisi dei dati dipendenti: robustezza completa nel modello lineare» Atti del Convegno sulla Robustezza, Orme, june 1983 232 Baldessari, B , 1983b «Intrinsic Dependence Foundations and Research Approaches» (technical report) Dept of Statistics - University of Arizona Baldessari, B , 1985 «Some Aspects of Intrinsic Inference» Metron, 43 (1-2) Baldessari, B and F Gallo, 1982 Dependence» Metron, 40 (3-4) «Linear Structural Baldessari, B., G.B Tranquilli and J Weber, 1983 « Intrinsic Inference: Review of the Related Literature» (technical report) Dept of Statistics - University of Arizona Baldessari, B and J Weber , 1986 «Statistical Models, Intrinsic Dependence and Intrinsic Inference» Metron, 44 Besag, J , 1974 «Spatial Interaction and Statistical Analysis of Lattice System» (with discussion) Journal of the Royal Statistical Society, 36 Cressie, N , 1991 Statistics for Spatial Data J Wiley & Sons, New York Dobruschin, R L , 1970 «Prescribing a System of Random Variables by Conditional Distributions» (tr by A R Kraiman) Theory Prob Appl., 15 (3) Lagona, F , 1992 « Parametrizzazione di informazioni geografiche in processi gaussiani Un'ipotesi di covarianza uniforme» Graduation thesis at the Rome University "La Sapienza", supervisor F Gallo (in Italian) Matrix Differential Calculus with Applications in Statistics and Econometrics J Wiley & Sons, New York Magnus, J R and H Neudecker, 1988 Whittaker, J ,1990 Graphical Models in Applied Multivariate Statistics J Wiley & Sons, New York 233 CLUSTER IDENTIFICATION IN A SIGNED GRAPH BY EIGENVALUE ANALYSIS A BELLACICCO Dipartimento di Teoria dei Sistemi e delle Oryanizzazioni Universitd di Temmo v TULLI Dipartimento di Metodi Quantitativi Facoltd di Economia Universitd di Brescia In the last years a number of clustering algorithms were presented Clustering algorithms can be characterized considering a set of choices regarding the constraints and the objective function Generally speaking we can imbed any clustering algorithm in the wide range of a graph transformation in terms of cuts and insertion of arcs or edges in order to obtain a given topology of the subgraphs, like cliques (a complete subgraph), circuits, arborescences and so on The shape of a cluster may be defined in terms of the corresponding topology and therefore can be characterized by the highest eigenvalue of the graph associated matrix In this paper we will consider the characterization of clustering algorithms on graphs in terms of eigenvalues Real eigenvalues are related to balanced subgraphs, where the notion of balanced graph will be considered Introduction Cluster analysis can be considered as an optimization algorithm able to produce a partition (covering) of a graph G representing the relations between couples of units, under given constraints Generally speaking, it is a usual way of treating a set of units, described by a set of variables, to consider a distance (dissimilarity) between couples of units and setting up a graph, that is a set of relationships between couples of units weighted by the strength of the relationship, which can be a distance function The requirements on the weights can be relaxed and we can have non negative functions which are symmetric, like dissimilarity and interaction functions As is well known, a distance function should observe the previous requirements and the so called triangular inequality We can generalize the previous notion of weight considering the set of real numbers, both positive and negative ones, including the case of zero weight The common interpretation of these weights (When integers) is the flow of individuals from a given place to another in spatial systems A number of clustering algorithms were presented in last years and a common definition considers them as unsupervised classifiers (Bellacicco and La- 234 bella, 1979; Bellacicco and Tulli, 1995) It is usual to characterize a clustering algorithm considering a set of choices regarding the constraints and the objective function Generally speaking we can imbed any clustering algorithm in the wide range of a graph transformation in terms of cuts and insertion of arcs or edges in order to obtain a given topology of the subgraphs, like cliques (i.e a complete subgraphs), circuits, arborescences and so on In Bellacicco and Coroo (1986) and in Baldassari and Bellacicco (1989) we introduced the notion of a shape of a cluster in terms of statistical constraints on each cluster In this paper we consider explicitly a graph G(S, X), X ~ S X S, where S is a finite set of vertices and X ~ S x S is a finite set of arcs labelled with a spin value, namely (+1,0, -1) where means the absence of the corresponding arc As weights, spin values on the arcs are very common and useful in termodinamics, ferromagnetism and in sociograms In the latter case, the value +1 means, e.g., that an individual in a classroom has a positive preference about the other individual joined by the arc, the value -1 meaning repulsive preference and the value indifference (Phillips, 1967; Roberts, 1976) The eigenanalysis of the matrices of weights (-1, 0, +1) is interesting as far as the main eigenvalue can give us information about the graph topology, that is the set of relationships among a set of individuals The shape of a cluster may be defined in terms of the corresponding topology and therefore can be characterized by the highest eigenvalue of the graph associated matrix In the coming paragraphs we will consider the characterization of clustering algorithms on graphs in terms of eigenvalues Real eigenvalues are related to balanced subgraphs, where the notion of balanced graph will be considered in the subsequent paragraph ° ° Clusters and balanced graphs Let us consider a graph G (S, X), X ~ S x S, where S is a finite set of nodes representing units, and X a finite set of arcs labelled with a spin value (+1,0, -1), joining couples of nodes We may consider both symmetric and antisymmetric graphs In particular a clique Kh,h is a symmetric graph whose cardinality is h We can consider the notion of balanced graph in terms of a distance which is based on the number of common ancestors of two nodes We recall some concepts from graph theory for defining the notion of balanced graph 235 Definition The neighbour C(Si) of a node Si of G is the set of all nodes of G linked by an arc to the node Si Definition The dissimilarity d( Si, Sj) ofG(S,X) is = dij between two nodes, Si and Sj, (1) where ni means the number of the nodes connected by an arc to Si and nij is the number of nodes connected simultaneously, by an arc, to Si and Sj The measure of dissimilarity introduced by (1) can be easily understood if we compare two strings composed by -1, 0, and +1 where we consider Definition 1, and therefore the neighbours G(Si) and G(Sj), and therefore the vectors representing each neighbour The distance between two nodes of the graph is represented by the euclidean distance without squaring the result, between two vectors whose elements are -1, and +1 The previous observation can be suitable for the interpretation of (1) as the distance between two nodes, whose properties are the well known distance properties A more general interpretation can be obtained in terms of graph theory: the distance between two units is the number of units belonging to each neighbour minus the elements belonging to both simultaneously This approach of reasoning is quite common in set theory once the distance between two sets must be evaluated It is easy to see that dij ~ and is such that dij = d ji , Vi,j The first property is assured by simple set theoretical arguments while the symmetry is given by the definition of the number nij = nji A balanced graph, labelled BG, can be defined as follows: Definition A gmph G(S,X) is a BG if and only if dij is equal for every couples of nodes of G It is easy to see that a clique Kh,h is a balanced graph Moreover there are balanced graphs which are not a clique, like, for instance, any cycle whose size is greater than We can introduce the definition of cluster in terms of balanced graph Definition A cluster G'(S, X) is a subgmph of G(S, X) which satisfies the properties of a balanced gmph Balanced graph are studied in psychology and a simle example can be given by considering groups of three people, where each individual may either like or dislike each other A general definition of a balanced subgraph is related to graph circuits (cycles) A small group is balanced if each circuit is positive This definition implies that we can characterize a cluster of units, whether 236 or not they share some general property about the strength of the relationships both for signed graphs and for the structure of the matrix of the values associated to the arcs A first generalization may be obtained if we consider cliques rather than circuits and the characterization of a cluster may be generalized without considering the amount of homogeneity to be maximized, but limiting ourselves to the shape of the signed subgraph, in terms of properties to be satisfied We can consider now spin values on the arcs of a graph C( S, X) and review the notion of cliques and balanced graphs We call a graph with spin value on the arcs as C*(S,X) Definition A clique of a spin graph C*(S,X) is a clique whose arcs are valued +1 We denote a clique, previously defined, as a spin clique, We have the following statement: Kh h' ' Proposition The dissimilarity d ij is still working on spin graphs C* (S, X) The truth of the previous statements on spin cliques can be easily checked if we consider only the arcs with non negative values In order to generalize (1) we can consider the adjacency matrix whose entries are the values dij of the arc connecting couples of nodes of C* (S, X) The adjacency matrix M(C*) can be associated to the graph considering the number of the arcs ingoing or outgoing from Si with values + 1, ni the number of the arcs ingoing or outgoing from Si valued -1 and n?jO is the number of arcs connecting the common neighbour of nodes of the mentioned couple The arcs valued give a null contribution We introduce the new terms di,dj,dir nt ·· -no - + +n.dtJ t t + - 2+n·J+ +n·J- - 2n·· n·· tJ tJ (2) where (3) thus ~=~+~ W Thus (2) is the distance formula In case of a spin clique Kh h it is obvious that d ij satisfies the same conditions defining the notion of ; dissimilarity index We can also state that in a clique both (1) and (2) are distance functions, whereas in the other cases the triangular inequality does not hold We can generalize the definition of a balanced graph by the previous generalization on the adjacency matrix M (C*) 237 Proposition Balanced subgraphs with spin values on the arcs own a constant dissimilarity measure c4j for every couple of nodes The previous statement can be verified if we consider the matrix M (C) We did not distinguish between ingoing arcs and outgoing arcs and equation (2) implies that the degrees of the nodes should have the same algebraic summation of the arcs Given any graph, a balanced subgraph may be obtained by introducing a threshold function for each node, supposed to be obtained for every node of C*(S,X) This notion of threshold function is quite new in graph theory and the problem is to characterize each cluster in terms of balanced subgraphs and an index which is strictly dependent from the degree of balance of C*(S, X) (Roberts, 1976) In the next paragraph we will consider the eigenvalues associated to a graph C* (S, X) and we will see that balanced graphs BG have real eigenvalues while unbalanced graphs could have complex eigenvalues The transition from balanced graphs to unbalanced graphs can be viewed as a sudden transition in terms of eigenvalues In other words, the transition from real eigenvalues to complex eigenvalues is depending on a small change of the values associated to the arcs Eigenvalue analysis of clusters in a graph In the previous paragraph we considered as a general definition of a cluster a balanced subgraph BG of a spin graph C*(S,X) With this approach, we consider as a cluster a subgraph owning some topological properties, without any explicit reference to underlying optimization functions In other words a definition of cluster is based on the cluster shape in terms of its topological properties (Roberts, 1976) The optimization problem can be reduced to the search of a minimal set of cuts such that we can obtain either a partition or a covering by a set of cliques Following the previous definitions, the matrix M( C) is decomposed into a set of submatrices which satisfy the condition: (5) In this case we can define the disequilibrium of a spin graph C* (S, X) as D(C*) = WI +W2 + +Wk (6) where k is the number of cliques able to represent the graph partitioning Since each clique can be decomposed in triangular cliques, all having W = 2, we have: D(C*) = 2k (7) · 238 The disequilibrium D(C*) can be easily evaluated for a partitioning of C*(S,X) into k balanced graphs We have indeed: D(C*) = ck (8) As far as D(C*) is larger than ck, we have less than k clusters and the problem is to isolate the clusters, if they exist We introduce now a concept which is quite new in the area of cluster analysis Besides the notion of cluster in terms of balanced subgraph of a spin graph C* (S, X), we consider the notation of inactive subgraph CO where (9) In this case the matrix M(CO) is asymmetric because for each couple of nodes we have a couple of arcs with opposite values The three notions already introduced in this paper can be considered as three different types of graphs Considering the associated adjacence matrix a sharp partition can be made between subgraphs whose associated matrix has real eigenvalues and subgraphs having complex eigenvalues Actually, a symmetric matrix M(C') whose elements are non negative, owns all real eigenvalues, and, in particular, the largest one is positive As a conseguence: Theorem The first eigenvalue of a clique adjacence matrix is positive Proof In fact, for every clique of a spin graph equation (5) holds and the symmetry of the matrix associated to the clique is guarantee It is easy to see that in a clique the neighbours of all nodes are equal, and therefore: ni = nj and from (1) dij = 2ni nij = ni = 2nij = 2nj - 2nij nj and as a conseguence dd = O In case a clique is inbedded in a graph, we have to add the other nodes of the graph which are the neighbours of each node For a graph which is decomposable in a set of cliques, it is easy to see that d ij has a constatnt value To both unbalanced subgraphs and inactive subgraphs of a spin graph C* (S, X) asymmetric matrices are associated and the main eigenvalue can be either negative or complex Some problems come up when there is just one arc connecting a couple of nodes In this case we can have some ambiguous results In the next paragraph 239 we will consider some experiments on triangles which are the smallest cliques Recalling that every clique greater than a triangle can be reduced through a covering by triangles, the problem to face is to consider the previous equations when we have a superposition of triangles in a larger clique We state the following theorem: Theorem The Equation (5) is true when a clique of triangles, K ,3 Kh,h is covered by a set Proof Let us consider a clique Kh,h and the set of the cliques K3,3 which actually cover Kh,h For each couple of nodes of each clique K3,3 equation (2) holds The numbers di and dj are equal, respectively, to nt and nj and nt = nj = = gi = gj where gi is the degree of the node Si and gj is the degree of the node Sj The degree of each vertex is equal to 2h - 2, where h is the size of the clique Kh,h and the loop associated to the vertex is not considered The number is equal to 4h - We get easily c = when we have two or more cliques with the same couple of vertices in common we have to sum up the new degrees and we have to consider the new number n~ The balance of equation (5) is untouched by adding new cliques covering the clique Kh,h The same arguments can be used both for balanced spin subgraphs and for inactive spin subgraphs nrJ Examples and conclusions We consider here only three matrices which are able to show the truth of the previous statements The graph matrices and their largest eigenvalues are the following ones: 240 MATRIX (t t D TYPE MAX EIGENVALUE Cli~ Bala=d Unbalanced -0.324 U1I n 11 ( _Ill ~1) It is possible to build up easily the set of all matrices from the set of all graphs composed by triangles and to evaluate their largest eigenvalue We can see that the shown examples are samples of this set Balanced graphs can be built like, for instance, the circuits and the cycles By examples considered we give a simple idea of the role of the maximum eigenvalue corresponding to a clique, a balanced graph or an unbalanced graph, respectively In case of an unbalanced graph the maximum eigenvalue is negative, while for a clique and for a balanced graph the maximum eigenvalue is a positive number We can observe that the clique and the balanced graph own the same maximum eigenvector in spite of their different sizes The common feature is the presence of a set of circuits whose arcs are valued with the same weight, which is non negative In the unbalanced case the presences of a weight -1 is sufficient for a sudden change of the value of the eigenvalue We may interpret easily the previous results and we may identify a clique like a balanced graph Moreover, the value of maximum eigenvalue can outline whether the corresponding subgraph is a balanced graph The identification of the clusters in a graph implies the search for the square sub matrices having the max/min eigenvalues positive and therefore the search may be reduced to a sequence of reordering of rows and columns of the matrix associated to the graph and to a sequence of cuts of edges (arcs) in order to isolate the balanced graphs From the point of view of the interpretation of a balanced graph, we can see that the absence of negative values besides the symmetry of the matrix is a guarantee of the homogeneity of the subgraph elements, which is a basic 241 feature of a cluster Cluster identification as balanced subgraphs may consider both circuits and cliques and some other types of graphs, like balanced graphs, which may be interpreted as an homogenous group of units, like individuals in a sociogram and elementary physical units in ferromagnetism In a sociogram a balanced graph means that all the individuals show a preference relationship toward the other individuals and the topology of the graph is unessential The presence of the elements of the matrix with a value equal to zero forbids balance, as we can see from the previous examples Symmetry of the matrix associated to a graph is an essential feature besides the absence of element whose value can be -l Such a type of good features can help for the search of sub matrices in a matrix associated to a graph which satisfy the previous requirements In this paper we limited our observation to the simple fact that, in order to search suitable submatrices, it is necessary a sequence of cuts of the edge (arcs) on the set of edges with a negative weight Our analysis was limited to the evaluation of the maximum eigenvalue which can be described as a global index of balance We can see that this approach overcomes the usual way of treating clustering problem, mainly because we not limit the study to non negative weights on the arcs, an usual constraint in cluster analysis The dissimilarity relationship is generalized in terms of spin (spatial interaction) between a couple of units in a graph Acknowledgments Research supported by 40% " Analisi dei dati spaziali" References Baldassarre B and A Bellacicco , 1986 «Identification of Linear Regression Models by a Clustering Algorithm» COMPSTAT 1986, PhysicaVerlag, Vien, 1986 Bellacicco A and P Corb ,1982 «Exponential type clustering algorithm » Randex Metron, XL, Bellacicco A and A Labella, 1979 Le stutture matematiche dei dati Feltrinelli, Milano Bellacicco A and V Thlli , 1995 «Clustering dinamico su grafi orientati segno: l'algoritmo CLUSDIN » Submitted for pubblication 242 Phillips J L , 1967 «A model for cognitive balance» Psychological Review, 34 Roberts F S , 1976 Discrete mathematical models Prentice-Hall, Englewood Cliffs, New Jersey .. .Matrices and Graphs Theory and Applications to Economics Proceedings of the Conferences on Matrices and Graphs Theory and Applications to Economics University of Brescia,... contributions by areas: Graphs and Matrices, from theoretical results to numerical analysis, Graphs and Econometrics, Graphs and Theoretical and Applied Statistics Graphs and Matrices contributions... proceedings The second editor did and the contribution came after a while In the meantime the editors organized another conference on Matrices and Graphs: Theory and Economic Applications , held as

Ngày đăng: 06/01/2020, 08:41

TỪ KHÓA LIÊN QUAN