(BQ) Computational network science seeks to unify the methods used to analyze these diverse fields. This book provides an introduction to the field of network science and provides the groundwork for a computational, algorithmbased approach to network and system analysis in a new and important way.
Computational Network Science An Algorithmic Approach Henry Hexmoor AMSTERDAM • BOSTON • HEIDELBERG LONDON • NEW YORK • OXFORD PARIS • SAN DIEGO • SAN FRANCISCO SINGAPORE • SYDNEY • TOKYO Morgan Kaufmann is an Imprint of Elsevier Morgan Kaufmann is an imprint of Elsevier 225 Wyman Street, Waltham, MA 02451, USA Copyright © 2015 Elsevier Inc All rights reserved No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein) Notices Knowledge and best practice in this field are constantly changing As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress ISBN: 978-0-12-800891-1 For information on all MK publications visit our website at http://www.mkp.com PREFACE The days of the need for gurus and extensive libraries are behind us The Internet provides ready and rapid access to knowledge for all This book offers necessary and sufficient descriptions of salient knowledge that have been tested in traditional classrooms The book weaves foundations together from disparate disciplines including mathematical sociology, economics, game theory, political science, and biological networks Network science is a new discipline that explores phenomena common to connected populations across the natural and man-made world From animals to commodity trades, networks provide relationships among individuals and groups Analysis and leveraging connections provide insights and tools for persuasion Studies in this area have largely focused on opinion attributes The impetus for this book is a need to examine computational processes for automating tedious analyses and usage of network information for online migration Once online, network awareness will contribute to improved public safety and superior services for all A collection of foundational notions for economic and social networks is available in Jackson (2008) A mathematical treatment of generic networks is present in Easly and Kleinberg (2010) A complementary gap filled by this book is an algorithmic approach I provide a fast-paced introduction to the state of the art in network science References are offered to seminal and contemporary developments The book uses mathematical cogency and contemporary computational insights It also calls to arm further research on open problems The reader will find a broad treatment of network science and review of key recent phenomena Senior undergraduates and professional people in computational disciplines will find sufficient methodologies and processes for implementation and experimentation This book can also be used as a teaching material for courses on social media and network analysis, computational social networks, and network theory and applications Our coverage of social network analysis is limited and details are available in Golbeck (2013) and Borgatti et al (2013) x Preface Whereas a teacher is a tour guide to the subject matter, this book is a reference manual Chapters in each part are related and they progress in maturity Chapters are semi-independent and a course instructor may choose any order that meets the course objectives Exercises at the end of each chapter are students’ hands-on projects that are designed for covering learning activities during a semester Some code is provided in appendices for prototyping and learning purposes only We not provide a how-to guide to mainstream social media or codebook for application development that is available elsewhere Henry Hexmoor Carbondale, IL 2014 REFERENCES Borgatti, S., Everett, M., Johnson, J., 2013 Analyzing Social Networks SAGE Publications Easly, D., Kleinberg, J., 2010 Networks, Crowds, and Markets Cambridge University Press Golbeck, J., 2013 Analyzing the Social Web Morgan Kaufmann Publications Jackson, M., 2008 Social and Economic Networks Princeton University Press CHAPTER Ubiquity of Networks 1.1 INTRODUCTION Broadly speaking, a network is a collection of individuals (i.e., nodes) where there are implicit or explicit relationships among individuals in a group The relationships may be strictly physical as in some sort of physical formation (e.g., pixels of a digital image or cars on the road), or they may be conceptual such as friendship or some similarity among pairs or within a pair In an implicit network, individuals are unaware of their relationships, whereas in an explicit network, individuals are familiar with at least their local neighbors In certain implicit networks called affinity networks, there is a potential for explicit connections from relationships that account for projected connection such as homophilly (i.e., similarity) (McPherson et al., 2001) Biological networks capture relationships among biological organisms For instance, the human brain neurons form a large network called a connectome (Seung, 2012) An ant society is an example of a large biological network (Moffett, 2010) There are many examples of small-scale animal networks, including predators and their prey, plant diseases, and bird migration Human crowds and network organizations (e.g., government or state agencies, honey grids in bee colonies) are other examples of natural networks Modern anonymous human networks have capacities for crowd solving problems (Nielsen, 2012), where a group of independently minded individuals possess a collective wisdom that is available to singletons (Reingold, 2000) Social and political networks model human relationships, where social and political relations are paramount Economic networks are models of parties related to economic relationships such as those among buyers (and consumers), sellers (and producers), and intermediaries (i.e., traders and brokers) (Jackson, 2003) Beyond natural networks, there are myriads of synthetic networks The grid of a photograph is an example of synthetic networks Nanonetworks are attempts to network nanomachines for emerging nanoscale applications (Jornet and Pierobon, 2011) Computational Network Science: An Algorithmic Approach A large class of networks is a complex engineered network (CEN) that is a man-made network, where the topology is completely neither regular nor random A CEN supports evolving functionalities Examples of CENs are the Internet, wireless networks, power grids with smart homes and cars, remote monitoring networks with satellites, global networks of telescopes, and networks of instruments and sensors from battlefields to hospitals Time requirements in CENs range from seconds in cyber-attacks to years in greenhouse gas emissions Data and control flow in CENs must be managed over connections that could span thousands of miles A few synthetic network categories, including CENs, are created intentionally Here, we list six types: Social networks through networking sites and services Political networks as in parliamentary cabinets and political committees Computer networks that include computers as nodes and how they communicate over local, wide area, and wireless links (e.g., sensor networks) Telecommunication networks as in switches for nodes and respective routing paths Power grids Cellular networks as in cellular base stations and transmission frequencies There are many synthetic, however, unintended, network categories For example, colocated brick-and-mortar businesses may share clientele that is sometimes unintended As such, those businesses form a location affinity network Relationships in affinity networks are only implied and in the context of the affinity context (e.g., colocation) Consumers visiting popular e-commerce sites (e.g., amazon.com) form their own product preference affinity networks Although pairs of individual consumers may never meet in-person, the e-commerce services use affinity networks for data mining and marketing Individuals sharing like votes (or retweets) are part of an affinity network (or a hashtag) in the context of what they liked (or tweeted) Figure 1.1 depicts a taxonomy of network types Exchange networks are those in which a quantifiable entity is exchanged among the nodes whether or not the nodes are tangible (e.g., natural gas) or intangible Ubiquity of Networks 3 Fig 1.1. A network taxonomy (e.g., trust) Relational networks are inert and merely reflect juxtaposition of nodes All CENs are exchange networks Once a network emerges, we can explore interactions within the network Strategic interactions involve reasoning and deciding over selection of strategies They can be modeled with game theory that will be our main focus in Chapter 3 Network theory is a set of algorithms that codifies relationships among network topology and outcomes, which are meaningful to network inhabitants There is a movement afoot that codifies network phenomena under the term network science These phenomena and salient algorithms will be discussed throughout this book An Online Social Networking Services (OSNS) creates synthetic networks among people The salient incentive for using an OSNS is to gain social authority (i.e., legitimacy), which is a form of social power and not generally a measure of vanity Social authority in social networks is with respect to a group and with respect to specific topics Therefore, social authority is a relative measure and not an absolute quantity In Section 1.2, we review a few popular OSNSs from a rapidly growing list (Khare, 2012) Since they provide platforms to create, to share, and/or Computational Network Science: An Algorithmic Approach to exchange information and ideas in virtual communities, an OSNS is considered to be a medium for social media There are quantitation schemes over social media, such as Klout, which offers user scores (i.e., a number between and 100) Klout calls influence, which is a measure of a user’s ability to reach one other through an OSNS This measure is valuable for marketing products online In Section 1.3, we review a few popular online bibliographic services (OBS) that house published articles We return to generic models of networks in Section 1.4 This is followed by a review of popular models of synthetic network generations in Section 1.5 A fully implemented NetLogo model (i.e., code and accompanying descriptions for use) of network generation models and analysis is available in the Appendix 1.2 ONLINE SOCIAL NETWORKING SERVICES Facebook is an OSNS that connects people, organizations, friends, and others who work or live around together Nodes in a Facebook network can be individuals or organizations Some of these may be entirely synthetic without real-world humans The main Facebook tool for connections is friendship Facebook is used largely for personal and recreational functions As such, it has filled the social gaps created by physical and psychological dispersion among traditional families and friends It also serves as a medium that creates relationships that would not otherwise exist One Facebook’s feature known as sharing allows adjustments on spread of information (i.e., selecting an audience) Sharing is used to limit who can view posts and photos It is a three-step process: (1) indicates who you are (i.e., tagging), (2) tells where you live (i.e., adding a location to a post), and (3) manages the privacy right for where you post (i.e., the inline audience selector) Sharing gives users control over their information diffusion, which in turn can yield a measure of social authority Another Facebook’s like feature provides a directional relationship (i.e., tie, connection, and link) that lends credibility to the item and is proportional to the credibility (i.e., authority) of the endorser Twitter is an OSNS that facilitates broadcasts of messages (i.e., tweets) The main twitter tool for connections is the explicit alignments of ideas among people (i.e., following) Twitter can be used by small or large groups to form crowd sourcing For example, in the small network, Ubiquity of Networks 5 when a family stays organized about their travel itinerary, there are disparate opinions In the large network, a large social project, such as a protest, can be planned Twitter can be used to work semi-anonymously with others Twitter’s hashtag (i.e., #) is a feature for labeling a topic Anyone may introduce or reuse a hashtag to attract attention For example, #flight1549 added to a tweet labels the tweet to be about “flight1549.” This hashtag labeling facilitates search related to specific topics Individuals who use specific hashtags form an implicit network in the context of their hashtags This feature has been used for commercial marketing and anonymous coordination over social actions The range of potential uses for hashtags is enormous, and they have been adopted by other OSNSs such as Facebook On the one hand, Twitter can be used for social organizations of crime or dissent On the other hand, it can be used to predict and mitigate violations of law enforcement Since Twitter provides democratization of opinion sharing and equal access for dissemination, it is seen as a social equalizer and as such it might be feared by repressive systems (e.g., government regimes) Twitter’s social authority is composed of three components: (1) the retweet rate of users’ last few hundred tweets, (2) the recentness of those tweets, and (3) a retweetbased model trained on users’ profile data Tagging someone shows the Twitter id to more people, whereas direct messaging someone just puts spam in their inbox, which is generally undesirable Websites, such as Klout.com, gauge the influence you have by monitoring things, for example, how active you are and how much you have been tagged on Twitter Twitter’s lists are a way to organize others into groups When you click on a list, you will retrieve a stream of tweets from all the users included in that group As a rule of thumb, if you want to develop relationships on Twitter, you should read other tweets, retweet good contents, tweet good contents, and stay on top of keywords and interests that you follow The same advice applies if you want to get retweeted Linkedin is an OSNS that provides an online forum for professional identity management The main tool for Linkedin’s connections is to link people, who would like to support one another (i.e., connections) Linkedin allows people to conduct a weak form of endorsement in regards to specific skills This creates directional links from endorsers to endorsees Linkedin allows a stronger directional endorsement through recommendations Endorsed individuals’ profiles gain social authority via Linkedin’s endorsements and recommendations Of course, the Computational Network Science: An Algorithmic Approach gained authority is proportional to the authority of those endorsing and recommending Pinterest is an OSNS that allows users to create and manage themebased image collections Repining in Pinterest is the feature that creates social authority Started in 2011, Whisper.sh is a privately owned mobile OSNS that allows anonymous posts including photographs It allows others to like posts, which creates a network of posts as nodes and directional links Since users are anonymous, the resulting network is implicit 1.3 ONLINE BIBLIOGRAPHIC SERVICES DBLP is a Computer Science Bibliography database website hosted at Universität Trier in Germany It houses a large collection of published articles and offers capabilities for browsing and searching The resulting database is a network of “author” nodes connected via coauthorship Through citations, papers are nodes of a separate network of paper, as nodes and citations are the links Google Scholar is another bibliography database website released in 2004 by google.com It creates networks of authors and papers similar to DBLP Microsoft Academic Research is an OBS (with a corresponding Windows app) that is supported by Microsoft.com that offers a similar service to DBLP Research Gate is an independent privately owned online site founded in 2008 for scientists and researchers to share papers, to ask questions, to answer questions, and to find collaborators On the one hand, it is an OBS, even though it is far smaller than its rivals On the other hand, it is an OSNS for professionals 1.4 GENERIC NETWORK MODELS In this section, we review four of the most popular generic network models In contrast to descriptive models in this section, Section 1.5 will offer algorithms for artificially generating networks Network Games 29 Fig 3.6. Nash equilibrium algorithm (adapted from Han et al., 2012) Fig 3.7. A security game payoff bimatrix (Tambe, 2012) symmetry and are not zero-sum games as shown in Figure 3.7 Before acting, the second player will observe choices of the first player Recently, security games of the police and the bad guy have been modeled Bad guys could be of several types (i.e., bad guys have access to a diverse set of strategies) unknown to the police This is modeled using the Bayesian Stackelberg model (Tambe, 2012) Tambe’s group has demonstrated security games with airport security 3.6 CONCLUSION This chapter covered basics of GT and a few emerging applications in communication networks including mobile, wireless, P2P, and ad hoc Due to the solid mathematical underpinnings, application of GT to social, economic, and communication networks is a promising domain and we expect far more developments in the near future (Jackson, 2008; Antoniou and Pitsillides, 2013) REFERENCES Antoniou, J., Pitsillides, A., 2013 Game Theory in Communication Networks CRC Press, USA Aumann, R., 1976 Agreeing to disagree Ann Stat (6), 1236–1239 Axelrod, R., 2006 The Evolution of Cooperation Basic Books 30 Computational Network Science: An Algorithmic Approach Barron, E., 2008 Game Theory: An Introduction Wiley Broom, M., Rychtar, J., 2013 Game-Theoretical Models in Biology Chapman and Hall/CRC Press Dawkins, R., 1976 The Selfish Gene Oxford University Press Fudenberg, D., 1991 Game Theory MIT Press Fudenberg, D., Levine, D., 1998 The Theory of Learning in Games The MIT Press Gintis, H., 2009 Game Theory Evolving: A Problem-Centered Introduction to Modeling Strategic Interaction Princeton University Press Han, Z., Niyato, D., Saad, W., Basar, T., Hjorungnes, A., 2012 Game Theory in Wireless and Communication Networks: Theory, Models, and Applications Cambridge University Press Hexmoor, H., 2011 On strategic coordination Int J Comput, Intell Theory Pract (1), 41–49 Jackson, M., 2008 Social and Economic Network Princeton University Press Kelly, F., Voice, T., 2005 Stability of end-to-end algorithms for joint routing and rate control Comput Comm Rev 35 (2), 5–12 Mas-Colel, A., Whinston, M., Green, J., 1995 Microeconomic Theory Oxford University Press Maynard-Smith, J., 1982 Evolution and the Theory of Games Cambridge University Press Menache, I., Ozdaglar, A., 2011 Network Games Morgan Claypool Publishers Myerson, R., 1991 Game Theory: Analysis of Conflict Harvard University Press Myerson, R., 1997 Game Theory: Analysis of Conflict, Harvard University Press Nisen, N., Roughgarden, T., Tardos, E., Vazirani, V (Eds.), 2007 Algorithmic Game Theory Cambridge University Press Osborne, M and Rubinstein, A (1994) A Course in Game Theory MIT Press Rosenthal, R., 1973 A class of games possessing pure-strategy Nash equilibria Int J Game Theory (1), 65–67 Roughgarden, T., 2005 Selfish Routing and the Price of Anarchy MIT Press Rubinstein, A., 1979 Equilibrium in supergames with the overtaking criterion J Econ Theory 21, 1–9 Saari, D (2001) Decisions and Elections: Explaining the Unexpected Cambridge University Press Tambe, M., 2012 Security and Game Theory Cambridge University Press Vincent, T., 2005 Evolutionary Game Theory, Natural Selection, and Darwinian Dynamics Cambridge University Press, pp 72–87 von Stackelberg, H (2011) Market Structure and Equilibrium (Bazin, D., Hill, R and Urch, L., Translators) Springer. EXERCISES How is game theory being applied to the vision of “smart grid” power distribution? Develop techniques for application of cooperative game theory in cooperative neighborhoods such as units in war theaters and sensor and camera networks Are there limits to using game theory in communication networks? CHAPTER Balance Theory Social harmony (i.e., freedom from contention) is an important concept in social networks Contention arises from implicit relationships Simmel suggested that among three nodes A, B, and C, if AB and AC are strong ties, BC would eventually form a tie, which has been termed triadic closure (Simmel, 1950; Easley and Kleinberg, 2010) Heider formulated the ideas in his social balance theory (Heider, 1958) Let there be two kinds of dyadic links for like, denoted by +, and dislike, denoted by − Even number of enemy links is a theory that summarizes balance theory in signed networks (van de Rijt, 2011) For example, with only three nodes A, B, and C, in the following four balanced structures (S1–S4), there are even number of negative links and the links are enduring, that is, there is no need for change: S1—AB: +, AC: −, BC: − S2—AB: −, AC: −, BC: + S3—AB: −, AC: +, BC: − S4—AB: +, AC: +, BC: + In contrast, the structures S5–S8 have an odd number of negative links and are unbalanced where some negative links may turn positive and vice versa: S5—AB: +, AC: −, BC: − S6—AB: −, AC: −, BC: + S7—AB: −, AC: +, BC: − S8—AB: +, AC: +, BC: + Even number of enemy links is applicable to network cycles of any size where the condition of even number of negative links makes those cycles balanced (Cartwright and Harary, 1956) Examining the ratio of balanced cycles over the total number of cycles in a network might be a useful clue about network balance It has been argued that even a single unbalanced cycle might unbalance the entire network (Bonacich and Lu, 2012) Determining which relationship to alter is the idea of conflict resolution in networks (Bonacich and Lu, 2012) 32 Computational Network Science: An Algorithmic Approach 4.1 CONCLUSION Balance theory has been an important sociological phenomenon applied to graphs It has been widely studied in public policy research (O’Toole, 1997; Piqueri and Hickman, 2006; Doreian and Mrvar, 2009; Ilany et al., 2013) Applications and extensions of this theory remain an active research topic in graph theory as well as mathematical sociology (Doreian and Mrvar, 2009) REFERENCES Bonacich, P., Lu, P., 2012 Introduction to Mathematical Sociology Princeton: Princeton University Press Cartwright, D., Harary, F., 1956 Structural balance: a generalization of Heider’s theory Psychol Rev 63, 277–293 Chiang, K., Hsieh, C., Natarajan, N., Tewari, A., Dhillon, I., 2014 Prediction and clustering in signed networks: a local to global perspective J Machine Learn Res 15 (1), 1177–1213, MIT Press Doreian, P., Mrvar, Andrej, 2009 Partitioning signed social networks Soc Netw 31, 1–11 Easley, D., Kleinberg, J., 2010 Networks, Crowds, and Markets: Reasoning About a Highly Connected World Cambridge University Press Heider, F., 1958 The Psychology of Interpersonal Relations Wiley Press Ilany, A., Barocas, A., Koren, L., Kam, M., Geffen, E., 2013 Structural balance in the social networks of a wild mammal Anim Behav 85, 1397–1405 O’Toole, L., 1997 Treating networks seriously: practical and research-based agendas in public administration Public Adm Rev 57 (1), 45–52 Piqueri, A., Hickman, M., 2006 An empirical test of Title’s control balance theory J Criminol 37 (2), 319–342, Wiley Simmel, G., 1950 The Sociology of Georg Simmel Free Press, (compiled and translated by Kurt Wolff) van de Rijt, A., 2011 The micro–macro link for the theory of structural balance J Math Sociol 35 (1–3), 94–113 EXERCISES Apply social balance theory to network of countries in the EU or the Middle East Which dyadic relation can be fixed to produce more balance in the network of Question 1? How? CHAPTER Network Dynamics In many real-world networks, properties of nodes and links change over time These changes yield evolutionary (or natural) network dynamics (END) (Roth and Cointet, 2010) In many open networks (e.g., the Internet), nodes may enter or exit asynchronously In many computer networks, nodes may crash or recover These rapid changes result in volatile network dynamics (VND) In other networks, nodes or supervisors may take deliberate action to change the structure of the network, which we may call driven network dynamics (DND) In this chapter, we review some common methods for modeling these three types of dynamics 5.1 EVOLUTIONARY AND VOLATILE NETWORK DYNAMICS Depending on the behavior of nodes, there are possibilities of forming new relations Time can be used to indicate relations If a fraction of the nodes in the network is engaged in a game, there is a chance of influencing other nodes in the network over time (Snijders and Doreian, 2010) Elder’s life course theory was developed in the 1960s for analyzing people’s lives in terms of structural, social, and cultural contexts The theory examines all individual histories and observes how early events influence future decisions such as marriage, divorce, and brand loyalty A life course is defined as a sequence of socially defined events and roles that the individuals enact over time The method encompasses observations, including history, sociology, demography, etc In particular, the theory focuses on the connections among individuals as well as historical and socioeconomic contexts, in which these individuals live Detecting communities in static networks is an emerging research area (Newman, 2004, 2010) Current algorithms are limited to static networks With faster algorithms, detecting communities in dynamic social networks (DSN) will rapidly capture changes in the network, which can be used for prediction and detection of plots as well as for its explanatory power used in social sciences (Hutchison, 2007) 34 Computational Network Science: An Algorithmic Approach Let t0, t1, t2, … be a sequence of time points The distance between any two time points is about five years At t0, we assume a man and a woman live by themselves At each time point, the social network is represented as a graph (G) Assume that the couple gets married and forms a family After n time points, we have n sets of consecutive social networks The distance between any two vertices in such a network is called a distance metric (Tang et al., 2010) The distance metric can describe social life of individuals in the DSN A time-varying graph is a discrete sequence, that is, an ordered set G1, G2, …, GT of undirected or directed graphs, where T is the length of the sequence There are other ways that time may play roles in the formation and network change Not only links come and go, but there are also changes within links We can represent a social network in a time graph by means of temporal windows, where each window is a snapshot of the network state at a specific time interval Alternatively, time is an attribute that can be incorporated in changes inside nodes If the time values for two nodes overlap several times in certain scenarios, we can say that the two nodes have temporal affinity Consider a real-time social networking site Today there might be 100 people online in some movie communities We may say that they like certain movies at that time Tomorrow the number of people might change If we observe the number of people within the same time period, we can draw the graph over a certain period and analyze nodes in the movie group In the same manner, if we analyze node 1’s affinities for a certain time period, we can identify node 1’s contribution in the formation of network, that is, the links it will form with other nodes, with whom it might share homophily Unlike physical data of natural science, social science data are constructed by meanings, motives, definitions, and classifications This means that the production of social science data involves a process of interpretation During the interpretation processes, social scientists construct distinct types of data Each data type requires appropriate distinct methods of analysis The principal types of data include attribute data and relational data Attribute data correlate attitudes, opinions, and behavior of agents that are, respectively, regarded as properties, qualities, or characteristics, Network Dynamics 35 which belong to individuals or groups The items collected through surveys and interviews, for example, are often regarded simply as the attributes of particular individuals, which can be quantified and analyzed through many available statistical procedures The methods appropriate to attribute data are those of variable analysis, whereby attributes are measured as values of particular variables (e.g., income, occupation, education) Relational data include contacts, ties, connections, group attachments, group meetings, etc., which relate one agent to another, so these data cannot be reduced to the properties of individual agents themselves Relations are not the properties of individuals, but of systems of agents These relations connect pairs of agents into larger relational systems The methods appropriate to relational data are those of network analysis, whereby the relations are treated as expressing the linkages, which run between agents While it is possible to undertake quantitative and statistical counts of relations, network analysis consists of a body of qualitative measures of network structure Although there are distinct types of data, each with their own appropriate methods of analysis, there is no specification about the methods of data collection that can be used to produce them There is no approach that distinguishes methods for the collection of attribute data from those for the collection of relational data The two types of data are often collected together as integral aspects of the same investigation A study of political attitudes, for example, may seek to link these attributes to group memberships and community attachments, or an investigation of interlocking directorships may seek to link these attributes to the size and the profitability of the companies involved Relational data are central to the principal concerns of the sociological tradition, with their emphasis on the investigation of the structure of social action Structures are built from relations, and the structural concerns of sociology can be pursued through the collection and analysis of relational data Paradoxically, most of the existing texts on research methods and methods of data collection give little attention to this type of data instead of concentrating on the use of variable analysis for the investigation of attribute data The formal, mathematical techniques of social network analysis, the methods that are specifically geared to relational data, have been developed and have been discussed outside 36 Computational Network Science: An Algorithmic Approach the mainstream of research methods While the techniques have made a number of spectacular breakthroughs in structural analysis, they have been largely inaccessible to many of those who would most wish to use them Classical studies on social networks focus on static representation of nodes (i.e., humans) and edges (i.e., relationships) resulting in sociograms By static, we mean all the relationships are assumed to occur at the same time (see Figure 5.1a) Often networks change radically as shown in progression from Figure 5.1a to c over a 36-year time span In a weather forecasting system, weather patterns consisting of time series that are recently collected data are fed into a computer model to predict the weather patterns for the next 5–15 days In the same way, the interactions among actors in a network are based on “communication acts” stored in their communication archive Imagine your daily pattern of friends and strangers whom you meet on the train, in the office, and on a night out These contacts occur at different times with different frequencies and durations To visually capture this temporal information, we use a temporal graph, which shows a snapshot of the connections across a time window For example, consider a network that has six nodes, that is, A, B, C, D, E, and F, and three time intervals shown in Figure 5.2 A static graph assumes that all the contacts occur simultaneously as shown in Figure 5.3 in contrast to the temporal information in Figure 5.2 How many friends does it take to spread a rumor? The static graph of Figure 5.2 shows a rumor from node A that would take at least four hops to reach node F through the path ADCEF From the temporal graph, we see that the contact between D and C occurs in the wrong order to facilitate this static path In a social system, there is a strong chance that a friend of a friend is also a friend Clustering coefficient captures these transitive relationships by measuring the average number of triangles in a graph For example, node A has friends B and D, and these friends are also friends Therefore, the clustering coefficient is 1.0 Static graphs assume these relationships are binary, that is, they exist at all the times In reality, friendships and meetings are not constant Friends come and go, and you meet different friends at different times at varying frequencies Temporal Network Dynamics 37 Fig 5.1. Three snapshots of a simple family network over a 36-year time span (pictures are images of a prototypical family) (a) A sociogram with a father, a mother, and a child; (b) the same sociogram after 30 years, at largest size, with the father, the mother, and the five children; (c) the same sociogram after 36 years when several family members are no longer alive Fig 5.2. Time graph of link formation for the six friends’ example 38 Computational Network Science: An Algorithmic Approach Fig 5.3. Static network of six friends’ example tracking of relations illuminates temporal issues and records changes in communities over time How long does it take to spread gossip to your friends? The window reached is the real time of delivery, that is, if each window is 1 hour, node A can reach node C in 3 hour To model temporal changes in links, we can average the actual friendship strength per time window If each window is one day in the temporal graph, the friends of node A (B, D) meet only once on day out of three possible days In a temporal space, there are more disconnected node pairs than in a static graph To handle this, we use the inverse of the temporal path length In this case, the infinite temporal path length between node A and F is 1/ ∞ = Nodes (i.e., individuals) are defined by their demographic attributes, for example, name, age, sex, and location We can add the time attribute to the nodes If the values of the time for two nodes overlap several times in certain community, we can say that these two nodes have temporal affinity For example, consider a real-time social networking site, for example, Facebook or Flickr Let us assume that there are 100 people online in some movies community today Then we can say that they like movies at that time Tomorrow the number of people might change If we observe the number of people for the same time attribute value and draw the graph over certain periods, we could analyze the nodes in the movies community, most likely, for the larger communities and eventually to the whole network We will next examine styles of modeling networks involving times, places, and individuals The spatiotemporal–semantic model considers the visit activity of people to specific places and at specific times (intervals) For example, let four persons A, B, C, and D visit three places P1, P2, and P3 A place means a semantic location, which is visited by people For instance, a conference is an example of a semantic place where researchers gather and exchange their work and thought A name of a locality and a ZIP Network Dynamics 39 code are another example of semantic locations The places can be the different geographic locations or can be the same location with different time intervals of visits Each visit activity is represented by a visitor and a unique activity ID per person, for example, A.1 In a people-to-place model, an individual and a place have a relationship if the person visits the place in a time interval Bipartite graphs are often used to model people and place relations In a bipartite graph, vertices are divided into two disjoint sets, that is, a set of persons and a set of places An edge may be labeled by an activity event to link its persons and places A bipartite network can be transformed into a onemode social network that depends on the emphasis of a specific type of interaction, that is, people to people or place to place In a people-to-people model, the people are linked in a social network based on their common visits to places The number of common visits can be used as the weight of each link For example, persons A and B are linked with a weight of due to their visits to places P1 and P2 In a place-to-place network, two places are connected if they share at least one visitor Human social networks are characterized by rich variation at the individual level Some people have few friends, whereas others have many Some people are embedded in tightly knit groups, where everyone knows each other, whereas others belong to many different groups, where there is little overlap among friends To explain this variation, scholars have sought simple models of network formation that generate an empirically realistic distribution of network characteristics as an endogenous outcome of a self-organizing process The bestknown network formation models start with identical individuals who are subjected to social processes that create or exacerbate dissimilarity in a network For example, in the scale-free physics model, it is the process of growth and, in particular, preferential attachment that drives the selforganizing feature of the power-law distribution in the degree In the economic connections model, individuals who are homogenous ex ante endogenously form a star network when actors obtain indirect network benefits and when they are driven by short-run economic incentives In sociology, actors’ preferences for structural balance and homophily tend to stimulate transitivity in social relationships and the formation of like-minded cliques Although the structural processes in these models 40 Computational Network Science: An Algorithmic Approach generate empirically realistic variation in some network attributes, the effect of individual characteristics has been mainly ignored There have been extensions to the canonical models that take into account individual heterogeneity, but these models are usually presented as robust versions of the original models, in which the focus is still on the endogenous process In this book, we focus on the individual characteristics themselves and explore the possibility that humans are endowed with traits that affect their network attributes And our most intrinsic characteristics can be found in our genes 5.2 TIME GRAPHS Time graphs extend the traditional notion of an evolving directed graph that captures link creation as a point phenomenon in time We address this question through creation and analysis of a series of snapshots of the data The development of tools and methods to analyze these snapshots is therefore a timely endeavor Blogspace offers an additional technical advantage over such approaches—if data is recrawled with a certain frequency, there is no notion of the precise point in time when a page or link was created and updated In contrast, blogspace offers a ready-made view of evolution in continuous time As each blog adds an entry together with links, there is a time stamp associated with that event By automatically extracting these time stamps, we can piece together a view of blogspace evolving continuously from the beginning of blog archiving to the present We should stress that time is absolute and not merely relative as in a sequence of crawls We focus on connectivity evolution and on temporally concentrated bursts (i.e., the Kleinberg’s conception) in the evolution of blogspace (Kleinberg, 2002) Within a community of interacting bloggers, a given topic may become the subject of intense debate for a period of time and then fade away These bursts of activity are typified by heightened hyperlinking among the blogs involved—within a time interval Note that a subgraph indicates a community of interest in the traditional sense that may exist among a set of blogs without ever achieving this temporal focus Conversely, heavy linkage within a short period may appear less significant when viewed over a long time span—suggesting that the criterion for inferring that pattern of links is a community and is less stringent than for a static graph Network Dynamics 41 We extract dense subgraphs from the blog graph; these correspond to all potential communities, whether or not they are bursty (Kumar et al., 2003) Building on the work of Kleinberg on bursts in event streams, we perform a burst analysis of each subgraph in order to identify and rank bursts in these communities If we can assume networks change due to events free from history, we can use Markov chains to model network change, which is covered in the next section 5.3 MARKOV CHAINS We can represent the topology in a network of n nodes as an n × n adjacency matrix A If M varies over distinct time points (say time point t), each snapshot of the network will be denoted by At Furthermore, if we can make the Markovian assumption between pairs of successive topologies, that is, the future and past states are independent (in other words, history does not affect state transitions), the temporal sequence of network topologies forms a Markovian chain (MC) (Norris, 1998) A MC is a directed graph, where the edges are labeled by the probabilities of going from one state to the other states (i.e., network topologies are treated as random variables) Let T be an n × n Markovian transition matrix with entry tij denoting probability of transition from state i to state j Then, Equation 5.1 is the temporal update function among temporal network topologies MCs have well-defined properties that can describe landmark states for a network At+1 = At × T (5.1) Tk denotes the kth successive transition At a large value k, Tk converges equilibrium transition values shown in Equation 5.2: lim T k k→ ∞ (5.2) 5.4 STRATEGIC NETWORK PARTNERING USING MARKOV DECISION PROCESSES Let there be a network of I nodes (i.e., network vertices) and L dyadic links (i.e., network edges) at time t Let each agent maintain a set of directed links (e.g., “friend-” or “follower-”) with others We consider the 42 Computational Network Science: An Algorithmic Approach t status of agent i’s links at a time t as a state Si An agent may perform one of two actions of adding or deleting a link that deterministically transitions the agent from one state to another state If the agent has σ existing links, it will have σ available “delete” actions and I − σ available “add” actions totaling I = σ − I + I − σ actions If the target agent’s add links reciprocate, dyadic links are established (i.e., actualized) Otherwise, the link is directional (i.e., aspirational) If a delete link is reciprocated by the target agent, thereby deleting the link, its dyadic link will break Otherwise, a directional (i.e., aspirational) link will remain Since the actions of target agents are unknown, we can model the result of link status probabilistically With addition or subtraction of each link, an agent’s connectivity network topology changes, which creates a new state (i.e., status) for the agent If two agents reciprocate directed links, a dyadic link between them is created Each agent is involved in I + states, where I states are reached from I actions Possible I + 1 states for agent i, denoted by Mi, form a Markovian state space since a state change is a deterministic function of link change determined by an agent An agent will value states that give them better structural properties (e.g., centrality measure) mi is a function that assigns a utility for each state for player i Putting it together, 〈Mi, Ai, mi〉 is a Markov decision process (MDP) for player i (Howard, 1960) Player i can solve its own Markov process for an optimal policy denoted by πi* Since there are n agents and each engages with a corresponding MDP, collectively there are I × (I + 1) states Meanwhile, 〈I, Å, U〉 is a game theoretic game among I agents, where Å is the union of all players’ actions (i.e., their strategies) Å contains I2 strategies U is the utility function that assigns payoffs to players against players’ action profiles This is a Bayesian game since agents not know about other agents’ patterns of behavior (i.e., types) We assume external actions that trigger probabilistic agents’ action selection that transitions them among states Collection of all transition probabilities can be cast into a stochastic transition matrix leading to a MC that is irreducible, aperiodic, recurrent, and ergodic (Strook, 2005) Network Dynamics 43 5.5 CONCLUSION Network properties change over time Models that capture those changes are models of network dynamics We outlined a variety of methods for modeling dynamics Clearly, this is only the genesis of rich models to come REFERENCES Corten, R., Buskens, V., 2010 Co-evolution of conventions and networks: an experimental study Soc Netw 32 (1), 4–15 Elder, G (1998) The life course as developmental theory In: Child Development, Vol 69, No Wiley, New Jersey, pp 1–12 Howard, R., 1960 Dynamic Programming and Markov Processes MIT Press Hutchison, E., 2007 Dimensions of Human Behavior: Person and Environment Trade, 3rd edition Sage Publications, CA, (paperback) Kleinberg, J (2002) Bursty and hierarchical structure in streams In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining ACM Press, New York, NY Kumar, R., Novak, J., Raghavan, P., Tomkins, A., 2003 On the bursty evolution of blog space The 12th Annual World Wide Web Conference Budapest, Hungary, New York: ACM Press, New York, NY Newman, M., 2004 Detecting community structure in networks Eur Phys J B 38, 321–330 Newman, M., 2010 Networks: An Introduction Oxford University Press Norris, J., 1998 Markov Chains Cambridge University Press Roth, C., Cointet, J., 2010 Social and semantic coevolution in knowledge network Soc Netw 32 (1), 16–29 Snijders, T., Doreian, P., 2010 Introduction to the special issue on network dynamics Soc Netw 32 (1), 1–3 Strook, D., 2005 An Introduction to Markov Processes Springer Tang, J., Musolesi, M., Mascolo, C., Latora, V., 2009 Temporal distance metrics for social network analysis In: Proceedings of the 2nd ACM Workshop on Online Social Networks ACM Press, New York, NY, pp 31–36 Westaby, J., 2011 Dynamic Network Theory: How Social Networks Influence Goal Pursuit APA Press EXERCISES How can volatile networks be used to model communication networks? Sketch a time graph for a historically important social network, for example, presidential elections Design a 2 × 2 matrix probability of a coordination game (e.g., a game of chicken) Treat the matrix as a Markov chain and predict the long-term distribution of states ... is an example of synthetic networks Nanonetworks are attempts to network nanomachines for emerging nanoscale applications (Jornet and Pierobon, 2 011 ) 2 Computational Network Science: An Algorithmic. .. algorithm 18 Computational Network Science: An Algorithmic Approach degree and di is the degree of node i (Freeman, 19 78; Macindoe and Richards, 2 011 ): d max − di j =1 ( n − 2) × ( n − 1) n L =∑... media and network analysis, computational social networks, and network theory and applications Our coverage of social network analysis is limited and details are available in Golbeck (2 013 ) and