Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 236 (2016) – 13 International Conference on Communication in Multicultural Society, CMSC 2015, 6-8 December 2015, Moscow, Russian Federation Formalization of criteria for social bots detection systems Yury Drevs*, Aleksei Svodtsev National Research Nuclear University MEPhI (Moscow Engineering Physics Institute), 31 Kashirskoye shosse, Moscow 115409, Russian Federation Abstract Due to the development of social networks in the Internet, the programs providing automatic users’ actions imitation obtained a wide circulation Common usage of these programs causes informational noise The research considers a possibility of fuzzy logic mathematical apparatus application for the recognition of these programs’ activity in social networks 2016The TheAuthors Authors.Published Published Elsevier Ltd ©2016 © byby Elsevier Ltd This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility of the National Research Nuclear University MEPhI (Moscow Engineering Physics Institute) Peer-review under responsibility of the National Research Nuclear University MEPhI (Moscow Engineering Physics Institute) Keywords: Information security; social networks; artificial intelligence; social bots; formalization of criteia Introduction Rapid development of popular social networks ("Facebook", "LinkedIn", "Twitter", "VKontakte" and etc.) continues in present time (Drеvs and Svоdtsеv, 2014) There is a common feature for all of them that accounts registered not always correspond to real persons and can be "fake"-ones Due to the absence of serious technical restrictions on new accounts creation in the most of all Internet-resources, specialists in social media management (reputation management, advertisement, spam distribution and etc.) have an opportunity to prepare a huge amount of "fake" user accounts to execute coordinated virtual activities and thereby distort natural bulk information available All this also helped by usage of special programs, that imitate human behavior while operating social networks and called social bots * Corresponding author Tel.: +7-903-177-5158 E-mail address: ydrevs@yandex.ru 1877-0428 © 2016 The Authors Published by Elsevier Ltd This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility of the National Research Nuclear University MEPhI (Moscow Engineering Physics Institute) doi:10.1016/j.sbspro.2016.12.003 10 Yury Drevs and Aleksei Svodtsev / Procedia - Social and Behavioral Sciences 236 (2016) – 13 Work statement The task is social bots detection Consequently, it is necessary to exclude them from consideration while carrying out social media data analysis The objective of present work is a selection of an approach for a formal description of the social bots detection criteria In most cases, the statistic abnormality in social networks accounts’ activity gives an opportunity to recognize them Methodology To the solution of bots detection task fuzzy logic mathematical apparatus can be used This mathematical tool was offered in sixties years of the previous century by Lotfi Zadeh professor from the University of California (Drevs, 2005) One of the fundamental concepts of traditional logic – is a concept of set and subset as a part of it The set consists of separate elements and each separate element can be either included into the set or not included It is said that to be binary relation of inclusion The fuzzy logic based on a fuzzy set concept that defined by non-binary relation of inclusion That not only means whether the element is included into the set or not but also degree of its membership, that varies from zero to one 0≤ ≤1, where f – is a parameter, measured in weighted factors Weighted factors can be used because it is impossible to say for sure whether all parameters f are of one scale or not The description of a checklist of actions that users registered in the social networks able to perform is as follows Virtual users registered in the social networks have a possibility to perform the following actions (their availability is defined by specific social network’s features): x authorize on social network main site and enter personal page; x adjust personal page settings using graphic interface of the social network by specifying personal data (full name, virtual communities of interest, educational institutions, list of employers, vacation destinations, area of interests, hobbies, musical and other preferences); x edit (set, alter or delete) current “status” bar’s value on personal page (usually contains current activity, mood or geographic location specified by the user or device used by him); x publish text, photo or audio materials on social network’s personal page; x to establish or cut off friendly relations (one-way as well as two-way – mutually confirmed by the user); x send personal messages to a specific social network user, that are invisible for other users; x comment on text, photo or audio materials published by other users by posting text messages immediately below them; x express an approbation or disapproving attitude to other users’ publications or comments by marking them positive or negative using graphic interface of the social network; x finish the operation in social network as a registered user at any time, breaking the authorization session and using “Log out” soft key in the social network graphic interface Suppose there is an access to the open information aggregated on random social network users, including personal details available on their social network accounts, registration date and time, actions performed in certain time points, set of published text messages in open access, set of open friendly relations established by social network users and stored in a relational database By virtue of the fact that friendly relations between the users can be represented in a form of oriented graphs’ arcs, connecting corresponding nodes of the graph, then these relations can also be described as a vertex matrix Distinguish formal indicators characterizing a presence the social bots subset in the social networks is also takes place Yury Drevs and Aleksei Svodtsev / Procedia - Social and Behavioral Sciences 236 (2016) – 13 Results Now a check-list of indicators in user’s activity which reveals that he is the social bot puts together A formal representation for parameters, which give an opportunity to suppose the activity of this virtual user to be managed by the specialized software can be derived: x Accounts have the same name structure There is a common pattern that suitable for personal data of i-th and j-th users exists then it is reasonable to suppose the registration of i-th and j-th virtual users to be centrally managed by the specialized software There is a common pattern that suitable for the name structure of several accounts ("Samantha_1986", "Michael_1992", “Robert_1954" and etc.) as if they were created by the single algorithm As a rule, using the common naming pattern helps the owner to get their work statistics The parameters listed can be conditionally united in terms of similarity between users Fuzzy logic for creation of a membership function for socialbot subset that depends on weighted factors (see Fig 1) can be applied: Fig Membership functions for human subset , uncertain subset , socialbot subset x Accounts execute the same actions per restricted time interval Different users executed same actions per restricted time interval Fuzzy logic for creation of a membership function for socialbot subset that depends on weighted factors (see Fig.2) is applied: 11 12 Yury Drevs and Aleksei Svodtsev / Procedia - Social and Behavioral Sciences 236 (2016) – 13 Fig Membership functions for human subset , uncertain subset , socialbot subset x Accounts publish the same messages The set of text messages published by user can fully consist of messages, which were taken from other users In other words, if for any message of i-th user the identical message exists in the relational database that has been published before by some j-th user The similar way membership function for socialbot subset that depends on weighted factors can be built x Accounts registered at the same time The registration in social network of i-th and j-th users happened in restricted time interval In other words it is reasonable to suppose the registration of i-th and j-th virtual users to be centrally managed by the specialized software The similar way membership function for socialbot subset that depends on weighted factors can be built x Accounts have similar personal details The personal data of i-th and j-th users’ match In this case it is reasonable to suppose the creation of i-th and j-th virtual users to be social bots The similar way membership function for socialbot subset that depends on weighted factors can be built Final generalized membership function for all the indicators is derived using intersection of fuzzy sets operation According to (Leonenkov, 2003) an algebraic intersection of two fuzzy sets is a result of common arithmetic multiplication of corresponding membership function’s values Yury Drevs and Aleksei Svodtsev / Procedia - Social and Behavioral Sciences 236 (2016) – 13 Fig Membership functions for socialbot subset Conclusion Within the present work an approach of formalization of five social bots detection criteria was offered involving fuzzy logic mathematical apparatus application As example of using this approach an experiment has been done Its result is represented in Table It shows the validity level and user’s characteristics values: the repeated operations regularity, the addition to friends, continuous run time Table Validity level in dependence on user’s characteristics Repeated operations regularity Addition to friends, Continuous run time, friends in day days in month 0,0 0,0 0,0 1,86 1,0 1,0 1,0 1,86 7,0 7,0 2,0 3,86 10,0 10,0 2,5 5,00 13,0 13,0 3,7 6,42 17,0 17,0 4,0 8,52 25,0 50,0 5,0 9,23 (a number of repeated operations in 10 minutes) Validity level It is clear that the dependent of validity level on three has the increasing view: the “worse” user’s characteristics, the more probability user to be social bot There are a lot of social bots in social networks, but they reveal themselves because of the similarities in attributes and behavior The fuzzy logic is one of the approaches to use References Drevs, Yu.G (2005) Real-time systems: technical facilities and software: Manual M Moscow: NRNU MEPhI Drеvs, Yu.G., and Svоdtsеv, А.К (2014) The concept of the effective social bots detection systems’ creation International Conference “Regional informatics (RI-2014)”, Saint-Petersburg, 2014 Leonenkov, A.V (2003) Fuzzy modeling in MATLAB and fuzzyTECH environment SPb.: BHV-Petersburg 13 ... - Social and Behavioral Sciences 236 (2016) – 13 Fig Membership functions for socialbot subset Conclusion Within the present work an approach of formalization of five social bots detection criteria. .. analysis The objective of present work is a selection of an approach for a formal description of the social bots detection criteria In most cases, the statistic abnormality in social networks accounts’... professor from the University of California (Drevs, 2005) One of the fundamental concepts of traditional logic – is a concept of set and subset as a part of it The set consists of separate elements and