Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 236 (2016) 29 – 33 International Conference on Communication in Multicultural Society, CMSC 2015, 6-8 December 2015, Moscow, Russian Federation Factographic information retrieval for communication in multicultural society Sergey Kulik* National Research Nuclear University MEPhI (Moscow Engineering Physics Institute), Kashirskoe shosse 31, Moscow 115409, Russian Federation Abstract A factographic information retrieval with human involvement which consists of two stages is given detailed consideration in this paper In the first stage, the retrieval without direct human involvement is implemented In the second stage, the retrieval assumes the human involvement This retrieval includes a pattern recognition algorithm, and they are implemented for retrieval only one document among the variety of similar documents An analytical model of the retrieval block is developed This model is presented by effectiveness indicator: average length of the recommendatory list provided by the retrieval block enabling the human operator to take the final decision © 2016 2016The TheAuthors Authors Published Elsevier © Published by by Elsevier Ltd.Ltd This is an open access article under the CC BY-NC-ND license Peer-review under responsibility of the National Research Nuclear University MEPhI (Moscow Engineering Physics Institute) (http://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility of the National Research Nuclear University MEPhI (Moscow Engineering Physics Institute) Keywords: Information retrieval; communication; recommendatory list; pattern recognition; factographic information; effectiveness Main text In practice, a factographic information retrieval (FIR) is used in communication technologies for multicultural society Automated Factographic Information Retrieval System (AFIRS) is a special case of question-answer systems (QA systems) (Tomljanovic, Pavlic, and Katic, 2014; Abdullah and Abdel-Kader Rehab, 2011; Sherkat and Farhoodi, 2014, Kulik, 2015) or data base fact retrieval systems (Salton, 1968) This AFIRS (Kulik, 2015) in its structure contains a block of recognition To develop this block we used neural networks (Galushkin, 2007) * Corresponding author Tel.: +7-495-788-5699; fax: +7-499-324-2111 E-mail address: sedmik@mail.ru 1877-0428 © 2016 The Authors Published by Elsevier Ltd This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility of the National Research Nuclear University MEPhI (Moscow Engineering Physics Institute) doi:10.1016/j.sbspro.2016.12.008 30 Sergey Kulik / Procedia - Social and Behavioral Sciences 236 (2016) 29 – 33 Informative retrieval (Kulik, 2015; Ampazis and Iakovaki, 2004) plays a great role in information retrieval systems Effective FIR is very important and is sometimes critical for AFIRS where the factographic information retrieval is implemented Some important problems were considered in the indicated papers (Salton, 1968; Fukunaga, 1990; Galushkin, 2007; Feller, 1968) These paper and monograph (Kulik, 2015; Galushkin, 2007) are concerned with the problem of a neural networks The neural network technologies (Kulik, 2015; Galushkin, 2007) has been applied successfully to design biometrical system for AFIRS This paper (Kulik, 2015) deals with the problem of biometric systems of identification for AFIRS These papers (Kolmogorov, 1973; Feller, 1968) are concerned with the problem of probability theory For instance, this paper (Kolmogorov, 1973) deals with the problem of axiomatic basis for modern probability theory Generally, the task of searching in the AFIRS is to find required factographic information, connected with the document which is searched For instance, the task of searching in multicultural society is to find a Document which Is Identical (DII) to the enquiry among the archive system documents with the help of their descriptions in the search factographic database (FDB) and, in case of this document detection, to give necessary factographic information from this document The automated factographic information retrieval system (see Fig.1) consists of different blocks Fig Automated factographic information retrieval system The automated factographic information retrieval system includes six blocks: the block of document indexing; the block of retrieval; the block of Recommendatory List (RL) processing; the block – the searching FDB which includes document descriptions in the form of Searching Documents Patterns (SDP); the block of recognition; the block (archive of documents) Methods The research was conducted on the basis of methods: information retrieval (Salton, 1968), probability theory (Kolmogorov, 1973; Feller, 1968), neural networks (Kulik, 2015; Galushkin, 2007) and pattern recognition (Fukunaga, 1990) Results and discussion It is supposed that there are N documents which are stored in the archive and every enquiry have Searching Enquiry Pattern (SEP) Each document of the archive is located in the place which is uniquely determined by the registration number Only one document can be located in only one place All SDPs are stored in the searching array of the factographic database in the form of a consecutive linear list All SDPs are stored in the searching array of the factographic database in the form of a consecutive linear list Request for the search (or SEP) includes the description of the document which can be stored or can be absent in the documents array It is supposed that the request with R probability can be a description of the document which is identical to one of the archive documents Information in SDP and SEP can be misrepresented because of different 31 Sergey Kulik / Procedia - Social and Behavioral Sciences 236 (2016) 29 – 33 hindrances (noises) or errors during the document indexation Comparison of SEP and SDP by the AFIRS is realised using patterns recognition algorithm (Fukunaga, 1990; Galushkin, (2007) and is characterised by P1, P2 probabilities where (Kulik, 2015): x P1 – probability of the correct comparison of two identical documents based on their descriptions (determines the target mission probability); x P2 – probability of the correct comparison of two non-identical documents based on their descriptions (determines the false alarm probability) It is supposed that Lx is average length of the RL Analytical formula was obtained to evaluate the factographic information retrieval effectiveness with the help of indicator (see (1), (2) and (3)): Lx R S R W , (1) where: S – average length of the RL during the information retrieval in the area which includes DII, where S f1 P1 , P2 ; (2) W – average length of the RL during the information retrieval in the area which does not include DII, where W f P2 (3) In (1) the Lx is: Lx t R W (4) What is more (see (1) and (4)), if R|0, the average length of the RL which is given by the AFIRS for the human operator (this person makes a final decision) is Lx | W (5) Đnà âr Following Feller (1968), we will denote: ă n n n r 1 r r n! r ! n r ! The following analytical expressions for (3) were obtained to evaluate the effectiveness: L 1 W êĐ N à Ư ôă m á\ m ơâ m N ê N Đ Ã m ằ L Ư ôă á\ m ằ, m L ơâ m ẳ ẳ \m P2N m P2 m (6) As a result of researches (1), (3) and (6), it was set that Lx in (5) is changed during the changes of L, N, and P2 A small part of these researches for different L, N and P2 we can see in the tables 1, Table Example of effectiveness, if N=50000 and P2=0.995 32 Sergey Kulik / Procedia - Social and Behavioral Sciences 236 (2016) 29 – 33 L 10 70 170 240 242 245 247 250 800 Lx | 10 70 170 237 239 241 242 244 250 103 250 103 250 103 250 … … 104 250 Table Example of effectiveness, if N=80000 and P2=0.998 L 20 80 140 143 145 150 155 160 170 180 Lx | 20 80 140 142 144 148 152 155 158 160 103 160 103 160 … … 10 160 According to table 1, if P2=0.995, N=50000 (N – the number of SDP) and L=1000 (L – maximum length of the RL), the length of the RL of the AFIRS is Lx | 250 Analogously, according to the table 2, if P2=0.998 and L=140, the length of the RL of the AFIRS is Lx | 140 Let us briefly consider the likely applications of the results Firstly, these results allow us to develop an effective Factographic Information Retrieval System For example, we create the AFIRS and share it with someone from other countries A wide variety of people representing different cultures could come to use AFIRS in order to effectively retrieve information which is important to them So for instance, these results allow us to estimate the efficiency of the factographic information retrieval Secondly, these results can be helpful in teaching university students The knowledge of Factographic information retrieval can be used as illustrative material in teaching modern information technology in the University Lecturers and students can make presentations on various topics related to communication in a multicultural society For example the Lecturer can talk about Automated Factographic Information Retrieval System or effectiveness of factographic information retrieval or Average length of the Recommendatory List, etc Conclusion Thus, as a result of research, the analytical formula (6) allowing the evaluation of the RL average length was developed Properties of the important indicator of effectiveness – the RL average length for the human operator were explored It allows reasonable analysis of the effectiveness of the factographic information retrieval which is implemented by the AFIRS Necessary software, allowing the evaluation of the retrieval’s effectiveness, was created for the developer of the Automated Factographic Information Retrieval System In future it is planned to obtain and explore two remaining evaluations of effectiveness They are: probability of the correct response to the enquiry of the factographic information retrieval and average number of comparison operations which are implemented by the Automated Factographic Information Retrieval System References Abdullah, M.M., and Abdel-Kader Rehab, F (2011) QASYO: A question answering system for YAGO ontology International Journal of Database Theory and Application, 4(2), 99–112 Ampazis, N., and Iakovaki, H (2004) Cross-language information retrieval using Latent Semantic Indexing and Self-Organizing Maps International Joint Conference on Neural Networks (IJCNN'2004), (1) Budapest, Hungary, 751–755 Feller, W (1968) An introduction to probability theory and its applications (Vol.1, 3rd ed.) New York: John Wiley & Sons Fukunaga, K (1990) Introduction to statistical pattern recognition San Diego, San Francisco, New York, Boston, London, Sydney, Tokyo: Elsevier Academic Press Sergey Kulik / Procedia - Social and Behavioral Sciences 236 (2016) 29 – 33 Galushkin, A.I (2007) Neural networks theory Berlin, Heidelberg: Springer Kolmogorov, A.N (1973) Grundbegriffe der wahrscheinlichkeitsrechnung Berlin: Springer Reprint: Berlin, Heidelberg, New York: Springer Verlag Kulik, S.D (2015) Neural network model of artificial intelligence for handwriting recognition Journal of Theoretical and Applied Information Technology, 73(2), 202–211 Salton, G (1968) Automatic information organization and retrieval New York: McGraw-Hill Sherkat, E., and Farhoodi, M (2014) A hybrid approach for question classification in Persian automatic question answering systems 4th International eConference on Computer and Knowledge Engineering (ICCKE), 29-30 Oct 2014, IEEE, 279–284 Tomljanovic, J., Pavlic, M., and Katic, M.A (2014) Intelligent question - answering systems: review of research 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), 26-30 May 2014, IEEE, 1228–1233 33 ... related to communication in a multicultural society For example the Lecturer can talk about Automated Factographic Information Retrieval System or effectiveness of factographic information retrieval. .. the RL during the information retrieval in the area which includes DII, where S f1 P1 , P2 ; (2) W – average length of the RL during the information retrieval in the area which does not include... Fig Automated factographic information retrieval system The automated factographic information retrieval system includes six blocks: the block of document indexing; the block of retrieval; the