(Nonthacumjane, 2011) Information Service emphasize about Quality of service, Content Provider, Customer Ori- ented and Proactive service. Information behavior of Users, th[r]
(1)Proceedings
International Workshop on Global Collaboration of Information Schools, WIS2012
(2)Preface
The Inter national Workshop on Global Collaboration of Infor mation Schools, WIS2012, was held on 15 November 2012 as part of International Confer ence on Asia-Pacific Digital Libraries (ICADL2012) which took place at the GIS, NTU Convention Centre, from 12-15 November 2012 in Taipei, Taiwan (www.icadl2012.org) The first half of the Workshop comprised research paper sessions which wer e organized jointly with the Graduate Students Consortium (GSC) of ICADL2012 The second half of the day was composed of a round table discussion for information schools collaboration and research paper presentations by students and young researchers of the infor mation schools community, e.g., CiSAP (Consortium of information Schools in Asia Pacific) and iSchools Community
This proceedings is a collection of the research papers presented at WIS2012 All papers had been reviewed by the WIS2012 committee before they wer e selected for presentation
About WIS 2012
WIS 2012 consists of thr ee parts,
Graduate Students Forum (ICADL 2012 main confer ence): Presentations by graduate students
Roundtable Discussion: Discussion by invited delegates
Research Forum: Presentations of r esearch papers reviewed by the organizing committee based on the quality
Res earch Forum
The topics of the Research Forum include
Digital Libraries, Archives, Curation and Preservation
Infor mation Access, Discovery and Retrieval
Metadata and Knowledge Organization
LIS Education, Professional Practices, and LIS and Society
Web, Social networking and Infor mation management Chairs
Gobinda Chowdhury (University of Technology Sydney, Australia) and Vilas Wuwongse (Thammasat University, Thailand)
Organizers
Chern Li Liew (Victoria University of Wellington, New Zealand), Gary Marchionini (University of North Carolina, USA) Ronald L Larsen (University of Pittsburgh, USA), Edie Rasmussen (University of British Columbia, Canada), Emi Ishita (Kyushu University, Japan), Nisachol Chomnongsri (Suranaree University of Technology, Thailand), Shigeo Sugimoto (University of Tsukuba, Japan), Shalini Urs (University of Mysore, India), Hsueh-hua Chen (National Taiwan University, Taiwan), Lampang Manmart (Khon Kaen University, Thailand), Hao-Ren, Ke (National Taiwan Nor mal University, Taiwan), Schubert Foo (Nanyang Technological University, Singapor e), Chutima Sacchanand (Sukhothai Thammathirat Open University, Thailand), Hideo Joho (Universit y of Tsukuba, Japan)
Background
Previous workshops
WIS 2010: https://sites.google.com/site/wischool2010/
WIS 2011: https://sites.google.com/site/wischool2011/
CiSAP (Consortium of iSchools - Asia Pacific) http://dis.sci.ntu edu.sg/cisap/about.htm) was established on December 2008 as a not-for-profit organisation to promote collaboration a mong iSchools in the Asia Pacific
(3)region Currently, 24 institutions from 11 differ ent countries have joined CiSAP Members of the Consortium are academic institutions for higher education inter ested and involved in education and research in the area of 'information' Being part of CiSAP and being involved in the consortium's activities provides member institutions an important opportunity for international engagements and for raising the profile of their information school programmes and research CiSAP is founded as a voluntary organization and does not requir e a membership fee
(4)Workshop :Global Collaboration of Information Schools (WIS 2012)
Paper Session Session Chair : Dr Hao-Ren Ke
1. A Snapshot of Digital Library Research Trends (1990-2010) Son Hoang Nguyen and Gobinda Chowdhury
2. Help-Seeking Interactions in Digital Libraries: Influence of Learning Styles Chunsheng Huang
3. Evaluating User-System Interactions During the Search Process in Digital Libraries Soohyung Joo
4. A Study on the Experts' and Users' Mental Model for Keyword Selection in Information Organization
Ya-Ning Chen and Hao-Ren Ke
Paper Session Session Chair : Dr Muh-Chyun Tang 5. Influence Detection Within The Blogosphere
Luke Kien-Weng Tan
6. Knowledge Management in Indigo Dying Community Enterprises Adopting a Value
Chain Approach
Sutisa Songleknok, Smarn Loipha and Chollabhat Vongprasert 7. An Ontology for Automated Cloud Archiving Systems
Jan Askhoj, Shigeo Sugimoto and Mitsuharu Nagamori
8. Development of GMS’s Intangible Cultural Knowledge Base Systems: Integrated Semantic Web Technology and Geographic Information System Wirapong Chansanam, Kulthida Tuamsuk and Kanyarat Kwiecien
Paper Session Session Chair : Dr Hideo Joho
9. Development of Thai Qualification Framework for the Information Profession Nujarin Pathumpong and Chollabhat Vongprasert
10. Evaluating Core Measures of Text Denoising for Biomedical Relation Mining Rushdi Shams and Robert E Mercer
11. Information Behavior Model of Farmers using Grounded Theory Approach Unchasa Seenuankaew
(5)A Snapshot of Digital Library Research Trends (1990-2010) Son Hoang Nguyen and Gobinda Chowdhury
Information & Knowledge Management, University of Technology Sydney, NSW, Australia {Hoang.S.Nguyen@student.uts.edu.au}{Gobinda.Chowdhury@uts.edu.au}
1 Introduction: Analysis of research trends plays an important role in planning and evaluating research and development activities in any field of study Bibliometric studies based on citation counts, and various statistical analyses based on them, are commonly employed in the analysis of research trends in a field of study However, one of the major problems of such studies is that often the analysis is based on term occurrence (or co-occurrence) rather than the semantic connotations and relations among concepts that constitute the given field of study Instead, a semantic knowledge map that not only provides a semantically organized map of concepts but also the number of publications by year in a given field of study, could provide a better dataset for analysis of research trends in the field Such an approach has been used in the research reported in this paper to analyse the research trends in the field of DLs Based on the knowledge map of DL research (1990-2010) categorized in 21 core topics and 1015 subtopics [1], a bibliometric analysis on main research trends of the 21 core topics of DL research (1990 - 2010) by R-Squared values (R2) was conducted with the aim of analysing the past and predicting the future trends of the DL domain
2. Methodology
Data Collection: First, SCOPUS database was chosen because of its being the largest abstract and citation database of peer-reviewed literature [10] A search for DL publications (Search Terms: “Digital Librar*” in the field: Keywords with Date range “1990 - 2010” ) was conducted with a result of 7905 DL publication records Second, 1015 subtopics from the DL knowledge map (1990-2010) [1] were used as Search Terms in the field: Keywords for searching publications within the 7905 DL publication records For each subtopic, all of details (publication numbers by years, first time appearance of the subtopic) were recorded and transferred to Microsoft Excel 2007 for later calculation and analysis
Calculating R-Squared values: R-Squared value is a number ranging from to that reveals how closely the estimated values for a trendline (a straight line relationship) correspond to a set of actual data (a trend line is most reliable when its R-Squared value is at or near and vice versa, if near or at 0, it is least reliable) In fact, in linear regression, the trend line is a regression line drawn on a scatter graph and used to fit a predictive model to an observed data set of y (value on y axis) and x (value on x axis) After developing such a model, if an additional value of x is then given without its accompanying value of y, the fitted model can be used to make a prediction of the value of y In Excel, the R-Squared value is calculated by the equation for the Pearson product moment correlation coefficient The formula for R is:
R
and R-Squared returns R2, which is the square of this correlation coefficient In our research, to measure the trends in
the DL research (1990-2010), the R-Squared values were calculated in Excel 2007 based on the degree of association between variables (variable “Publication” or “Subtopic Number” on y axis; variable “Year” on x axis) The trend lines showing the DL research trends were classified into types: Increasing Trends (Positive Association), Decreasing Trends and Not Identified Trends (No Association)
3. Findings: In Chart 1, there is a significant increase in DL publication numbers, especially in the period (2000-2010) and the future trend of DL research is strongly increasing and estimated as R2 = 0.836 which is very reliable (very close to 1) In Chart 2, although there was an increasing trend in subtopic numbers of 21 core topics at peak in 2001, the overall trend (1990 - 2010) in the chart shows its decreasing and its future trend is decreasing with estimated R2 value = 0.0383 (not very reliable for being close to 0)
(6)Chart 1: Trend in Total Publication Numbers of DL Research (1990-2010)
Chart 2: Trend in Total Subtopics Numbers of DL Research (1990-2010)
Table 1: Publication Numbers vs R-Square Numbers Table 2:Subtopic Numbers vs R-Square Numbers Core Topics No
of Publications
Core Topics R2 Core Topics No of Subtop ics
Core Topics R2
#8.Architecture - Infrastructure
15339 #7.User Studies (Increasing Trend)
0.92 #8.Architecture - Infrastructure
144 #8.Architecture – Infrastructure (Decreasing Trend)
0.38
#19.DL Research & Development 14210 #11.Mobile Technology (Increasing Trend) 0.92 #3.Information Organization 141 #20.Information Literacy (Decreasing Trend) 0.35 #3.Information Organization 6036 #14.Virtual Technologies (Increasing Trend) 0.87 #4.Information Retrieval 78 #12.Social Web(We b 2.0) (Increasing Trend)
0.24
#4.Information Retrieval 5365 #13.Semantic Web (Web 3.0) (Increasing Trend)
0.84 #16.Digital Library Applications
64 #3.Information Organization (Decreasing Trend)
0.23
#1.Digital Collections 4593 #2.Digital Preservation (Increasing Trend)
0.84 #6.Human - Computer Interaction 61 #1.Digital Collections (Decreasing Trend) 0.23 #16.Digital Library Applications 3987 #18.Cultural,Social,Le gal,Economic Aspects (Increasing Trend)
0.83 #7.User Studies 59 #13.Semantic Web(We b 3.0) (Increasing Trend)
0.19
#6.Huma n - Computer Interaction
2582 #16.Digital Library Applications (Increasing Trend) 0.83 #9.Knowledge Manage ment 58 #4.Information Retrieval (Decreasing Trend) 0.18 #10.Digital Library Services
2571 #10.Digital Library Services (Increasing Trend)
0.82 #15.Digital Library Manage ment
53 #9.Knowledge Manage ment (Increasing Trend)
0.18
#7.User Studies 2485 #9.Knowledge Manage ment (Increasing Trend)
0.82 #19.DL Research & Development
48 #19.DL Research & Developme nt (Decreasing Trend)
0.17
#2.Digital Preservation 2141 #19.DL Research & Developme nt (Increasing Trend)
0.82 #1.Digital Collections 48 #11.Mobile Technology (Increasing Trend)
0.12
#15.Digital Library Manage ment
1705 #6.Human - Computer Interaction (Increasing Trend)
0.80 #2.Digital Preservation 46 #5.Access (Decreasing Trend) 0.09 #9.Knowledge Manage ment 1533 #3.Information Organization (Increasing Trend)
0.80 #10.Digital Library Services 30 #17.Intellectual Property, Privacy, Security (Decreasing Trend) 0.05 #18.Cultural, Social, Legal, Economic Aspects 1193 #4.Information Retrieval (Increasing Trend) 0.79 #13.Semantic Web(Web 3.0)
30 #10.Digital Library Services (Decreasing Trend)
0.03
#14.Virtual Technologies
(7)#17 Intellectual Property,Privacy,
Security
764 #12.Social Web (Web 2.0) (Increasing Trend)
0.75 #18.Cultural,Social, Legal, Economic
Aspects
25 #16.Digital Library Applications (Decreasing Trend) 0.02 #13.Semantic Web (Web 3.0) 590 #5.Access (Increasing Trend) 0.74 #11.Mobile Technology
22 #7.User Studies (Increasing Trend)
0.01
#5.Access 544 #8.Architecture – Infrastructure (Increasing Trend)
0.69 #12.Social Web (Web 2.0)
21 #6.Huma n - Computer Interaction (Decreasing Trend)
0.01
#11.Mobile Technology 359 #1.Digital Collections (Increasing Trend)
0.69 #14.Virtual Technologies
20 #15.Digital Library Manage ment (Decreasing Trend) 0.01 #12.Social Web (Web 2.0) 298 #20.Information Literacy (Increasing Trend) 0.57 #20.Information Literacy 20 #18.Cultural,Social, Legal,Economic Aspects (Increasing Trend) 0.01 #20.Information Literacy 267 #17.Intellectual Property,Privacy, Security (Increasing Trend)
0.54 #5.Access 14 #2.Digital Preservation (Increasing Trend)
0.00
#21.Digital Library Education
180 #21.Digital Library Education (Increasing Trend)
0.13 #21.Digital Library Education
5 #21.Digital Library Education (Not Identified
Trend)
#DIV /0!
In Table 1, it can be noted that although Architecture – Infrastructure (15339), DL Research & Development (14210), Information Organization (6036), Information Retrieval (5365) and Digital Collections (4593) are top core topics with highest publication numbers, they are not the most trending core topics with R2 values = 0.69; 0.82; 0.80; 0.79;
and 0.69 respectively Vice versa, User Studies (2485), Mobile Technology (359), Virtual Technologies (1105), Semantic Web(Web 3.0) (590), and Digital Preservation (2141) are core topics having less number of publications than the top 5, they get the highest R2 values = 0.92; 0.92; 0.87; 0.84; and 0.84 respectively It should be noted that
values of publication numbers by years just tell us how DL research trends happened in the past while R2 values show
how the trends will happen in future (future predictions of the trends) In other words, based on calculating the actual data of two variables “Year” and “Publication”, R2 numbers reveal how closely the estimated values for a trend line ( a
straight line relationship) correspond to a set of actual data
In Table 2, based on the calculation of the actual data of two variables “Year” and “Subtopic Number” of 21 core topics, there are increasing trend core topics, 13 decreasing trend core topics and not indentified trend core topic Although, Architecture – Infrastructure; Information Organization; Information Retrieval; Digital Library Applications; and Human - Computer Interaction were top core topics with highest subtopic numbers, viz 144, 141, 78, 64, and 61 respectively, their future as shown by R2 values were decreasing trends, such as: Architecture –
Infrastructure (0.38); Information Organization (0.23); Information Retrieval (0.18); Digital Library Applications (0.02); and Human - Computer Interaction (0.01) With regard to top core topics with increasing trends in subtopics numbers, there were top core topics, viz Social Web (Web 2.0) (0.24); Semantic Web (Web 3.0) (0.19); Knowledge Management (0.18); Mobile Technology (0.12); and User Studies (0.01)
4 Conclusion
Overall, there are strong increases in 21 core topics of DL research (1990-2010) with their total future growth prediction as R2 = 0.836 (very reliable) Despite the decreasing in most of subtopic numbers, the future declining
trends of the subtopics are not reliable for having R2 value = 0.0383 Most remarkably, there are some topics showing
their future growths in R2 values of both numbers of publications and subtopics, viz User Studies, Mobile Technology,
Semantic Web(Web 3.0), Social Web (Web 2.0), Knowledge Management, Digital Preservation which will be the major research interests for DL communities in the future However, the core topic: Digital Library Education with least publications, subtopic numbers and R2 value should be paid more interest so that it would enhance the activities
of research, education and implementation within the DL domain References
1 Nguyen, H.S & Chowdhury, G 'Digital Library Research (1990-2010): A Knowledge Map of Core Topics and Subtopics', ICADL 2011 vol 7008, ed F.C C Xing, and A Rauber (Eds.), Springer-Verlag Berlin Heidelberg 2011, Beijing, pp 367-371 (2011)
(8)Help-Seeking Interactions in Digital Libraries: Influence of Learning Styles
Chunsheng Huang1,2
1 School of Information Studies, University of Wisconsin-Milwaukee, Milwaukee, WI, USA Library, National Chung Hsing University, Taichung, Taiwan
huang22@uwm.edu
Abstract Learning style has been identified to be influential in users'
preferences of information searching systems However, little is known about how learning styles may have an impact on users’ help-seeking interactions This proposal reports preliminary results of a dissertation study investigating the effects of learning styles on help-seeking behaviors in the digital library environments Index of Learning Styles was employed to measure users’ different dimensions of styles Multiple data collection methods, including questionnaires, think-aloud protocols, transaction logs, and interviews, were employed to collect data from 37 participants Findings of this study demonstrate that participants demonstrated different approaches of help-seeking as well as the influences of users’ learning styles on their corresponding interactions with help features of digital libraries
Keywords: Learning styles, help-seeking, digital libraries, interactions
1 Introduction
User’s information need has to be fulfilled by providing well-designed system However, end users usually encounter various problems when interacting with information retrieval (IR) systems and it is even more so for novice users The most common problem reported from previous research is that novice users not know how to get started even though most IR systems contain help mechanisms Since digital libraries were developed during the past decade, most users are unfamiliar with them Novice users, who never use or rarely use digital libraries, need to learn how to use new digital libraries by interacting with help features to fulfill their searching need However, many research studies have demonstrated that the existing help systems cannot fully satisfy users’ needs In addition to the problems caused by the system itself, users’ characteristics, such as preference in using help, also play a major role in the decision of using system help When viewing help-seeking as a learning activity, learning style is one of the influential factors that would lead to different help-seeking behaviors This study aims to explore the effects of learning style on help-seeking interactions in the information seeking and searching environment
(9)2 Related Work
Cognitive preferences unconsciously serve as an adaptive control mechanism between the inner self-need and external interacting environment In learning activities, individuals’ preferred ways of processing information is called learning styles Learning styles have been confirmed by previous studies to deeply influence on how users process information in their search process Different style users apply their particular ways of chosen search strategies and their preferred system features While most learning styles theories classify learners into few groups, the Index of Learning Styles (ILS) describes learners in more detailed dimensions: Active/Reflective, Visual/Verbal, Sensory/Intuitive, and Sequential/Global [1]
Several researchers studied the effects of learning style and the associated dimensions on users’ reactions to information organization and representation, search strategy, and search performance [2] Help-seeking represents a mini information search process The factors, that influence information seeking and retrieving, also affect users’ help-seeking Help is defined as assistance or clarification from either an IR system or a human in the search process when people encounter problems[3]
Although previous research has addressed the issues of help-seeking and various cognitive factors, they were investigated separately There are two main limitations associated with the previous research: 1) most studies focus on how cognitive factors affect search behaviors, yet less research focuses on the influence on help-seeking; 2) there is even less research examining the issue in digital library environments In order to fill the gap and provide a better interacting environment, it is necessary to gain a clearer picture and understand what the novice users’ help-seeking approaches are when using digital libraries The specific research questions are: (1) What are the help-seeking approaches of novice users while getting started with digital libraries?; (2) In what ways that learning styles affect users’ help-seeking approaches in the information search process?; (3) To what extend novice users’ learning styles affect help-seeking behaviors?
3 Methodology
A user study was designed to address the proposed research questions and associated hypotheses Two digital libraries were selected for this study: Library of Congress Digital Collections and University of Wisconsin Milwaukee Digital Collections Both digital libraries provide diversified academic content in various topics and formats Most importantly, both digital libraries facilitate information seeking of novice users with complete and different types of help features The context of the study is designed to be in an academic setting with real academic users and real academic problems Sixty novice users are expected to be recruited in this study, including undergraduate and graduate students Multiple methods were employed to systematically collect data, including pre-questionnaires, cognitive measures, think-aloud protocols, transaction log, and post- interviews The mixed methods design consists of two major components: qualitative illustration followed by
(10)quantitative testing Results from the two components will be connected and interpreted to provide a better understanding of novice users’ help-seeking behaviors
4 Preliminary Results
The preliminary findings focus on descriptive and qualitative analysis Since novice users are more likely to encounter problems when searching digital libraries, novice users are the main subjects of this study Thirty-seven student participants have completed the study More than half of participants are between the ages of 20 to 29 with approximately equal number of males (49%) and females (51%) To better represent the diversified educational backgrounds, participants with different educational disciplines, including arts and humanities (38%), social sciences (38%), science and engineering (22%), were recruited
The finding of this study emphasizes on answering the research questions in regard to what are the help-seeking approaches of the users and how learning styles affect their corresponding help-seeking interactions with help functions The preliminary results of the study showed that participants demonstrated different approaches for help-seeking to deal with various types of help features, such as Interactive Help, Visual Help, Overview Help, Step-by-step Help, Channeling Help, Viewing Help Results of this study also showed that learners with different learning styles exhibit various dimensions of help-seeking interactions when searching information in digital libraries In selecting and using help features, active and reflective learners showed their preferred approaches of engagement Visual and verbal learners had their preferred presentation formats of help features While perceiving help features, sensing and intuitive learners had different preferences in relation to help content, structure, and design Sequential and global learners applied their preferred strategies to make sense of and understand digital libraries and their functions The characteristics of interactions offer practical implications for the design of digital libraries to support different types of learning styles In particular, the results suggested that digital libraries need to support different learning styles by offering different types of help features, different formats of help, and different organization and presentation of help content Further research will continue to quantitatively test the exploratory results identified by the qualitative analysis
References
1.Felder, R M., & Silverman, L K.: Learning and teaching styles in engineering education Engineering Education 78(7) 674-681 (1988)
2.Ford, N., Wilson, T D., Foster, A., Ellis, D., & Spink, A.: Information seeking and mediated searching part 4: Cognitive styles in information seeking JASIST 53(9) 728-735 (2002) Xie, I., & Cool, C.: Understanding help seeking within the context of searching digital
libraries JASIST 60(3) 477-494 (2009)
(11)Evaluating User -System Inter actions dur ing the Sear ch Pr ocess in Digital Libr ar ies
Soohyung Joo1
1 School of Information Studies, University of Wisconsin-Milwaukee, P.O Box 413, Milwaukee, WI 53201, USA
sjoo@uwm.edu
Abstr act This study aims to investigate in what ways and to what extent digital library (DL) systems support user-system interactions focusing on search process Based on previous interactive information retrieval (IIR) models, a multi-tiered evaluation framework has been developed to assess system support for users’ application of search tactics at physical, cognitive and affective dimensions In addition, the study plans to investigate how system support would influence IR outputs and outcomes This proposal summarizes the conceptual IIR evaluation framework for DLs, research design and methods, and some preliminary findings
1 Intr oduction
In digital library evaluation, major concerns have been usage, service quality, interface design and usability, and collections in both research areas and operational DL practices [1] Less research has been done in the context of IIR evaluation in DL settings Interactive Information Retrieval (IIR) evaluation focuses on users’ behaviors and experiences at physical, cognitive and affective levels, and the interactions that occur between users and systems [2] This study intends to propose a process-driven IIR evaluation framework that assesses system support for users’ application of search tactics focusing on search process, not limited to predominant search results evaluation The evaluation framework that this study will develop is to assess in what ways and to what extent a DL system supports different types of search tactics applied in achieving an information search task Here are the research questions (RQs) for this study: 1) What are the types of system supports users need to apply search tactics during the search process? In what ways does the system support users’ application of search tactics during the IR process?; 2) To what extent DLs support users’ application of search tactics at three hierarchical levels—physical, cognitive and affective—during the search process?; 3) How does system support affect IR outputs and outcomes, such as overall satisfaction, usefulness of search results, knowledge change, and aspectual recall?
(12)2 Conceptual Evaluation Fr amework
The conceptual evaluation framework of this study is based on three aspects of IIR: 1) user engagement and system support; 2) search tactics in IR process; and 3) hierarchical levels of interactions
First, the conceptual framework viewed an IR process as interactions between user engagement and system support Both user engagement and system support play important roles in applying search tactics [3] To apply search tactics in IR processes, users need to be intellectually engaged while the system assists them by providing different system features The evaluation framework of this study focuses on user- system interactions in two aspects: in what ways users engage in and to which extent the system supports user engagement in applying different types of search tactics
Second, the evaluation framework of this study assumes that an IR process consists of users’ application of multiple search tactics A search tactic refers to a move or moves, including search choices and actions that users apply to achieve a specific objective in IR processes [4] For this study, thirteen types of search tactics were adopted based on Xie and Joo’s [5] identification of search tactics: Creating search statement [Creat], Modifying search statement [Mod], Evaluating search results [EvalR], Evaluating individual items [EvalI], Access Forward and Backward [AccF / AccB], Exploring[Xplor], and Obtaining [Obt], among others This study’s evaluation framework covers a variety of user-system interactions by incorporating thirteen different types of search tactics into the evaluation practice
Third, the evaluation framework poses three hierarchical levels in measurement Based on Kulhthau’s [6] ISP model, this study attempts to measure user-system interactions in DLs in three hierarchical levels: 1) at physical level, users’ application of search tactics and corresponding system support will be explored; 2) at cognitive level, users’ perceptions of system support and engagement will be measured; and 3) at affective level, user satisfaction to system support will be investigated Specific evaluation criteria and feasible measures are suggested in this study
3 Methodology
To answer the three research questions and to empirically test the conceptual evaluation framework, a user study was designed This study applies the identified framework into the actual IIR evaluation of the US Library of Congress Digital Collections (LOCDL) LOCDL represents one of the national-level DLs in academia in the United States Sixty student subjects will be collected from the University of Wisconsin-Milwaukee Multiple methods are used to collect the data: 1) pre- questionnaires; 2) search logs; 3) think-aloud protocols; 4) post-questionnaires Three different types of search tasks are designed for the study: known-item search, specific information search, and subject-oriented search (aspectual recall-driven search)
For data analysis, both qualitative and quantitative methods are used Qualitative analysis, in particular open coding and content analysis, will be applied to identify different types of user engagements and associated system supports Quantitative analysis, including inferential statistical tests, will be employed to analyze search
(13)tactic patterns, to numerate the amount of system support at different hierarchical levels, and to examine the effects of system support on IR outcomes and outputs
4 Pr eliminar y Results
As of June 20th 2012, thirty eight subjects participated in this study This proposal summarizes some preliminary findings from the tentative analysis of those thirty- eight subjects As to RQ1, initial types of user engagements and system supports for each type of search tactics were identified based on open coding method For example, in Creat tactics, types of users engagements can be “Convert user need to a search statement,” “Determine search strategies,” “Manipulate search statements” and others, while corresponding system supports are “Provide interactive search mechanisms,” “Suggest different search strategies,” and “Offer different search fields/facets,” among others For all thirteen types of search tactics, this study will identify types of user engagements and associated system supports from the observations of users’ search processes Moreover, specific system features and interaction examples will be presented in the final results
As to RQ2, user engagement was explored by exploring users’ application of search tactics For example, the preliminary findings revealed that on average users spent most time in EvalI and EvalR tactics during their search session in using LOCDL Also, system support was measured at physical, cognitive, and affective levels based on the analysis of search logs and post-search questionnaires
As to RQ3, we found that system support would be associated with IR outcomes and IR outputs In particular, according to tentative correlation analysis results, system supports for Creat, Mod, EvalR, EvalI, and Xplor tactics would be significantly correlated with knowledge increase, aspectual recall, usefulness of search results and satisfaction to search results
This ongoing study will continue to collect data up to at least sixty subjects to achieve adequate statistical power for multiple regression
Refer ences
1 Joo, S.; Xie, I.: Evaluation Constructs and Criteria for Digital Libraries: A Document Analysis In: Cool, C., Ng, K.B (Eds) Recent Developments in the Design, Construction and Evaluation of Digital Libraries Hershey, PA: IGI Global (in press)
2 Kelly, D.: Methods for Evaluating Interactive Information Retrieval Systems with Users Foundations and Trends in Information Retrieval 3(1-2), 1-224 (2009)
3.Xie, I.: Supporting Ease-of-use and User Control: Desired Features and Structure of Web- based Online IR Systems Information Processing and Management 39, 899-922 (2003) Xie, I.; Joo, S.: Factors Affecting the Selection of Search Tactics: Tasks, Knowledge,
Process, and Systems Information Processing and Management 48(2), 254-270 (2012) 5.Xie, I.; Joo, S.: Transitions in Search Tactics during the Web-based Search Process Journal
of American Society for Information Science and Technology 61(11), 2188-2205 (2010) 6.Kuhlthau, C.C.: Inside the Search Process: Information Seeking from the User’s Perspective
Journal of the American Society for Information Science 42, 361–371 (1991)
(14)A Study on the Experts' and Users' Mental Model for Keyword Selection in Information Organization
Ya-Ning Chen and Hao-Ren Ke
Graduate Institute of Library and Information Studies, National Taiwan Normal University, Taipei, Taiwan
arthur@gate.sinica.edu.tw, clavenke@ntnu.edu.tw
Abstract This study aims at exploring the similarity of information
organization behaviors between users and experts through lens of mental model 16 journals, 1,491 articles, 3,978 tags of CiteULike, and 6,717 descriptors of LISA were selected for analysis between 26 February and March 2011 Four in-depth research questions presented below were investigated to examine the similarity of mental models for information organization: 1) correspondence between social tags and article titles, and descriptors and article titles in scholarly journals, 2) similarity between social tags and descriptors in scholarly journal articles, 3) the usage of keyword categories for social tags and descriptors and similarity between them, and 4) implicit patterns and structures of used keyword categories embedded in social tags and descriptors
Keywords: social tags, descriptors, mental models, information organization
1 Introduction
With widespread applications of Web 2.0, social networking platforms offer users social tags and folksonomy to organize personal information resources These platforms not only provide an opportunity to study how users organize resources for personal information management, but also aggregate information organization in a collective intelligence seamlessly The approach of social tag is distinctive from that of information organization in library and information science (hereafter LIS) in that information organization is conducted in a top-down manner but social tagging is grass-rooted Therefore, it becomes an emergent issue to study behaviors of information organization in order to bridge the gap between users and experts
In a study of student’s behaviors of database query, users regard keywords as concepts for retrieving documents from a database [1] In addition to treating tags as keywords, taggers also use tags as concepts to organize personal information resources in the tagging process [2] From the perspective of information organization, keywords have been regarded as concept and tool for information organization such as subject headings, authority file and thesaurus
Mental models are defined as “people’s mental representation of information objects, information systems, and other information related processes”[3] In addition to applications of information retrievals and information systems, mental models are
(15)also adopted in information organization to explore users’ cognition of how information is organized [4] Mental models are further employed to examine whether users and information organization experts have the similar cognitive understanding of FRBR for metadata description [5-6] Therefore, if users and information organization experts can share similar mental models, then the gap between users and information organization experts would be harmonized to facilitate more effective information organization and retrieval
This study aims to explore the similarity and difference of information organization practices and behaviors between users and experts in order to provide suggestion for information organization through lens of mental models Research questions are proposed by this study in the following:
• RQ1: What is the correspondence between social tags and article titles, and descriptors and article titles in scholarly journals?
• RQ2: What is the similarity between social tags and descriptors in scholarly journal articles?
• RQ3: What is the usage of keyword categories for social tags and descriptors and similarity between them?
• RQ4: What are the patterns and structures of used keyword categories embedded in social tags and descriptors, and similarity between them?
2 Methodology
This study selected related LIS journal articles from CiteULike and LISA as target subject Journals were selected according to the following criteria: relative prominence in LIS as indexed by the Journal Citation Report and LISA, overall coverage in theoretical and practical, and advantage of author’s domain knowledge Thus this study selected social tags of LIS journal articles from CiteULike, and descriptors from LISA between 26 February and March 2011 Journals were sorted in an alphabetical order, and then each article was given with a sequential number Each article’s tags were placed adjacent to article title in another column, and then descriptors put next to tags in a new column In a total of 16 journals, 1,491 articles, 3,978 tags, and 6,717 descriptors were selected for analysis
Based on the concept of application profile [7], the study adopted Tag Category Model [8] and matching categories [9] as a framework to develop two sets of classification schemes One is used to classify categories for tags and descriptors, and the other is to compare term’s similarity between tags and descriptors Three individuals with LIS background were divided into three groups in order to undertake the in-depth analysis for the above three parts of this study (RQ1-RQ3) and authors did all of three parts The values of agreement have arrived at substantial level of Cohen’s kappa values for consistency [10] If there any difference exists between LIS individuals and authors, a meeting was convened to discuss to achieve agreement for inter-reliability Furthermore, social network analysis (SNA) and frequent pattern (FP) free were employed to investigate the implicit patterns and structures of used keyword categories embedded in social tags and descriptors (RQ4)
(16)3 Preliminary Results
31.95% of CiteULike tags and 18.37% of LISA descriptors were identical to corresponding keywords in article titles An inverse J shape shows that both of used tags and descriptors follow a Zipfian power-law distribution, and the usage of tag and descriptor categories also echoes the similar distribution Both of the top-8 used tag categories and descriptor categories are in line with 80/20 rule of distribution accounting for majority of usage, but the ranking orders of used categories are different According to the usage, the Zipfian curve of used descriptor categories is much steeper than that of used tag categories and elicits that the usage of descriptor categories centered on fewer categories than that of tags The most popular category of tags is category 01 (i.e., title-exact), the usage of which is 31.95%, whereas category 09 (i.e., topic-general) is the most popular category of descriptors and usage is 41.16% If related categories are grouped together, the ranking order and usage of categories are distinctive between tags and descriptors
In light of term’s comparison between tags and descriptors, non-matches were the most popular category and usage was over half The partial matches were the second rank of popular category and higher than exact matches In terms of SNA, centrality, grouping clusters, co-used groups and structural equivalent role of used keyword categories between tags and descriptors were different Based on path-based rules of FT tree analysis results, taggers’ collective mental model of keyword selection is much shallower than that of information organization experts It means that experts are inclined to use more keywords and categories than taggers to represent concepts of information objects
4 Conclusion
Initially, this study is successful in developing two sets of classification schemes to examine the similarity of behaviors for information organization between users and experts However, an in-depth study is needed to explore the similarity of mental model in keyword association for information organization in the future
References
1 Holman, L.: Millennial students’ mental models of search: Implications for academic librarians and database developers Journal of Academic Librarianship, 37(1), 19-27 (2011) Smith, G.: Tagging: People-powered metadata for the social web Berkeley, CA: New
Riders (2008)
3 Zhang, Y.: Undergraduate students’ mental models of the web as an information retrieval system Journal of the American Society for Information Science and Technology, 59(13), 2087-2098 (2008)
4 Ahlstrom, V., Allendoerfer, K.: Information organization for a portal using a card-sorting technique (2004), http://hf.tc.faa.gov/technotes/dot-faa-ct-tn04-31.pdf
(17)5 Pisanski, J., Žumer, M.: Mental models of the bibliographic universe Part 1: mental models of descriptions Journal of Documentation, 66(5), 643-667 (2010)
6 Pisanski, J., Žumer, M.: Mental models of the bibliographic universe Part 2: Comparison task and conclusions Journal of Documentation, 66(5), 668-680 (2010)
7 Heery, R., Patel, M.: Application profi les: mixing and matching metadata schemas Ariadne, 25 (2000), http://www.ariadne.ac.uk/issue25/app-profi les
8 Heckner, M., Mühlbacher, S., Wolff, C.: Tagging tagging: Analysing user keywords in scientific bibliography management systems Journal of Digital Information, 9(2) (2008) Carlyle, A.: Matching LCSH and user vocabulary in the library catalog Cataloging &
Classification Quarterly, 10(1/2), 37-63 (1989)
10.Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data Biometrics, 33(1), 159-174 (1977)
(18)Influence Detection Within The Blogosphere
Luke Kien-Weng Tan
Wee Kim Wee School of Communication and Information Nanyang Technological University, Singapore
w080078@ntu.edu.sg
Introduction
Weblog or Blog, is a specialized web site that allows individuals to express their thoughts, voice their opinions, and share their experiences and ideas The easy access and availability of blog sites (e.g., www.blogspot.com) have encouraged web-users to change from consumers to providers of information Providers of such content exert a certain level of influence on the receivers and this is evident from blog sites having effect on their readers’ purchase decisions (e.g., www.engadget.com), attitudes and approaches to life (e.g., www.lifehack.org), political viewpoints (e.g., www.huffingtonpost.com), and others Merriam-Webster dictionary (www.merriam- webster.com) defines influence as “the power or capacity of causing an effect in indirect or intangible ways” Influence is a characteristic of an individual that defines the capacity of exerting some effect on other individuals [3] A blogger is influential if he has the capacity to affect the behavior of fellow bloggers The ability to detect influence in the blogosphere could be used to identify influential blogger and the chain of information flow Through this, further stimulus could be added to aid the flow of positive information, or pre-emptive and preventive actions taken to minimize any negative impact
Previous studies linked information propagation and influence to blog features, which are mainly graph-based, such as the number of in and out-link [1, 2] However, the use of blog features alone to detect influence in the blogosphere may not yield highly accurate results This is because influence is a subjective concept and often depends on the context of the posting More recent studies had used sentiment analysis on links between blogs to detect influence [8, 9] These studies had focused on a single notion that influence exists in the blog posts links, and had not studied the details in the influence types and styles of the blog sites Moreover, influence is a complex concept that cannot be described using simple directional quantity derived from the presence or absence of links
The aim of this study is to develop an influence detection model to automatically detect the influence flow within the blogosphere An approach that combines relevant blog features and sentiment analysis is used to analyze the blog influence styles, and the appropriate techniques applied to determine the influence between the linked blog posts The research questions posed are:
• What are the blog features that could indicate the influence within the blogosphere?
• Will automatic influence style analysis help detect influence propagation within the blogosphere?
(19)To answer the research questions, the following research objectives will be carried out:
• Determine the blog features that show influence within the blogosphere
• Establish a linguistic approach to improve sentiment analysis between linked blog posts
• Determine the blog sites and bloggers influence styles
• Develop an influence detection model to detect the influence in the blogosphere
Literature Review
In the study by Agarwal and Liu [2], an influential blogger is defined based on the number of in-links to the post, and the post length A high authority value which refers to a larger number of in-links to the blog postings may indicate higher readership However, having higher readership may not necessary infer influence For example, opposite community group blogger would also tend to read and link to high authority blogs from the other group, but are not influenced by the linked posts Moreover, the two different communities’ blogs may even express opposing viewpoints Similarly, length of posts was equated to eloquence of the blogger where an eloquent blogger was deemed influential Length of posts may be subjective in detecting influence as it is dependent on the writing style of the blogger Instead the quality of the blog content would have provided clearer evidence of influence Analyzing the blog features alone may not be sufficient to yield high accurate results in detecting influence within blogosphere, and other studies have included content analysis in their approach For instance, Adar and Adamic [1] considered text similarity, in addition to the blog features to determine influence Most of the content analysis approaches for influence detection measured document similarity between linked blog posts using information retrieval techniques, such as cosine similarity These approaches mainly checked whether the linked blog posts were discussing the same topic for influence detection Instead, further contextual information, that is, the sentiments expressed within the post should be analyzed to improve influence detection performance
Sentiment analysis is a type of content analysis which aims to identify opinions, emotions, and evaluations expressed in natural language [13] Earlier studies in sentiment analysis at the sentence or phrase level had used the notion that an opinion word associated with its aspect or feature would appear in its vicinity [7] However, opinionated text could be written in elaborate styles where sentences have nested clauses with the related opinion and subject words hidden in separate clauses Methods using word distances [5] or part-of-speech patterns [14] will not be able to detect the relationship of the opinionated words as these methods typically assume both the opinion and aspect words to appear within a certain distance of one another Moreover, the grammatical relationship between the words had largely been ignored Recent sentiment analysis research has focused on the functional relations of words using typed dependency parsing, which provides a refined analysis on the grammar and semantics of textual data Heuristics are typically used to determine typed dependency polarity patterns, which may not comprehensively identify all possible rules [12, 11] In this study, the use of class sequential rules (CSR) [10] is explored to
(20)automatically learn the typed dependency patterns for sentiments prediction More recent studies had used sentiment analysis on links between blogs to detect influence Li et al [9] considered the positive and negative edges of the nodes in their attempt to detect influence in the social network Leskovec et al [8] similarly adapted a framework of trust and distrust in an attempt to infer the attitude of one user toward another using the observed positive and negative relations These studies had focused on a single notion that influence exists between the links of the blog posts, and had not studied the details in blog sites influence styles In this study, the influence styles of the blog sites and bloggers are determined to better describe the influence exerted Research Methodology
A blogosphere could have numerous posts resulting in a complex network of influence between the blogs In our study, the focus is on detection of influence propagation between a linked blog post and the linking blog post To understand the influence propagation characteristics of blog posts, the archives from product review blog sites would be used in the study The overview of the research methodology is shown in Figure
Linked Blog (A)
Linking Blog (B)
(2) Sentiment Analysis
1.Typed Dependency Rules 2.Sentence-Level Sentiment Polarity Prediction Rules 3.Semantic Analysis
(1)Blog Features Analysis
(4) Influence Propagation Detection (3) Influence Styles Detection
Fig Overview of research methodology
In step (1), a study to explore the blog features that are useful in detecting influence between the linked blogs would be done An analysis on the blog content to predict the sentiments expressed within the blog content in step (2) is performed to take into account the contextual consideration of the blog posts A linguistic approach that leveraged on the typed dependency rules, and further consider the complex phrase relationships found in a sentence is proposed to improve sentiment analysis performance Subsequently in step (3), blog features analysis and sentiment predictions would be combined to profile the blog sites and bloggers using influence styles The influence styles further describe influence through the engagement, persuasion, and persona of the blog sites and bloggers Engagement style refers to the participation and involvement level of the bloggers towards the blogs Cialdini and Goldstein [4] defined persuasion as a process of influence through appeals to reason or emotion, which we evaluate in the persuasion style analysis Kelman [6] defined compliance as an influence process, referring to the agreement expressed between
(21)linked blog sites The blog site’s persona is a measurement of compliance in the persona analysis The influence styles together with the relevant blog features are then to be used in an approach to detect influence propagation in step (4)
The contribution of the research would be the novel approach of detecting influence propagation within the blogosphere through analyzing the sentiments expressed in the blog posts and influence styles of blog sites and bloggers Unlike previous studies, the proposed approach would automatically generate the influence styles of the blog sites and bloggers This is done using identified blog features and a linguistic based sentiment analysis Further to that, the novel idea of using influence style as a parameter to detect influence propagation within the blogosphere would be explored As influence is a subjective and complex concept, it is believed that describing influence in details through influence style would improve influence detection performance by providing an in-depth analysis of influence
References
1 Adar, E., Adamic, L.A.: Tracking Information Epidemics in Blogspace In: Conference on Web Intelligence, 207-214 (2005)
2 Agarwal, N., Liu, H.: Blogosphere: Research issues, tools, and applications In: SIGKDD Explorations Newsletter, 10(1), 18-31 (2008)
3 Agarwal, N., Liu, H., Modeling and Data Mining in Blogosphere, Morgan & Claypool, San Rafael, CA (2009)
4 Cialdini, R.B., Goldstein, N J.: Social Influence: Compliance and Conformity Annual Review of Psychology, 55, 591-621 (2004)
5 Hu, M., Liu, B.: Mining and summarizing customer reviews In: International Conference on Knowledge Discovery and Data Mining, 168-177 (2004)
6 Kelman, H C.: Compliance, identification, and internalization: Three Processes of attitude change Journal of Conflict Resolution, 2(1), 51-60 (1958)
7 Kim, S.M., Hovy, E.: Determining the sentiment of opinions In: International Conference on Computational Linguistics ACL, 1367-1373 (2004)
8 Leskovec, J., Huttenlocher, D., Kleinberg, J.: Predicting Positive and Negative Links in Online Social Networks In: World Wide Web ACM, 641-650 (2010)
9 Li, H., Bhowmick, S S., Sun, A.: CASINO: Towards Conformity-Aware Social Influence Analysis in Online Social Networks In: Conference on Information and Knowledge Management ACM, 1007-1012 (2011)
10 Liu, B.: Web Data Mining: Exploring Hyperlinks, Contents and Usage Data Springer Berlin Heidelberg, New York, 37-54 (2006)
11 Shaikh, M A M., Prendinger, H., Ishizuka, M.: Sentiment Assessment of Text By Analyzing Linguistic Features And Contextual Valence Assignment Applications of Artificial Intelligence, 22(6), 558-601 (2008)
12 Thet, T T., Na, J.C and Khoo, C S G.: Aspect-Based Sentiment Analysis of Movie Reviews on Discussion Boards Journal of Information Science, 36(6), 823-848 (2010) 13 Wiebe,: Tracking point of view in narrative Computational Linguistics, 20(2), 233-287
(1994)
14 Yi, J., Nasukawa, T., Bunescu, R and Niblack, W.: Sentiment Analyzer: Extracting Sentiments about a Given Topic using Natural Language Processing Techniques In: International Conference on Data Mining, 427-434 (2003)
(22)Knowledge Management in Indigo Dyed Cloth Community Enterprises Adopting a Value Chain
Approach
Sutisa Songleknok1, Smarn Loipha2, Chollabhat Vongprasert3 1Faculty of Humanities and Social Sciences, Khon Kaen University, Thailand
songleknok_sutisa@yahoo.com
2Associate Professor, Information and Communication Management Program, Faculty of
Humanities and Social Sciences, Khon Kaen University, Thailand smarn@kku.ac.th
3Assistant Professor, Information and Communication Management Program, Faculty of
Humanities and Social Sciences, Khon Kaen University, Thailand chat045@yahoo.com
Abstract This research aim to purpose : (1) to exam- ine knowledge management of indigo dyed cloth commu- nity enterprises base on a value chain concept, (2)
to examine factors supporting knowledge management and community enterprises management of indigo dyed
cloth community enterprises, (3) to develop a
knowledge management model for indigo dyed cloth com- munity enterprises base on a value chain concept The qualitative research method was used Area of research were the two group of indigo dyed cloth com- munity enterprise in Sakon Nakhon province : Baan Tam Tao Housewife Farmers Group, Samak-kee Patana sub- district, Arkat-amnuay district, Tee Ta natural indi- go dyed group, Nahuabor sub-district, Pannanikom dis- trict
Expected outcome was to obtain a model for knowledge management base on a value chain concept which is correspondent to the context of the indigo dyed cloth community enterprises in Sakon Nakhon province This model is a relationship between knowledge management processes, Business management and enable factors of knowledge management and community enterprises
Keywords : Knowledge management, Community enterprises, Value chain, indigo dyed cloth
adfa, p 1, 2011
© Springer-Verlag Berlin Heidelberg 2011
(23)1 Background
In Thailand, community enterprises are founded by community organizations, which are started by people in the community The businesses are run by the community, for the community, and by the community fund (Petprasert, 1999; Pipatseritham, 2004) to increase the members’ incomes and to improve living conditions and life quality of local people, who are the majority of the country (Chommuang and Wasusophapon, 2003) Community enterprises have been supported by the public sector to encourage the community to set up its own businesses such as Small and Medium Enterprises (SMEs) and One Tambol One Product (OTOP) The major emphasis of these busi- nesses is to use local sources as raw materials and to exploit local wisdom as a basis for the development of products and services in order to create the identity of each community, add value to products, and to preserve and extend the local wisdom (Community Development Office in Sakon Nakorn, 2004; Chommuang and Wa- susophapon, 2003; SCEB, 2003)
2 Statement of the problem
It has currently been found that some community enterprises are successful—they could produce quality products granted a five-star standard of OTOP products, while some are not so successful, some disappear from the market, and some are still at- tempting to improve the quality of their products to meet the standard set by the pub- lic authority To reach the standard, it takes a long period of time because the com- munity often lack knowledge of product development Besides, community enterpris- es encounter a lot of problems These include (1) limited business management skills resulting in low competitiveness level and low rate of growth and their sales just cov- er the expenses and production costs (Allen, 1999; Petprasert, 1999), (2) product problems which involve product design of how to be correspondent to and to meet the customer’s needs as well as low quality and low standard products (Saenpot, 2003), (3) product advertisement of those relying on outside markets or targeting at outside markets (Thailand Productivity Institute, 2003), (4) labor (i.e., lack of skilled staff in production, labor’s in-depth knowledge training, specific knowledge for product ex- pansion such as accounting, production development, laws, contract, and so on (Thai- land Productivity Institute, 2003; Pisaisawat, 1996)), and (5) inaccessibility to neces- sary information such as marketing, raw materials, financial sources, and so on (Thai- land Productivity Institute, 2003) These factors clearly demonstrate problems in knowledge management of community enterprises
Knowledge management is vital for the development of community enterprises be- cause knowledge is an important factor driving economy and increasing competitive level (Holsapple & Singh, 2001; Nonaka, 1998) Knowledge management reinforces continuing learning, which is crucial for survival, maintenance, and sustainability of the organization’s excellence With constant and systematic knowledge management, an organization can use the existing and stored knowledge for relearning and expand- ing to create innovations (Drucker, 1993; Wijan, 2003) Community enterprises
(24)should be viewed as a holistic scheme, from upstream production to midstream and downstream ones To achieve this, a concept of value chain should be adopted be- cause it connects production activities from raw production searching and processing to procedures of delivery and customer services This concept can help increase a business’s competiveness level and its product value in the customer’s eye
Based on the literature reviews of previous studies related to community enterprises in Thailand, it was found that the previous studies mainly focused on six areas The first area was on how to solve problems related to community enterprises manage- ment by focusing on specific management issues (e.g., record of income and expense) (Saenpot, 2003) Another area of the previous studies was attempting to solve prob- lems related to knowledge management in community enterprises focusing on educa- tional knowledge management by creating a new curriculum or integrating knowledge management as part of the existing curriculum (Kemakorn, 2009; Mekwan, 2006) Also, previous studies examined knowledge management of successful community enterprises (Jonjoubsong, 2008; Phabu, 2002; Tinnaluck, 2004, 2005) The last re- search area found in the previous studies was that they examined community enter- prises by analyzing in order to understand major and support business activities through a value chain concept (The Northeastern Strategic Institute of Khon Kaen University and Office of the Public Sector Development Commission, 2006) To be competitive in the market, the community enterprises must improve its organization from upstream to downstream production However, there are no studies examining all activities of the community enterprises
For the purpose of this study, indigo dyed cloth community enterprises were selected because they were community enterprises whose identities reflected characteristics of community enterprises Also, they adopted local wisdom as a basis for their produc- tion as the local wisdom is related to their ethnic group
Natural indigo dyed cloth has gradually disappeared; there are currently only eight countries producing indigo dyed clothes In Thailand, major places of indigo dyed cloth are the northeast : Udon Thani province, Sakon Nakhon province, Mukdahan province, and Chaiyaphum province This study, indigo dyed cloth places in Sakon Nakhon province were selected because they were identities reflected characteristics of community enterprises Sakon Nakhon province has realized the importance of the wisdom of indigo dyed cloth It has promoted and preserved its local wisdom and encouraged people to use their wisdom as part of their career Moreover, Sakon Na- khon province promoted indigo dyed clothes as their provincial dresses Also, the public sector has encouraged Sakon Nakhon province to be the center of indigo dyed cloth cluster This policy is in line with the World Craft Council of UNESCO, which perceives the importance of local arts and crafts, so it has set up a policy to preserve local arts and handicrafts for the restoration of indigo dyed cloth all over the world (Kenan Institute, 2006; Thongchern, 2006)
(25)3 Objectives of the study
3.1 To examine knowledge management of indigo dyed cloth community enterprises base on a value chain concept
3.2To examine factors supporting knowledge management and community enterpris- es management of indigo dyed cloth community enterprises
3.3 To develop a knowledge management model for indigo dyed cloth community enterprises base on a value chain concept
4 Scope of the study
This study will be purposively selected The community enterprises to be selected is indigo dyed cloth community enterprises, which is rated as the five-star OTOP prod- uct The community enterprises that will be chosen are Baan Tam Tao Housewife Farmers Group, Samak-kee Patana sub-district, Arkat-amnuay district, Tee Ta natural indigo dyed group, Nahuabor sub-district, Pannanikom district
5 Conceptual framework
This study is involved with knowledge management of indigo dyed cloth community enterprises by deploying a value chain concept in order to understand indigo dyed cloth community enterprises in Sakon Nakorn Major frameworks of this study are knowledge management and value chain Based on the value chain concept, an indigo dyed cloth community enterprises is divided into three processes: raw material sup- ply, production of indigo dyed clothes, and marketing and sales promotion This study will carefully examine the knowledge management in each chain value stage, namely 1) the establishment of objectives for knowledge management and types of knowledge, 2) searching and provision of knowledge both within and outside the community, 3) creation of new knowledge by examining phenomena and knowledge building of each member, member groups, and community enterprises of other com- munities, 4) categorizing knowledge and systematic storing of knowledge, 5) accessi- bility and retrieval of knowledge of members in the community enterprises, 6) trans- ferring of knowledge among members, and 7) use and reuse of knowledge as well as methods of how knowledge is used and reused To understand this phenomenon, environmental factors reinforcing success in this business will also be examined These include cultural and local wisdom capital, social capital, and natural resource capital existing in the community These production factors will also examined along side with the structure of community enterprises management such as leaders, coop- erate culture, labor, local wisdom, marketing, rivals, community enterprises networks, budgets, and cooperation of organizations in the public and private sectors as show in Figure
(26)Fig The conceptual framework
To develop a knowledge management model for indigo dyed cloth community enter- prises following a value chain concept, the results from researching on social phe- nomena, knowledge management procedures of each value chain, and factors contrib- uting to the success in knowledge management and running indigo dyed cloth com- munity enterprises will be used It is expected that this model will lead to increasing potential and competitiveness level of the community enterprises
6 Research methodology
This study will adopt a qualitative approach And research methodology is followed to objectives as show in table
Table Research methodology
Objectives Sources Methods Results
1 To examine knowledge management of indigo dyed cloth community enterprises base on a value chain concept
• Documents • Content
analysis • Community profile • Key Inform-
ants
-Government officers
• Semi- structured in- terview
(27)Objectives Sources Methods Results
-Local experts -Leaders • Key inform-
ants •depth In- •a value chain concept KM Process base on 2 To examine enable
factors supporting KM and management of indigo dyed cloth community enterprises
• Key inform-
ants •depth
In-interview
• enable factors of KM and management of indigo dyed cloth community enterprises 3 To develop a
knowledge management model for indigo dyed cloth community enter- prises base on a value chain concept
• results of objective and 2
• Draw KM
model base a value chain concept
• KM model for indi- go dyed cloth commu- nity enterprises • Stakeholders • Focus group
discussion •for indigo dyed cloth Confirm KM model community enterprises
7 Expected outcomes
7.1Obtain a model for knowledge management base on a value chain concept which is correspondent to the context of the indigo dyed cloth community enterprises in Sakon Nakhon province
7.2Obtain an enable factors be in knowledge management and business community management Enable factors specific to the knowledge management and the business community management
7.3The model can be applied in other community enterprises similarities the context
8 References
1 Allen, R.: Community enterprise: Civil society of the economic question University of Birmingham, Birmingham (1999)
(28)2 Chommuang, L., Wasusophapon, S.: Thai Lua weaving clothes: Community business for self sufficiency Sangsan Publications, Bangkok (2003)
3 Community Development Office in Sakon Nakhon.: One Tambon One Product Commu- nity Development Office in Sakon Nakhon, Sakon Nakhon (2004)
4 Drucker, P F.: Post-capital society Harper and Collins, New York (1993)
5 Holsapple, C.W., Sing, M.: The knowledge chain model: Activity for competitiveness Expert System with Applications, vol 20, pp 77-98 (2001)
6 Jonjoubsong, L.: An integrated knowledge management model for community enterprises: A case study of a rural community enterprise in Thailand Ph.D Dissertation, School of In- formation Management, Faculty of Commerce and Administration, Victoria University of Wellington, New Zealand (2008)
7 Kemakorn, C.: Knowledge management organization model of Thai community business PhD Dissertation, Graduate school, Chiang Mai University (2009)
8 Keenan Institute.: Final report of cluster mapping to increase the competitiveness level of the production and service sector Udomrat Printing and Design, Bangkok (2006)
9 Mekwon, C.: Knowledge management of community business groups in Roi-et Master thesis in Social Studies, Graduate School, Khon Kaen University (2006)
10 Nindam, S.: A study of community business performances, shops at Highway Service Cen-
ter of Kaophoe, Prachup Kerikhan Master of Arts thesis in Rural Studies and Develop- ment, Graduate School, Mahidol University (2008)
11 Nonaka, I.: The knowledge-creating Company In P F Drucker (Eds.), Harvard business review on knowledge management pp 21-45 Harvard Business Press, Boston (1998) 12 Petprasert, N.: Community business: Potential ways Research Fund, Bangkok (1999) 13 Phabu, T.: Processes of knowledge transfer in silk weaving: A case study in Baan Tayuak,
Tung Luang sub-district, Suwannaphum district, Roi-et An Independent Study Report in Thai Studies, Mahasarakham University (2002)
14 Pipatseritham, K.: OTOP: Layman fighters and community marketing persons AR Busi- ness Press, Bangkok (2004)
15 Pisaisawat, S.: Production and distribution of Mudmee-Kid Thai silk products: A case study in Udonthani Journal of Research for Development, vol 25(89), pp 23-33 (1996) 16 Porter, M E.: Competitive advantage: creating and sustaining superior performance: with
a new introduction The Free Press, New York (1985)
17 Saenpot, K.: Improvement of community business performances: A case study of handi- craft products in Udon Thani PhD Dissertation , Graduate School, Khon Kaen University (2003)
18 SMCE.: Information system of community enterprises Retrieved on November 11, 2011 from http://smce.doae.go.th/smce/1index.php?result=5 (2004)
19 Thailand Productivity Institute.: Educational report on the assessment of problems and needs of community enterprises Niwatporn Printing, Bangkok (2003)
20 The Northeastern Strategic Institute of Khon Kaen University and Office of the Public Sector Development Commission.: Methods for value chain analysis: Jasmine rice for im- port Khon Kaen (2006)
21 Tinnaluck, Y.: Knowledge creation and sustainable development: A collaborative pro- cess between Thai local wisdom and modern sciences Ph.D Dissertation, Depart- ment of Public Communication of Science and Technology, University of Poitiers, France (2005)
22 Tinnaluck, Y.: Modern science and native knowledge : Collaborative process that opens new perspective for PCST QUARK, vol 32, pp 70-74 (2004)
(29)23 Thongchern, P.: Reviving of indigo: An action research on the restoration of indigo dyed cloth of local Southern areas Research Project and Area Development 5, Research Fund, Songkla (2006)
(30)An Ontology for Automated Cloud Archiving Systems
Jan Askhoj, Shigeo Sugimoto, and Mitsuharu Nagamori
Graduate School of Library, Information and Media Studies, University of Tsukuba, 1-2 Kasuga, Tsukuba, Ibaraki 305-8550, Japan
Janaskhoej@gmail.com, {sugimoto,na gamori}@slis.tsukuba.ac.jp
This paper presents a domain ontology for cloud archives, based in part on the PREMIS Editorial Committee ontology for the PREMIS Data Dictionary Our ontology’s design is based on a layered model of cloud computing where lower layers provide shared services to higher layers, resulting in the creation of generic Submission Information Packages with PREMIS preservation metadata
An ontology is designed to define a common vocabulary for cloud archives, and define the roles and responsibilities for data creation and transfer, including the registration of cloud-based content creation systems We define the classes, object properties, data properties and annotations necessary to describe the agents, objects, events and rights that comprise a cloud archive
We evaluated the ontology with a prototype system, using real-world examples of cloud systems, digital objects and metadata We found that the ontology was able to describe the chosen components successfully, and that it improved metadata interoperability between content creating applications and the services providing preservation metadata
Keywords: Archives; Metadata; Preservation; Cloud Computing; PREMIS; OWL
1 Introduction
In recent years there has been a huge growth in the use of cloud computing for digital content (Leavitt, 2010) With this move to cloud computing, organisations are gaining a number of benefits, such as economies of scale, reduction of capital expenditure, on-demand scalability and so on (Armbrust, 2010) However, the outsourcing of hardware, software and data storage to one or more third parties makes it more difficult to guarantee the long term preservation of archival
(31)content Traditional models of archiving, such as the OAIS Model (CCSDS 2002), where data is sent in packages from a producer to an archive in complete control of the technological
infrastructure, not address the specific characteristics of cloud computing We have previously identified four areas of cloud computing that differ from the OAIS approach (Askhoj, 2010; Askhoj, 2011)
To accommodate for these differences, the entities of a cloud archive and their roles and dependencies must be formally defined, and the metadata necessary for preservation of Digital Objects must be identified, captured and stored
2.1A layered domain model of cloud computing
Our model is a layered model, that builds on the way cloud computing can be divided into different Service Models, such as PaaS and SaaS The purpose of this division into layers to match the service models is to match current cloud service delivery models
As a starting point, we are working with the assumption that the cloud enables the sharing of computing services, between Producer and Archive These resources form the base layer of the model Producer and Archive themselves are located in separate higher layers, with a Preservation Layer in between The Preservation Layer is there to allocate storage for Producer and direct the Archive towards that storage and the digital objects in it, generate Submission Information Packages (SIP) and ensure the successful transfer transfer of Digital Objects and Metadata between Producer and Archive Figure shows an overview of our conceptual model
We have used the National Institute of Standards and Technology (NIST) October 2011 definition of cloud computing (Mell, 2009) NIST not only defines the essential characteristics of the cloud, but also presents the three basic models of service delivery: Software as a Service (SaaS), Platform as a Service (PaaS) and Infrastructure as a Service (IaaS)
(32)Figure Concept
ual Model
of a Cloud Archive
divided into layers
1.T he PaaS layer (Layer 1) provides Cloud Storage, a trusted, long-term repository for simple bit-streams These are primitive units that make no sense as information outside the context of a system that can read and represent them The bit-streams can be parts of the Digital Objects that are the target for archiving, parts of metadata or system configuration data The integrity of the data is guaranteed by the layer, preventing "bit decay"
2.The SaaS layer (Layer 2) holds the Creating Applications (i.e applications for the creation of Digital Objects used by a Producer) that represent PaaS layer bit-streams to content users/creators as Digital Objects with associated metadata
3.Digital Objects need accompanying metadata before they can be ingested into an archive (in other words, they need to be turned into Submission Information Packages) This metadata is needed to ensure long term preservation (for example information about provenance and structure), and is different from the metadata added in layer two which is specific to the Creating Application, such as descriptive metadata (Dublin Core, MODS or similar) The Preservation Layer (Layer 3) creates SIPs that can be accessed by archive systems
4.The Interaction Layer (Layer 4) is where agents (users or systems) access cloud systems to create, manage or archive Digital Objects using a browser or dedicated archive systems
(33)Archive systems in the Interaction Layer ingest Submission Information Packages produced by the Preservation Layer
The arrows in the right side of Figure show the development of information as it moves between the layers There is a progression of complexity in information, from the simple bit- strings in the PaaS layer to the complete Information Packages in the Interaction layer It should be noted that it is possible for a higher layer to interact with a layer one or more layers down (in effect “skipping a layer”) An example of this is the Preservation Service allocating storage in the PaaS layer
3 Objective
In a cloud environment, functionality in one or more of the layers from to may lie outside the control of the archiving organisation It therefore becomes extremely important to describe the types of data produced and received by each layer Without such information, it becomes impossible to abstract functionality, as there are no guarantees that the necessary data will be produced in the right format We therefore defined a domain ontology for use in the design of a cloud archive system, as outlined in the conceptual model
4.1 A Model Preservation System for Ontology Design
Figure illustrates the information flow from Digital Object to Archive In our model, the Archive System and Creating Application share a common storage platform As storage needs to be reliable and long term, this is allocated by the Preservation Service The Preservation Service serves as an abstraction layer between the Creating Application and the Archive An Archive needs Digital Objects to be accessible long after the organisations that created them have disappeared
In order to get access to storage and submit to the archive, Creating Applications need to register using a registration template originating from the Preservation Service The registration is used to record information about the Digital Objects produced, any associated metadata schemas and the Creating Application itself This information is needed for preservation purposes, and can be thought of as Static Preservation Metadata We use the word static, because the preservation related properties of the systems creating Digital Objects are expected to remain relatively consistent, changing only in case of major version upgrades or added functionality
(34)Once registration is complete, the Preservation Service allocates storage space for the Creating Application to save Digital Objects and for an Archive System to access these objects This information is passed on to the Creating Application as a Registration Response containing the Storage URI, Path and Access Keys The Creating Application can now submit Digital Objects to the Allocated Cloud Storage Along with the Digital Objects, the following is saved: Original Metadata from the Creating Application and any Metadata about the Digital Objects required by the Preservation Service This metadata is part of the Registration Response, and as they will be different for each Digital Object, we call it Dynamic Preservation Metadata
The Static and Dynamic Metadata cover a large part of the information necessary for preservation purposes Using the OAIS terminology, these two types combined deliver the Preservation Description Information and Representation Information necessary to create Submission Information packages These are types of information that cannot be generated automatically by
the Preservation Service without input from the Producer (OCLC, 2002)
Once submission is complete, the Creating Application Notifies the Preservation Service about the submission The preservation Service creates a Submission Information Package for the archive system containing a URI to the Digital Object and Preservation Metadata created from The Static and Dynamic Preservation Metadata; event information automatically generated as part of the submission process; information resulting from analysis of the Digital Objects themselves, and any information extracted from the Original Metadata
The functional entities Archive System, Preservation Service, Creating Application and Storage have been used as classes for our ontology, along with the information types produced, such as Information Package, Registration Request We have used the information flow between the functional entities in the system to define the properties associated with the classes These are shown in figure as the arrows between entities and as the contents of the different information types indicated by a line
(35)rvation System
Figur e
2 M o d e l
P r e s e
5.Ontology Design 5.1Defining Class Aspects
The classes related to preservation metadata for Digital Objects have been taken directly from the PREMIS Editorial Committee OWL ontology draft These are part of the PREMIS data dictionary, and need to be included (Gartner R, 2004)
We have used the entities from the PREMIS data model as super-classes (Agents, Events, Objects and Rights) These entities not only provide a convenient way to group classes, they can also be used to express class inference For example, RightsGranted is a sub-class of Rights The classes and sub-classes in the ontology are not intended to express property inheritance
Using the PREMIS data model entities to group classes has the benefit of providing a second level of semantics, by incorporating relationship information from the PREMIS data model For example, Agents are related to Objects, via either Events or Rights
5.2Class Extensions and Annotations
The PREMIS Editorial Committee ontology has been extended by a number of other metadata schemas, namely FOAF, SKOS Core, PRONOM, ORE and Dublin Core We have decided not to
(36)use these for the purposes of this paper This is partly because we have no current plans to use these schemas and partly to keep our own ontology as simple as possible
The terms in the PREMIS data dictionary have annotations relating to their usage, such as
Definition, Rationale, Creation/Maintenance Notes and Usage Notes (Woodyard-Robinson, 2007) We have added the annotation Layer to our classes Layer is used to define where in the Model a class is located, and makes it possible to assign responsibility for the functionality in a class to an entity in a Layer
5.3Object and Data Property Aspects
Whereas classes are used to capture information about individuals and groups of individuals, Object Properties connect individuals, and Data Properties connect literals and individuals (W3C, 2009) Using these, we can show the information flow in our conceptual model
For Object Properties (properties where the value is an individual) we have included the following annotations: definition, the property domains and ranges and domain/range relationship (functional or inverse functional) if the property is mandatory or not, if the property is repeatable or not, and other comments such as See Also and Usage Notes For Data Properties (properties where the value is a literal), we have included the same information as above However, as Data Properties are used for literals, we have included an annotation for Origin Origin is used to define which entity in the Layered Model generates the Data Property literal For example, contentLocationType is generated by the Preservation Service
5.4Using OWL as a Domain Description Language.
We choose OWL (Web Ontology Language) to describe our domain Compared to RDF, OWL offers better semantic expression and greater machine interpretability than RDF, and is therefore ideally suited to our purposes (McGuinness, 2004) Furthermore, an OWL ontology for the PREMIS Data Dictionary was announced on October 18, 2011 This ontology is not finalized at the time of writing, but the groundwork in defining the PREMIS semantics in OWL has been completed The newly drafted standard is available for comment from the PREMIS Editorial Committee, and forms the basis of the ontology (LoC, 2011)
(37)5.5Extensibility
One of the main reasons for designing an ontology in OWL is cross domain interoperability By having a well-defined common vocabulary, individuals from different domains can be linked according to their semantics OWL already has three constructs to this: owl:sameAs,
owl:differentFrom and owl:AllDifferent We have come to the conclusion that these constructs are not enough to express the relationship between individuals in different PREMIS implementations Good examples of this are PREMIS entities that are defined by locally controlled vocabularies The entity may be the same, but due to differences in vocabulary use, using owl:sameAs may give rise to problems when exchanging data We have chosen to use the Simple Knowledge Organization System (SKOS) mapping properties to link individuals (Miles, 2005)
6.Evaluation of the ontology using a case scenario
We have defined a case scenario, using existing cloud components to show how the ontology can be implemented Our case scenario is very similar to the model preservation system from Figure It contains the same main entities and information flow Each entity is an individual from a class in the ontology, with the functionality of the individual explained in the class definition
Individuals are linked to one or more layers, using the Layer annotation The individuals themselves are linked via properties, with data properties expressed as strings
6.1Registration process
Based on the class description from our ontology, Preservation Service is responsible for ensuring the validity and completeness of preservation metadata to create archive packages As OWL does not specify any syntactic constraints, the preservation service provides an XML Schema
registration template, to be populated by the owning organisation of the Creating Application (Drupal) Here, the class RegistrationResponse is used to define what data properties are related to the registration, and how the registration is related to other classes, such as Event outcome The registered data can be automatically extracted using XPath and imported into the Preservation Service Any errors or omissions in the XML Schema result in a negative registration
The registered data gives the preservation service the ability to validate the metadata provided by the Creating Application This is done by ensuring that all Mandatory Data Properties with the origin Business System are either preregistered (static information such as signatureMethod) or designated as provided at time of creation (dynamic information such as originalName)
(38)If the provided data meets the requirements, a XML response is sent back to the Creating Application from the Preservation Service, containing URI, path and access keys for the shared Cloud Storage (Amazon S3)
6.2Conversion into Generic Submission Information Package.
Once registration is complete, the Creating Application can save digital contents to the dedicated Cloud Storage Digital contents consist of three parts: the Digital Objects in an agreed format; original metadata such as Dublin Core or MODS, and any preservation metadata not provided during registration (dynamic metadata) Once saved, the Preservation Service provides read access to these objects for the Creating Application and the archive system (DSpace) Since the storage platform is shared by Creating Application and Preservations Service, the Digital Objects themselves are not included in the SIP, but only referenced as links to Digital Objects in the Amazon S3 Storage The ontology is used to validate the metadata provided by the Creating Application at the time of save
Another benefit in the ontology lies in the linking of metadata from different creating applications to one authoritative schema As long as the creating applications are registered with the correct metadata linking to the data properties in the ontology, complete Submission Information Packages can be created from applications with different metadata schema
7 Discussion
Our major criterion for evaluation has been whether the ontology can be applied to real world data We have evaluated the ontology by using values from existing cloud system components and data from a PREMIS version 2.1 Sample Record1 from LoC The components used were
Amazon S3 for Cloud Storage, two instances of Amazon EC2 with Ubuntu Linux 10.10 as SaaS Platform and Preservation Service platform and Drupal as Creating Application
We found that the ontology was descriptive enough to create a generic XML package with PREMIS metadata, including cloud specific entities such as platform descriptions We were able to map instances to cloud layers, and to assign them to ontology classes Using the OWL Object Properties we were also able to show relationships between entities: for example, which Agent is
1 http://www.loc.gov/standards/premis/louis-2-1.xml
(39)responsible for the creation/transfer of which Object Finally, the SKOS properties allowed us to link a number of elements from the DC Metadata Element Set to PREMIS
Based on the discussion above, we believe that the test system we built using our ontology meets the requirements presented in the introduction of this paper; it is possible for a Producer and an Archive to share a common platform, and to use data about this platform coupled with
preregistered metadata to automatically create SIPs in a way that eases the burden of metadata provision for Producers By assigning origin and layer information to each term in the ontology, it is possible to assign responsibility for the metadata to specific layers and entities
Whereas the ontology is complete in its current version (subject to modification after further tests), the system we have built for testing purposes is still not mature and relies on a number of functions being carried out by hand Once this is complete, the next step will be to integrate the cloud components and perform a re-evaluation of the ontology using a larger set of test data
8 Conclusion
In this paper, we have presented an OWL ontology for cloud archive systems built on the PREMIS Editorial Committee ontology combined with a layered model of cloud computing We believe that the strength of the ontology lies in the fact that it not only describes a metadata model for Submission Information Packages, but also for the entities contributing to these packages We believe this to be a benefit in a cloud system with multiple Creating Applications such as the one described in the paper One reason for this is that a system with a large number of Creating Applications increase the chances that not all of these will be able to supply submission packages the right format (this is also a problem for many non-cloud based systems) Another reason is that when different system entities share computing resources, such as storage, having a single model to describe these resources increases consistency Furthermore, without a common vocabulary and information model, it is difficult to describe the different cloud entities that contribute to the creation of Information Packages in a manner consistent for preservation purposes
We used the ontology to describe a number of cloud system components, such as platform,
storage and creating application together with a PREMIS version 2.1 Sample Record In our
model system, we found that the ontology was able to describe the chosen components successfully, and that it allowed some metadata interoperability between content creating applications and the preservation service So far, our model system has provided a proof-of- concept by showing an example information flow between system entities In future, we plan to
(40)create an integrated system that implements a storage controller to allow better abstraction of the Cloud Storage and a registration framework
References
Armbrust M, et al (2010) A view of cloud computing Communications of the ACM Volume 53
Issue 4, April 2010, Pages 50-58 ACM New York
Askhoej J, and Sugimoto S (2010) A Model for the Provision of Preservation Metadata as a Service In Taipei, Taiwan: CiSAP
http://www.lis.ntu.edu.tw/cisap2010/Abstracts_files/ds05.pdf Accessed 15 Mar 2012
Askhoj J, Sugimoto S, Nagamori M (2011) Preserving Records in the Cloud Records Management Journal 21 (3) 175–187 doi:10.1108/09565691111186858
CCSDS Secretariat (2002) Reference Model for an Open Archival Information System (OAIS) Blue Book Issue The Consultative Committee for Space Data Systems
Gartner R (2004) PREMIS—Preservation Metadata Implementation Strategies Update 2: Core Elements for Metadata to Support Digital Preservation RLG Diginews Article
Leavitt N (2010) IEEE Xplore - Is Cloud Computing Really Ready for Prime Time? Computer 42
1.January 15 – 20
LoC (2011) PREMIS Data Dictionary for Preservation Metadata Version 2.1 PREMIS Editorial Committe www.loc.gov/standards/premis/v2/premis-2-1.pdf Accessed 15 Mar 2012
McGuinness DL, and Van Harmelen F (2004) OWL Web Ontology Language Overview W3C Recommendation 10 http://cies.hhu.edu.cn/pweb/~zhuoming/teachings/MOD/N4/Readings/5.3-B1.pdf Accessed 15 Mar 2012
Mell, P., and T Grance (2009) The NIST Definition of Cloud Computing National Institute of Standards and Technology 53
Miles A, Matthews B, Wilson M, Brickley D (2005) Skos Core: Simple Knowledge Organisation for the Web In International Conference on Dublin Core and Metadata Applications, pp–3 dcpapers.dublincore.org/index.php/pubs/article/view/798 Accessed 15 Mar 2012
The Library of Congress (2011) PREMIS OWL Ontology Now Available
http://www.loc.gov/standards/premis/owlOntology-announcement.html Accessed 15 Mar
(41)The OCLC/RLG Working Group on Preservation Metadata (2002) Preservation Metadata and the OAIS Information Model - A Metadata Framework to Support the Preservation of Digital Objects Dublin, Ohio http://www.oclc.org/research/activities/past/orprojects/pmwg/pm_framework.pdf Accessed 18 Aug 2012
W3C (2009) OWL Web Ontology Language Document Overview W3C OWL Working Group http://www.w3.org/TR/owl2-overview/ Accessed 15 Mar 2012
Wetteroth, D (2001) OSI reference model for telecommunications McGraw-Hill Professional Woodyard-Robinson, D (2007) Implementing the PREMIS Data Dictionary: a Survey of Approaches Library of Congress http://www.loc.gov/standards/premis/implementation-report- woodyard.pdf Accessed 15 Mar 2012
(42)Development of GMS’s Intangible Cultural Knowledge Base Systems: Integrated Semantic Web Technology and Geographic Information System
Wirapong Chansanam1, Dr Kulthida Tuamsuk2, Dr Kanyarat Kwiecien3 1PhD Candidate in Information Studies, Faculty of Humanities and Social Sciences,
Khon Kaen University, Thailand (wirapongc@kkumail.com)
2Associate Professor, Information and Communication Management Program, Faculty of Humanities
and Social Sciences, Khon Kaen University, Thailand (kultua@kku.ac.th)
3Lecturer, Information and Communication Management Program, Faculty of Humanities
and Social Sciences, Khon Kaen University, Thailand (kandad@kku.ac.th)
Background and Rationale
The Greater Mekong Sub-region (GMS) is an economic region bonded by the Mekong River The land of this region covers 2.6 square kilometers with a population of 326 million people The GMS comprises Cambodia, People’s Republic of China (only Yunnan province and Guang–xi province), Lao People’s Democratic Republic, Myanmar, Thailand, and Viet Nam In 1992, with the assistance of ADB, the six countries entered into a program of sub-regional economic cooperation, designed to enhance economic relations among the countries The program and its projects are supported by contributions from the Asian Development Bank (ADB) and other sources The first priority projects in the region include transportation, energy, telecommunication, environment, human resource development, tourism, trading, and investment of both the private and agricultural sectors (Asian Development Bank, 2011)
During the past decades, cultural heritages have undergone several changes Part of it was caused by the use of management tools invented by UNESCO Currently, cultural heritages not limit to only the collection of objects for memory They also include tradition or practices related to living, passed down from ancestors to later generations Amid the world’s delicate situation and globalization, intangible cultural heritages are vital factors that help preserve cultural diversity An attempt to understanding intangible cultural heritages of people from different cultures helps increase communication across cultures, and, at the same time, helps encourage people to respect each other’s way of life The significance of the intangible cultural heritages are not simply that they are heritage culture, but because they are knowledge and skills continuously passed down from one generation to another generation Social and economic values of the transmission of knowledge reveal the relations of sub-social groups and major social groups in one culture This practice is remarkable important for developing countries because it leads to the development of human resources (UNESCO, 2011)
Currently, a number of intangible cultures are rapidly disappearing This may be due to changes in society and culture, the development of large-scale industry, increasing tourism, mobility of up-country people to big cities, and changing environment Under such changing context, the practices of the intangible cultures as well as the transmission of such cultures have been considerably affected The announcement and registration of the intangible cultures are an important measure increasing people’s awareness on their unique values Also, they are means to eulogize ancestors’ knowledge and wisdom, promote cultural dignity and identity of all groups living all over the country They can also create understanding and acceptance of cultural diversity leading to creative preservation and development, which are both systematic and sustainable (Department of Cultural Promotion, 2012)
Knowledge domains and categorization of intangible cultures of most of the registered ones constitute all categories of intangible cultures One of the starting areas is the domain of knowledge indicated in Section 2.2 of the convention concerning the protection of intangible cultures (i.e., tradition, figure of speech, and performances as well as languages which are a medium of intangible
(43)cultural heritage, performing arts, social practices, ceremonies and festivals, knowledge and practices about nature and universe, and traditional crafts) Accordingly, it is obvious that the domain of knowledge does not cover all the existing contents Any classification systems are just to better organize the data of the list of the registered items (UNESCO, 2011a)
However, there should not only be one form of collection and classifications such well known forms as the Library of Congress System or Dewey Decimal Classification widely used on the Internet network For web technology, distribution frameworks (e.g., Warwick Framework by Lagoze, Lynch, and Daniel (1996) ) were proposed This framework focuses on the importance and necessary of metadata standards for interoperation Therefore, Resource Description Framework (RDF) may have to be reconsidered when designing a working framework RDF is prepared for metalanguage to build and use metadata on web technology RDF is one means of use of the semantic web—a project of World Wide Web Consortium (W3C) Under the concept of ontology, information and knowledge are knowledge-based (Fishwick and Miller, 2004), mainly under the format of Extensible Markup Language (XML) Grigoris and Frank (2004) concluded that an ontology approach is a method to describe opinions under the domain of interest In other words, it is the specification of a conceptualization Ontology is the construction of specific knowledge-based structure or domains, which share mutual concepts and understanding and which are able to categorize documents of the data under the domain of interest Challenges on diversity and accuracy have been major issues for criticisms, in particular in relation to Geographic Information System (GIS) As such, ontology research is conducted to achieve the objective of the GIS through multiple perspectives (Schuurman, 2006; Schuurman and Leszczysnki, 2006; Kitchen and Dodge, 2007)
Therefore, to develop an integrated semantic web technology and a GIS for managing GMS’s intangible cultural knowledge through the use of shared information under the ontology-based approach, the structure of knowledge data will be constructed Definitions of similar knowledge sets from different institutions’ knowledge sources must be similar and related, and also they must have the same meanings This will be a knowledge base for data integration and linking of various data sources As a consequence, the integrated semantic web technology and the GIS can be presented The main aim of this study, therefore, is to seek ways to manage knowledge sets for educators and those who are interested in using and making integrated semantic searches across regional or global institutes, organizations, and institutions This study will adopt the ontology approach to define clear definitions of data set structure shared among different data sources, so that they are mutually understanding and accord with each other in terms of meanings Through the use of the GIS, the sources of the data can be explained and the knowledge sets can be reused To prepare the computer for reading and understanding words and concepts of the intangible culture information, software agents are permitted to access, analyze, and process the data to solve the overwhelming difference and diversified data and to have limitless access to the shared data at anyplace and anytime around the globe
Research Objectives
1 To examine knowledge domains of GMS’s intangible cultures
2 To construct an ontology of intangible cultures and an integrated semantic web by using the Geographic Information System
3 To construct the GMS’s intangible cultural knowledge base systems by integrated semantic web technology and geographic information system
(44)Research Methodology
This study will adopt Research and Development (R&D) approach The details of research methodology are shown in the following table
Objectives Method Tools/Procedures Population/Sample Outcomes
To examine the knowledge domains of GMS’s intangible cultures -documentary research -qualitative research -documents analysis -In-dept interview
-Documents in GMS’s cultures -Experts in GMS’s cultures
Knowledge domains
To construct an ontology of intangible cultures and an integrated semantic web by using the Geographic Information System -qualitative research -Research and development of ontologies -structured interviews -Ontology Editor Software
-Experts in GMS’s cultures
-Ontology experts
An ontology of GMS’s
intangible cultures
To construct the GMS’s intangible cultural knowledge base systems by integrated semantic web technology and geographic information system Knowledge Base System Development -Programming -Testing -Evaluation -web technology -Geographic information technology -an evaluation form for the system
-Ontology experts -Experts in GMS’s cultures
(45)Conceptual framework
Figure1 Research conceptual framework
Expected outcomes
1.Obtain the knowledge domains of intangible cultures in the GMS, procedures to manage the domains, and methods to apply appropriate modern technology in knowledge management through research methodology Academics in cultural anthropology can use the obtained domains to compare against other cultures and also can improve and add up more domains For information specialists, they can correctly and completely publicize the domains currently found
2 Obtain methods to improve an information technology system by applying a cultural anthropology approach through the consultation with experts and interdisciplinary integration of information technology development by using ontologies, semantic web technology philosophy, and geographic information system The obtained methods will be guidelines for interdisciplinary research aimed at conducting research by integrating knowledge from different disciplines
3 Obtain the form of ontologies and semantic web for intangible cultures, which will be used as basic information and methods for modern knowledge management and which will be presented in the geographical information system that help enhance more accurate semantic searching and truly replace the knowledge of the intangible culture domains for information science researchers
(46)References
Akerkar, R., & Saja, P Knowledge Based Systems Jones and Bartlet Publishers 2010
Asian Development Bank (2011) Great Mekong Sub-region Retrieved December 11, 2011, from http://beta.adb.org/countries/gms/overview
Berners-Lee, T., Hall, W., Hendler, J., Shadbolt, N., & Weitzner, D J (2006) Creating a Science of the Web Science, 313(5788), 769-771
Burrough, P A., & McDonnell, R A (1992) Principles of Geographical Information Systems, Oxford University Press : New York
Department of Cultural Promotion (2011) Intangible Cultural Heritage Retrieved December 13, 2011, from http://www.culture.go.th/ichthailand/tran.html
Dublin Core Metadata Initiative.(2011) Dublin Core Metadata Element Set, Version 1.1: Reference Description.Retrieved December 19, 2011, from http://purl.org/dc/documents/rec-dces- 19990702.htm
Fishwick, P A., & Miller, J A (2004, 5-8 Dec 2004) Ontologies for modeling and simulation: issues
and approaches Paper presented at the Simulation Conference, 2004 Proceedings of the 2004
Winter
Kitchen, R.,& Dodge, M.(2007) Rethinking maps Progress in Human Geography, 31(3), 331-344 Kules, B., & Shneiderman, B (2008) Users can change their web search tactics: Design guidelines for
categorized overviews Inf Process Manage., 44(2), 463-484
Lagoze, C., Lynch, C.,& Daniel, R Jr.(1996).The Warwick Framework: A container architecture for aggregating sets of metadata Cornell Computer Science Technical Report TR96-1593 Retrieved December 29, 2011, from
http://www.ecommons.cornell.edu/bitstream/1813/7248/1/96-1593.pdf
Lassila , O.,& Swick , R R.(2004).Resource Description Framework (RDF) RDF/XML Syntax Specification (Revised) W3C Retrieved December 15,2011, from
http://www.w3.org/TR/REC-rdf-syntax/
Guarino, N (1998) Formal Ontology in Information Systems Proceedings of FOIS’98, Trento, Italy, 6- June 1998
Schuurman, N (2006) Formalization Matters: Critical GIS and Ontology Research [doi:
10.1111/j.1467-8306.2006.00513.x] Annals of the Association of American Geographers, 96(4), 726-739
Schuurman, N.,& Leszczynski, A (2006) Ontology-Based Metadata Transactions in GIS, 10(5), 709- 726
UNESCO (2011a) Drawing up inventories Retrieved December 19, 2011, from http://www.unesco.org/culture/ich/index.php?lg=en&pg=00313
_ (2011b) What is Intangible Cultural Heritage? Retrieved December 19, 2011, from http://www.unesco.org/culture/ich/index.php?lg=en&pg=00003
Uschold, M.,& King, M (1995) Towards a Methodology for Building Ontologies Retrieved December 29, 2011, from
http://www1.cs.unicam.it/insegnamenti/reti_2008/Readings/Uschold95.pdf
Uschold, M,& Jasper, R (1999) A Framework for Understanding and Classifying Ontology Applications Retrieved December 29, 2011, from
http://www.cs.man.ac.uk/~horrocks/Teaching/cs646/Papers/uschold99.pdf
W3C (2011) Technology & Society Domain Activity Retrieved December 29, 2011, from
http://www.w3.org/TandS/
Weber, R., & Kaplan, R (2003) Knowledge-based knowledge management In Innovations in Knowledge Engineering, International Series on Advanced Intelligence Volume 4, July 2003, pp 151-172 Adelaide:Advanced Knowledge International
(47)Development of Thai Qualification Framework for the Information Profession
Nujarin Pathumpong1 and Chollabhat Vongprasert2 1PhD Candidate in Informaion Studies,
Faculty of Humanities and Social Sciences, Khon Kaen University, Thailand Email: ottonuch@hotmail.com
2Assistant Professor, Information and Communication Management,
Faculty of Humanities and Social Sciences, Khon Kaen University, Thailand Email: chat045@yahoo.com
1 Background of the study
Currently, the world is becoming a knowledge-based society or a know ledge-based economic society It is a time where success and the competency of competition are driven by knowledge Organizations have to acclimate to changes (Marquard, 2002) Therefore, in order to become transformational leaders and gain the competitive advantage, knowledge is needed, especially the knowledge of intellectual property, which could be valuable and add value (Kaplan & Norton, 2003) At present, the volume of information and knowledge has doubled, and this affects the competency of the organization in accessing information and gaining knowledge Hence, organizations that are successful and are able to increase competitive competence often prepare their workers who have information management skills and knowledge As a result, organizations need to have information professionals, who have the knowledge and an understanding of effective information management within the organization Then, organizations will be able to be successful and com- petitive
Presently, the role and working style of the information profession differs from the past These changes have arisen because there are now more threats which can occur in many aspects of our rapidly changing world, such as, changes in communication, changes in technology, changes in manage- ment, and changes in information
Social Networking and Social Computing have caused changes to communication They also play an important role in daily life Currently, communications equipment, that can be accessed anytime and anywhere, also supports multimedia These changes will affect personal interactions in the future, which can be accessed anytime and anywhere (Ministry of Information
adfa, p 1, 2011
(48)and Communication Technology, 2008) Therefore, the information profes- sional needs to adjust his/her interaction style with customers
The information profession is facing changes in technology as in innovation technology in the digital form of such as electronic publishing, for example, Web 2.0, Library 2.0, Really Simple Syndication (RSS), Blogs, Wikis Short Message Service (SMS), Podcasting, Mashups, Tagging, Folk- sonomies, Open Source Software (OSS), and Open Access (OA) This has changed their roles (Nonthacumjane, 2011) Therefore, the information professional needs to have knowledge and skills related to current information technology
There have been changes in Management In the 21st century, the administration has downsized the organization, has adapted the organizational structure to flat organization, has increased or empowered manpower, has re-adjusted the role of the administration, and has implemented a proactive administration Moreover, competition among government sectors has also been created to promote equality, equity, and facilitation There is an evalua- tion in the form of an open system which focuses on the empowerment of consumers The practice of contracting out some activities has been promoted in order to reduce workloads, manpower, and budgets, including salary Working transformation has also been adapted in order to improve effective- ness in work (Sompis Suksan, n.d) In addition, The organization development policy by Thai Government is to be High Performance Organization (Office Public Sector Development Commission : OPDC, 2009) Therefore, the information profession needs to be flexible and able to adapt to a changing environment
For the information aspect, there have been many changes in infor- mation; Information Explosion, It is due to a large number of media and in- formation format that have changed from paper to multimedia (Raina, 2000) Information Organization, especially FRBR, Semantic Web, RDF, SPARQL, Metadata Schemas have been used (Nonthacumjane, 2011) Information Service emphasize about Quality of service, Content Provider, Customer Ori- ented and Proactive service Information behavior of Users, that the users access to internet more than the libraries Therefore, the information profes- sion needs to be learning about new knowledge and how to modify of work
(49)framework development project which specifies that all curriculum needs to be developed at the program level according to its qualification framework in order to ensure the quality of the graduates it produces Moreover, in order to meet the need of the current labor market it is necessary to gain knowledge and develop skills for information profession because there are threats of program competition from other similar programs present Due to our changing world and threats from other sources, it is important to develop Thai qualification framework for the information profession Professional autonomy could be developed and it could become an assurance tool for graduates with Bachelor’s Degrees in fields related to information profession Confidence in the quality of information profession could be built for the stakeholders Apart from this, research could be applied as a guideline for developing curriculum that is related to the information profession in educa- tional institutes in the ASEAN community
2 Research Problems
Global changes in communication, technology, management, infor- mation and other threats affect instruction in course information profession Therefore, it is necessary to develop educational standards and improve the curriculum in fields related to information profession
3 Research Objectives
1) To synthesize the related literatures, and curriculums in Bachelor’s Degree relating to Information Profession in Thailand and Foreign Countries
2) To study the need for Qualification Framework of Information Profession
3) To develop the Qualification Framework of Information Profession
(50)5 Research Methodology
Objective Method Sources of Information Outcome
1 To synthesize Bachelor’s Degree curriculum in fields related to the information profession in Thailand and abroad
Content Analysis Bachelor’s Degree curriculum in fields related to the information profession in Thailand and abroad
Current condi- tions of Bache- lor’s Degree curriculum in fields related to the information profession To study and analyze
the needs of stakeholders in the information profes- sion at the Bachelor’s De- gree level in Thailand
In-depth Interview
Some of the stakeholders group Qualification
framework needs of stake- holders in the information profession
Questionnaire Stakeholders groups :
-Teachers in Information Profes- sions
-Practitioners in Information Pro- fessions
-Employers
-Students in curriculum related Information Professions
-Alumni in curriculum related Information Professions
3 To develop the Thai Qualification Framework for the information profes- sion
Drafting Thai Qualification frame- work for information profession
Result of Objective and Objective Thai Qualifica- tion Framework for the infor- mation profes- sion
(51)6 Anticipated Outcomes
1) The Qualification Framework for Information Profession would be obtained for using as the guidelines in developing curriculum relating to the Information Profession
2) To elevate the level of the Information Profession based on the Qualification Framework for Information Profession would be obtained for using as the framework of professional development as well as creating confidence in quality of information technology professionals for the stakeholders
3) To enhance the staffs in information profession to move the free labor by being able to work in ASEAN Countries
4) The Thai Qualifications Framework for Information profession can be applied by Universities in ASEAN that provided the curriculum related information profession
7 References
1 Abels E., Jones R., Latham J., Magnoni D., Gard J (2003) Competencies for Infor-
mation Professional of the 21st Century Retrieved May 15, 2011, from
http://www.sla.org/PDFs/ Competencies2003_revised.pdf
2 ALA’s Presidential Task Force (2008) ALA’s Core Competences of Librarianship.
Retrieved Jan 5, 2011, from http://wikis.ala.org/professionaltips/images/ele7/ ALA_core_Competences_June_6_2008.pdf
3 Aschoft, L (2004) Developing competencies critical analysis and personal transferable skills in future information professionals [Electronic version] Library Review, 53(2), 82-88
4 Canadian Library Association (n.d.) Competency profile of information management
specialists in archives, libraries and records management : A comprehensive cross-
sectoral competency analysis Retrieved Jan 5, 2011, from
http://www.cla.ca/resources/competency.htm
5 European Council of Information Associations (2004) EUROGUIDE LIS Volume 1
Competencies and aptitudes for European information professionals Retrieved June
15, 2012, from http://www.certidoc.net/en/euref1-english.pdf
6 Gordon B Davis et al (1996) IS '97 Model curriculum and guidelines for undergraduate degree programs in information systems [Electronic version] ACM SIGMIS Database,
28(1), 1-63
7 International Federation of Library Association and Institutions [IFLA] (2000)
Guidelines for professional library/information educational program 2000.Retrieved
January 5, 2011, from http://www.ifla.org/VII/s23/bulletin/guidelines.htm J Daniel Couger et al (1994) IS'95: Guideline for Undergraduate IS Curriculum
[Electronic version] MIS Quarterly, 19(3), 341-359
9 John T Gorgone et al (2002) IS 2002 Model Curriculum and Guidelines for
Undergraduate Degree Programs in Information Systems Retrieved November 11,
(52)10 Kaplan R.S., & Norton, D.P (2003) Strategy Maps: Converting Intangible Assets into
Tangible Outcomes Massachusetts : Harvard Business School Press
11 Malaysian Qualification Agency (2002) Programme Standards for Library &
Information Science Retrieved Nov 11, 2010, from http://www.mqa.gov.my/
en/garispanduan_sperpustakaanmaklumat.cfm
12 Marquard M.J (2002) Building the Learning Organization: Mastering the Ele-
ments for Corporate Learning California : Davies-Black Publishing
13 Ministry of Information Communication and Technology (2010) ICT 2020 Conceptual
Framework Retrieved Jan 10, 2012, from http://www.ict2020.in.th/?q=system/files/u1/
20100912_ict2020_NICT_v1_2.pdf
14 Nonthacumjane Pussadee (2011) Key skills and competencies of a new generation of LIS professionals International Federation of Library Associations and Institutions -
IFLA, 37(4), 280-288
15 Office Public Sector Development Commission (2009) Development of preliminary
model of the public sector : high performance organization Retrieved Nov 5, 2012,
from http://www.opdc.go.th/oldweb/thai/High_Performance_Organize/ HighPerformanceOrganize.pdf
16 Raina Roshan Lal (2000) Competency Development among Librarians and
Information Professionals Paper of XIX IASLIC Seminar [India] : Bhopal
17 Sooksan Sompid (n.d.) Administration and Management : the 21st Century (in
Thai) Retrieved June 4, 2012, from library.uru.ac.th/article/htmlfile/manage21.pdf
18 Zins Chaim (2007) Knowledge map of Information science [Electronic ver-
(53)Evaluating Core Measures of Text Denoising for Biomedical Relation Mining
Rushdi Shams and Robert E Mercer
Department of Computer Science University of Western Ontario London, ON N6A 5B7,Canada {rshams,mercer}@csd.uwo.ca
Abstract Text Denoising is a tool that reduces texts to their content-
rich parts It has been reported as an effective tool which improves biomedical relation mining as well as supervised keyphrase indexers for digital libraries The idea behind text denoising is that the complex- ity of a sentence plays an important role for it being the content-rich part of the text Therefore, the core measure of text denoising is a well- known readability formula called Fog Index (FI) However, the effect of using other readability formulas is yet to be explored In this paper we plug in four other readability formulas—FRES, SMOG, FORCAST, and FKRI—with text denoising and report their performance on min- ing relations from a corpus of 24 biomedical texts Experimental results show that FI outperforms all other formulas in terms of meaningful re- lation extraction The results also show that besides FI, formulas like SMOG index and FKRI can be used as core measures of text denoising for biomedical relation mining
Keywords: Information Extraction, Information Retrieval, Text Denoising, Text
Readability, Relation Mining
1 Introduction
(54)Apply Threshold
Apply Association
Matrix
Apply PPV and Sensitivity Apply Fog
Index Biomedical
Text
Sentences Ranked by Readability
Score
Filtered Sentences
Ranked Connected
Concepts
Re-ranked Connected Concepts
Fig 1: Text denoising and related concept extraction method described by Shams and Mercer [16]
data, the indexers induced better classifiers and achieved better F-score than their benchmarks The authors concluded that low-readable sentences of a text are content-rich when extracting relations or keyphrases
To assess readability, FI considers two core measures, namely sentence length and complex words Currently, over 50 formulas have been proposed [1] as read- ability measures They consider different features involved in readability like paragraph length, white spaces, use of headings, monosyllabic words, choice of sample size, and proper nouns [7] Among these formulas, four are considered not only as yardsticks but also close to the popularity of FI—Flesch Reading Ease Score (FRES), SMOG Index, FORCAST Index, and Flesch-Kincaid Readability Index (FKRI) Like FI, the two formulas provided by Flesch use the sentence length but consider word length as their second core measure However, the Flesch formulas use different weighting factors leading them to correlate almost inversely with FI On the other hand, SMOG index is similar to FI except that it operates on some specific samples of the text FORCAST, unlike most other formulas, uses only one vocabulary element—monosyllabic words—making it useful for texts without complete sentences Despite the difference in their work- ing principles, the formulas can estimate the difficulty of style; their intention is not to rate the content, organization, format, imagery or quality of readers [10]
(55)Epilepsy-GABA [15] Experimental results show that FI outperforms the other four read- ability formulas as the core measure by extracting more meaningful relations The results also demonstrate that among the four formulas, SMOG index and FKRI are competitive to be used like FI as the core measure of text denoising for biomedical relation mining
The organization of the paper is as follows In the next section, we describe text denoising as well as the readability formulas used in this research Following that, Section describes the methodology Section shows the experimental findings Section draws the conclusion of the paper
2 Background
In this section, we briefly describe the text denoising technique that extracts more content-rich sentences from full biomedical texts based on their FI scores This is followed by an overview of the four readability formulas, FRES, SMOG, FORCAST, and FKRI A detailed description of FI can be found in [16] 2.1 Text Denoising
One key aspect of biomedical papers is that they contain hidden or explicit re- lations, especially among drugs, chemicals, diseases, genes and proteins Most of the proposed automated relation miners attempt to extract these relations from paper abstracts because they are easier to access and they are believed to contain biomedical content information However, it is unlikely that abstracts will contain all important relations because they are at best the concise sum- maries of texts For this reason, a number of biomedical ontologies like OMIM (Online Mendelian Inheritance in Man) and GO (Gene Ontology) use human annotators to extract relations from full texts This is time-consuming as well as error-prone procedure To overcome these shortcomings, Shams and Mercer [16] have proposed a method, Text Denoising, that identifies those sentences in a text, called the denoised text, where content information, such as biomedical relations, is more likely to occur The rest of the text is called the noise text The authors suggested that the describing of biomedical relations lengthens sentences and increases the use of polysyllabic words Some readability indexes, the Fog Index in particular, are based on these two factors They proceeded to use Fog Index to measure sentence readability and showed experimentally that 30% of the low-readability sentences, the denoised part of a text, contain the relations of interest Figure shows the text denoising method
To evaluate Text Denoising, the method was applied on a dataset of 24 full texts that describe four related pairs of disease and chemical components The method extracted pairs of biomedical concepts from the denoised part of the dataset of which about 75% are reported as related by the Unified Medical Language System’s (UMLS) semantic relation network1 It was also noted that
1
(56)the noise text did not contain any related biomedical entities of interest These experimental findings supported the hypothesis of the authors that sentences that are difficult to read have the content information of the full text
2.2 Overview of Readability Formulas
A brief description of the readability formulas is as follows (in historical order)
Flesch Reading Ease Score (FRES) Considered as one of the oldest and
most accurate readability indexes, FRES was developed to advocate a return to the phonics [5] The formula uses two core measures—average sentence length and word length It was originally developed to assess the grade-level of a reader Its use now extends to questionnaire formulation in the US Department of De- fense and medical form content assessment Mathematically, FRES can be writ- ten as Eq
FRES = 206.835 − 1.015 ×
(
( \
Total W ords Total Sentences
\
− 84.6 × Total Syllables
Total Words (1)
The FRES score spans the range to 100, where scores between 90 and 100 are considered easily understandable by an average 5th grader and scores between and 30 are considered easily understandable by university graduates
SMOG Index SMOG index, when first published, was anticipated as a proper
substitute for FI due to its accuracy and ease of use [12] A recent study claims that SMOG index should be the preferred formula when evaluating medical materials [4] The formula for SMOG index counts the complex words (i.e., words that are polysyllabic) in three 10-sentence samples from documents of n
sentences, takes the square root of the sum of the count normalized by n and 30, and then adds 3.1219 (Eq 2)
SMOG Index = 1.043 ×
I
Complex Words
30 ×
n + 3.1219 (2)
The meaning of SMOG index is similar to FI—the index indicates the year of education required by the reader to understand the sentence For example, a passage with a SMOG index of 12 means that to understand it, the reader should have 12 years of academic education
FORCAST Index The FORCAST index was originally formulated to assess
(57)(
yet significant vocabulary element—the count of simple words (i.e., monosyl- labic words) Due to its relative ease of use, the index was applied to write understandable publications by the U.S Air Force Eq is used to give the FORCAST index of any document
N FORCAST Index = 20 −
10 (3)
where N is the number of monosyllabic words in a 150-word sample of the text The index, like many other formulas, indicates the grade level of the reader required to understand the content of the text
Flesch-Kincaid Readability Index (FKRI) A second instalment of a read-
ability index proposed by Flesch and further investigated and modified by Kin- caid [9] eventually took the form of Eq
(
FKRI = 0.39 ×
\
Total W ords Total Sentences
\ + 11.8 × Total Syllables
Total Words − 15.59 (4)
The score, like the other indexes, corresponds to a grade level However, it cor- relates inversely with FRES due to different weighting factors For example, a FKRI score of 10.1 would indicate that the text is anticipated to be understand- able by any student studying in grade 10 Conversely, in the case of FRES, this score would indicate it as a low-readable text Another key difference between them is that FKRI defines the lowest possible grade level score in theory, which is −3.40, although very few real-life passages comprise a single one-syllable word
3 Methodology
In this section, we describe the dataset of 24 full texts, the experimental proce- dures and performance evaluation measures that we used in our experiment 3.1 The Dataset
FI was successfully established as a core measure for text denoising in [16] The trial used 24 biomedical texts as a test dataset divided into four sets Each set describes one pair of concepts related with an explicit disease-chemical compo- nent relation reported by Perez-Iratxeta et al [15] The pairs of concepts are
Ischemia-Glutamate, Ataxia-Dehydrogenase, Hypogonadism-Gonadotropin, and
(58)Text Denoising with Four Readability
Formulas
Texts on Disease-Chemical Component
Concepts Extracted using
FRES
Concepts Extracted using
SMOG
Concepts Extracted using
FORCAST
Concepts Extracted using
FKRI
Fig 2: Experimental Procedure
– The texts have been randomly collected from PubMed paper repository2
that can be described by the four concept pairs mentioned above
– Texts have been preprocessed For example, several sections of the texts like title, affiliations, tables, figures, acknowledgments, and references have been removed
– Document size in terms of number of words varies
Several annotation tasks for the dataset are still being carried out However, for this experiment, we only needed the pre-processed full texts
3.2 Procedure
The experimental procedure is shown in Figure In our experiment, we followed the procedure described by Shams and Mercer [16] as shown in Figure 1, except that we used four different readability formulas other than the FI The four formulas were applied one at a time on every sentence of the texts to provide each with a readability score The sentences were then ranked based on this score From these ranked sentences, 30% of the low-readable sentences were considered Then, we used a co-occurrence frequency matrix (also known as association matrix) to find out the most frequently co-occurred concepts in the texts of which the 20 most frequent pairs were selected
However, among these selected pairs of concepts, some lack representative- ness (i.e., they not hold any relation according to UMLS semantic relation network) These are called noisy pairs of concepts and needed to be removed Because we randomly collected texts, we observed that it is possible for the pairs
2
(59)TP TP
to never co-occur in a sentence which indicates that our data set is imbalanced So, we used the equally weighted harmonic mean of the PPV and sensitivity of the pairs of concepts provided by FI to evaluate their representativeness as it is a great evaluation metric for imbalanced dataset [6]
PPV3 is the proportion of correctly predicted relations and sensitivity is the
proportion of relevant relations that are identified by our method To measure these values, we considered the number of sentences extracted by the formulas which is the total number of results returned by the tool (R) that comprises the number of True Positives (TP ) and False Positives (FP ) Then, we took each pair in our co-occurrence frequency matrix and developed a second set of sentences that contain both the concepts The number of sentences in this set is the number of results that should have been returned by our system (S ) and comprises the number of True Positives (TP ) and False Negatives (FN ) The number of sentences that are present in both of these sets is the number of
TP Afterwards, FP is obtained by subtracting TP from R and FN is obtained by subtracting TP from S So, the PPV of every pair of connected concepts is
TP+FP and the sensitivity of every pair of connected concepts is TP+FN Eq
5 is then used to determine the equally weighted harmonic mean for the given pair of concepts In this way, we measured this mean for every pair of concepts in our co-occurrence matrix
Harmonic Mean of PPV and Sensitivity
( \
PPV × Sensitivity (5)
= ×
PPV + Sensitivity
These pairs are then re-ranked based on each of their PPV and sensitivity From these re-ranked list, top 10 pairs of concepts were considered as the related concepts of the texts As we have these 10 pairs of concepts per set of texts by each of the formulas, we divided them into two groups— (i) the first group con- tained the pairs of concepts that were reported to be related by UMLS semantic relation network; (ii) the second group was composed of pairs of concepts that not have any semantic relation The more pairs of concepts extracted using a readability formula in the first group, the better its performance is
3.3 Evaluation Measures
Our evaluation of the readability forumlas is twofold First, we are interested in knowing the number of meaningful pairs of concepts extracted using each of the formulas The concepts in the tables 1–4 are divided into two segments The upper segment of the table contains the first group of concepts while the lower
3
(60)Related Concepts
FI SMOG FKRI FRES FORCAST
Rank Harm onic Rank Harm onic Rank Harm onic Rank Harm onic Rank Harm onic
Mean Mean Mean Mean Mean
Ischemia-Glutamate 51.85 51.85 51.85 39.13 48.15
Levels-Glutamate 41.66 41.66 41.66 37.50
Glutamate-Neurons 39.02 39.02 39.02
10Min-Ischemia 37.50 37.50 37.50 48.14 29.16
Glutam ate-CA4 35.89
Increase-Glutamate 32.55 32.55 32.55 32.55
Ischemia-5Min 31.57 31.57 31.57
Ischemia-DG 39.02
Glutamate-Microdialysis 37.50
Neurons-Ischemia 22.22
CA1-Ischemia 15.78
Glutamate-Release 27.77
Levels-Ischemia 43.47 43.47 43.47 39.93
10Min-Glutamate 31.81 31.81 31.81 32.55 27.27
Glutam ate-5Min 31.57 31.57 31.57
Ischemia-Release 32.55 10 15.00
Experiment-Ischemia 27.77 27.77
Glutam ate-Exp eriment 27.77 27.77
10Min-Release 31.57
Increase-Ischemia 23.80
Table 1: Relations extracted using the readability formulas from the papers on Ischemia and Glutamate
Related Concepts
FI SMOG FKRI FRES FORCAST
Rank Harm onic Rank Harm onic Rank Harm onic Rank Harm onic Rank Harm onic
Mean Mean Mean Mean Mean
Friedreich-Ataxia 59.25 66.66 59.25 59.25 51.85
PDHC-Ataxia 56.00 56.00 56.00 48.00 39.99
Activity-Friedreich 43.47 43.47 43.47 34.78 43.47
Patients-Ataxia 43.47 52.17 43.47 52.17 34.78
Activity-Ataxia 43.47 43.47 43.47 34.78 43.47
PDHC-Friedreich 43.47 43.47 43.47 34.78 34.78
Patients-Friedreich 36.36 45.45 45.45
Activity-PDHC 37.03 37.03
Preparations-Ataxia 40.00 40.00 40.00 40.00
Preparations-Friedreich 40.00 40.00 40.00 40.00
Pyruvate-Ataxia 38.09 47.61 38.09 47.61 28.57
Siblings-Ataxia 22.22
Disease-Pyruvate 21.05
Table 2: Relations extracted using the readability formulas from the papers on Ataxia and Dehydrogenase
segment of the table has concepts that belong to the second group (see Section 3.2)
(61)Related Concepts
FI SMOG FKR I FRES FORCAST
Rank Harm onic Rank Harm onic Rank Harm onic Rank Harm onic Rank Harm onic
Mean Mean Mea
n
Mean Mean
AAS-Treatment 29.41 29.41 32.35 23.52 29.41
AAS-Testosterone 18.46 15.38 18.46 24.61
Gonadotropin-Treatment 18.18 18.18 18.18 21.21 12.12
Testosterone-Treatment 14.92 14.92 17.91 17.91 11.94
Levels-Testosterone 14.49 17.39 14.49 10 11.59
AAS-Conditions 12.90
Treatment-HCG 12.90 12.90 12.90 12.90
Treatment-Therapy 12.90 16.12 16.12 16.12 12.90
Gonadotropin-Testosterone 14.92 11.94 14.92
Clomiphene-Citrate 24.32
Tamoxifen-Citrate 20.28
Use-AAS 21.62 16.21 18.91 10.81 27.02
Replacement-Therapy 12.90
AAS-Conditions 12.90
Testosterone-Production 15.15
Function-Testosterone 12.30
Therapy-Gonadotropin 10.00
Function-Testosterone 10 9.67
Therapy-Testosterone 12.69
Table 3: Relations extracted using the readability formulas from the papers on Hypogonadism and Gonadotropin
against a gold standard As FI is already proved to be an effective measure for text denoising, we considered the performance of FI as our gold standard We considered the related concepts extracted using FI as the positives and examined the true positives, false positives and false negatives of a given formula, and calculated its precision and recall In addition, we calculated both micro and macro average of precision and recall and hence the F-Score of every formula (Table 6) We calculated the micro average as we have large number of sentences that differ from one set to the other as well as the macro average to see how the formulas performed across all sets [11] To calculate the micro average, the true positives, false positives and false negatives were added up across every set first that are used to compute the statistics On the other hand, the macro average was calculated by calculating the precision and recall for each instance first that is averaged over all instances in the reference standard
4 Results and Discussion
(62)Related Concepts
FI SMOG FKRI FRES FORCAST
Rank Harm onic Rank Harm onic Rank Harm onic Rank Harm onic Rank Harm onic
Mean Mean Mean Mean Mean
Inhibition-GABA 26.08 28.16 34.78 34.78 23.91
GABA-Synapse 20.25 20.25 22.78 20.25
Neurons-Synapse 14.17 14.70 14.70 11.76
Inhibition-Hippocam pus 12.30 10.66
Neurons-GABA 8.00 16.00 10.66
Prop erties-GABA 6.45 9.67 9.67 9.67
Cl-Gradient 3.33
Inhibition-Dentate Gyrus 9.37
Synapse-Change 9.37 9.37 9.37 12.50
GABA-Change 6.45 9.67 6.45
GABA-Numb er 6.34 12.69 9.52 9.52
Synapse-Number 6.55
Neuron-Input 6.45
Animal-Models 11.26 14.08
Neurons-Inhibition 9.67
GABA-Alteration 18.18
Study-Tissue 9.37
Study-Inhibition 10 8.95
Rat-Inhibition 20.51
Slices-Inhibition 12.50
Animal-Rat 12.30
Cortex-slices 9.09
Epilepsy-Rat 8.95
Inhibition-Kindling 8.95
Number-Tissue 6.45
Table 4: Relations extracted using the readability formulas from the papers on Epilepsy and GABA
SMOG performed marginally better than FRES and FORCAST as most of its meaningful relations had low harmonic mean It is noteworthy that the ranks and harmonic means of the relations for the first three formulas were somewhat similar to each other—means that they extracted almost the same sentences
In Table 2, the relations extracted using the formulas from the papers on Ataxia and Dehydrogenase are displayed It is surprising that all of the formulas extracted exactly seven meaningful relations In this case, the performance of FI and FKRI were almost identical On the other hand, both SMOG and FRES extracted the same related concepts like FI but their harmonic means largely differed Careful observations sustain that FI and SMOG performed best in this case followed by FKRI and FRES
(63)Readability Formula Precision Recall F-Score
SMOG 100.00 100.00 100.00
FKRI 85.71 85.71 85.71
FRES 100.00 100.00 100.00
FORCAST 85.71 85.71 85.71
Readability Formula Precision Recall F-Score
SMOG 100.00 71.43 83.33
FKRI 100.00 71.43 83.33
FRES 100.00 71.43 83.33
FORCAST 50.00 14.29 22.22
Readability Formula Precision Recall F-Score
SMOG 96.88 82.60 89.16
FKRI 89.73 82.60 86.01
FRES 80.83 65.63 72.44
FORCAST 77.88 61.61 68.72
Readability Formula Precision Recall F-Score
SMOG 100.00 71.43 83.33
FKRI 85.71 85.71 85.71
FRES 40.00 28.57 33.33
FORCAST 100.00 71.43 83.33
(a) (b)
Readability Formula Precision Recall F-Score
SMOG 87.50 87.50 87.50
FKRI 87.50 87.50 87.50
FRES 83.33 62.50 71.43
FORCAST 75.00 75.00 75.00
(c) (d)
Table 5: Performance of the four readability formulas for the papers on (a) Ischemia and Glutamate, (b) Ataxia and Dehydrogenase, (c) Hypogonadism and Gonadotropin (d) Epilepsy and GABA
Readability Formula Precision Recall F-Score
SMOG 95.83 82.14 88.46
FKRI 88.89 82.76 85.71
FRES 82.61 65.52 73.08
FORCAST 81.82 62.07 70.59
(a) (b)
Table 6: Average precision, recall and F-Score of the formulas with (a) micro- average and (b) macro-average methods
they are semantically related The performance of FRES is the poorest among the five as it extracted four pairs of concepts without semantic relations
Table shows the related concepts extracted using the formulas from the papers on Epilepsy and GABA FI outperformed others by extracting seven semantically related concepts FKRI, SMOG, and FRES extracted five related concepts each but the performance of FKRI is better than the other two Between SMOG and FRES, most of the low-ranked pairs of concepts extracted using the prior lack meaning than the latter On the other hand, FORCAST performed really poor in this case by extracting only two semantically related pairs
Table displays the performance of the four readability formulas—FKRI, SMOG, FRES, and FORCAST—against the gold standard From the table, it can be seen that the performance of FKRI and SMOG was consistent throughout the four sets of papers For all sets of papers, either of the formulas had the best F-Score On the other hand, the F-scores of FRES and FORCAST largely varied for the papers and either of them had the lowest F-Score
(64)precision and recall but their significanly poor recalls cost them lower F-Scores Information on Table (b) shows similar results except that both SMOG in- dex and FKRI achieved the best recall From this analysis, it can be said that SMOG index and FKRI are both performing similar to FI and thus can be used to reduce textual noise and extract related biomedical concepts
5 Conclusions
While FI has been used in text denoising to make it a meaningful relations extraction tool for biomedical texts, we reported the performance of four other readability formulas, namely FKRI, SMOG, FRES, and FORCAST on this task We applied the formulas to the sentences of 24 biomedical texts, ordered them according to their reading difficulty, and extracted frequently co-occurred con- cepts from the 30% of the low-readable sentences These concepts were then re-ranked according to the harmonic mean of their PPV and sensitivity A com- parative result shows that FI outperformed the other formulas by extracting more meaningful relations according to UMLS semantic relation network We also analyzed the performance of the formulas considering the performance of FI as a gold standard It shows that SMOG index achieved the best F-Score followed by FKRI while FRES and FORCAST performed poorly It can also be noted that SMOG index, like FI, uses the core measure of complex words and its performance is the best compared to the gold standard which reveals the fact that the measure of complex word fits best for text denoising and biomedical relation extraction
As for relation mining we found at least two competitive measures for text de- noising other than FI, their performances on training data reduction for keyphrase indexers can be of great interest This task is left as future work
Acknowledgements
This work was partially funded through a Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant to Robert E Mercer
References
1 J Bogert In defense of the fog index Business Communication Quarterly, 48:9–12, 1985
2 J S Caylor, T G Stitch, L C Fox, and J P Ford Methodologies for determining reading requirements of military occupational specialities Technical Report 73-5, Human Resources Research Organization, Alexandria, VA, 1973
3 T M Duffy and P Kabance Testing a readable writing approach to text revision Journal of Educational Psychology, 74:733–48, 1982
(65)5 R Flesch A new readability yardstick Journal of Applied Psychology, 32:221–33, 1948
6 O Frunza and D Inkpen Extraction of disease-treatment semantic relations from biomedical sentences In Proceedings of the 2010 Workshop on Biomedical Natu- ral Language Processing, BioNLP ’10, pages 91–98, Stroudsburg, PA, USA, 2010 Association for Computational Linguistics
7 E Fry A readability formula that saves time Journal of Reading, 11:512–16 cont 575–78, 1968
8 O S Goh, C C Fung, A Depickere, and K W Wong Using gunnnig-fog in- dex to assess instant messages readability from ecas In Proceedings of the Third
International Conference on Natural Computation, volume of ICNC ’07, pages
480–486, Washington, DC, USA, 2007
9 J P Kincaid, R P Fishburne, R L Rogers, and B S Chissom Derivation of new readability formulas (automated readability index, fog count, and flesch reading ease formula) for navy enlisted personnel Research Branch Report 8-75, Chief of Naval Technical Writing: Naval Air Station Memphis, 1975
10 K Koenke Another practical note on readability formulas Journal of Reading, 15:205, 1971
11 C Manning and H Shutze Foundations of Statistical Natural Language Processing Cambridge, MA: MIT Press, 1999
12 G H McLaughlin Smog grading – a new redability formula Journal of Reading, 12(8):639–46, 1969
13 O Medelyan Human-competitive automatic topic indexing PhD thesis, University of Waikato, New Zealand, 2009
14 O Medelyan and I Witten Domain-independent automatic keyphrase indexing with small training sets Journal of the American Society for Information Science and Technology (JASIST), 59(7):1026–1040, 2008
15 C Perez-Iratxeta, P Bork, and M Andrade Literature and genome data min- ing for prioritizing disease-associated genes In F Eisenhaber, editor, Discovering Biomolecular Mechanisms with Computational Biology, Molecular Biology Intelli- gence Unit, pages 74–81 Springer, 2006
16 R Shams and R E Mercer Extracting connected concepts from biomedical texts using fog index Procedia - Social and Behavioral Sciences, 27:70–76, 2011 17 R Shams and R E Mercer Improving supervised keyphrase indexer classification
of keyphrases with text denoising In 14th International Conference on Asia- Pacific Digital Libraries (ICADL 2012), Taipei, Taiwan, 2012
18 R Shams and R E Mercer Investigating keyphrase indexing with text denoising
In Proceedings of the 11th ACM/IEEE-CS Joint Conference on Digital Libraries
(JCDL 2012), Washington DC, USA, 2012
(66)Information Behavior Model of Farmers using the Grounded Theory Approach
Unchasa Seenuankaew
Ph.D Candidate in Information Studies Program
Faculty of Humanities and Social Sciences, Khon Kaen University, Thailand
unchasa.s@hotmail.co.th
Chollabhat Vongprasert
Assistant Professor, Information and Communication Management Program, Faculty of Humanities and Social Sciences, Khon Kaen University, Thailand
chat045@yahoo.com
Abstract
This concept paper research that focuses on the development of an Infor- mation Behavior Model of Farmers This research study endeavors to study and understand the phenomenon of information behavior as it relates to Thai farmers The research objectives of this study include: (a) to study infor- mation behavior consisting of the information need, information seeking, and information use of Thai farmers; (b) to study enable that supports the information behavior of Thai farmers; and, (c) to develop an information behavior model of Thai farmers by using the Grounded Theory Approach The researcher intends to use in-depth interviews to elicit data from key informants This study will study and analyze three groups of farmers: sub- sistence farmers, semi- subsistence farmers, and purely commercial farmers Farmers that either own or rent their farm land will be included in this study The location of this study will be Tambon Chai Buri in the Muang District of Pattalung Province in Thailand
adfa, p 1, 2011
(67)1 Introduction
In the 21st century, the world is stepping closer to a knowledge-based society where information, knowledge, and information technology are used in development, as opposed to traditional manpower resources Similar changes in Thailand cannot be avoided (Cheejang, 2008) As such, Thai people must develop skills to adapt to this changing world They must now learn new skills and develop their knowledge con- tinuously, necessitating the continuous retrieval and use of information This is to solve problems and to survive in the new generation (Spink & Cole, 2006) The un- derstanding of information behavior is essential because it demonstrates a relation- ship between information needs, information seeking, and information use (Wilson, 2000) Moreover, information behavior is also important for development For ex- ample, it helps people to access information and news to assist with basic living and related needs Government is also responsible for understanding information behav- ior in order to facilitate public policy The research of Marchlla & Baker (1999) stud- ied the needs of information and information seeking behavior in population in the United Kingdom The results reflect that reading is the most popular information ac- cess in the United Kingdom It is believed that information access is very important for population in the United Kingdom Therefore, government and related authorities must understand information behavior of their people in order to plan information services that meet people needs Information behavior is the essential basic knowledge for living in a knowledge-based society
information culture is most often linked to information literacy and information behavior (Gendina, 2004) For the purpose of this study, information culture is de- fined as the attitudes, beliefs and behavior towards information ownership, infor- mation seeking and information use Information culture in a developing country requires that to truly step into the information society, developing countries need to adopt holistic approaches that are designed to cultivate a modern information culture, and to make incremental social institutional changes, in addition to technological innovations (Zheng, 2005)
More than half of all Thai people are agriculturalists Indeed, agriculture is closely related to Thai culture and the lifestyle of Thai people (Ministry of Agriculture and Cooperatives, 2012) Agriculture continues to play an important role in strengthening Thailand’s economy Agriculturalists, in particular, often face obstacles, such as limi- tations in information access regarding agriculture, price changes, and non-standard production These are some of the problems that cause and perpetuate poverty among Thai farmers Thai farmers are the most important group of people because farming is the major occupation of Thai people Apart from being the main food, each year, Thailand exports rice, which worth millions of baht, to many countries For the first six months in 2010, rice export accounted for 78,982 million Baht (National Infor- mation Center, Office of Permanent Secretary, Ministry of Commerce: Online) Rice is the export goods that brings the highest revenue to the country
(68)proach The research outcomes are intended to provide a theoretical conclusion re- garding the information behavior of farmers in Thailand Grounded theory is the theory that explains the understanding of a phenomenon, though, and belief from the view of people in the phenomena Then it conceptualizes information from those phe- nomena in order to find the connection among concepts and therefore receive theoret- ical conclusion of the phenomenon that needs to be explained (Hawanon et al., 2003) In the same way, Glaser & Strauss (1976) explain that creating grounded theory is the creating of theoretical explanation directly from information The methodology in creating grounded theory is developed from belief, which is to understand human behavior and how they live together You need to understand the process in which people give the meaning to things around themselves because human thought and action has basic element in the meaning of things around themselves This methodol- ogy focuses on the study of social phenomenon in understanding things and putting information into concept It also finds connection among concepts to receive theoreti- cal conclusion of a social phenomenon Therefore, the researchers were interested in studying Farmers’ Information Behavior from the grounded theory The results of this study will be used to develop a theory to explain the information behavior of Thai rice farmers, how they access information, how they manage changes in information, and how the government manages the farmer’s ability to access information
2 Research Objectives
This research aims to study and understand the phenomenon of Thai farmers’ information behavior The research objectives are as follows:
2.1 To study information behavior consisting of the information need, information seeking, and information use of Thai farmers
2.2To study enable that supports the information behavior of Thai farmers
2.3To develop an information behavior model of Thai farmers, by using the grounded theory approach
3 Research Methodology
The research method that was selected for this study is grounded theory The aim of this research method is building theory, not testing theory Rather than begin a study with a preconceived theory that needs to be proven, the researcher begins with a general area of study and allows the theory to emerge from the data
3.1 The Study Area
(69)two provincial agricultural research officers, this is the most productive area in the province
3.2 Entering research site
Entering the research site is an important step It is necessary to ensure that the data collection process is complete and that true information, well understood by both key informants and the researchers, is obtained Interview questions have been prepared to use as a research tool to collect data, as indicated in the research objectives and conceptual framework The questions cover information needs, conditions, and situa- tions – all of which encourage the search for information regarding farming and the use of information in the decision making process Information culture factor ques- tions, which support information behavior, will also be asked
3.3 Data Collection and Analysis
Data Collection
Data will be collected through the use of in-depth interviews Guidelines were developed from the research objectives and the conceptual framework of the study These guidelines help to verify whether questions for the interview elicit the answers, as indicated in the objectives or not After that, the results were used to improve the questions in order to get the accurate information as stated in the objectives of the study With regard to the selection of key informants, I will need to study the phe- nomenon first However, I have an approximate idea of the key informants I will need There are three farmers groups: subsistence farming, semi- subsistence farming, and purely commercial farming – to include farmers who both own or rent there land
Data Analysis.
(70)Since, I conclude regarding these steps in my data analysis: The first, after the in- terview of the first key informant, I will transcribe the recorded interview conversa- tion word-for-word I will use this information to ensure that the question guidelines are accurate The second, the data collected from interviewing the key informants, which will be recorded on tapes, will be used for describing the behaviors of each case in detail The description of each case must give a complete picture of behaviors that respond to different situations The third, I will read and analyze the data many times in order to clearly understand the information relating to different behaviors before analyzing the process of behaviors under each situation The fourth, I will ana- lyze the information behaviors according to grounded theory with the use of the con- ceptual construction process based on the data In this respect, coding techniques will be applied, namely: open coding, axial coding, and selective coding And finally, an information behavior model of farmers will be developed
4 Anticipated outcomes of the research The anticipated outcomes of this study are to:
4.1New knowledge which is the theoretical conclusion of farmers’ information be- havior in Thai society context Moreover, developed information behavior can be used to explain Thai farmers’ information behavior Finally, new knowledge will be useful for library and information science
4.2The results of the study can be used as a guideline to develop information services and reduce the gap in information access provided for Thai farmers
4.3 The findings of this research will provide an alternative method of information management for the public sector, as well as other related sectors Furthermore, this new methodology will more effectively and efficiently respond to the information behavior and needs of farmers, thereby helping to stimulate and enhance the econom- ic and social development of Thailand
5 Conceptual Framework
(71)Fig Conceptual Framework
6 References
1 Curry, A., & Moore, C.: Assessing Information Culture: An Exploratory Model Interna-
tional Journal of Information Management, 23(2), 91-110 (2003)
2 Gendina, N.: Information Culture in the Information Society: the View from Russia In:
Proceeding the International Conference UNESCO between Two Phases of the
World Summit on the Information Society Retrieved November 2012, from
http://unpan1.un.org/intradoc/groups/public/documents/un-dpadm/unpan047050.pdf#page=99
(2004)
3 Glaser, B and Strauss, A.: The Discovery of Grounded Theory: Strategies of Qualita-
tive Research London: Weidenfeld & Nicolson (1976)
4 King, D.G and Palmour, V.E.: “How Need are Generated ; What We Have Found out them.” In The Nationwide Provision and Use of Information : ASLIB, IIS, LA Joint
Conference, 15-19 Sept 1980 Sheffield Proceeding, 68-79 London : Library Association
(1981)
5 Kuhlthau, C C.: Inside the search process : Information seeking from the user's perspec- tive Journal of the American Society for Information Science, 42(5), 361-371 (1991)
6 Leckie, J., Pettigrew, E & Sylvain, C.: Modeling the information seeking of professionals: a general model derived from research on engineers, health care professionals and lawyers
Library Quarterly 66(2), 161-193 (1996)
7 Marcella, Rita and Baxter, Graene.: The Information needs and the information seeking behavior of a national sample of the population in The United Kingdom
(72)8 Ministry of Agriculture and Cooperatives.: Hand out for Agriculture Development Plan
during the Eleventh National Economic and Social Development Plan (2012-2016) (in
Thai) Retrieved June 11, 2012, from http://www.oae.go.th/download/ docu-
ment_plan/01.PDF
9 Hawanon, N et al.: The Accordance between grounded theory and empirical indica-
tors in building Community Empowerment Index (in Thai) Bangkok: PhD thesis Pro-
gram in Development Education, Graduate school, Srinakharinwirot University (2003)
10 National Information Center, Office of Permanent Secretary Ministry of Commerce Thai-
land’s 15 essential export goods during 2006-2010 (Jan.-June) (in Thai) Retrieved July
2010, from http://www2.ops3.moc.go.th/export/export_topn_5y/report.asp
11 Riyaz, Aminath.: The Information culture of the Maldives: An exploratory Study of
Information provision and Access in a Small Island Developing State Retrieved Au-
gust 3, 2012, from http://espace.library.curtin edu.au/R/?func=dbin- jumpfull&object_id=133399 (2009)
12 Cheejang, S.: Higher Education and Knowledge-based Society (in Thai) Journal of Du-
sit Thani College, (2), 19-41 (2008)
13 Spink, A., & Cole, C.: Human Information Behavior: Integrating Diverse Approaches and
Information Use Journal of the American Society for Information Science and
Technology, 57(1), 25–35 (2006)
14 Wilson, T.D.: Human Informtion Behaviour Informing Science 3,(2), 49-55 (2000)
15 Zheng, Y.: Information Culture and Development: Chinese experience of e-health In:
Proceedings of the 38th Hawaii International Conference on System Sciences
Retrieved July 2012, from http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber
(www.icadl2012.org) http://dis.sci.ntu edu.sg/cisap/about.htm 4), http://hf.tc.faa.gov/technotes/dot-faa-ct-tn04-31.pdf ), http://www.ariadne.ac.uk/issue25/app-profi l http://smce.doae.go.th/smce/1 http://www.loc.gov/standards/premis/louis-2-1.xml http://www.lis.ntu.edu.tw/cisap2010/Abstracts_files/ds05.pdf A www.loc.gov/standards/premis/v2/premis-2-1.pdf A http://www.loc.gov/standards/premis/owlOntology-announcement.html A http://www.oclc.org/research/activities/past/orprojects/pmwg/pm_framework.pdf http://www.w3.org/TR/owl2-overview/ A http://www.loc.gov/standards/premis/implementation-report- http://beta.adb.org/countries/gms/overview om http://www.culture.go.th/ichthailand/tran.html om http://purl.org/dc/documents/rec-dces- http://www.ecommons.cornell.edu/bitstream/1813/7248/1/96-1593.pdf m http://www.w3.org/TR/REC-rdf-syntax/ http://www.unesco.org/culture/ich/index.php?lg=en&pg=00313 http://www.unesco.org/culture/ich/index.php?lg=en&pg=00003 om http://www1.cs.unicam.it/insegnamenti/reti_2008/Readings/Uschold95.pdf http://www.cs.man.ac.uk/~horrocks/Teaching/cs646/Papers/uschold99.pdf http://www.w3.org/TandS/ m http://www.sla.org/PDFs/ C http://wikis.ala.org/professionaltips/images/ele7/ http://www.cla.ca/resources/competency.htm 2, from http://www.certidoc.net/en/euref1-english.pdf 11, from http://www.ifla.org/VII/s23/bulletin/guidelines.htm 1, 2010, from http://www.acm.org/education/is2002.pdf m http://www.mqa.gov.my/ http://www.opdc.go.th/oldweb/thai/High_Performance_Organize/ http://www.ncbi.nlm.nih.gov/pubmed/ http://unpan1. http://www.oae.go.th/download/ d http://www2.ops3.moc.go.th/export/export_topn_5y/report.asp http://espace.library.curtin e from http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber