Transactions on computational science

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	190
Dung lượng	29,49 MB

Nội dung

Journal Subline LNCS 9590 Alexei Sourin Guest Editor Transactions on Computational Science XXVIII Marina L.Gavrilova · C.J Kenneth Tan Editors-in-Chief Special Issue on Cyberworlds and Cybersecurity 123 Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen Editorial Board David Hutchison Lancaster University, Lancaster, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M Kleinberg Cornell University, Ithaca, NY, USA Friedemann Mattern ETH Zurich, Zürich, Switzerland John C Mitchell Stanford University, Stanford, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel C Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Dortmund, Germany Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbrücken, Germany 9590 More information about this series at http://www.springer.com/series/8183 Marina L Gavrilova C.J Kenneth Tan Alexei Sourin (Eds.) • Transactions on Computational Science XXVIII Special Issue on Cyberworlds and Cybersecurity 123 Editors-in-Chief Marina L Gavrilova University of Calgary Calgary, AB Canada C.J Kenneth Tan Sardina Systems Tallinn Estonia Guest Editor Alexei Sourin Nanyang Technological University Singapore Singapore ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notes in Computer Science ISBN 978-3-662-53089-4 ISBN 978-3-662-53090-0 (eBook) DOI 10.1007/978-3-662-53090-0 Library of Congress Control Number: 2015960432 © Springer-Verlag Berlin Heidelberg 2016 This work is subject to copyright All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed The use of general descriptive names, registered names, trademarks, service marks, etc in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made Printed on acid-free paper This Springer imprint is published by Springer Nature The registered company is Springer-Verlag GmbH Berlin Heidelberg LNCS Transactions on Computational Science Computational science, an emerging and increasingly vital field, is now widely recognized as an integral part of scientific and technical investigations, affecting researchers and practitioners in areas ranging from aerospace and automotive research to biochemistry, electronics, geosciences, mathematics, and physics Computer systems research and the exploitation of applied research naturally complement each other The increased complexity of many challenges in computational science demands the use of supercomputing, parallel processing, sophisticated algorithms, and advanced system software and architecture It is therefore invaluable to have input by systems research experts in applied computational science research Transactions on Computational Science focuses on original high-quality research in the realm of computational science in parallel and distributed environments, also encompassing the underlying theoretical foundations and the applications of large-scale computation The journal offers practitioners and researchers the opportunity to share computational techniques and solutions in this area, to identify new issues, and to shape future directions for research, and it enables industrial users to apply leading-edge, large-scale, high-performance computational methods In addition to addressing various research and application issues, the journal aims to present material that is validated – crucial to the application and advancement of the research conducted in academic and industrial settings In this spirit, the journal focuses on publications that present results and computational techniques that are verifiable Scope The scope of the journal includes, but is not limited to, the following computational methods and applications: – – – – – – – – – – – – Aeronautics and Aerospace Astrophysics Big Data Analytics Bioinformatics Biometric Technologies Climate and Weather Modeling Communication and Data Networks Compilers and Operating Systems Computer Graphics Computational Biology Computational Chemistry Computational Finance and Econometrics VI – – – – – – – – – – – – – – – – – – – – – LNCS Transactions on Computational Science Computational Fluid Dynamics Computational Geometry Computational Number Theory Data Representation and Storage Data Mining and Data Warehousing Information and Online Security Grid Computing Hardware/Software Co-design High-Performance Computing Image and Video Processing Information Systems Information Retrieval Modeling and Simulations Mobile Computing Numerical and Scientific Computing Parallel and Distributed Computing Robotics and Navigation Supercomputing System-on-Chip Design and Engineering Virtual Reality and Cyberworlds Visualization Editorial The Transactions on Computational Science journal is part of the Springer series Lecture Notes in Computer Science, and is devoted to a range of computational science issues, from theoretical aspects to application-dependent studies and the validation of emerging technologies The journal focuses on original high-quality research in the field of computational science in parallel and distributed environments, encompassing the theoretical foundations and the applications of large-scale computations and massive data processing Practitioners and researchers share computational techniques and solutions in the area, identify new issues, shape future directions for research, and enable industrial users to apply the presented techniques The current issue is devoted to the state-of-the-art approaches in the domain of cybersecurity and cyberworlds It comprises extended versions of the best papers from the International Conference on CyberWorlds, held in Gotland, Sweden, in October 2015 The first paper is a position paper, presenting open problems and identifying future directions in the vibrant domain of cyberworld research It was co-written following the invited panel held during the conference The other eight full papers are devoted to a range of topics, including virtual reality, games, haptic modeling, cybersecurity, brain wave analysis, shape parametrization, projections, and data mining We would like to extend our sincere appreciation to the special issue guest editor, Associate Professor Alexei Sourin, NTU, Singapore, for his dedication to the special issue and the coordination of the position paper We thank all the reviewers for their work on this special issue We would also like to thank all of the authors for submitting their papers to the journal and the associate editors for their valuable work It is our hope that this collection of nine articles presented in this special issue will be a valuable resource for Transactions on Computational Science readers and will stimulate further research into the key area of cybersecurity and human–computer interaction May 2016 Marina L Gavrilova C.J Kenneth Tan Guest Editor Preface Cyberworlds are information worlds or communities created in cyberspace by participants collaborating either intentionally or spontaneously As information worlds, they accumulate information regardless of whether or not anyone is in Cyberworlds can be based on sharing text, image, and video information, and they can also be immersive multi-user networked, shared virtual worlds Cyberworlds have been created and applied in such areas as e-business, e-commerce, e-manufacturing, e-learning, and cultural heritage They augment and sometimes replace real life and become a significant component of real economy Examples of such cyberworlds with millions of participants are communities created in different social networking services, virtual shared worlds, and multiplayer online games Problems of cyberworlds were discussed at the annual 2015 International Conference on CyberWorlds, which was held in Gotland, Sweden, during October 7–9, 2015 The eight full papers presented at the conference were selected to be published in extended form in this special issue of the Transactions on Computational Science The issue begins with a report on the position statements made by Alexei Sourin, Ray Earnshaw, Marina Gavrilova, and Olga Sourina during the plenary panel on problems of human–computer interactions in cyberworlds In the next article, Kyota Aoki and Naoki Aoyagi propose a method for realizing augmented reality that works in a head-worn-type equipment that recognizes objects in camera images Mikael Fridenfalk continues by presenting an application developed as a research platform for the real-time generation of 3D L-system structures It enables the user to interact with the L-system geometries to render a mathematically defined world Next, Qian Fu, Zhongke Wu1, Xiang Ying, Mengdi Wang, Xia Zheng, and Mingquan Zhou present a novel method for solving the problem of an adequate revealing the esthetic value of Chinese calligraphy In the next contribution, Christopher J Headleand, James Jackson, Ben Williams, Lee Priday, William J Teahan, and LLyr Ap Cenydd explore how the perceived identity of a non-player character affects a player’s behavior in computer games In the next article, Xiyuan Hou, Yisi Liu, Wei Lun Lim, Zirui Lan, Olga Sourina, Wolfgang Mueller-Wittig, and Lipo Wang describe a novel brain–computer interface integrated with proposed real-time emotion, mental workload, and stress–recognition algorithms Andrés Iglesias, Akemi Gálvez, and Marta Collantes address the problem of fitting a given set of data points in the least-square sense by using a polynomial Bézier curve Zahra Sayed, Hassan Ugail, Ian Palmer, Jon Purdy, and Carlton Reeve describe a novel approach to generating 3D Islamic geometric patterns using the shape grammar method X Guest Editor Preface Finally, Nurseda Yildirim and Bahri Uzunouglu present data-mining and optimization methods for detecting power ramps, which are large swings in power generation, within a short time window The organizers of the conference are very grateful to Prof Marina Gavrilova, Editor-in-Chief of the Transactions on Computational Science, for her continuing support and assistance We also wish to thank the authors for their high-quality contributions, as well as the reviewers for their invaluable advice that helped to improve the papers April 2016 Alexei Sourin Data Mining via Association Rules for Power Ramps Detected by Clustering or Optimization Nurseda Yıldırım1 and Bahri Uzuno˘ glu1,2(B) Department of Engineering Sciences, Division of Electricity Centre for Renewable Electric Energy Conversion, The Angstră om Laboratory, Uppsala University, Box 534, 751 21 Uppsala, Sweden nursedayildirim@iyte.edu.tr,bahri.uzunoglu@angstrom.uu.se Department of Mathematics, Florida State University, Tallahassee, FL 32310, USA bahriuzunoglu@computationalrenewables.com Abstract Power ramp estimation has wide ranging implications for wind power plants and power systems which will be the focus of this paper Power ramps are large swings in power generation within a short time window This is an important problem in the power system that needs to maintain the load and generation at balance at all times Any unbalance in the power system leads to price volatility, grid security issues that can create power stability problems that leads to financial losses In addition, power ramps decrease the lifetime of turbine and increase the operation and maintenance expenses In this study, power ramps are detected by data mining and optimization For detection and prediction of power ramps, data mining K means clustering approach and optimisation scoring function approach are implemented [1] Finally association rules of data mining algorithm is employed to analyze temporal ramp occurrences between wind turbines for both clustering and optimization approaches Each turbine impact on the other turbines are analyzed as different transactions at each time step Operational rules based on these transactions are discovered by an Apriori association rule algorithm for operation room decision making Discovery of association rules from an Apriori algorithm will serve the power system operator for decision making Keywords: Data mining · Big data · Power ramp Optimization · Association rules · Apriori algorithm · Clustering · Introduction Instant volatile changes in the power level is defined as ramp event Power ramp estimation is an important problem for power system balancing In wind power plants, due to intermittency of wind speed, power level can vary in a stochastic c Springer-Verlag Berlin Heidelberg 2016 M.L Gavrilova et al (Eds.): Trans on Comput Sci XXVIII, LNCS 9590, pp 163–176, 2016 DOI: 10.1007/978-3-662-53090-0 164 N Yildirim and B Uzuno˘ glu behaviour Power ramp rate (PRR) is an instant change in power level, can be in simplest terms denoted by gradient of the power production identified by the anomalies of the first derivative of power production The definition of what is a ramp can have different meanings in different context but all has the general definition of large swings in power generation within a short time window One of the reasons for not having a clear classification is because of the different time resolution observations of these physical processes that are based on short term atmospheric dynamics such as low level jets, low pressure systems, gusts, thunderstorms and others [1–4] The analysis of the power ramps will be the focus of this paper Power production is always positive or zero however due to change direction of power, PRR can be negative and positive Higher magnitude of absolute value of PRR points to faster power surge (drop) Power ramp up and down refer to positive and negative ramp issues respectively with negative ramps impacting the power system security, contingency and also impacting reserve electricity markets [5,6] Control room operators face with two main issues related with ramp Positive ramps occur when the unexpected shocks occur in wind power production increases due to positive trend on wind speed in a short period that can lead to financial losses and an imbalance in the grid Operators must balance load by decreasing the production of other power plants For negative ramp cases operators should have enough backup power Power ramp rate related applications can employ available historical data such as Supervisory Control and Data Acquisition (SCADA) databases and meteorological mast data to create forecasts and prediction model while some of this data is collected as mandatory as a result of the regulations of the industry In these large databases, knowledge discovery issues can be addressed with data mining methods Grid service infrastructure has also been one of the implementations of the methodology of data mining [7] In literature, a temporal physical parameter space study was undertaken by Kamath [4] that demonstrated that some atmospheric physical parameters which are taken from historical SCADA data are more important than others for ramp occurrences At the end of the study, author derived a helper key set for the control room operators In another study, an Apriori algorithm was employed with the approach of correcter of predicted wind speed values for the Hexi Corridor area of China [8] In this study, meteorological variables such as temperature, pressure and humidity that affect predicted wind speeds, was clustered according to rules between these parameters The cluster means of each group (4 main set due to number of meteorological tower) was found to decrease the prediction error in this study [8] There has been also other applications of Apriori rules in wind power industry in different fields such as alarm data cleaning, operation and maintenance and fault detection from alarm data [9,10] with the help of association rules In the analysis of power ramps events, data can be available with physical parameters, spatial parameters and temporal parameters Especially relatively high frequency large data sets could be utilized with the help of associative prediction rules of data mining that will be developed in this paper for the decision making in the operational room This will be the focus of this study Data Mining via Association Rules for Power Ramps 165 To understand relationships between occurrences of ramp, location of each turbine has been investigated previously by the authors [11–13] based on spatial and temporal effects This will be further expanded in this work to associative prediction rules with physical and temporal parameters that will be expanded to a decision making process algorithm for the power system operational room The article is organized as follows Section defines wind ramp rules Section introduces clustering and optimization detections algorithms Section introduces association rules and Apriori algorithm The details of input database and parameters will be given in Sect Findings and list of rules are given in Sect Discussions will be presented in Sect Definitions for Wind Ramps Rules Power ramps can be defined by ramp start, ramp duration and ramp rate Power Ramp Rates (PRR) values are calculated from SCADA power records in this study The inputs from SCADA are time series set of power and time pair SCADA = {{t1 , power1 }, , {tN , powerN }} which are given in an interval I = (i, l) where indices of time series are i, j : ≤ i < l ≤ N The basic definition power ramp rate rule will be discrete derivatives of power [14] between consecutive data: P RR1 (i, i + 1) = P oweri+1 − P oweri ti+1 − ti (1) herein ti+1 − ti = 10 mins Another rule can be defined on different time consecutive order for ramp rate thresholding as P RR2 (i, l) = P owerl − P oweri tl − ti (2) and if one drops the denominator, the power swing thresholding is defined as P RR3 (i, l) = P owerl − P oweri (3) and as another rule maximum and minimum power difference thresholding can be defined as P RR4 (i, l) = max(P oweri P owerl ) − min(P oweri P owerl ) (4) Rules 1, and will be employed as input for detection of ramps by clustering and optimization methods that will be discussed in next sections Detection of Wind Power Ramps The detection of power ramps will be achieved by two approaches and will be discussed in detail in Sects 3.1 and 3.2 166 3.1 N Yildirim and B Uzuno˘ glu Detection by Clustering Algorithm - K-Means The power ramp rule P RR1 (i, i + 1) defined in previous section PRR1 = {P RR1 (1, 2) , P RR1 (N − 1, N )} (5) will be used as input to clustering algorithm that optimizes an objective function F : Qj (PRR1 ) (6) herein Qj (PRR1 ) is the set of all the partitions of data PRR1 in K non empty clusters as defined by C1 , C2 , C3 , , Cj where j with partition of the number of clusters j reorganize the index of PRR1 array as PRR1 = {C1 , , Cj , , CK } 1 1 = {{P RR1,1 , , P RR1,K } {P RRK,1 , , P RRK,K }} K (7) with an algorithm which will be the K-means algorithm used in this study The K-means algorithm creates a solution to compute optimal values via clustering criterion F which depends on the sum of the distance between each element and its nearest cluster center (centroid) [15] We can formulate it as follows where K is the number of clusters and Kj is the number of objects of the cluster j, 1 P RRjk is the kth object of the jth cluster and P RRj is the centroid of the each cluster The F objective function can be expanded as [15] to K j F ({C1 , , CK }) = ΣK P RRjk − P RRj j=1 Σk=1 Kj 1 P RRj = Σ P RRjk j = 1, , K Kj k=1 (8) The conventional K-means algorithm pseudo code [16] can be summarized as below; (1) Clustering methodology needs to implement decision of optimal cluster size Cluster size stands for meaningful partition without losing information in large clusters or creating too many small clusters There has been various approaches to solve this issue One of them is the entropy calculation which is not employed in this study [17] The second approach is calculation and optimization of indexes which are discussed in [17] In this study NbClust package in R program will be employed to perform this optimization task [17] based on several indexes discussed in [17] Majority rule, which is the most frequent result of 30 different cluster size is implemented Decision methodology on final cluster size suggestion will be based on majority rule [17] Data Mining via Association Rules for Power Ramps 167 (2) Initialization partition (C1 , C2 , , CK ) of database is executed via Forgy Algorithm [16] (3) Compute centroids of each cluster (4) Reassign P RRjk to closest cluster centroid (5) Recalculate centroids for each cluster P RRj (6) Reiterate until no further changes of cluster membership occur in a complete iteration and stop In the next section, we will introduce the optimization detection algorithm that will preprocess the output of cluster number results of each turbine in generating association rules for operational decisions 3.2 Detection by Optimization Optimal ramp intervals was not explicitly addressed in rule P RR1 (i, i + 1) in Sect 3.1 which had 10 fixed intervals however as will be presented in the numerical results section clustering algorithm can address this implicitly To address this explicitly, the ramp rules and will be employed in this section for optimization at different duration intervals which means the starting and ending point of interval will not be fixed For this, the optimization introduced by [1,3] will be used by employing the objective function defined in [18] J(i, l) = max W (i, m) + J(m + 1, l) i≤m≤l (9) where J(i, l) is the maximum score in signal interval I = (i, l) and is initialized as zero at the beginning of iteration The J(i,l) is computed as the maximum over l − i subproblems To achieve scoring the scoring functions W (i, m) can be defined in different ways as long as it employs additivity property [1,3] The weight function in this case is defined as W (i, l) = (l − i)2 1{R(i,l)=1} where is an operator when the condition is met equals to one herein ramp rules R(i, l) = {R0 (i, l), R1 (i, l)} are set of functions to define ramp interval and there ramps scoring functions are defined as R0 (i, l) = 1P RR3 (i,l)≥Pval (10) R1 (i, l) = 1P RR4 (i,l)≥α (11) the parameter Pval is given value for power swing threshold value test and α is given value for power difference thresholding test The proof of these concepts can be found in detail [1,3] This optimal ramp detection algorithm [1,3] is used to detect power ramp rate durations, after data cleaning and preparation phase such as normalization to rated power [1,3] The default settings are used via available open access code [18] In the next section, we will introduce the Apriori algorithm to generate association rules for operational decisions We will postprocess the output of cluster number results of each turbine 168 N Yildirim and B Uzuno˘ glu Associations Rules - Apriori Algorithm for Detected Power Ramps To illustrate three layer process of Apriori algorithm, lets define an array Pt,1 15 = {C11 t,1 C31 t,1 } that defines the matrix below where matrix single entry at each row for a single time step Cjt,tn is a set of binary attributes called items where t represents time step, tn represents turbine number and j represents cluster label or in the optimization case ramp up down or neutral values that are discussed in Sect 6.1 Since, in our study, there were five turbines, three clusters and three optimization scenarios, there are always 15 items in this example where item set of 15 is denoted by I j=1 j=2 j=3 ⎞ C11,1 C21,1 C31,1 ⎟ ⎜ = ⎝ ⎠ N,1 N,1 N,1 t=N C1 C2 C3 ⎛ t=1 P1 N,1 15 Each row entry of Pt,1 15 is a transaction that means that there are as many transactions as time steps Each row transaction contains analysis of K-means cluster labels or optimization ramp detections for ramps at each time step for each turbine in binary form Each row entry that represents a new transaction will have a subset of items such as X and Y where both of the item subsets will have size less than 15 where X, Y ⊆ I and X ∩ Y = ∅ As a result an association rule that has a directional rule can be defined where X → Y Association rules are defined at each transactions based on each original row entry Each original row entry will define a new association rule However, this will create several associations rules so more filtering is required on associations Support is a user defined limit to filter irrelevant occurrences of that candidate association rules that will not be significant for decisions This functionality will be defined by function supp() which basically counts the frequency of association rules occurrences In this context of association rules, support is the occurrence or the size of this association rule in all rows Mathematically, the support count σ(X → Y ) for an itemset X rule can be stated as [19] σ(X → Y ) = |{Pt,1 15 | X → Y ⊆ Pt,1 15 , Pt,1 15 ∈ P1 N,1 15 }| where the symbol |.| denote the number of elements in a set If we define total number of transactions or time steps in our case as N, support of X → Y can be defined as supp(X → Y ) = σ(X → Y ) N (12) Confidence will define the ratio of support of a rule association candidate such as supp(X → Y ) divided by support of one subset of items such as supp(X) which is defined by function conf () Data Mining via Association Rules for Power Ramps conf (X → Y ) = 169 supp(X → Y ) supp(X) (13) Difference between confidence and support must be highlighted again, confidence is a tool to measure strength of an association rule and support symbolizes statistical significance Filtering by support threshold is an important tool to imply rule is worth consideration or should be eliminated [20] Lift is another criteria denoted by function lif t() as the ratio of the confidence to right hand item subset This is used to strength the filter when there is several association rules lif t(Y → X) = conf (X → Y ) supp(X → Y ) = supp(X)supp(Y ) supp(Y ) (14) (15) After the generation of item sets, counting phase will identify frequency of item groups in each item set class For an ideal strong relationship between X and Y, support must be large and confidence must be high Greater magnitude of lift values are the proof for stronger associations between items (stronger rules) According to occurrence of counted candidates, their support value is calculated by ratio between candidate repetition number to observation set In final step each candidate will be compared with their own support values to pre-defined threshold support value such as 50 % The candidates that are exceeding this threshold or catch this threshold will be selected Data mining and discovery rule finding methodologies that were discussed here are as of nature dependent on input transaction data types Transaction is data form of act which you can not divide to smaller parts that represents any change in database (DB) as presented Each transaction is binary set of items For the association rule process, input data was transaction data which has time dimension that is the count of transaction in this example Different objects such as physical variables, spatial or temporal dimensions can also be employed Based on transaction data, association rule process specifically the Apriori algorithm is employed here that computes the frequent object groups of transaction in the database through numerous iterations [19] Case Study-Ayyildiz The Ayyildiz is a wind farm that has five VESTAS V90-3.0 MW turbines at 80 m The wind farm is located in the town of Bandirma which is close to sea of Marmara as illustrated in Fig with open sea in West and East directions The main wind direction is from North to North East in this wind farm as demonstrated in Fig Ayyildiz wind farm power production values for five turbines 170 N Yildirim and B Uzuno˘ glu are recorded for 2013 year by 10 intervals 2013 power production values are used for PRR analysis while data was scaled for unit conversion as necessary [16] SCADA data was provided by TUBITAK (Scientific and Technological Research Council of Turkey) Numerical Results In this study, two type of power ramp rate detection algorithms are investigated as discussed in the previous sections For the first approach, power difference in 10 (fixed time step as ramp duration) [10] are calculated and clustered with K-means algorithm With the help of cluster centroid each observation is labeled as 2(up), 3(non ramp), 1(down) For the second approach, optimization based power ramp rate detection algorithm of [1] are used to calculate ramp durations and detect ramp occurrences and each observation is labeled 1(up) 0(non ramp) -1(down) Outcomes of two cases are studied separately with Apriori rule mining algorithm 6.1 Power Ramp Detection by Clustering and Optimization In Fig 3, K-means algorithm detects down ramp and up ramp occurrences Whereas in Fig 3, optimal detection algorithm detects down ramp occurrences and up ramp occurrences This demonstrates optimization based algorithm can give better results in comparison to to fixed interval based K-means study This comes at a price of several order magnitudes increase in computational time Reader should keep in mind that optimization based algorithm uses complex rule sets to define ramp Ramp start index, ramp finish index, power swing and the angle ramp rate are some of the parameters that can be defined However only ramp rate and duration is sufficient to define ramp detection optimization In contrast K-means algorithm uses as input parameter just the difference between two registered power values within 10 Fixed duration assumption with Kmeans algorithm is simple less sensitive approach but it is still successful in terms of ramp detection accuracy The analyze is conducted for 1000 observations and for Turbine values to create simplified visual tools while the final results are presented for full data set Difference between detection accuracy causes difference between non ramp labeled observations In fixed ramp rate K-means algorithm case, these differences are more in comparison to optimization based detection algorithm When the information is analyzed with Apriori rule finding algorithm working principles, it is expected that fixed interval based K-means will produce stronger rules for non ramp cases in comparison to optimization based power ramp rate detection that will produce more equally balanced rules for up, down and non ramp cases This will be discussed in next section Data Mining via Association Rules for Power Ramps (a) (b ) 171 (c) Fig Wind farm location a) Location of the wind farm site province in Turkey b) Ayyildiz wind farm location details c) Location of the wind farm site city (a) (b ) Fig Wind farm siting a) Wind farm layout with objects; triangles are turbines, circle is the met mast b) Wind rose of the wind farm (a ) (b) Fig Ramp detection a) Fixed interval PRR - K-means ramp detection for 1000 observations b) Optimization based ramp detection for 1000 observations 172 6.2 N Yildirim and B Uzuno˘ glu Association Rules for Power Ramps In Table and in Fig 5, eight significant power ramp rate rules for five wind turbines of fixed interval ramp rate K-means algorithm are presented Rules are sorted by their lift value from stronger rules to weaker rules Table Eight significant power ramp rate rules between five wind turbines for fixed interval ramp rate K-means algorithm LHS RHS non ramp Support Confidence Lift {PRR3=1} → {PRR5=3} 0.033 0.4647887 0.5135787 {PRR4=2} → {PRR2=3} 0.023 0.4339623 0.4768816 {PRR3=2,PRR4=2} → {PRR2=3} 0.022 0.4230769 0.4649197 {PRR3=1} → {PRR2=3} 0.030 0.4225352 0.4643244 {PRR3=2} → {PRR2=3} 0.024 0.4210526 0.4626952 {PRR4=1} → {PRR5=3} 0.027 0.4153846 0.4589885 {PRR1=2} → {PRR2=3} 0.022 0.4150943 0.4561476 {PRR1=2,PRR4=2} → {PRR2=3} 0.019 0.4130435 0.4538939 Fig K-means ramp detection Apriori rules for whole data set of Table If the results of Table and Fig are summarized in a list, the following conclusions can be drawn: Most powerful rule says that when the down ramp occurs in Turbine 3, non ramp case occurs in Turbine When the up ramp occurs in Turbine 4, non ramp case occurs in Turbine When the up ramp occurs in Turbine and Turbine 4, non ramp case occurs in Turbine When the down ramp occurs in Turbine 3, non ramp case occurs in Turbine Data Mining via Association Rules for Power Ramps 173 When the up ramp occurs in Turbine 2, non ramp case occurs in Turbine When the down ramp occurs in Turbine 4, non ramp case occurs in Turbine When the up ramp occurs in Turbine 1, non ramp case occurs in Turbine When the up ramp occurs in Turbine and Turbine 4, non ramp case occurs in Turbine In a previous study, authors presented space clusters of five wind turbines power ramp rate cases [13] Turbine and Turbine 2, Turbine and Turbine and Turbine represented three separate clusters From this starting point, similar power ramp rate characteristics are expected from the turbines that share same cluster labels Let’s reexamine these rules of Table based on this information Turbine Turbine Turbine Turbine Turbine Turbine Turbine Turbine 3 1 vs Turbine → different space clusters lift = 0.51 vs Turbine → different space clusters and Turbine vs Turbine → different space clusters vs Turbine → different space clusters vs Turbine → different space cluster vs Turbine → different space cluster vs Turbine → same space cluster lift = 0.456 and Turbine vs Turbine → same space clusters lift = 0.453 In Table and in Fig 5, eight significant power ramp rate rules between five wind turbines based on optimization algorithm is presented Rules are sorted by their lift value so from stronger rules to weaker rules from 1–4 and than most powerful rule 1–2, most powerful rule 1–2 (right hand side filtering effect) If the results of Table and Fig are summarized in a list, the following conclusions can be drawn: When the up ramp occurs in Turbine 2, non ramp case occurs in Turbine When the up ramp occurs in Turbine and down ramp occurs in Turbine 3, non ramp case occurs in Turbine Table Eight significant power ramp rate rules between wind turbines for variable interval ramp rate optimization algorithm LHS RHS non ramp Support Confidence Lift {PRR2=1} → {PRR4=0} 0.169 0.4747191 16.540.735 {PRR1=1,PRR3=-1} → {PRR5=0} 0.007 0.8750000 12.886.598 {PRR1=1,PRR2=1,PRR3=-1} → {PRR5=0} 0.007 0.8750000 12.886.598 {PRR1=1,PRR3=-1,PRR4=1} → {PRR5=0} 0.006 0.8571429 12.623.606 LHS RHS up ramp Support Confidence Lift {PRR4=0,PRR5=0} → {PRR2=1} 0.169 0.5971731 16.774.527 {PRR1=0,PRR4=0,PRR5=0} → {PRR2=1} 0.169 0.5971731 16.774.527 LHS RHS down ramp Support Confidence Lift {PRR2=0,PRR5=0} → {PRR4=-1} 0.120 0.5128205 0.9603380 {PRR2=0,PRR3=0,PRR5=0} → {PRR4=-1} 0.120 0.5128205 0.9603380 174 N Yildirim and B Uzuno˘ glu Fig Optimization based ramp detection Apriori rules for whole data set of Table When the up ramp occurs in Turbine and Turbine and down ramp in Turbine 3, non ramp case occurs in Turbine When the up ramp occurs in Turbine and down ramp occurs in Turbine and up ramp occurs in Turbine 4, non ramp case occurs in Turbine 5 When the non ramp case occurs in Turbine and Turbine 5, up ramp occurs in Turbine (most significant rule due to lift value) When the non ramp case occurs in Turbine and Turbine and Turbine 5, non ramp occurs in Turbine When the non ramp case occurs in Turbine and Turbine 5, down ramp case occurs in Turbine When the non ramp case occurs in Turbine and Turbine and Turbine 5, down ramp case occurs in Turbine In a previous study, authors presented space clusters of five wind turbines power ramp rate cases [13] Turbine and Turbine 2, Turbine and Turbine and Turbine represented three separate clusters From this starting point, similar power ramp rate characteristics are expected from the turbines that share same cluster labels Let’s reexamine these rules of Table based on this information Turbine vs Turbine → different space cluster lift = 16.5 Turbine and Turbine vs Turbine → different space clusters lift = 12.88 Turbine and Turbine and Turbine vs Turbine → different space clusters Turbine and Turbine and Turbine vs Turbine → different space clusters Turbine and Turbine vs Turbine → different space clusters lift =16.774 Turbine and Turbine and Turbine vs Turbine → different space clusters Turbine and Turbine vs Turbine → different space clusters Turbine and Turbine and Turbine vs Turbine → same space cluster with Turbine lift =0.96 Different ramp occurrences based on different structure of rules in turbines are presented with varying lift scales that are from 0.51 to 0.453 in Table and that are from 16.774 to 0.96 in Table For both cases, all stronger rules in time Data Mining via Association Rules for Power Ramps 175 not show contradictory characteristics to that of space clusters of the turbines and rules from previous studies [13] In stronger rules without any exception the analysis supported the space relationship findings and power ramp occurrences in wind turbines [13] In weaker rules that are with low lift values, these relations can not be traced In future research, models that incorporate ramp forecasting parameters with Bayesian setting will be further investigated to investigate the impact of ramps for forecasting and operation and maintenance Discussion and Conclusion The proposed method can be applied to the general cases and it is not site specific One example farm has been selected here for study The 10 intervals recorded SCADA data can generate large databases for wind farms in several years Two different ramp detection algorithms were tested while K-means clustering was effective in computational time for ramp detection, the accuracy of capturing ramps were lower compared to more accurate but computationally intensive optimal power ramp detection strategy This information can be employed in ramp forecast controls For larger wind farms with 100 or more wind turbines, taking into account just total power production will cause information loss that will have implications on management of the wind farm In contrast, working with each individual turbine will create inefficiency on resources Working with turbine clusters can help to gain time and more information Clustering algorithms and machine learning applications could discover hidden rules between complex data parameters from big data of these wind farms for operational decision making of wind farms The analysis will further serve repowering strategies Future research will include detection of fundamental change in status patterns to identify frequent status patterns of turbine components for prediction of status patterns in wind turbines Early prediction of status patterns related to the deterioration of components will benefit turbine operation maintenance This is being further developed in Bayesian setting for operation and maintenance applications Acknowledgements We would like to acknowledge the financial support given by Vindforsk and Swedish Energy Agency grant “Bayesian methods for preventive maintenance” The authors would like to acknowledge the financial support given by Computational Renewables LLC for the duration of this study The second author, Bahri Uzuno˘ glu, would like to acknowledge visiting scientist exchange granted at Florida State University, Department of Mathematics with Prof Yousuff Hussaini in the context of this research 176 N Yildirim and B Uzuno˘ glu References Sevlian, R., Rajagopal, R.: Detection and statistics of wind power ramps IEEE Trans Power Syst 28(4), 3610–3620 (2013) Ouyang, X.Z.T., Qin, L.: A survey of wind power ramp forecasting Energy Power Eng 5(4B), 368–372 (2013) Sevlian, R., Rajagopal, R.: Wind power ramps: detection and statistics In: Power and Energy Society General Meeting, pp 1–8 IEEE, July 2012 Kamath, C.: Associating weather conditions with ramp events in wind power generation In: Power Systems Conference and Exposition (PSCE), IEEE/PES, vol 2011, pp 1–8 IEEE (2011) Uzunoglu, B., Bayazit, D.: A generic resampling particle filter joint parameter estimation for electricity prices with jump diffusion In: 10th International Conference on the European Energy Market (EEM), pp 17 IEEE (2013) ă Ulker, M.A.: Balancing of wind power: optimization of power systems which include wind power systems (2011) Aflori, C., Craus, M.: Grid implementation of the Apriori algorithm Adv Eng Softw 38(5), 295–300 (2007) Guo, Z., Chi, D., Wu, J., Zhang, W.: A new wind speed forecasting strategy based on the chaotic time series modelling technique and the Apriori algorithm Energy Convers Manag 84, 140–151 (2014) Tong, C., Guo, P.: Data mining with improved Apriori algorithm on wind generator alarm data In: Control and Decision Conference (CCDC),: 25th Chinese, vol 2013, pp 1936–1941 IEEE (2013) 10 Kusiak, A., Verma, A.: Prediction of status patterns of wind turbines: a datamining approach J Sol Energy Eng 133(1), 011008 (2011) 11 Yıldırım, N., Uzunoglu, B.: Association rules for clustering algorithms for data mining of temporal power ramp balance In: Cyberworlds Visby IEEE (2015) 12 Uzunoglu, B., Albayrak, A.: Data mining of wind data generated by CFD solutions In; CFD and Optimization ECCOMAS Antalya TURKEY (2011) 13 Yıldırım, N., Uzunoglu, B.: Spatial clustering for temporal power ramp balance and wind power estimation In: Greentech IEEE (2015) 14 Kusiak, A., Zheng, H.: Data mining for prediction of wind farm power ramp rates In: IEEE International Conference on Sustainable Energy Technologies, ICSET 2008, vol 2008, pp 1099–1103 IEEE(2008) 15 Pena, J.M., Lozano, J.A., Larranaga, P.: An empirical comparison of four inilitiazion methods for the k means algorithm Pattern Recogn Lett 20(10), 1027–1040 (1999) 16 Gan, G., Ma, C., Wu, J.: Data Clustering,: Theory, Algorithms, and Applications, vol 20 Siam, Philadelphia (2007) 17 Charrad, M., Ghazzali, N., Boiteau, V., Niknafs, A.: Nbclust: An R package for determining the relevant number of clusters in a data set J Stat Softw 61(6), 1–36 (2014) 18 Sevlian, R.: Wind ramp detection (2012) http://web.stanford.edu/rsevlian/ WindRampDetect.html 19 Tan, P.-N., Kumar, V.: Chapter association analysis: basic concepts and algorithms In: Introduction to Data Mining, Addison-Wesley (2005) ISBN 321321367 20 Agrawal, R., Imieli´ nski, T., Swami, A.: Mining association rules between sets of items in large databases In: ACM SIGMOD Record, vol 22, no 2, pp 207–216 ACM (1993) Author Index Aoki, Kyota 23 Aoyagi, Naoki 23 Ap Cenydd, LLyr 88 Palmer, Ian 146 Priday, Lee 88 Purdy, Jon 146 Collantes, Marta Reeve, Carlton 146 Earnshaw, Rae 127 Fridenfalk, Mikael Fu, Qian 69 Sayed, Zahra 146 Sourin, Alexei Sourina, Olga 1, 108 45 Teahan, William J Gálvez, Akemi 127 Gavrilova, Marina Headleand, Christopher J Hou, Xiyuan 108 88 88 Yıldırım, Nurseda Ying, Xiang 69 Lan, Zirui 108 Lim, Wei Lun 108 Liu, Yisi 108 Mueller-Wittig, Wolfgang Ugail, Hassan 146 Uzunoğlu, Bahri 163 Wang, Lipo 108 Wang, Mengdi 69 Williams, Ben 88 Wu, Zhongke 69 Iglesias, Andrés 127 Jackson, James 88 108 Zheng, Xia 69 Zhou, Mingquan 163 69 ... applied computational science research Transactions on Computational Science focuses on original high-quality research in the realm of computational science in parallel and distributed environments,... – – – – – – – – – – – – LNCS Transactions on Computational Science Computational Fluid Dynamics Computational Geometry Computational Number Theory Data Representation and Storage Data Mining and... LNCS Transactions on Computational Science Computational science, an emerging and increasingly vital field, is now widely recognized as an integral part of scientific and technical investigations,

Ngày đăng: 12/03/2018, 10:56