An improved forecasting model combining recurrent fuzzy logical relationships and K-means clustering technique

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	12
Dung lượng	627,56 KB

Nội dung

In this paper, a new forecasting model based on combining the Fuzzy Time Series (FTS) and K-mean clustering algorithm with two concepts, the recurrent fuzzy relationship groups (RFRGs) and K-mean clustering technique, is presented. Firstly, the authors use the K-mean clustering algorithm to divide the historical data into clusters and adjust them into intervals with different lengths.

Nghiên cứu khoa học công nghệ AN IMPROVED FORECASTING MODEL COMBINING RECURRENT FUZZY LOGICAL RELATIONSHIPS AND K-MEANS CLUSTERING TECHNIQUE Nghiem Van Tinh1*, Nguyen Cong Dieu2 Abstract: Most of the forecasting approaches based on fuzzy time series used the static length of intervals The disadvantage of the static length of intervals is that the historical data are roughly put into intervals, although the variance of them is not high In this paper, a new forecasting model based on combining the Fuzzy Time Series (FTS) and K-mean clustering algorithm with two concepts, the recurrent fuzzy relationship groups(RFRGs) and K-mean clustering technique, is presented Firstly, the authors use the K-mean clustering algorithm to divide the historical data into clusters and adjust them into intervals with different lengths Then, based on the new intervals obtained, the proposed method is used to fuzzify all the historical data, identify all fuzzy relationships, construct the recurrent fuzzy logical relationship groups and calculate the forecasted output by the improved method in defuzzification phase To evaluate performance of the proposed model, datasets from both the Taiwan Futures Exchange (TAIFEX) from 8/3/1998 to 9/30/1998 and from the historical enrolment data of the University of Alabama are utilized Compared to the other methods existing in literature, particularly to the first-order FTS and the high order- FTS, the proposed method showed a better accuracy in forecasting the number of students of the University of Alabama from 1971s to 1992s and TAIFEX prediction Keywords: Fuzzy time series, Fuzzy forecasting, Recurrent Fuzzy logic relationship, K-means clustering, Enrolments, TAIFEX INTRODUCTION Future prediction of time series events has attracted people from the beginning of times They were using some forecasting models to deal with various problems:such as the enrolment forecasting [2], crop forecast [7], [8], [11], temperature prediction [14], [20], [21], stock markets [14], etc There is the matter of fact that the traditional forecasting methods cannot deal with the forecasting problems in which the historical data are represented by linguistic values Song and Chissom [2], [3] proposed the time-invariant FTS and the time-variant FTS model which use the max–min operations to forecast the enrolments of the University of Alabama However, the main drawback of these methods is huge computation weight when a fuzzy relationship matrix is large Then, Chen [4] proposed the first-order FTS model which used the fuzzy logical relationship groups (FLRGs) to simplify the computational complexity of the forecasting process.His model employed simple arithmetic calculations instead of max-min composition operations for better forecasting accuracy Afterward, FTS has been widely studied to improve the accuracy of forecasting in many applications Huarng [6] presented a method for forecasting the enrolments of the University of Alabama and the TAIFEX based on [4] by adding a heuristic function to get better forecasting results Chen also extended his previous work to presented several forecast models based on the high-order fuzzy time series to deal with the enrolments forecasting problem [9], [12] Yu shown model of refinement relation [10] and weighting scheme [5] for improving forecasting accuracy Both the stock index and enrolment are used as the goals in the empirical analysis Ref.[13] presented a new forecast model based on the trapezoidal fuzzy numbers Huarng Tạp chí Nghiên cứu KH&CN quân sự, Số Đặc san ACMEC, 07 - 2017 35 Điều khiển – Cơ điện tử - Truyền thông [19] showed that different lengths of intervals may affect the accuracy of forecast He modified previous method by using the ratio-based length to get better forecasting accuracy Recently, in [17], [20], [22] presented a new hybrid forecasting model which combined particle swarm optimization with FTS to find proper length of each intervaland adjust interval lengths Some other techniques for determining the best intervals and interval lengths based on clustering techniques such as: automatic clustering techniques found in [18], the K-means clustering in [15] and the fuzzy c-means clustering in [23] In this paper, a hybrid forecasting model combining the recurrent fuzzy relationship groups with K-mean clustering algorithm is presented The proposed model is different from the approaches [4], [17] in the way where the fuzzy relationships are created and the defuzzification forecasting rules.In case study, we applied the proposed method to forecast the enrolments of the University of Alabama and the TAIFEX The experimental results show that the proposed method gets a higher average forecasting accuracy compared to the existing methods In addition, the empirical results also showed that the high-order FTS model outperformed the first-order FTS model with a lower forecast error The rest of this paper is organized as follows In section 2, we provide a brief review of FTS and K-means clustering algorithm In section 3, we present a hybrid method for handing forecasting problems based on K-means clustering and the RFRGs through the experiments of forecasting enrolment of the university of Alabama Then, the experimental results are shown and analysed in section Finally, conclusions are presented in section FUZZY TIME SERIES AND K-MEAN CLUSTERING ALGORITHM 2.1 Fuzzy time series In 1993, Song and Chissom proposed the definitions of fuzzy time series [2], [3], where the values of fuzzy time series are represented by fuzzy sets Let U={u1,u2,…,un } be an universe of discourse; A fuzzy set A of U is defined as A={ fA(u1)/u1+…+fA(un)/un }, where fA is a membership function of a given set A, fA :U[0,1], fA(ui) indicates the grade of membership of ui in the fuzzy set A, fA(ui) ϵ [0, 1], and 1≤ i ≤ n General definitions of fuzzy time series are given as follows: Definition 1: Fuzzy time series [2], [3] Let Y(t) (t = , 0, 1, …), a subset of R, be the universe of discourse on which fuzzy sets fi(t) (i = 1,2…) are defined and let F(t) be a collection of fi(t)) (i = 1, 2…) Then, F(t) is called a fuzzy time series on Y(t) (t , 0, 1,2, ) Definition 2: Fuzzy logical relationship– FLG [2]-[4] The relationship between F(t) and F(t-1) can be denoted by F(t-1) F(t) Let Ai = F(t) and Aj = F(t-1), the relationship between F(t) and F(t -1) is denoted by fuzzy logical relationship Ai Aj where Ai and Aj refer to the current state or the left hand side and the next state or the right-hand side of fuzzy relations Definition 3: - order fuzzy logical relations [9] Let F(t)be a fuzzy time series If F(t) is caused by F(t-1), F(t-2),…, F(t- +1) F(t- ) then this fuzzy relationship is represented by by F(t- ), …, F(t-2), F(t-1) F(t) and is called an - order fuzzy time series Definition 4: Recurrent Fuzzy Relationship Group (RFLRG) [10] Fuzzy logical relationships with the same fuzzy set on the left-hand side can be further grouped into a fuzzy relationship group Suppose there are relationships such that: ;… 36 N V Tinh, N C Dieu, “An improved forecasting model… K-means clustering technique.” Nghiên cứu khoa học công nghệ So, based on[10], these fuzzy logical relationship can be grouped into the same FLRG as : … 2.2 K-Means clustering algorithm K-means clustering introduced in [1] is one of the simplest unsupervised learning algorithms for solving the well-known clustering problem K-means clustering algorithm groups the data based on their closeness to each other according to Euclidean distance The result depends on the number of cluster The algorithm is composed of the following steps Step 1: Choose k centroids { , } Step 2: Assign each object x to the clusters : x if ,j Step 3:Update { } to minimize , i=1 k Step 4: Reassign the objects using the new centroids Step 5: Repeat Steps 2, and until the centroids no longer move FORECASTING MODEL BASE ON K-MEANS CLUSTERING AND RFRG In this section, a novel forecasting model based on recurrent fuzzy relationship groups and K-means clustering algorithm for forecasting the enrolments of University of Alabama, is presented Firstly, K-means clustering algorithm is applied to classify the collected data into clusters and adjusted these clusters into contiguous intervals for generating intervals from the enrolment data in subsection 3.1 Then, based on the defined interval, we fuzzify all historical data and establish recurrent fuzzy relationship groups Finally, the forecasting output, based on the recurrent fuzzy relationship groups and the proposed forecasting rules, are obtained and shown in subsection 3.2 To verify the effectiveness of the proposed model, all historical enrolments [4] are used to illustrate the first - order FTS forecasting process 3.1 The K-Mean clustering algorithm for creating intervals from historical data set The algorithm composed of four steps is introduced step-by-step with the same dataset [4] Step 1: Apply the K-means clustering algorithm to partition the historical time series data into q clusters and sort the data in clusters in an ascending sequence In this paper, we set q=14 clusters, the clustering results are as follows: {13055}, {13563}, {13867}, {14696,15145,15163}, {15311}, {15433,15460,15497}, {15603}, {15861,15984}, {16388}, {16807}, {16859}, {16919}, {18150}, {18876, 18970,19328,19337} Step 2: Calculate the cluster center In this step, we use automatic clustering techniques [18]to generate cluster center(Centerk) from clusters according to Eq (1)as follows: = (1) where di is a datum in clusterk, n denotes the number of data in clusterk and Step 3: Adjust the clusters into intervals according to the follow rules Assume that Centerk and Centerk+1 are adjacent cluster centers, then the upper bound Cluster _ UBkof clusterk and the lower bound cluster_LBk+1 of clusterk+1 can be calculated as follows: Tạp chí Nghiên cứu KH&CN quân sự, Số Đặc san ACMEC, 07 - 2017 37 Điều khiển – Cơ điện tử - Truyền thông = (2) = (3) where, k = 1, , q-1 Because there is no previous cluster before the first cluster and there is no next cluster after the last cluster, the lower bound Cluster _ LB1 of the first cluster and the upper bound Cluster _ UBq of the last cluster can be calculated as follows: = ) (4) = ) (5) Step 4: Let each cluster form an interval , which means that the upper bound and the lower bound of the cluster are also the upper bound and the lower bound of the interval , respectively Calculate the middle value of the interval as follows: = (6) where, bound of the interval and are the lower bound and the upper , respectively, with k = 1, ,q 3.2 Forecasting model based on the first – order FTS In this section, we present a hybrid method for forecasting enrolments based on the Kmean clustering algorithm and recurrent fuzzy relationship groups The proposed method is now presented as follows: Step 1: Partition the universe of discourse U into intervals After applying the procedure K-mean clustering, we can get the following 14 intervals and calculate the middle value of the intervals are listed in Table Table The midpoint of each interval Intervals MidPoint No Intervals MidPoint No [12801, 13309] 13055 [15762.5, 16155] 15958.75 [13309, 13715] 13512 [16155, 16597.5] 16376.25 [13715, 14434] 14074.5 10 [16597.5, 16833] 16715.25 [14434, 15156] 14795 16861 11 [16833, 16889] [15156, 15387] 15271.5 12 [16889, 17534.5] 17211.75 [15387, 15533] 15460 13 [17534.5, 18639] 18086.75 19128 [15533, 15762.5] 15647.75 14 [18639, 19617] Step 2: Define the fuzzy sets for observations (historical data) Each interval in step represents a linguistic variable of enrolment For 14 intervals, there are 14 linguistic variables Each linguistic variable represents a fuzzy set Ai ( and its definition is described in Eq (7) (7) 38 N V Tinh, N C Dieu, “An improved forecasting model… K-means clustering technique.” Nghiên cứu khoa học công nghệ Here, the symbol “+” indicates the operation of union and the symbol “/” indicates the separator rather than the commonly used summation and division in algebra, respectively The value 0, 0.5 and indicate the grade of membership of uj in the fuzzy set Ai Step 3: Fuzzy all historical enrolment data In order to fuzzify all historical data, it’s necessary to assign a corresponding linguistic value to each interval first The simplest way is to assign the linguistic value with respect to the corresponding fuzzy set that each interval belongs to with the highest membership degree For example, the historical enrolment of year 1972 is 13563, and it belongs to interval because 13563 is within = [13309, 13715] So, we then assign the linguistic value or the fuzzy set corresponding to interval to it In the same way, we can complete fuzzified results of the enrolments are listed in Table Table Fuzzified historical enrolment data of the University of Alabama Year Actual data Fuzzy sets Year Actual data Fuzzy sets 1971 13055 A1 1982 15433 A6 1972 13563 A2 1983 15497 A6 -1980 16919 A12 1991 19337 A14 1981 16388 A9 1992 18876 A14 Step 4: Create all – order fuzzy relationships ≥ ) Based on definition and To establish a -order fuzzy relationship, we should find out anyrelationship which has the , where and are called the current state and the next state, respectively Then a - order fuzzy relationship in the training phase is got by replacing the corresponding linguistic values For example, supposed from Table 2, a fuzzy relation is got as So on, we get the first-order fuzzy relationships are shown in Table 3, where there are 22 relations; the first 21 relations are called the trained patterns, and the last one is called the untrained pattern (in the testing phase) For the untrained pattern, relation 22 has the fuzzy relation A14 → # as it is created by the relation , since the linguistic value of is unknown within the historical data, and this unknown next state is denoted by the symbol ‘#‘ Table The first-order fuzzy logical relationships No Year_status Fuzzy relations No Year_status Fuzzy relations 1971 1972 A1 → A2 12 1982 → 1983 A6 → A6 1972 → 1973 A2 → A3 13 1983 → 1984 A6 → A4 10 1980 → 1981 A12 → A9 21 1991 → 1992 A14 → A14 11 1981 → 1982 A9 → A6 22 1992 → 1993 A14 → # Step 5: Establish all fuzzy logical relationship groups Table Completed all first-order recurrent fuzzy relationshipgroups (FRG) No group FRG At time No group A1 → A2 t=1 A2 → A3 t=2 A3 → A4 t=3 10 A4 → A6, A5 t =4, 14 11 A6 → A5, A6, A4 t =5, 12, 13 12 A5 → A7, A8 t = 6, 15 13 FRG A8 → A10, A11 A10 → A12 A12 → A9 A9 → A6 A11 → A13 A13 → A14 Tạp chí Nghiên cứu KH&CN quân sự, Số Đặc san ACMEC, 07 - 2017 At time t = 8, 16 t=9 t =10 t = 11 t= 17 t=18 39 Điều khiển – Cơ điện tử - Truyền thông A7 → A8 t=7 14 A14 → A14, A14, A14 t =19,20,21 In previous studies [4], [12],[16], [17] the repeated FLRs were simply ignored when fuzzy relationships were established But, according to the Definition 4, the recurrence fuzzy relations can be used to indicate how the FLR may appear in the future From this viewpoint and based on Table 3, we can establishall recurrent FRGs are shown in Table Step 6: Defuzzify and calculate the forecasting values To calculate the forecast output for all recurrent FRGs, we use [11] for the trained patterns in the training phase and use [17] the untrained patterns in the testing phase For the training phase, we can compute all forecast values for recurrencefuzzy relationship groups based on fuzzy sets on the right-hand or next state within the same group For each group, we divide each corresponding interval of each next state into p subregions (p=3) with equal size, and calculate a forecasted value for each group according to Eq.(8) (8) where,  n is the total number of next states or the total number of fuzzy sets on the right-hand side within the same group  ( is the midpoint of interval corresponding to j-th fuzzy set on the right-hand side where the highest level of fuzzy set Aj takes place in these intervals uj  is the midpoint of one of p sub-regions corresponding to j-th fuzzy set on the right-hand side where the highest level of Aj takes place in this interval For the testing phase,we calculate a forecasted value based on Eq.(9), where the symbol means the highest votes predefined by user, the symbol  is the order of the fuzzy relationship, the symbols and denote the midpoints of the corresponding intervals of the latest past and other past linguistic values in the current state ; i= (9) Based on the forecast rule is created (8) and (9), we complete forecasted results forall first-order recurrent fuzzy relationship groupsarelisted in Table Table The complete forecasted values for all first-order recurrent fuzzy relationship groups (FRGs) No group FRG A1 → A2 A2 → A3 A3 → A4 A4 → A6, A5 A6 → A5, A6, A4 A5 → A7, A8 A7 → A8 Value 13512 13954.66 14795 15346.5 15236.56 15784.12 15893.34 No group 10 11 12 13 14 15 FRG A8 → A10, A11 A10 → A12 A12 → A9 A9 → A6 A11 → A13 A13 → A14 A14 → A14, A14, A14 A14 → # Value 16807.75 17104.16 16376.25 15435.66 18086.75 19128 19182.33 19128 Based on Table and the data in Table 1, we complete forecasted results for enrolments from 1971 to 1992 based on first-order fuzzy time series model with 14 intervals are listed in Table Table The complete forecasted outputs based on the first–order FTS 40 N V Tinh, N C Dieu, “An improved forecasting model… K-means clustering technique.” Nghiên cứu khoa học công nghệ under number of intervals=14 Year Actual data Fuzzy set Forecasted value Forecasted- actual 1971 13055 A1 Not forecasted 1972 13563 A2 13512 -51 1973 13867 A3 13955 88 -1991 19337 A14 19182 -155 1992 18876 A14 19182 306 1993 # 19128 To estimate the forecasting accuracy, the Mean Square Error (MSE) used as follows: (10) Where, Ri denotes actual data at year i, Fi is forecasted value at year i, n is number of the forecasted data, λ is order of the fuzzy relationships EXPERIMENTAL RESULTS In this paper, we apply the proposed method to forecast the enrolments of University of Alabama with the whole historical data [4], the period from 1971 to 1992 and we also apply the proposed method to handle other forecasting problems, such as the empirical data for the TAIFEX [14], from 8/3/1998 to 9/30/1998 are used to perform comparative study in the training phase 4.1 Experimental results for enrolment prediction Actual enrolments of the University of Alabama [4] are used to perform comparative study in the training and testing phases In order to verify forecasting effectiveness, the proposed model is compared with existing models for various orders and different intervals The forecasted accuracy of the proposed method is estimated using the MSE value in Eq (10) 4.1.1 Experimental results from the training phase In order to verify the forecasting effectiveness of the proposed model for the first – order FRLGsunder different number of intervals, five FTS models in the SCI model [2], the C96 model [4], the H01 model [5], CC06a model [11] and HPSO model [17] are examined and compared A comparison of the forecasting results among these models is shown in Table Table A comparison of the forecasted results for the first-order FLRGs with 14 intervals Year Actual data SCI C96 H01 CC06a HPSO Our proposed 1971 13055 1972 13563 14000 14000 14000 13714 13555 13512 1991 19337 19000 19000 19500 19149 19340 19182 1992 18876 19000 19000 19149 19014 19014 19182 1993 N/A 19128 MSE 423027 407507 226611 35324 22965 20333 It is obvious that the proposed model getsthe smallest MSE value of 20333 among all the compared models with different number of intervals The major difference between the Tạp chí Nghiên cứu KH&CN quân sự, Số Đặc san ACMEC, 07 - 2017 41 Điều khiển – Cơ điện tử - Truyền thông CC06a, HPSO and ourmodels is the defuzzification forecasting rules and optimization methods used Two models in CC06a[11], HPSO[17] use the genetic algorithm and the particle swarm optimization algorithm to get the appropriate intervals, respectively, while the proposed model performs the K- mean algorithm to achieve the best interval lengths Displays the forecasting results of H01 model [5], CC06a model[11], HPSO model[17] and the proposed method The trend in forecasting of enrollments by first-order of the fuzzy time series model in comparison to the actual enrolment can be visualized in Fig 20,000 Actual data H01 model CC06a model HPSO model Our model 19,000 Number of students 18,000 17,000 16,000 15,000 14,000 13,000 1972 1974 1976 1978 1980 1982 1984 Years forecasting 1986 1988 1990 1992 Fig.1 The curves of the actual data and the H01, CC06a, HPSO models and our model for forecasting enrolments of University of Alabama From Fig 1, it can be seen that the forecasted value is close to the actual enrolment of students each year, from 1972s to 1992s To verify the forecasting effectiveness for high-order fuzzy time series, four existing forecasting models, the C02 [9], CC06b [12], HPSO [17], AFPSO [22] models are used to compare with the proposed model A comparison of the forecasted results is listed in Table where the number of intervals is seven for all forecasting models From Table 8, it is clear that the proposed model is more precise than the other four forecast models at all, since the best and the average fitted accuracies are all the best among the five models Practically, at the same intervals, the proposed method obtains the lowest MSE values which are 13224, 12369, 12827, 12237, 10988, 8772, 10658 for 3-order, 4-order, 5-order, 6-order, 7-order, 8-order and 9-order fuzzy time series, respectively The proposed model also gets the smallest MSE value of 8772 for the 8th-order FTS model The average MSE value of the proposed model is 11582, which is smallest among all forecasting models compared Table A comparison of the MSEvalue under various high-order FTS models with seven intervals Order C02 [9] CC06b [12] HPSO [17] AFPSO [22] Our model 86694 31123 31644 31189 13224 89376 32009 23271 20155 12369 94539 24948 23534 20366 12827 98215 26980 23671 22276 12237 104056 26969 20651 18482 10988 102179 22387 17106 14778 8771 102789 18734 17971 15251 10658 Average MSE 95868 31373 28121 20261 11582 4.1.2 Experimental results in the testing phase 42 N V Tinh, N C Dieu, “An improved forecasting model… K-means clustering technique.” Nghiên cứu khoa học công nghệ To verify the forecasting accuracy for future enrolments, the historical enrolments are separated two parts for independent testing The first part is used as training data set and the second part is used as the testing data set.In this paper, the historical data of enrolments from year 1971 to 1989 is used as the training data set and the historical data of enrolments from year 1990 to 1992 is used as the testing data set For example, to forecast a new enrolment of 1990, the enrolments of 1971-1989 are used as the training data Similarly, a new enrolment of 1991 can be forecasted based on the enrolments under years 1971-1990 After the training data have been well trained by the proposed model,future enrolments could be obtained to compare with testing data Some experimental results of the forecasting models for the testing phase are listed in Table Table A comparison of actual data and forecasted result for seven intervals in the testing phase Year Actual enrolments Forecasted value st 1990 1991 1992 19328 19337 18876 nd - order - order 3rd - order 4th - order 5th - order 18560 18560 18493 18563 18455 19142 19129 19149 19146 19178 18946 19212 18946 19150 19040 4.2 Experimental results for the TAIFEX forecasting In this paper, we also apply the proposed method to forecast the TAIFEX index with the whole historical data [14], from 8/3/1998 to 9/30/1998 are used as the training data in the training phase To verify the superiority in the forecasted accuracy of the proposed model with the high-order FLRGs under numbers of intervals is 16, five FTS models in C96 [4], H01b [6], L06 [20], L08 [21], HPSO [17] are selected for purposes of comparison A comparison of the forecasted results is listed in Table 10 where all forecasting models use high-order fuzzy relationships under different number of intervals Table 10 A comparison of the forecasted results of the proposed method with the existing models based on high – order of the fuzzy time series under number of intervals = 16 Date 8/3/1998 8/4/1998 8/11/1998 8/12/1998 -9/29/1998 9/30/1998 10/1/1998 MSE Actual data 7552 7560 7360 7330 6806 6787 N/A C96[4] H01b [6] L06 [20] L08 [21], HPSO[17] 7450 -7300 7300 -6850 6850 7450 7300 7300 6850 6750 -7350 7350 -6850 6750 7329 -6796 6796 7289.56 -6800.07 7289.56 9668.94 5437.58 1364.56 105.02 103.61 Our model 7332 6795.16 6783.25 6834.44 62.5 To demonstrate the effectiveness of the proposed model,we choose two models based on fuzzy time series to be compared with proposed model These two models are proposed by L08 [21], HPSO [17], respectively And the forecasted errors MSE of two models are listed in Table 11 Tạp chí Nghiên cứu KH&CN quân sự, Số Đặc san ACMEC, 07 - 2017 43 Điều khiển – Cơ điện tử - Truyền thông Table 11 A comparison of the MSE value of the proposed model with the L08 and HPSO model for the training phase based on high – order FRGs rd th Models - order - order 5th- order 6th- order 7th- order 8th- order L08 208.79 142.26 143.31 147.14 105.02 124.48 HPSO 152.47 148.14 112.24 122.68 103.61 108.37 Our model 101 80.9 70.4 69.6 62.5 66.3 From Table 11, the experimental results show that our proposed model bears all the smallest MSE in ten testing times for each order.From these results, it is obvious that our model significantly outperforms the twomodels proposed by L08 model [21] and HPSO model [17] and obtains the smallest of 62.5 for the 7th-orderFRGs From Fig.2, it can see that the forecasting values of the proposed model is close to the actual data than the compared models Fig A comparison of the MSE values for 16 intervals with different high-order FLRs CONCLUSION In this paper, we have proposed a hybrid forecasting model based on fuzzy time series model with recurrent fuzzy relations and K-mean clustering algorithm By adopting K mean algorithm, our model can get more suitable partition of the universe of discourse and using recurrence numbers of fuzzy relations, which can improve the forecasting results significantly The proposed method has been implemented on the two historical data of enrolments of University of Alabama and the TAIFEX dataset to have a comparative study with the existing methods The detail of comparison was presented in Table 7, 8, 9, 10 and 11 In all cases, the comparison shows that the proposed model out performs the compared models based on the firs – order FTS and the high – order FTS with different interval lengths Even the model was only examined in two the enrolment forecasting and the TAIFEX prediction problems; we believe that it can be applied to any other forecasting problems such as population, weather news, or car road accident forecasting, so on That will be discussed in the futureresearch REFERENCES [1] J.B MacQueen, “Some methods for classification and analysis of multivariate observations,” in: Proceedings of the Fifth Symposium on Mathematical Statistics and Probability, vol 1, University of California Press, Berkeley, CA, pp 281-297, 1967 [2] Q Song, B.S Chissom, “Forecasting Enrollments with Fuzzy Time Series – Part I,” Fuzzy set and system, vol 54, pp 1-9, 1993b [3] Q Song, B.S Chissom, “Forecasting Enrollments with Fuzzy Time Series – Part II,” Fuzzy set and system, vol 62, pp 1-8, 1994 44 N V Tinh, N C Dieu, “An improved forecasting model… K-means clustering technique.” Nghiên cứu khoa học công nghệ [4] S.M Chen, “Forecasting Enrollments based on Fuzzy Time Series,” Fuzzy set and system, vol 81, pp 311-319 1996 [5] H.K Yu, “A refined fuzzy time-series model for forecasting”, Phys A, Stat Mech Appl 346, 657–681, 2005; http://dx.doi.org/10.1016/j.physa.2004.07.024 [6] Huarng, K “Heuristic models of fuzzy time series for forecasting” Fuzzy Sets and Systems, 123, 369–386, 2001b [7] Singh, S R “A simple method of forecasting based on fuzzytime series” Applied Mathematics and Computation, 186, 330–339, 2007a [8] Singh, S R “A robust method of forecasting based on fuzzy time series” Applied Mathematics and Computation, 188, 472–484, 2007b [9] S M Chen, “Forecasting enrollments based on high-order fuzzy time series”, Cybernetics and Systems: An International Journal, vol 33, pp 1-16, 2002 [10].H.K Yu “Weighted fuzzy time series models for TAIEX forecasting”, Physica A, 349 , pp 609–624, 2005 [11].Chen, S.-M., Chung, N.-Y “Forecasting enrollments of students by using fuzzy time series and genetic algorithms” International Journal of Information and Management Sciences 17, 1–17, 2006a [12].Chen, S.M., Chung, N.Y “Forecasting enrollments using high-order fuzzy time series and genetic algorithms” International of Intelligent Systems 21, 485–501, 2006b [13].Liu, H.T., "An Improved fuzzy Time Series Forecasting Method using Trapezoidal Fuzzy Numbers," Fuzzy Optimization Decision Making, Vol 6, pp 63–80, 2007 [14].Lee, L.-W., Wang, L.-H., & Chen, S.-M “Temperature prediction and TAIFEX forecasting based on fuzzy logical relationships and genetic algorithms” Expert Systems with Applications, 33, 539–550, 2007 [15].Zhiqiang Zhang, Qiong Zhu, “Fuzzy time series forecasting based on k-means clustering”, Open Journal of Applied Sciences, 100-103, 2012 [16].Wang, N.-Y, & Chen, S.-M “Temperature prediction and TAIFEX forecasting based on automatic clustering techniques and two-factors high-order fuzzy time series” Expert Systems with Applications, 36, 2143–2154, 2009 [17].Kuo, I H., Horng, S.-J., Kao, T.-W., Lin, T.-L., Lee, C.-L., & Pan “An improved method for forecasting enrollments based on fuzzy time series and particle swarm optimization” Expert Systems with applications, 36, 6108–6117, 2009a [18].S.-M Chen, K Tanuwijaya, “Fuzzy forecasting based on high-order fuzzy logical relationships and automatic clustering techniques”, Expert Systems with Applications 38,15425–15437, 2011 [19].Huarng, K.H., Yu, T.H.K., "Ratio-Based Lengths of Intervals to Improve Fuzzy Time Series Forecasting," IEEE Transactions on SMC – Part B: Cybernetics, Vol 36, pp 328–340, 2006 [20].Lee, L W., Wang, L H., Chen, S M., & Leu, Y H “Handling forecasting problems based on two-factors high-order fuzzy time series” IEEE Transactions on Fuzzy Systems, 14, 468–477, 2006 [21].Lee, L.-W Wang, L.-H., & Chen, S.-M, “Temperature prediction and TAIFEX forecasting based on high order fuzzy logical relationship and genetic simulated annealing techniques”, Expert Systems with Applications, 34, 328–336, 2008b [22].Huang, Y L., Horng, S J., He, M., Fan, P., Kao, T W., Khan, M K., et al “A hybrid forecasting model for enrollments based on aggregated fuzzy time series and particle swarm optimization” Expert Systems with Applications, 38, 8014–8023, 2011 Tạp chí Nghiên cứu KH&CN quân sự, Số Đặc san ACMEC, 07 - 2017 45 Điều khiển – Cơ điện tử - Truyền thông [23].Bulut, E., Duru, O., & Yoshida, S “A fuzzy time series forecasting model formultivariate forecasting analysis with fuzzy c-means clustering” WorldAcademy of Science, Engineering and Technology, 63, 765–771, 2012 TĨM TẮT MƠ HÌNH DỰ BÁO CẢI TIẾN KẾT HỢP GIỮA QUAN HỆ MỜ TÁI PHÁT VÀ KỸ THUẬT PHÂN CỤM K -MEANS Tóm tắt: Hầu hết phương pháp dự báo dựa vào chuỗi thời gian mờ chủ yếu sử dụng độ dài tĩnh(độ dài khoảng nhau) Hạn chế khoảng có độ dài liệu đưa vào cách cứng nhắc chuỗi số liệu biến động không mạnh Trong báo này, chúng tơi biểu diễn mơ hình dự báo dựa chuỗi thời gian mờ kỹ thuật K- mean thơng qua hai khái niệm nhóm quan hệ mờ tái phát thuật toán phân cụm k-means Trước tiên, thuật toán K-means sử dụng để phân tập liệu lịch sử thành cụm điều chỉnh cụm thành khoảng với độ dài khác Dựa vào khoảng đạt này, thực mờ hóa tất liệu chuỗi thời gian lịch sử, xác định quan hệ mờ, thiết lập nhóm quan hệ mờ tái phát tính giá trị dự báo quy tắc đề xuất giai đoạn giải mờ dự báo Để đánh giá hiệu mơ hình đề xuất, chúng tơi kiểm định hai tập liệu kinh điển thị trường chứng khoán Đài Loan (TAIFEX) từ 8/3/1998 đến 9/30/1998 liệu sô lượng sinh viên nhập học trường Đại học Alabama So sánh kết dự báo mơ hình đề xuất với mơ hình trước đây, cụ thể với chuỗi thời gian mờ bậc bậc cao mơ hình chúng tơi đưa độ xác dự báo tốt việc dự báo số lượng sinh viên nhập học trường Đại học Alabama từ giai đoạn 1971 đến 1992 dự báo TAIFEX Từ khóa: Chuỗi thời gian mờ, Dự báo mờ, Quan hệ mờ tái phát, Phân cụm K-means, Tuyển sinh, TAIFEX Received date, 02nd May, 2017 Revised manuscript, 10th June, 2017 Published, 20th July, 2017 Author affiliations: Thai Nguyen University of Technology, Thai Nguyen University; Institute of Information Technology, Vietnam Academy of Science and Technology * Corresponding Author: nghiemvantinh@tnut.edu.vn 46 N V Tinh, N C Dieu, “An improved forecasting model… K-means clustering technique.” ... 2, and until the centroids no longer move FORECASTING MODEL BASE ON K-MEANS CLUSTERING AND RFRG In this section, a novel forecasting model based on recurrent fuzzy relationship groups and K-means. .. proposed a hybrid forecasting model based on fuzzy time series model with recurrent fuzzy relations and K-mean clustering algorithm By adopting K mean algorithm, our model can get more suitable... K-means clustering technique. ” Nghiên cứu khoa học công nghệ So, based on[10], these fuzzy logical relationship can be grouped into the same FLRG as : … 2.2 K-Means clustering algorithm K-means clustering

Ngày đăng: 10/02/2020, 00:46