Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 64 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
64
Dung lượng
1,65 MB
Nội dung
TRƯỜNG ĐẠI HỌC BÁCH KHOA HÀ NỘI LUẬN VĂN THẠC SĨ Nghiên cứu mơ hình online learning cho tốn dự báo phụ tải điện NGUYỄN NHẬT ANH Anh.NNCB190313@sis.hust.edu.vn Ngành: Toán Tin Giảng viên hướng dẫn: TS Trần Ngọc Thăng Viện: Chữ kí GVHD Tốn ứng dụng Tin học HÀ NỘI, 10/2021 LUẬN VĂN THẠC SĨ NGUYỄN NHẬT ANH Lời cảm ơn Tác giả xin gửi lời cảm ơn chân thành Tiến sĩ Trần Ngọc Thăng Tiến sĩ Nguyễn Thị Ngọc Anh tận tình hướng dẫn đóng góp ý kiến suốt q trình nghiên cứu đề tài Đồng thời tác giả xin trân trọng cảm ơn Viện Toán Ứng dụng Tin học, Phòng đào tạo - Bộ phận quản lý sau đại học Trường Đại học Bách Khoa Hà Nội tạo điều kiện thuận lợi để hồn thành luận văn Tóm tắt nội dung luận văn Dự báo phụ tải điện đóng vai trị thiết yếu việc cân cung cầu việc vận hành điều phối phụ tải điện, giúp đảm bảo ổn định hệ thống Với phát triển cảm biến thời gian thực hệ thống điện, việc xử lý liệu lớn liên tục cần thiết để cải thiện hiệu vận hành Do đó, việc nghiên cứu phát triển phương pháp học dự báo online theo thời gian thực có tính ứng dụng cao dự báo phụ tải Luận văn tập trung trình bày online learning cho dự báo chuỗi thời gian, từ đề xuất mơ hình online learning dựa mơ hình seasonal ARMA có khả dự báo tốt chuỗi thời gian có yếu tố mùa Kết dự báo liệu phụ tải quốc gia thực tế cho thấy mơ hình đề xuất phù hợp với chuỗi thời gian có yếu tố mùa Từ khóa: Online learning, Online convex optimization, Seasonal time series, Load forecasting, ARIMA model Hà Nội, ngày 20 tháng 09 năm 2021 Học viên thực Nguyễn Nhật Anh Mục lục GIỚI THIỆU 1.1 Giới thiệu chung toán dự báo phụ tải điện 1.2 Giới thiệu online learning ứng dụng dự báo chuỗi 1.3 Mục tiêu nghiên cứu 1.4 Bố cục luận văn PHƯƠNG PHÁP LUẬN 2.1 Một số khái niệm chuỗi thời gian 2.1.1 Khái niệm chuỗi thời gian ồn trắng 2.1.2 Toán tử lùi toán tử sai phân 2.1.3 Quá trình dừng 2.2 Mơ hình phân tích chuỗi thời gian ARIMA 2.2.1 Quá trình ARMA 2.2.2 Quá trình ARIMA 2.2.3 Quá trình seasonal ARIMA 2.3 Khung toán online learning 2.3.1 Thuật toán online gradient descent (OGD) 2.4 Mơ hình online ARMA 2.4.1 Thuật toán ARMA-OGD 2.5 Mơ hình đề xuất: online ARMA kết hợp yếu tố mùa 2.5.1 Thuật toán đề xuất seasonal ARMA-OGD thời gian THÍ NGHIỆM VÀ KẾT QUẢ 3.1 Thí nghiệm liệu nhân tạo 3.2 Thí nghiệm với liệu phụ tải điện thực tế 3.2.1 Mô tả liệu 3.2.2 Ứng dụng mơ hình đề xuất để dự báo phụ tải điện KẾT LUẬN 10 10 10 10 10 11 11 11 12 12 13 14 16 22 24 30 30 31 31 32 37 LUẬN VĂN THẠC SĨ NGUYỄN NHẬT ANH Tài liệu tham khảo 38 Phụ lục 40 A Công bố khoa học liên quan 41 Danh sách hình vẽ 2.1 2.2 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 Sơ đồ khung toán online learning Minh họa thuật toán online gradient descent: giá trị xt+1 xác định di chuyển xt theo hướng −∇ft chiếu lại xuống tập định K Image credit: (6) Tổn thất tích lũy thuật tốn đề xuất seasonal ARMA-OGD mơ hình seasonal ARMA cố định tốt 5000 vòng lặp Regret thuật tốn đề xuất seasonal ARMA-OGD 5000 vịng lặp Tổn thất trung bình thuật tốn đề xuất seasonal ARMA-OGD 5000 vòng lặp Chuỗi phụ tải điện quốc gia theo từ 0h 1/7/2021 đến 23h 29/7/2021 Biểu đồ hộp so sánh MAPE hai thuật toán sau 30 lần chạy Biểu đồ hộp so sánh RMSE hai thuật toán sau 30 lần chạy Kết dự báo ngày gần mơ hình ARMA-OGD Kết dự báo ngày gần mơ hình seasonal ARMA-OGD A.1 Minh chứng báo A.2 Minh chứng báo 13 14 31 32 32 33 35 35 36 36 41 52 Danh sách bảng 3.1 3.2 Dữ liệu phụ tải điện (đơn vị MW) theo cho dạng bảng So sánh kết dự báo hai thuật toán sau 30 lần chạy (lấy trung bình) 33 34 Bảng kí hiệu từ ngữ viết tắt Từ viết tắt Ý nghĩa ARIMA AutoRegressive Integrated Moving Average ARMA AutoRegressive Moving Average ARMA-OGD AutoRegressive Moving Average-Online Gradient Descent MAPE Mean Absolute Percentage Error RMSE Root Mean Square Error SCADA Supervisory Control And Data Acquisition VARMA Vector AutoRegressive Moving Average CHƯƠNG GIỚI THIỆU 1.1 Giới thiệu chung toán dự báo phụ tải điện Dự báo phụ tải điện góp phần quan trọng việc điều độ hệ thống điện quốc gia, giúp lên kế hoạch vận hành sản xuất để đáp ứng đủ nhu cầu điện năng, đảm bảo ổn định hệ thống Với phát triển ứng dụng cảm biến thời gian thực hệ thống điện (hệ thống SCADA), việc xử lý liệu lớn liên tục cần thiết để cải thiện hiệu vận hành Hệ thống SCADA có khả giám sát, thu thập liệu từ xa cập nhật tới máy chủ liên tục theo thời gian thực Do đó, việc phát triển thuật tốn dự báo online theo thời gian thực trở nên cấp thiết có tính ứng dụng thực tế dự báo phụ tải điện 1.2 Giới thiệu online learning ứng dụng dự báo chuỗi thời gian Hiện nay, hầu hết phương pháp học máy có giám sát (supervised learning) khơng giám sát (unsupervised learning) huấn luyện mơ hình với liệu đưa vào toàn theo batch Phương pháp học truyền thống có số nhược điểm: Toàn tập liệu phải biết trước; với liệu đưa vào dạng data stream khơng có khả thích ứng, phải huấn luyện lại toàn liệu có liệu cập nhật Điều gây tốn lớn chi phí tài nguyên thời gian huấn luyện Online learning thiết kế để khắc phục nhược điểm Online learning phương pháp học máy liệu đưa vào cách để mơ hình cải thiện Các mơ hình online learning có khả thích ứng tốt trước thay đổi theo thời gian liệu đưa vào Do đó, online learning phù hợp ứng dụng nhiều toán với liệu dạng data stream dự báo phụ tải, phát bất thường, v.v Trong năm gần đây, với phát triển liệu lớn, ngày có nhiều LUẬN VĂN THẠC SĨ NGUYỄN NHẬT ANH công trình, tài liệu nghiên cứu dành riêng online learning Tiêu biểu sách Prediction, learning, and games Cesa-Bianchi Lugosi (4) đưa nhìn tổng quan online learning, đặc biệt tập trung vào dự báo online với gợi ý chuyên gia (online prediction with expert advice) Cuốn Introduction to Online Convex Optimization Hazan (6) hệ thống hóa online learning, phát biểu toán online learning dạng khung tối ưu lồi online (online convex optimization framework) Theo đó, online learning chia thành T vòng chơi Tại vòng chơi, người chơi phải đưa định định tương ứng với tổn thất Tổn thất tiết lộ cho người chơi sau đưa định Mục tiêu online learning tổn thất tích lũy sau T vịng chơi nhỏ có thể, điều thể qua khái niệm regret (sẽ trình bày chi tiết phần sau) Về báo nghiên cứu online learning, năm 2003 Zinkevich (11) áp dụng phương pháp gradient descent tối ưu lồi online để đề xuất thuật toán online gradient descent Với hàm tổn thất √ hàm lồi, tác giả regret thuật toán đạt O( T ) Năm 2007, Hazan cộng (7) đề xuất thuật toán online Newton step dựa ý tưởng thuật toán Newton-Raphson truyền thống Thuật toán đạt regret O(log(T )), tốt so với thuật toán online gradient descent Tuy nhiên, điều kiện để sử dụng thuật toán hàm tổn thất phải thỏa mãn điều kiện exp-concave, chặt so với điều kiện hàm lồi thông thường tối ưu lồi online Hơn nữa, trình cập nhật tham số thuật tốn ta phải tính ma trận Hessian thay cần phải tính gradient thuật tốn online gradient descent Trong dự báo chuỗi thời gian, mô hình ARIMA Box Jenkins (3) mơ hình chuỗi thời gian cổ điển ứng dụng rộng rãi Việc áp dụng online learning cho toán dự báo chuỗi thời gian với ARIMA đề xuất lần Anava cộng vào năm 2013 (1) Trong báo này, tác giả sử dụng mơ hình ARMA làm mơ hình sở cho dự báo kết hợp áp dụng thuật toán online gradient descent Zinkevich thuật toán online Newton step Hazan để tìm hệ số tối ưu cho mơ hình online ARMA Năm 2016, Liu đồng tác giả (8) phát triển mở rộng mô hình thành online ARIMA, xử lý chuỗi thời gian không dừng Dựa nghiên cứu trên, năm 2018 Yang cộng (10) đề xuất mơ hình online VARMA Mơ hình xây dựng thiết kế để làm việc với chuỗi thời gian nhiều biến Tuy nhiên, thời điểm chưa có nhiều cơng trình nghiên cứu đề cập đến việc áp dụng online learning để xử lý chuỗi thời gian có tính mùa Đặc biệt dự báo phụ tải điện, tính chu kì thường thể cách rõ ràng phụ thuộc trực tiếp vào yếu tố nhiệt độ, độ ẩm, thói quen sinh hoạt sử dụng thiết bị điện LUẬN VĂN THẠC SĨ NGUYỄN NHẬT ANH người dân Từ đó, luận văn tập trung trình bày, đề xuất phương pháp dự báo online áp dụng cho chuỗi thời gian có yếu tố mùa chuỗi phụ tải điện 1.3 Mục tiêu nghiên cứu Mục tiêu luận văn nghiên cứu, tìm hiểu phương pháp online learning cho dự báo chuỗi thời gian, từ ứng dụng vào tốn dự báo phụ tải điện Cụ thể tốn dự báo phụ tải điện, tính chất mùa thể cách đặc trưng, luận văn đề xuất mơ hình online seasonal ARMA xét tới tính mùa để áp dụng dự báo 1.4 Bố cục luận văn Nội dung luận văn bao gồm chương: • Chương 1: Giới thiệu chung toán dự báo phụ tải điện, giới thiệu online learning ứng dụng online learning cho toán dự báo chuỗi thời gian Cuối phát biểu mục tiêu nghiên cứu cấu trúc luận văn • Chương 2: Trình bày khái niệm chuỗi thời gian mơ hình ARIMA Trình bày khái niệm thuật tốn online learning, mơ hình online ARMA từ đề xuất mơ hình online ARMA kết hợp yếu tố mùa • Chương 3: Áp dụng thuật toán đề xuất chương trước dự báo cho liệu phụ tải điện thực tế Trình bày kết dự báo thu từ liệu thực tế đưa đánh giá, nhận xét • Chương 4: Tóm tắt, tổng hợp kết đạt đưa hướng nghiên cứu, phát triển Nguyen Nhat Anh, NH Quoc Anh, NX Tung, NT Ngoc Anh criteria are defined as follows: M AP E = n n t=1 |yreal (t) − yf orecast (t)| ; RM SE = yreal (t) n n t=1 (yreal (t) − yf orecast (t))2 (2) Experimental Scenarios Three experimental scenarios are investigated: Scenario 1: LSTM with different parameters are applied for forecasting Scenario 2: Use genetic algorithm for feature selection to select input features for LSTM model Scenario 3: Use genetic algorithm for feature selection to select input features for LSTM model, then use Bayesian optimization to fine-tune LSTM hyperparameters Results: The hyper-parameters we fine-tune in scenario include number of neurons (NN), n-steps which is the window of multiple time series and learning rate (LR) For LSTM model, we set the number of layer is 1, the activation function is function Moreover, we use all features to train our model in scenario The result with difference sets of hyper-parameters which choosing randomly is shown in table In scenario 2, using the same sets of hyper-parameters in scenario 1, we apply GA to select the most appropriate features for each set of hyper-parameters We will use Adam optimizer for GA algorithm The result of scenario is shown in table The proposed model not only use GA for feature selection but also use Bayesian Optimizer to fine the optimize hyper-parameters For BO, the number of iterations (iter) we choose are 20, 30, 40, 50, 60 The sets of optimize hyper-parameters with each number of iteration are shown in table For the proposed model LSTM-GA-BO, the average MAPE and average RMSE results are 4.31% and 1601.23 (MW), respectively Compared to simple LSTM model, the proposed model gives us much superior results However, when we compare the proposed model with the LSTM-GA model, there are two things that we need to discuss deeper Firstly, in general, the proposed model gives better results than the LSTM-GA model Although the average MAPE of the proposed model is 4.31% higher than the average MAPE of the LSTM-GA model, the average RMSE of the proposed model is 9.15% better This can be explained that the loss function needed to optimize is RMSE, not MAPE Secondly, among the parameters given for the LSTM-GA model, there are several sets of parameters that give a much better result than the model we propose, for example, parameter set NN: 300, n- steps = 25, LR = 0.001 yields a remarkable optimal result of 3.29% for MAPE and 1383.12 for RMSE This raises the question of whether the BO algorithm is really effective for the parameter optimization problem However, when we look at Table 1, we can see that when we increase the number of iterations of the BO algorithm, the RMSE value tends to decrease This means that the BO algorithm is capable of converging to the optimal point with a large enough loop GA and BO for LSTM in short-term load forecasting In summary, from the obtained results, we propose GA algorithm for features selection in short-term load forecasting problem We also propose using Bayesian optimization to find the appropriate hyper-parameters for this problem Note that if we want the Bayesian algorithm to be effective, we can increase the number of iterations However, this means that the run time of algorithm will be longer and we should consider that Table Results of all scenarios Model LSTM LSTM - GA LSTM - GA - NN 128 128 200 150 300 50 128 128 200 150 300 50 391 BO 378 200 362 361 nsteps 24 20 12 15 25 10 24 20 12 15 25 10 37 48 24 26 40 LR 0.001 0.003 0.005 0.01 0.001 0.006 0.001 0.003 0.005 0.01 0.001 0.006 0.00069 0.0012 0.001 0.015 0.00082 Iter MAPE RMSE(MW) 20 30 40 50 60 7.14% 6.73% 6.71% 6.76% 7.25% 6.51% 4.47% 4.37% 4.28% 4.19% 3.29% 4.68% 4.26% 4.21% 4.39% 4.28% 4.29% 2634.31 2460.42 2492.68 2533.25 2599.50 2449.82 1854.35 1634.23 1785.80 1982.02 1383.12 1775.60 1624.67 1590.17 1627.27 1589.77 1563.21 Conclusion and discussion In this paper, a novel forecasting model has been proposed to obtain more accurate load prediction By using genetic algorithm, feature selection is done automatically and is highly optimized rather than picking features manually Additionally, the hyper-parameters of the LSTM network are optimized using Bayesian optimization The test results have shown that the proposed model is suitable for Vietnam electricity load forecast at nationwide level Actual data have been used to test short term load forecasting model for two days ahead In conclusion, in this paper some results were achieved as follows: – – – – Feature selection using genetic algorithm Employ LSTM in multi time series in forecasting Fine-tune the hyper-parameters for the LSTM using Bayesian optimization Propose a forecasting model combining GA, LSTM and Bayesian optimization – Apply the proposed model in shot-term load forecasting in Vietnam 10 Nguyen Nhat Anh, NH Quoc Anh, NX Tung, NT Ngoc Anh With the results achieved by the proposed model, we believe it will contribute to energy consumption assessment and electricity market management In the future work, more time series forecasting applications are researched based on the proposed model The stability of the model will be investigated References Cheng, H., Ding, X., Zhou, W., Ding, R.: A hybrid electricity price forecasting model with bayesian optimization for german energy exchange International Journal of Electrical Power Energy Systems 110, 653 – 666 (2019) Dat, N.Q., Ngoc Anh Nguyen Thi, V.K.S., An, N.L.: Prediction of water level using time series, wavelet and neural network approaches International Journal of Information Retrieval Research (IJIRR) 10 (2020) Frazier, P.I.: A tutorial on bayesian optimization (2018) Gers, F.A., Schmidhuber, J.A., Cummins, F.A.: Learning to forget: Continual prediction with lstm Neural Comput 12(10), 2451–2471 (Oct 2000) Heydari, A., Majidi Nezhad, M., Pirshayan, E., Astiaso Garcia, D., Keynia, F., De Santoli, L.: Short-term electricity price and load forecasting in isolated power grids based on composite neural network and gravitational search optimization algorithm Applied Energy 277, 115503 (2020) Hochreiter, S., Schmidhuber, J.: Long short-term memory Neural Comput 9(8), 1735–1780 (Nov 1997) Huang, Y., Gao, Y., Gan, Y., Ye, M.: A new financial data forecasting model using genetic algorithm and long short-term memory network Neurocomputing (2020) Jones, D., Schonlau, M., Welch, W.: Efficient global optimization of expensive black-box functions Journal of Global Optimization 13, 455–492 (12 1998) Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey CoRR cs.AI/9605103 (1996), https://arxiv.org/abs/cs/9605103 10 Kulshrestha, A., Krishnaswamy, V., Sharma, M.: Bayesian bilstm approach for tourism demand forecasting Annals of Tourism Research 83, 102925 (2020) 11 Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) The MIT Press (2005) 12 Salami, M., Sobhani, F., Ghazizadeh, M.: A hybrid short-term load forecasting model developed by factor and feature selection algorithms using improved grasshopper optimization algorithm and principal component analysis Electrical Engineering 102 (11 2019) 13 Sheikhan, M., Mohammadi, N.: Neural-based electricity load forecasting using hybrid of ga and aco for feature selection Neural Computing and Applications - NCA 21, 1–10 (11 2011) 14 Sulandari, W., Subanar, Lee, M.H., Rodrigues, P.C.: Indonesian electricity load forecasting using singular spectrum analysis, fuzzy systems and neural networks Energy 190, 116408 (2020) 15 Wang, F., Xuan, Z., Zhen, Z., Li, K., Wang, T., Shi, M.: A day-ahead pv power forecasting method based on lstm-rnn model and time correlation modification under partial daily pattern prediction framework Energy Conversion and Management 212, 112766 (2020) 16 Zhang, Q., Hu, W., Liu, Z., Tan, J.: Tbm performance prediction with bayesian optimization and automated machine learning Tunnelling and Underground Space Technology 103, 103493 (2020) LUẬN VĂN THẠC SĨ NGUYỄN NHẬT ANH Hình A.2: Minh chứng báo 52 Hybrid online model based multi seasonal decompose for short-term electricity load forecasting using ARIMA and online RNN Nguyen Quang Dat1 , Nguyen Thi Ngoc Anh1 , Nguyen Nhat Anh2 , and Vijender Kumar Solanki3 School of Applied Mathematics and Informatics, HUST, Hanoi, Vietnam CMC corporation, No 11, Duy Tan street, Cau Giay district, Hanoi, Vietnam Department of Computer Science & Engineering CMR Institute of Technology, Hyderabad, India Abstract—Short-term electricity load forecasting (STLF) plays a key role in operating the power system of a nation A challenging problem in STLF is to deal with real-time data This paper aims to address the problem using a hybrid online model Online learning methods are becoming essential in STLF because load data often show complex seasonality (daily, weekly, annual) and changing patterns Online models such as Online AutoRegressive Integrated Moving Average (Online ARIMA) and Online Recurrent neural network (Online RNN) can modify their parameters on the fly to adapt to the changes of realtime data However, Online RNN alone cannot handle seasonality directly and ARIMA can only handle a single seasonal pattern (Seasonal ARIMA) In this study, we propose a hybrid online model that combines Online ARIMA, Online RNN, and Multiseasonal decomposition to forecast real-time time series with multiple seasonal patterns First, we decompose the original time series into three components: trend, seasonality, and residual The seasonal patterns are modeled using Fourier series This approach is flexible, allowing us to incorporate multiple periods For trend and residual components, we employ Online ARIMA and Online RNN respectively to obtain the predictions We use hourly load data of Vietnam and daily load data of Australia as case studies to verify our proposed model The experimental results show that our model has better performance than single online models The proposed model is robust and can be applied in many other fields with real-time time series Keywords : Hybrid online, RNN online, multi time series, multi seasonal decompose, Electricity forecasting I I NTRODUCTION Electrical load prediction is an important task and fundamental to operate the power system of a nation [1] Three common electrical load prediction categories are considered: short-term, medium term, and long-term [2], [3], [4] However, short-term is the most important because of role in operation and planning of power systems in day-to-day scheduling [5] Short-term time series prediction problems are attractive many researchers [6], [7], [8], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21] that three main methodologies are statistical approach, machine learning approach and the hybrid approach The first one are used widely such as ARMA, ARIMA, SARIMA, ARIMAX [6], [7], [8], [8], [9] Secondly, the artificial intelligent are applied for this problem such as ANN, RNN, LSTM [16], [17], [18], [19], [20], [21] Third, hybrid approach are combined The two approaches mentioned above such as [10], [11], [12], [13], [14], [14], [15] The properties of time series are different with the other data type that data depend on time Thus, stationary, trend and seasonal are considered The techniques to solve time series are Stationary test and decompose trend, seasonal and residual components A Structure of this paper The remainder structured of this paper is as follows: Section III is methodology of time series prediction ARIMA online, RNN online and our proposed hybrid online model, Section IV shows the experiment and results of hybrid online models and algorithms for electricity load prediction and then describes the evaluation criteria and presents the experimental results, Section V discusses the results, the conclusion and the future work II R ELATED WORKS A Model statistical mode for time series forecasting Research objective Forecasting heat demand for district heating system Modelling and Forecasting of Rainfall Railway passenger forecasting Study field Forecasting scope Pubs Short-term [6] Climate Forecasting method • Multiple linear regression model • SARIMA SARIMA Short-term [8] Transport SARIMA Long-term [7] Energy TABLE I S EASON ARIMA FOR TIME SERIES In the paper [6], Fang et al first proposed a simple regression model with the hourly outdoor temperature, hourly wind speed Weekly rhythm of heat was added to increase the accuracy The SARIMA model was used variables weather and the historical heat data as depending variables The model can be a long-term forecast (although the results are far worse than short-term forecasts) However, the model only works for seasonal data sets, which are difficult to apply to nonlinear data Similarly, in the paper [7], railway passenger flows are analyzed and the SARIMA model proposed The historical data (monthly passenger counts on Serbian railway) is concluded that this data has a strong autocorrelation of seasonal characteristics And in the paper [8], SARIMA models were constructed for monthly, weekly and daily monsoon rainfall time series With the above problems, when the author uses the ARIMA model, the results are more accurately predicted when the input data is usually simple data, does not fluctuate continuously, does not change much (even data used is similar to linear data) This is less common in practice, or more commonly encountered in low-value data sets With large datasets, continuously variable values and large amplitudes (nonlinear data), it is more difficult to predict, the result will not be equal to other models Research objective Study filed Forecasting method Forecasting scope Pubs Prediction method for renewable energy sources Multi-layer algorithm for neural network Energy Neural network wavelet neural network Short-term [21] Short-term [18] Prediction of lane changing behavior Transport Long-term [19] Short term and longterm Short-term [22] Short-term [24] RNN, Hybrid multi–objective evolutionary algorithms Elman RNN Short-term [25] Short-term [26] Energy LSTM, DELSTM, BPNN, RNN, SVM Short-term [20] Energy RNN, DGRU Short-term [4] Predict the missing values in time series Compare the models Two different problem decomposition methods for training Elman RNN on problems in chaotic time series Methods of selecting individual prediction models for hybrid model Elman RNN for Time-Series prediction Compare RNN, SVM, LSTM and hybrid LSTM differential evolution Time series prediction Energy Multilayer Neural Network with Multi-Valued Neurons deep neural network RNN RNN model Energy Energy Energy Energy Multi-layer perceptron, RNN model RNN, Chaotic time series LSTM, [23] TABLE II N EURAL NETWORK METHODS AND RNN METHODS FOR TIME SERIES The aim of the paper [21] is to build a method to predict future value for renewable energy sources to control a micro grid system (an intelligent management) By means of Wavelet decomposition and ANN, the novel model was based on the multi-resolution analysis of the time series This novel model was reduce about 29% of the resources needed to implement the algorithm In the paper [18], Igor Aizenberg et al proposed a new model ”Complex-valued Neural Network” that solved the problem of the long-term time series prediction using Multi-Valued Neurons in a Multilayer Neural Network This paper delves deeper to propose the use of a complex network of valuable values to predict for nonlinear time series (the fluctuation of values in the series has absolutely non-fixed rules) In the paper [19], a novel Group-wise Convolutional Neural Network was proposed: Multivariate Time SeriesGroup-wise Convolutional Neural Network model This model was used for Multivariate Time Series pattern classification Jun et al present novel learning algorithm for training phase The results shown that, with comparison of other models’ state-of-arts, the proposed neural network performs better in terms of forecasting accuracy when use for nonlinear data In papers [22], [23], [24], [25], [26], [4], the authors made many comparisons between RNN and other machine learning methods Most recently, 2019, Alaa Sagheer and Mostafa Kotb [4] proposed a model RNN and deep LSTM for forecasting in oil production The authors developed a new model of the deep neural network Authors’ results are good compared to models such as the RNN, Deep GRU (GRU is a variant of traditional LSTM network), Nonlinear Extenstion for linear Arps, Higer-Order Neural Network Take the other approach to the machine learning, Lu Peng et al pointed out that there are many parameters of RNN and LSTM neural networks that can affect the efficiency and accuracy of models such as input length, epcho number, hidden units, batch-size [20] The authors used RNN and LSTM together with differential evolution (DE) algorithm for electricity price forecasting The DE algorithm is used for the purpose of finding suitable parameters for RNN network and LSTM network The proposed model gives better results than BPNN, DE-BPNN, Support Vector Machine, RNN, ARIMA, Zhang’s hybrid model [10], Babu’s hybrid model [11], Khashei’s hybrid model on different nonlinear data sets The authors provided evaluation and estimation models for a static data set This dataset is used to evaluate both longterm and short-term forecasts When long-term prediction (estimated value is far from the final value of the time series), the models used have a high accuracy of results This is explained by the following estimated values, which must be used to run the model, which are also estimates and inaccuracies Peter Zhang [10] was the first author to propose a new hybrid model that combines ARIMA and the Artificial Neural Network (ANN) models His hybrid model takes advantage of the strength of traditional linear models and nonlinear model He assumed that the time series are decomposed into components that are a linear component and a nonlinear component Then, the linear component of dataset using auto-regressive moving average model (ARMA) (or ARIMA model) and the nonlinear component of dataset using the ANN model The limit of this hybrid model is his assumption because original time series (original dataset) are not always the sum of a linear and a non-linear components C N Babu et al [11] detected the limitation of Zhang’s hybrid model that the given data directly separated in two components linear and nonlinear Research objective Combine ARIMA and ANN for better accuracy Use movingaverage (MA) filter for hybrid model (ARIMA–ANN) Use wavelet transform for hybrid ARIMA–ANN model Use wavelet for denoising in hybrid model Use wavelet for denoising in hybrid model Study field Finance Astronomy Natural Energy Climate Forecasting method Forecasting scope Pubs ARIMA and ANN Short-term [10] MA filter ARIMA ANN Short-term [11] then and Wavelet-ARIMAANN Short-term [27] Research objective Use online machine learning for time series Predict the power of energy system Study field Industry Forecasting method ARIMA online Forecasting scope Long-term Pubs Energy - ARIMA online - Markov chain Short-term [30] [29] TABLE IV ARIMA ONLINE FOR TIME SERIES the ARIMA model into a online model of full information optimization As a consequence, Liu et al can estimate the Climate Wavelet, TS, Short-term [28] parameters in an efficient Beside that, the authors analyze the proposed algorithms, prove that the ARIMA online is provably ANN as good as the best ARIMA model in hindsight The results TABLE III shown the effectiveness of this method In the paper [30], H YBRID MODEL FOR TIME SERIES FORECASTING J Leithon et al proposed an ARIMA model and a Markov chain model for intra-day forecasting, and a linear program to optimize the decision variables on a real-time data With Authors proposed the new hybrid model that used an Moving the result, the authors consider random energy rates and show Average filter (MA filter) to decompose the original time series robustness of their proposed model into sub-time series Then, the ARIMA model was applied to linear time series and the ANN model was applied to the Research objective Study Forecasting Forecasting Pubs left of dataset component The forecasts were compared with field method scope the forecasts from the two single ARIMA, ANN models Babu Chaotic time series Method RNN online Short-term [31] analytic and preet al shown their proposed model has better accuracy in Mean dict Average Error (M AE) and Mean Square Error (M SE) than Learning Data anaRNN online Short-term [32] Zhang’s hybrid model Attentions for lytic Online Advertising For denoising the signal data, some article used the wavelet Online services Network RNN online Short-term [33] method Wavelet will remove the component of the original prediction data data that can’t predict In paper [15], Aasim et al shown that, Control the Robot Robot RNN online Short-term [34] the Wavelet model can increase the accuracy of the model TABLE V by removing the noise in the original time series The MSE RNN ONLINE FOR TIME SERIES of the hybrid model with wavelet was 76% betterthan single model In the work [28], the author Anh et al have shown that pre-processing data by wavelets will give better results The 2) Online machine learning - RNN: In 2007, Gao et al reason given is that the noise in the original data is harmful used RNN online for analytic and predict the Lorenz’s time values, making the forecast accuracy less So filtering noise series [31] RNN model in order was overcomed several during pre-processing data is necessary The authors pointed drawbacks of traditional models and to combine advantageous out that when using the wavelet filter to filter noise, the results features In 2016, Tian Guo et al proved that the RNN will be better than 3% online enables to upgrade a RNN models by down-weighted By using a mixture of individual models, the strengths of gradients for output The result of the RNN online model in each model are shown and weaknesses are removed However, the extensive experiments on all testing dataset and actual hybrid models can be built in a variety of ways, from many datasets In other work, the authors Ahmadreza et al [34] different single models So the choice of a single model is used RNN for AI to program and control the robot, making entirely up to the data we need to forecast it possible to act in a situation that has never been seen in the past This helps the fish to operate without interruption B Problem objective and the working process does not require human intervention This research attempts the problem of electricity load pre- Predictions are continually revalued when new external data diction The new hybrid online model combining ARIMA can be observed.They have proven that RNN works better than online and RNN online, that enhance the advantage and reduce traditional models Energy Wavelet ARIMA - Short-term [15] the disadvantage of single model In additon, multi seasonal is used intend to get better performance for seasonal dataset 1) Online machine learning - ARIMA: In 2016, Liu et al.[29] proposed online learning algorithms for estimating ARIMA models that has higher computational efficiency ARIMA medel in the idea of the authors was to reformulate The online model allows for continuous updates of new values for data, and thus allows the model to run again and again So long-term forecasts will become short-term forecasts when the value used for forecasting is no longer an estimate of the previous run The accuracy thus will also increase [31], [34] III M ETHODOLOGY The goal of result of ARIMA online model is to minimize the sum of losses when the model run over several loop T The regret of the online model after the loop of T rounds can define as following function: A Formulation electricity load prediction problem Give original data (several times series) xt = {x1t ; x2t ; ; xdt } (1) where d is the number of dimensions t is the moment that the value was collected (t=0,1, ,∞) We need to estimate the value in the near future yt+1 , yt+2 , etc when we knew the past values y1 to yt B ARIMA Box and Jenkin in 1971[35] proposed ARIMA model In that book they shown that in an ARIMA model, the future value of the prediction of the time series yt is predicted from one linear function of several past values yt–1 , yt–2 , , yt–p and several past errors εt–1 , εt–2 , , εt−q We can see this function in the following function: yt = θ0 +φ1 yt−1 + +φp yt−p +εt −θ1 εt−1 − −θq εt−q (2) In this function, yt−i (with i = p) are p values from time series before the yt (these was used in the function), φi (with i = p) and θj (with j = q) are coefficients Error values εt−i (with i = q) are the random error Then we call the integer value p - the order of Auto-Regressive, and the integer value q - the order of Moving Average Note the Auto-Regressive Moving Average ARMA(p,q) is a special of the ARIMA model with the order d=0 (differences order) Where q=0 and d=0, the ARIMA model degenerate to the AR model (then the model has order p only), and where p=0 and d=0, the ARIMA model degenerate to the MA model (with order q only) 1) ARIMA online: For learning with ARIMA online models, where an model sequentially commits to a decision and get the forecasting value, then we suffer a unknown loss (when decision maker ahead of time) In online ARIMA, the coefficient vectors (φ, θ) are assumed to be fixed At moment t, the machine chooses the noise εt and generates observation yˆ from real values yt−p , ,yt−1 based on equation (2) The values φt−p , ,φt−1 , θt−p , ,θt−1 , and error εt are important in this function but unknown to us at any moments At time t, the model makes a prediction yˆt not related to the real value yt Therefore, the learner got a loss t (yt , yˆt ) At time t + the real value yt is disclosed to the learner [29] The loss function is define as following function: t (φ, θ) = (3) yt , yˆt (φ, θ) d−1 = yt , d i yˆt + yt−1 i=0 p = yt , i=1 q d−1 φi i i yt−1 + i=0 yt−1 + θi εt−1 i=1 with the dth order difference of the yt compute by function d yt = d−1 yt - d−1 yt−1 , and yt = yt T RT = t=1 T (yt , yˆt ) − φ,θ t yt , yˆt (φ, θ) (4) t=1 In the paper [29], Liu et al were to propose an efficient model that the regret grows sub-linearly as a function of the number of the loop T Then when the T increases, the perround regret of the online model will vanish In the loss function that defined in Equation (4), the authors estimated the parameters vectors (φ, θ) for the ARIMA online model This task is not possible because the error εt are unknown (to the human that is learning) at any moment of the ARIMA online process And that, the authors can’t predict parameter vector (φ,θ) with the unknown error terms Liu et al used the idea from Anava’s work [36] to build a solution for tackle this challenge, where the prediction come from one of the modified ARIMA models that without the transparent error terms, doesn’t directly from the original ARIMA model A approximating model was proposed on presenting an ARIMA online learning method from Hazan et al [37] Online Newton Step 2) ARIMA online: Online Newton Step: We use a constant m ∈ N so that the coefficient vector {φ} ∈ Rm+k (m+kdimensional) is approximative the original prediction The loss function is defined as m t t (φ ) = = ˆt ) t (yt , y p+m t yt , d−1 φi i=1 d i yt−i + yt−1 i=0 The model need to minimize the difference between the total loss we has incurred and that of the best fixed ARIMA model t ΠA K is the projection onto K with the norm induced by At Note that at each iteration, after we predict Y˜t , the prediction ˜ t of the original time series is obtained by expanding and X rearranging equation Yt = (1 − B)d (1 − B s )D Xt The proposed algorithm is as follows The parameters φ = (φ1 , , φp )T are updated iteratively using Online Newton Step method [38] Algorithm ARIMA Online Newton Step Input: order p; constant m; order d; seasonal period s; learning rate η; an initial (p + m) × (p + m) Hessian matrix A0 1: Choose m = logλmax (T LMmax q)−1 [29] 2: for t = to T − p+m d−1 3: Predict yˆt = i=1 φi d yt−i + i=0 i yt−1 t 4: Observe yt and compute loss ft (φ ) 5: Set t = ft (φt ), 6: Update At ← At−1 + t Tt −1 t t 7: Set φt+1 ← ΠA t) K (φ − η At 8: end for xt = Wt · ht−1 (5) ht = f U · xt yt = g V · ht where Wt denotes the current vector of parameter, and U and V are vertor of weights that was learned (the connection weights to the hidden layer h from the input layer x) The functions f is or sigmoid, the function g is sof tmax function: 1 + e−z ez g(z) = zk ke f (z) = RNNs create the short-term memory, so it can better solve with position invariance, that feed-forward networks cannot that RNNs can solve the short-term dependencies in data series The weakness of the RNN is the ability to handle long-term dependence [16], [39] Fig The diagram of the ARIMA online model C RNN In the feed-forward neural networks, current result (or step) is represented by N − past value (steps) Unlike feed-forward neural networks, in the RNN is a neural network with additional connections between adjacent time steps History is represented with recurrent connections length is infinite The RNN model can compress whole history in low dimensional space, while feed-forward neural networks compress just only past value A RNN model allows to self-connect from any node to itself over time and share their weights to the different time steps [20] Therefore the RNN is an effective model for modeling the time series data At the moment t (or called ”step t”), the model receives the input value at the current input xt and the hidden node values at previous history ht−1 and calculates the value at the current hidden node ht The output value yt is calculated from the node value hiding ht So the output value yt depends not only on the current input value xt but also the value of xt−1 of the time ahead Fig RNN model With the input vector x=(x0 ,x1 , ,xT ), the hidden states of the recurrent layer h=(h0 ,h1 , ,hT ) and the output vector (output layer) y=(y0 ,y1 , ,yT ) In the RNN model, the functions among states is represented in terms of a tightly coupled system given by following equations: where µt is random fluctuations obeying Gaussian statistics fµt = N (µt |0, Qt ) (Qt is covarian matrix) νt is the noise satisfying fµt = N (νt |0, Rt ) (with Rt is covarian matrix) We obtain the following update scheme for parameters of the model: 1) RNN online: The online learning model has to maintain and update the parameters (and hyperparameters) of its environment allowing it to forecast the future value [40], [41] The typical equation of an online model might look like following form: (6) Wt = Wt−1 + µt ht = f ht−1 , Wt , xt yt = g ht , Wt + νt Step 2: Add the external regressors to the ARIMA model / ARIMA online model (in the form of Fourier formula) to explain the seasonal components in time series • Step 3: Use Akaike’s Information Criterion to find the best fit model In order to find the best parameters of Fourier formula corresponding to each of the periods, the AIC values of the ARIMA model are calculated The parameter that corresponds to minimum Akaike’s Information Criterion’s value is that we need • ¯t = f h ¯ t−1 , W ¯ t, x h ¯t (7) ¯ t, W ¯t Ot = g h Pt = Pt∗ − Pt∗ HtT Ht Pt∗ HtT + RT −1 Ht Pt∗ ¯t = W ¯ t−1 + Pt HtT Rt−1 (yt − Ot ) W where Pt is covariance matrix of the parameter Wt , Pt∗ is priori covariance matrix of Wt that Pt∗ = Pt−1 + Qt Ht is ¯ t−1 : the gradient of output W ¯t dOt ∂Ot dh ∂Ot = ¯ · ¯ + ¯ ¯ dWt−1 ∂ Wt−1 ∂ ht dWt−1 ¯t ¯ t dh ¯ t−1 ∂h h = ¯ +¯ ¯ t−1 ∂ Wt−1 ht−1 dW Ht = ¯t dh ¯ dWt−1 (8) E Proposed model - Hybrid Multi-seasonal - ARIMA online - RNN online From the single models, we have combined them to create a new hybrid model that carries the advantages of each single model, while reducing their defects Using yt as input corresponds to train the model The latter expression (multiplication term) in equation is the gradient calculated for online learning (real time learning) with model training D Multiple seasonal periods Electricity load data often exhibit multiple seasonal patterns (daily, weekly, monthly, quarterly and annual) SARIMA models can only represent one seasonal pattern and don’t perform well when the period is long (for example annual pattern, or even weekly pattern for hourly data) For RNN, the input window is fixed and not long enough to capture long-term periodic effects A workaround is to use dummy variables to represent calendar effects However, this is infeasible for long seasonal period because of large input dimension Recently, many time series forecasting applications require more complicated seasonal components [42], [43] A natural solution is to use Fourier series to handle multiple seasonal patterns Each seasonal period is represented by a different period pi In our case, the seasonal periods can be hourly, daily, weekly, monthly, or annually, so the Fourier series is in the following form: M Ki yt = a+ α sin i=1 k=1 2πkt +β cos pi 2πkt pi +Nt (9) This approach is flexible, allowing us to incorporate multiple periods In this study, for daily data we used p1 = to represent weekly effect and p2 = 365 to represent annual effect For each period pi , the value of Ki is chosen to minimize the Akaike information criteria After that we compare the result of the fixed model to other models which is derived from information theory, and choose the best model from them The multi-seasonal can be used in ARIMA / ARIMA online as follows: • Step 1: Decompose the multiple seasons in the original time series by using the Fourier formula with different periods Fig Proposed model Proposed hybrid Multi-seasonal - ARIMA online - LSTM online model can run in steps as followed: • • • • • • • • • 1st step: decompose original time series to training set and test set 2nd step: use MA filter to decompose the train set to two components: the trend component and the first residual component (residual set in the first time that decomposed) The trend component is a linear signal And the first residual component, that was the different of the original data and the trend component, is a non–linear time series 3rd step: use multi-seasonal decomposing method to decompose the first residual component to two set: the multi-seasonal components (1st set) and the second residual component (2nd set) The all of the multi-seasonal components are the linear signals And the second residual component is a non–linear time series 4th step: calibration ARIMA model with trend component, get result (predictive values) of first component 5th step: calibration Fourier formula transform model with all of the multi-seasonal components, get result (predictive values) of second set of components 6th step: calibration Recurrent Neural Networks (LSTM) model with second residual component Take the results of third component 7th step: reconstruct result for training set from results 8th step: forecast in the test set with the trained model 9th step: verification with MSE or MAPE criteria IV E XPERIMENT AND RESULTS In this section, we first show our dataset for experimenting and then demonstrate the results of the proposed model A Study area and datasets 1) Dataset in Australia: We used the data set of Australian Electricity Load for this comparison This data set was publicated in the site https://www.aemo.com.au/ This dataset contains electricity load measurements on all days in the week, dated from May 21st 2019 to now The data were collected at 1-day intervals (at 1:01 AM) The dataset is updated daily to 30 August 2019 The data were collected at 1-hour intervals and consisted of 29208 instances The dataset is represented as a table Each entry is corresponding to a date and the columns indicate the specific hour We treat these 24 columns as 24 separate time series and the results are evaluated individually 3) Criteria for comparison: We use criteria Mean Squared Error (MSE) (this is not RMSE - Root Mean Squared Error) and criteria Mean Absolute Percentage Error (MAPE): n M SE = · (ˆ yi − yi )2 n i=1 n M AP E = yˆi − yi · n i=1 yi B Results and compare (p, d, q) ARIMA(2,0,2) ARIMA(2,1,2) ARIMA(2,2,2) ARIMA(2,0,3) ARIMA(2,1,3) ARIMA(2,2,3) ARIMA(2,0,4) ARIMA(2,1,4) ARIMA(3,0,2) ARIMA(3,1,2) ARIMA(3,0,3) ARIMA(3,1,3) ARIMA(3,0,4) ARIMA(3,1,4) ARIMA(4,0,2) ARIMA(4,1,2) ARIMA(4,0,3) ARIMA(4,1,3) ARIMA(4,0,4) ARIMA(4,1,4) AIC 7356.900 7331.565 7321.969 7358.254 7329.418 7358.855 7357.669 7310.707 7358.180 7334.104 7356.138 7296.385 7333.827 7281.776 7360.146 7331.116 7326.382 7274.138 7309.774 7272.979 TABLE VI C OMPARISON THE CRITERIA AIC FOR THE BEST PARAMETERS (p, d, q)(AUSTRALIAN E LECTRICITY LOAD ) T HE LOOP WAS P =2, ,10; D =0, ,5; Q =2, ,10 T HE BEST PARAMETERS IS (p, d, q)=(4,1,4) 1) Results in testing dataset (Australian dataset): Fig The public data for daily Australian Electricity load Data was collected at June 19th 2020 2) Dataset in Vietnam: Northern Vietnam Electricity Load dataset: This dataset contains electricity load measurements on working days (Monday to Friday) dated from January 2015 In the table VIII VIII, we can see that the result of the single model RNN is the worst Moreover, simple RNN and simple ARIMA can only forecast once for results in the next days (7-steps-ahead) Meanwhile, every day, when the new data is collected, ARIMA online and RNN online will again forecast So ARIMA online and RNN online only forecast for the next step (1-step-ahead) After days, ARIMA online and RNN online will also receive forecasts for the 7th day From theory, we can see that ARIMA online and RNN online will certainly produce better results From theory, we can see that ARIMA online will definitely produce better results Indeed, from the table, in the criteria MAPE, we can see that ARIMA online is better than ARIMA 22.1%, and RNN online is better than RNN 14.6% Fig Result of the ARIMA model Green line is the original data Blue line is the predict values Fig Result of the ARIMA online model Green line is the original data Blue line is the predict values Fig Result of the RNN model Green line is the original data Blue line is the predict values Fig Result of the RNN online model Green line is the original data Blue line is the predict values Layer Epoch 10 Loss 0.1059 0.0185 0.0149 0.0116 0.0091 0.0073 0.0062 0.0055 0.0052 0.0053 10 0.0109 0.0057 0.0054 0.0052 0.0051 0.0051 0.0050 0.0051 0.0050 0.0051 10 0.0439 0.0109 0.0079 0.0064 0.0056 0.0052 0.0053 0.0052 0.0051 0.0053 Note Best Best Fig Result of the proposed model Green line is the original data Blue line is the predict values better than online ARIMA As for MAPE standard, the proposed model is also the best as it is 37.8% better than the RNN (worst single model), and the proposed model is also better than ARIMA online (best single model) is 3.6% 2) Results in Vietnamese dataset: etc 10 128 0.0098 0.0055 0.0055 0.0057 0.0056 0.0058 0.0056 0.0055 0.0054 0.0056 TABLE VII C OMPARISON THE EPOCH AND NUMBER OF LAYERS FOR THE BEST RNN MODEL (AUSTRALIAN E LECTRICITY LOAD ) T HE LOOP OF Epoch WAS TO 10 T HE LOOP OF Layer WAS TO 128 T HE BEST RESULT WAS Layer=4 AND Epoch=7 OR ARIMA(4,1,4) RNN ARIMA onlinea RNN onlineb Proposed hybrid online a MSE 52054161.01 57667888.03 38539636.48 40149103.00 37653594.10 Liu s model (rebuild) [29] Gao s model (rebuild) [31] MAPE 0.744251 0.770737 0.579787 0.672100 0.569104 Fig 10 Result of the ARIMA model Green line is the original data Blue line is the predict values b TABLE VIII F ORECASTING COMPARISON (AUSTRALIAN E LECTRICITY LOAD ) At the last, we can see that the proposed model gave the best results when compared to the remaining single models In the MSE standard, ARIMA online is best in single models, it’s 38.2% better than ARIMA, 6.6% better than RNN online and 53.0% better than RNN (this is the worst model in all of the single models) However, the proposed model is still 2.3% In the table IX IX, we can see that the result of the proposed model is the best The simple model can only forecast once for results in the next days (7-steps-ahead) We can easily see that ARIMA shows the worst result, 3% less than RNN Meanwhile, ARIMA online gave slightly better results than ARIMA (5.9% better in MSE), but still inferior to RNN results (19.1% worse in MSE) In individual models, RNN online gives the best results If using the MAPE evaluation criteria, RNN online is 292.8% better than ARIMA, 10 Method ARIMA RNN ARIMA onlinea RNN onlinea Proposed hybrid online a MSE 1398659.72 1063311.72 1315265.94 998594.00 990192.91 Liu s model (rebuild) [29] Gao s model (rebuild) [31] MAPE 12.532267 6.357248 7.093431 3.190338 3.023862 b TABLE IX F ORECASTING COMPARISON (V IETNAMESE E LECTRICITY LOAD ) Fig 11 Result of the RNN model Green line is the original data Blue line is the predict values 112.3% better than ARIMA online and only equal to 50% of single RNN Finally, the results of the proposed model are the best With MAPE criteria, the error value obtained is lower than that of the online RNN model - the best of the single models (lower than 5.5%) With MSE criteria, the proposed model also gave the best results, 29.2% better than the worst results of the single ARIMA model V C ONCLUSION AND DISCUSSION Fig 12 Result of the ARIMA online model Green line is the original data Blue line is the predict values We proposed a number of solutions to develop hybrid models in the paper In the first direction, we can use preprocessing data to remove unpredictable noise from the input time series One widely used method of noise filtering in preprocessing is wavelet transform (CWT or DWT) With the noise removed, the accuracy of the model will be greatly increased The second development direction is that we will use stronger machine learning models Here, we can use LSTM and GRU instead of RNN In this way, the model is likely to be more accurate However, for each input data set, we should test all these models to find the model that gives the best results Another development path that is also very practical is the use of multiple input time series Then, the model will have more information for estimating the parameters Meanwhile, the construction of models will achieve better results We offer a few ways that the following article can handle the problem: • Fig 13 Result of the RNN model Green line is the original data Blue line is the predict values • • Replace RNN online model with LSTM online model or GRU online model Pre-processing data (denoise) by Wavelets model Use two or more time series as input data In the future work, optimization of hyper-parameters for online model will be studied The more better accuracy of forecasting are always need to research Last but not least, this hybrid online is not only apply in forecasting electricity load but also be potential apply for many fields using time series data ACKNOWLEDGMENT Fig 14 Result of the proposed model Green line is the original data Blue line is the predict values This work was funded by the project number T2020SAHEP-005 of Hanoi University of Technology (HUST) We sincerely thank our colleagues for helping us through the completion of this project 11 R EFERENCES [1] J Wu, Z Cui, Y Chen, D Kong, Y.-G Wang, A new hybrid model to predict the electrical load in five states of australia, Energy 166 (2019) 598 – 609 [2] S Muzaffar, A Afshari, Short-term load forecasts using lstm networks, Energy Procedia 158 (2019) 2922 – 2927, innovative Solutions for Energy Transitions [3] W El-Baz, P Tzscheutschler, Short-term smart learning electrical load prediction algorithm for home energy management systems, Applied Energy 147 (2015) 10 – 19 [4] A Sagheer, M Kotb, Time series forecasting of petroleum production using deep lstm recurrent networks, Neurocomputing 323 (2019) 203 – 213 [5] H Quan, D Srinivasan, A Khosravi, Uncertainty handling using neural network-based prediction intervals for electrical load forecasting, Energy 73 (2014) 916 – 925 [6] T Fang, R Lahdelma, Evaluation of a multiple linear regression model and sarima model in forecasting heat demand for district heating system, Applied Energy 179 (2016) 544 – 552 ˇ [7] M Milenkovi´c, L Svadlenka, V Melichar, N Bojovi´c, Z Avramovi´c, Modelling and forecasting of rainfall time series using sarima, Transport 33 (2018) 399–419 [8] D P P., M M Z., Modelling and forecasting of rainfall time series using sarima, Environ Process (2017) 399–419 [9] S Xu, H K Chan, T Zhang, Forecasting the demand of the aviation industry using hybrid time series sarima-svr approach, Transportation Research Part E: Logistics and Transportation Review 122 (2019) 169 – 180 [10] G Zhang, Time series forecasting using a hybrid arima and neural network model, Neurocomputing 50 (2003) 159 – 175 [11] C N Babu, B E Reddy, A moving-average filter based hybrid arima–ann model for forecasting time series data, Applied Soft Computing 23 (2014) 27 – 38 [12] M Ray, A Rai, R V., K Singh, Arima-wnn hybrid model for forecasting wheat yield time-series data, JOURNAL OF THE INDIAN SOCIETY OF AGRICULTURAL STATISTICS 70 (2016) 63–70 [13] G Aradhye, A C S Rao, M D M Mohammed, A novel hybrid approach for time series data forecasting using moving average filter and arima-svm, Emerging Technologies in Data Mining and Information Security Advances in Intelligent Systems and Computing 183 (2018) 369–381 [14] A Galicia, R Talavera-Llames, A Troncoso, I Koprinska, F Mart´ınez´ Alvarez, Multi-step forecasting for big data time series based on ensemble learning, Knowledge-Based Systems 163 (2019) 830 – 841 [15] Aasim, S Singh, A Mohapatra, Repeated wavelet transform based arima model for very short-term wind speed forecasting, Renewable Energy 136 (2019) 758 – 768 [16] S Hochreiter, J Schmidhuber, Long short-term memory, Neural Computation (8) (1997) 1735–1780 [17] F A Gers, J Schmidhuber, F Cummins, Learning to forget: Continual prediction with lstm, Neural Computation 12 (10) (2000) 2451–2471 [18] I Aizenberg, L Sheremetov, L Villa-Vargas, J Martinez-Mu˜noz, Multilayer neural network with multi-valued neurons in time series forecasting of oil production, Neurocomputing 175 (2016) 980 – 989 [19] J Gao, Y L Murphey, H Zhu, Multivariate time series prediction of lane changing behavior using deep neural network, Applied Intelligence 48 (2018) 3523–3537 [20] L Peng, S Liu, R Liu, L Wang, Effective long short-term memory with differential evolution algorithm for electricity price prediction, Energy 162 (2018) 1301 – 1314 [21] B Doucoure, K Agbossou, A Cardenas, Time series prediction using artificial wavelet neural network and multi-resolution analysis: Application to wind speed data, Renewable Energy 92 (2016) 202 – 211 [22] X Cai, N Zhang, G K Venayagamoorthy, D C Wunsch, Time series prediction with recurrent neural networks trained by a hybrid pso–ea algorithm, Neurocomputing 70 (13) (2007) 2342 – 2353, selected papers from the 3rd International Conference on Development and Learning (ICDL 2004) Time series prediction competition: the CATS benchmark [23] A Cherif, H Cardot, R Bon´e, Som time series clustering and prediction with recurrent neural networks, Neurocomputing 74 (11) (2011) 1936 – 1944, adaptive Incremental Learning in Neural Networks Learning Algorithm and Mathematic Modelling Selected papers from the International Conference on Neural Information Processing 2009 (ICONIP 2009) [24] R Chandra, M Zhang, Cooperative coevolution of elman recurrent neural networks for chaotic time series prediction, Neurocomputing 86 (2012) 116 – 123 [25] C Smith, Y Jin, Evolutionary multi-objective generation of recurrent neural network ensembles for time series prediction, Neurocomputing 143 (2014) 302 – 311 [26] R Chandra, Competition and collaboration in cooperative coevolution of elman recurrent neural networks for time-series prediction, IEEE Transactions on Neural Networks and Learning Systems 26 (12) (2015) 3123–3136 [27] N T N Anh, N Q Dat, N T Van, N N Doanh, N L An, Wavelet-artificial neural network model for water level forecasting, 2018 International Conference on Research in Intelligent and Computing in Engineering (RICE) (2018) [28] N T N Anh, N Q Dat, V K Solanki, N L An, Prediction of water level using time series, wavelet and neural network approaches, International Journal of Information Retrieval Research, IGI-Global 10 (03) (2020) [29] C Liu, S C Hoi, P Zhao, J Sun, Online arima algorithms for time series prediction, Thirtieth AAAI Conference on Artificial Intelligence (2016) [30] J Leithon, T J Lim, S Sun, Renewable energy management in cellular networks: An online strategy based on arima forecasting and a markov chain model, 2016 IEEE Wireless Communications and Networking Conference (2016) 1–6 [31] H Gao, R Sollacher, H.-P Kriegel, Spiral recurrent neural network for online learning, ESANN’2007 proceedings (2007) [32] S Zhai, K hao Chang, R Zhang, Z M Zhang, Deepintent: Learning attentions for online advertising with recurrent neural networks, KDD ’16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016) 1295–1304 [33] T Guo, Z Xu, X Yao, H Chen, K Aberer, K Funaya, Robust online time series prediction with recurrent neural networks, 2016 IEEE International Conference on Data Science and Advanced Analytics (2016) 816–825 [34] A Ahmadi, J Tani, A novel predictive-coding-inspired variational rnn model for online prediction and recognition, Neural Computation 31 (11) (2019) p.2025–2074 [35] G Box, G Jenkins, Time series analysis, forecasting and control, Holden-Day, San Francisco, CA (1970) [36] O Anava, E Hazan, S Mannor, O Shamir, Online learning for time series prediction, JMLR: Workshop and Conference Proceedings vol (2013) (2013) [37] E Hazan, A Agarwal, S Kale, Logarithmic regret algorithms for online convex optimization, Machine Learning 69 (2007) 169–192 [38] E Hazan, A Kalai, S Kale, A Agarwal, Logarithmic regret algorithms for online convex optimization, 2006, pp 499–513 [39] R Pascanu, T Mikolov, Y Bengio, On the difficulty of training recurrent neural networks (2013) 1310–1318 [40] S Sivakumar, W Robertson, W Phillips, Online stabilization of blockdiagonal recurrent neural networks, IEEE Transactions on Neural Networks 10 (01) (1999) 167 – 175 [41] X Zhang, G Xie, C Liu, Y Bengio, End-to-end online writer identification with recurrent neural network, IEEE Transactions on HumanMachine Systems 47 (2) (2017) 285–292 [42] N Ohana-Levi, S Munitz, A Ben-Gal, A Schwartz, A Peeters, Y Netzer, Multiseasonal grapevine water consumption – drivers and forecasting, Agricultural and Forest Meteorology 280 (2020) 107796 [43] J Li, W Pedrycz, I Jamal, Multivariate time series anomaly detection: A framework of hidden markov models, Applied Soft Computing 60 (2017) 229 – 240 ... phương pháp dự báo online áp dụng cho chuỗi thời gian có yếu tố mùa chuỗi phụ tải điện 1.3 Mục tiêu nghiên cứu Mục tiêu luận văn nghiên cứu, tìm hiểu phương pháp online learning cho dự báo chuỗi... Chương 1: Giới thiệu chung toán dự báo phụ tải điện, giới thiệu online learning ứng dụng online learning cho toán dự báo chuỗi thời gian Cuối phát biểu mục tiêu nghiên cứu cấu trúc luận văn • Chương... dụng vào tốn dự báo phụ tải điện Cụ thể toán dự báo phụ tải điện, tính chất mùa thể cách đặc trưng, luận văn đề xuất mơ hình online seasonal ARMA xét tới tính mùa để áp dụng dự báo 1.4 Bố cục