Followers’ and Winning Percentages

Một phần của tài liệu Collaborate computing networking, applications and worksharing (Trang 239 - 257)

Winning percentage can be seen as the probability of winning against the oppo- nent. Intuitively, the probability of winning increases as the difference in the followers’ percentages increases. We only point out the main observation: even if the difference between two strategies in terms of number of followers is small, the difference between wining probabilities can be large. This is of course easy to understand since having just one more follower than his opponent is enough to win.

4 Knowledge Advantage

In this section we introduce the smart peer who has prior knowledge of con- nections established by the opposing peer. This knowledge is used by the smart peer to formulate a mixed integer linear program (MILP) in order to acquire the majority of followers with the least possible budget.

The strategy adopted by the other forceful peer is supposed to be known.

Letλi be the number of connections from (non-smart) forceful peer toithnode

∀i N. Let di be the total weight from ith node to normal nodes only. Both di and λi are known. Let us now introduce the problem variables. Let xi be the number of connections from smart peer to ith node and let yi be the final opinion of ith node. A binary variable pi is equal to 0 if the ith node follows the smart peers, and 1 otherwise. We also use zji binary variabes to represent xi (see, (4b)). Some intermediate variables tij are used to denote the products (yizji). The problem can then be formulated using the following MILP.

228 A. Sobehy et al.

min

i∈N

xi (4a)

xi= b j=0

zij2j (4b)

yi

λi+di +

b j=0

tij2j= (1−α)

xi−λi+

k∈neigh(i)

yk

⎠ (4c)

−zji≤tij ≤zij (4d) yi1 +zji≤tij 1 +yi−zji (4e) yi≥ −1 +pi(1 +neut thresh) (4f)

i∈N

pi≥ N+ 1

2 (4g)

zji, pi∈{0,1} (4h)

The objective function (4a) aims to minimize the total budget used by the smart peer to win. The first constraint (4b) is a binary representation of the edge weights of the smart peer which will be used to conserve linearity of the problem.

Thebupper limit of the summation is the number of binary digits representing weight; it limits the maximum edge weight the smart peer can assign to a normal peer. The second constraint (4c) is derived from the opinion update equation mentioned in Algorithm1with the introduction of variablestij to avoid variable multiplication (yixi). tij represents the product (yizji) as ensured by (4d) and (4e). In constraint (4c), we used the fact that the initial opinion ON is 0 and the weights of links between normal peers are equal to 1.pi is defined in (4f), and used in (4g) to guarantee that more than half of the normal nodes follow the smart peer. Gurobi solver [21] is used. For all matches we set a time limit of one hour for the optimization problem computation after which the reached solution is returned by the solver.

Smart Peer Vs Other Strategies

From the four strategies mentioned above we run a simulation match between the smart peer against the most dominant strategy in each graph/budget con- figuration. Figure3 shows the budget needed by the smart peer to win against the most dominant strategy in each graph configuration.

To quantify the budget needed by the smart peer to beat an existing forceful peer, we draw a best fit line. The best fit line shown as a solid line in Fig.3has the following equation:f(x) = 0.5x+4.85,wherexis the budget of the dominant strategy (non-smart) and f(x) is the budget needed by the smart peer to win.

The main conclusion is that with prior knowledge of opponent’s connections, it is possible to win with nearly half of his/her budget.

How to Win Elections 229

Fig. 3.Smart Peer against dominant strategies

5 Conclusion

We have shown that the best strategy to strongly influence the members of a social network depends on the underlying graph structure and the budget. Thus, for each budget/graph pair configuration, the proposed strategies can be ranked from worst to best. We also observed that knowing the opponent’s strategy is a decisive advantage; the budget can be reduced by 50%!

This work can be extended in several ways. Strategies have been compared given the same budget. It might be interesting to study the robustness of the ranking of strategies by assuming that one strategy has slightly more budget than another. Other opinion propagation models can also be studied. One can for example consider the majority model where each node has a binary opinion following the opinion of the majority of his neighbors. Another worth-studying problem is the competition between more than two candidates. Finally, one can also consider antagonistic interactions inside the network represented by negative link weight.

References

1. Hendrikx, F., Bubendorfer, K., Chard, R.: Reputation systems: a survey and tax- onomy. J. Parallel Distrib. Comput.75, 184–197 (2015)

2. Block Chain Principle.http://blockchain.info

3. Altman, E., Kumar, P., Venkatramanan, S., Kumar, A.: Competition over timeline in social networks. In: 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 1352–1357. IEEE (2013) 4. Neglia, G., Ye, X., Gabielkov, M., Legout, A.: How to network in online social net-

works. In: 2014 IEEE Conference on Computer Communications Workshops (INFO- COM WKSHPS), pp. 819–824. IEEE (2014)

230 A. Sobehy et al.

5. Borodin, A., Filmus, Y., Oren, J.: Threshold models for competitive influence in social networks. In: Saberi, A. (ed.) WINE 2010. LNCS, vol. 6484, pp. 539–550.

Springer, Heidelberg (2010). doi:10.1007/978-3-642-17572-5 48

6. Chen, L., Leneutre, J.: Fight jamming with jamming-a game theoretic analysis of jamming attack in wireless networks and defense strategy. Comput. Netw. 55(9), 2259–2270 (2011)

7. Stackoverflow.http://stackoverflow.com/

8. Kamvar, S.D., Schlosser, M.T., Garcia-Molina, H.: The eigentrust algorithm for reputation management in p2p networks. In: Proceedings of the 12th International Conference on World Wide Web, pp. 640–651. ACM (2003)

9. Bradai, A., Ben-Ameur, W., Afifi, H.: Byzantine resistant reputation-based trust management. In: 2013 9th International Conference Conference on Collaborative Computing: Networking, Applications and Worksharing (Collaboratecom), pp. 269–

278. IEEE (2013)

10. Twitter.http://twitter.com

11. Facebook.http://www.facebook.com

12. Chen, K., Liu, G., Shen, H., Qi, F.: Sociallink: utilizing social network and trans- action links for effective trust management in P2P file sharing systems. In: 2015 IEEE International Conference on Peer-to-Peer Computing (P2P), pp. 1–10. IEEE (2015)

13. DeGroot, M.H.: Reaching a consensus. J. Am. Stat. Assoc.69(345), 118–121 (1974) 14. Krause, U.: A discrete nonlinear and non-autonomous model of consensus forma-

tion. In: Communications in Difference Equations, pp. 227–236 (2000)

15. Acemoglu, D., Ozdaglar, A.: Opinion dynamics and learning in social networks.

Dyn. Games Appl.1(1), 3–49 (2011)

16. Ben-Ameur, W., Bianchi, P., Jakubowicz, J.: Robust distributed consensus using total variation. IEEE Trans. Autom. Control61(6), 1550–1564 (2016)

17. Ben-Ameur, W., Bianchi, P., Jakubowicz, J.: Robust average consensus using total variation gossip algorithm. In: 2012 6th International Conference on Performance Evaluation Methodologies and Tools (VALUETOOLS), pp. 99–106. IEEE (2012) 18. Erdos, P., Renyi, A.: On random graphs I. Publ. Math. Debrecen6, 290–297 (1959) 19. Penrose, M.: Random Geometric Graphs (No. 5). Oxford University Press (2003) 20. Barab´asi, A.L., Albert, R.: Emergence of scaling in random networks. Science

286(5439), 509–512 (1999)

21. Gurobi optimizer reference manual.http://www.gurobi.com

Research on Short-Term Prediction of Power Grid Status Data Based on SVM

Jianjun Su1, Yi Yang1, Danfeng Yan2, Ye Tang2(&), and Zongqi Mu2

1 State Grid ShanDong Electric Power Research Institute, Jinan 250001, China 13953187960@163.com, yangyi814@gmail.com

2 Beijing University of Posts and Telecommunications, Beijing 100876, China {yandf,2015213126}@bupt.edu.cn,

tangye_bupt@foxmail.com

Abstract. EMS (Energy management system) is a collection of computer hard- ware and software, which collects, monitors, controls and optimizes data provided by power control system, and provide trading scheme, security services and ser- vice analysis for power market. The prediction of status data is a basic function module of advanced application software systems. Therefore it is meaningful to do research on new method and new technology of predicting power grid status data.

In this paper, support vector machine is used to do regression prediction for active power of EMS. In training process, the training set and kernel function of SVM are selected, and parameters are optimized, also, the performance of SVM is evalu- ated. Experiments show that SVM can get higher accuracy in short term active power prediction although the data set is small. This paper provides a new idea for related research works in electric power industry system.

Keywords: EMS SVMRegression prediction Parameter optimization

Machine learning

1 Introduction

Electric power industry is one of the most important basic national industries, and is the lifeblood of the national economy, engine of economic development. It plays a crucial role for our national security, economic development, social stability, life quality. In modern society everywhere is inseparable from the power supply.

With the rapid development of power industry, research of the high precision prediction technology and application system on power state data is becoming more important and has direct and significant economic benefits and social benefits.

Active power value, usually expressed with letter P, is one of the most important statistic records in power grid. Active power is the power of electric energy that transfer into other forms of energy (mechanical energy, light energy, thermal energy) power, therefore the active power could reflect the usage of the whole power grid. Through active power we could have a better understanding of the power grid energy

Funded by the national high technology research and development program (863 Program) (No. 2015AA050204)

©ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2017 S. Wang and A. Zhou (Eds.): CollaborateCom 2016, LNICST 201, pp. 231–241, 2017.

DOI: 10.1007/978-3-319-59288-6_21

consumption, moreover we could better monitoring the operational state of the power grid. Therefore, it is very necessary to seek the way to predict the active power value in the EMS of power grid network.

The main research of this paper is to predict the active power in power condition monitoring data. We try use the existing EMS (energy management system) monitoring data, comprehensively consider meteorological factors, historical data to design and construct the input training set using support vector machine (referred to as SVM) algorithm for active power index regression prediction.

2 Research Status of Related Works

The prediction of state monitoring data of power grid is an important basic project. The improvement of the mechanism of power market will accelerate research on the new method and new technology of state power grid monitoring data prediction.

At present, there is plenty of the domestic and foreign research on the prediction of the state data of the power grid. Elke Lorenz offers a way to predict regional PV power output based on weather forecasts information up to three days ahead [1]. In order to predict the risk of failures for components and systems, Cynthia Rudin gives a general process which transforms historical electrical grid data into models by machine learning [2]. Louka P predicts wind speed of speed using Kalmanfiltering to give a prediction of wind power [3]. M. Carolin Mabel and E. Fernandez discuss a neural network model to predict the energy from wind farms in their paper [4].

As for using support vector machines for regression prediction, Vladimir Cherkassky and Yunqian Ma did a research on selection of parameters of support vector machines regression, and their experiments indicate under sparse sample settings, SVM regression has an excellent generalization performance for different types of additive noise [5].

Existing theory, methods, and recent developments as well as research range of SVR is discussed in Debasish Basak, Srimanta Pal and Dipak Chandra Patranabis’s paper [6].

Chih-Chung Chang and CHIH-JEN LIN from offer a library for Support Vector Machines that has been developed since 2000, the library is called LIBSVM and now is widely used in machine learning and many other areas [7], which is also used in this paper.

3 The Introduction of SVM

Frequently used prediction technology includes Artificial Neural Networks [8], Times Series Analysis Method [9], Kalman Filter Analysis Method [10], Grey Models [11], Multi-output Support Vector Regression [12]. These methods have different charac- teristics, and they are already used in power systems. However, there are various defects, which makes practice effect not ideal. In this paper, the Support Vector Machine is applied to electric load prediction to achieve a better effect on active power prediction in the power system.

232 J. Su et al.

Support vector machine, namely SVM, is a new machine learning method based on Statistical Learning theory proposed by Vapnik et al. SVM is a supervised learning model used for pattern recognition, classification and regression analysis. SVM solves the problem of the linear inseparable problems by the probability of soft-margin, and introduces the kernel function to make the solution plane expand from the linear to the nonlinear [13].

To deal with linearly inseparable problems, SVM uses kernel functions. Kernel function, in essence, is a kind of mapping function, which maps the low dimensional space nonlinear problem to the high dimension space programming linear problem and then solve it.

So the basic function of a kernel function is to accept two lower dimensional space vectors, and calculate the vectors’ inner-product in high dimension space after a transformation. In the nonlinear case, determine mapping function/ðxiịis the kernel function that satisfy the Mercer conditions:

/ðxiị /ðxjị ẳKðxi;xjị ð1ị All functions can be used as the kernel function, as long as it satisfies the Mercer condition function. Common kernel functions include:

1. Linear Kernel Function

Kðxi;xjị ẳxixj ð2ị 2. Polynomial Kernel Function

Kðxi;xjị ẳ ẵcðxixjị ỵcd ð3ị 3. Radical Basis Kernel Function, which also called Gauss Kernel Function, the

expression is:

Kðxi;xjị ẳexpẵjxixjj2

r2 ð4ị

Whereris the Radial Basis Radius, takegẳr12into formula (4) is another common expression of Gauss’s function:

Kðxi;xjị ẳexpðgjxixjj2ị ð5ị

4. Sigmoid Kernel Function

Kðxi;xjị ẳtanhðcðxi;xjị ỵcị ð6ị Research on Short-Term Prediction of Power Grid Status 233

4 Scenario Description and Definition of the Research Problem

4.1 Data Preprocessing and Scenario Description

Main source of data for the research is from the history EMS state information of an electricity substation of the Shandong province power grid in June 2015, including meteorological temperature information. Table1 are examples of the contents of the meteorological table of the EMS records.

Table2shows the records of transformer equipment state data in the EMS. It can be seen in the table recording the record ID, equipment ID, the site code, as well as the record time, active power value orderly.

Data pre-processing can be divided into two steps, data cleaning, and data asso- ciation to integrate into the SVM training data set. After remove redundant, abnormal and wrong information, there are 4018 records of temperature data, also 4018 records of EMS’s active power value data.

By simple statistics, we can see the temperature in three days shown in Fig.1.

The active power of the corresponding time range is shown in Fig.2.

Table 1. Examples of meteorological data in the EMS

ID Device Time Temperature Humility

1 StationA 2015-07-15 00:00:00 23.2 099 2 StationA 2015-07-15 00:10:00 23.5 098

Table 2. Examples of equipment state data in the EMS

ID dev_id dev_name Time P

1 1800002459 Station C 2015-07-15 00:00:00 147.7919 2 1800002459 Station C 2015-07-15 00:05:00 150.3291 3 1800002459 Station C 2015-07-15 00:10:00 145.8889

Fig. 1. Range of temperature within 3 days Fig. 2. Range of active power within 3 days 234 J. Su et al.

It can be clearly seen from the two pictures, the temperature and active power show a relativelyfixed variation law. This rule can be used as a reference for the design and construction of the training set of data in the following chapters.

In this scenario, we hope to develop a good regression prediction model, making grid equipment meteorological data, environment data, state information as input to predict the specific value of active power(referred to as P) in the future time. The prediction can be a reference to the assessment of whole grid operation state, avoiding major accidents, making decisions of power grid operation and so on. Due to the short term prediction’s characteristics such as efficient and agile, as well as to be able to get accurate results using less sample data, short term prediction is very suitable for the support vector machine learning methods for regression prediction which has a small demand on data size.

4.2 Problem Definition

There is n set of records of active power (P) in future time corresponding to input samples are abstracted as n dependent variables, and are denoted by a vector:

Yn ẳ ðy1;y2;. . .;ynị ð7ị

In formula (7), n is the number of samples, andyiindicates the active power (P) of the predicted time of the ith input sample P. Independent variable is:

Xnẳ ðx1;x2;. . .;xnị ð8ị

And xn is a n*N matrix. The ith row of the matrixxi is a vector that comprises Ndimensional variables corresponding toyiwhereNdimension representingNkinds of grid environment factors (temperature, historical data, etc.), and N can be one or more of these factors. Moreovern is the number of samples, a total ofn samples is independent variables. For the specific selection and design of training data setxn will be discussed in the following chapters.

Do regression training using model M, which takes xn as input vector, and ynas SVM’s label.

In regression prediction, using the trained model Mto predict data vectorxs:

Xsẳ ðx1;x2;. . .;xsị ð9ị

As input,Xsis an*S matrix that has the same structure with input vectorXn. The output isYs:

Ysẳ ðy1;y2;. . .;ysị ð10ị

Each value yi inYs represents the corresponding input vector in Xs, which is the active power value corresponding to the ith rowxi.

Research on Short-Term Prediction of Power Grid Status 235

Moreover, to measure the error and analyze the result, in this paper, the mean relative erroreMRE and root mean square erroreRMSE are used as the basis for judging the effect of various prediction methods. Their calculation methods are as follows:

eMRE ẳ1 N

XN

iẳ1jLðiị L0ðiị

Lðiị j 100% ð11ị

eRMREẳ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1

N

XN

iẳ1ðLðiị L0ðiị Lðiị ị2 s

100% ð12ị

In the formulas,L(i) representing the actual active power value at a certain time and L’(i) representing the predicted active power value.

At the same time for every moment of the actual active power value and prediction active power value, we make theesingle as a single moment prediction results of the error percentage, the formula is as follows:

esingleẳ jLðiị L0ðiị

Lðiị j 100% ð13ị

In this paper, we make 5% as the judging criterion, if a result havingesingle[5%, we consider this prediction result fails. We define the qualified rate of r for active power prediction of algorithm result:

rẳp0

p100% ð14ị

Where P0 is the number of results satisfy esingle\ẳ5%, that is the number of qualified prediction results. Pis the total number of prediction results.

In summary, we select the prediction results pass rate r, the mean relative error eMRE and the root mean square erroreRMSE as the gauge of prediction accuracy rate.

5 Experiment Process and Result of the Algorithm

5.1 Brief Introduction of the Algorithm

The algorithm of this paper is divided into four parts, the first part carries on the construction of the training set data in different ways, and compares the results to select the optimal design, the second part is the selection of the kernel function, the third part is the adjustment of penalty factor and kernel function under the condition of the selected kernel function, the fourth part is the analysis of experimental results.

236 J. Su et al.

5.2 The Selection of Training Set

The randomness of the power measurement index is very strong and has many influ- ence factors, so the short-term active power prediction is a multi-variable regression prediction problem.

As described in the previous chapter, the active poweryiof the predicted time point is the output value of the function. And the factors that affect theyi, such as: historical data, temperature and meteorological information, as the input vectorxiof function. So we take multiple designs on input vectorxichecking the effect of prediction based on SVM regression, specific designs are as follows.

Scheme 1 is designed as follows:

xiẳ fb1;b2;b3;b4g ð15ị

In the formula b1, b2, b3, b4 represent the active power values of the 4 records before the target prediction time.

Scheme 2 is designed as follows:

xiẳ ft1;b1;b2;b3;b4g ð16ị

In the formula b1, b2, b3, b4 represent the active power values of the 4 records before the target prediction time. Andt1 represents the temperature data of the pre- diction time most recently

Scheme 3 is designed as follows:

xiẳ fh1;h2;b1;b2;b3;b4g ð17ị

In the formula b1, b2, b3, b4 represent the active power values of the 4 records before the target prediction time. Andh1represents the active power value of yesterday at the same time with the prediction time,h2represents the active power value of last week at the same time with the prediction time.

Scheme 4 is designed as follows:

xiẳ ft1;h1;h2;b1;b2;b3;b4g ð18ị

In the formula b1, b2, b3, b4 represent the active power values of the 4 records before the target prediction time. Andt1 represents the temperature data of the pre- diction time most recently. Andh1represents the active power value of yesterday at the same time with the prediction time,.h2represents the active power value of last week at the same time with the prediction time.

For the four design schemes of the input vector, we select 5 consecutive days that have more than 1440 records to structure training data set. And the default Gauss kernel function and the unmodified standard parameters are used as the configuration of the algorithm model. Predict the active power value of 288 time points in the next 1 day.

Research on Short-Term Prediction of Power Grid Status 237

Một phần của tài liệu Collaborate computing networking, applications and worksharing (Trang 239 - 257)

Tải bản đầy đủ (PDF)

(706 trang)