Solution for load balancing in Local network with Squid
ĐẠI HỌC QUỐC GIA HÀ NỘI TRƯỜNG ĐẠI HỌC CÔNG NGHỆ CÔNG TRÌNH DỰ THI GIẢI THƯỞNG “SINH VIÊN NGHIÊN CỨU KHOA HỌC” NĂM 2012 Tên công trình Solution for load balancing in Local network with Squid Họ tên sinh viên: Lê Trí Thái, lớp K53CA-KHMT Khoa: Công nghệ thông tin Giáo viên hướng dẫn: Ths. Đoàn Minh Phương HÀ NỘI, 2012 Solution for load balancing in Local network with Squid The introduction As efforts to upgrade the server hardware has reached a limit on processor speed, memory capacity, memory capacity in addition, the ability to connect people start thinking about the cheapest solution money, easy to use, easy to upgrade, more manageable, easy to maintain.One more problem posed is how to divide the load to achieve the system work better before upgrading that system . Contents The introduction 2 Chapter 1: Some of loadbalancing algorithms 3 1.1 Round robin 3 1.2 Least connection 3 1.3 Source IP 3 1.4 URI 3 Chapter 2: Web server loadbalancing model 4 2.1 Webserver and the neccessary of loadbalancing 4 2.2 Webserver loadbalancing system 4 Chapter 3: Load balancing system for Squid proxies 6 3.1 Introduct of Squid proxy – Web cache proxy 6 3.2 Solution for Squid proxy load balancing using Haproxy 6 3.3 The deployment model 7 Chapter 4: Experience result and error analysis 8 4.1 Introduct of Apache Bench (ab) 9 4.2 Result of using “ab” for Haproxy and squid stand alone 9 4.3 Experience of using web with proxy configure. 17 Chapter 5: The deployment ability and extendance 19 References 20 Chapter 1: Some of loadbalancing algorithms 1.1 Round robin Rotating load balancing solution is simple. The load in turn will be divided for each server under load. Each server in turn will receive your request by his ability on the configuration. With server more memory, cpu speed will be divided higher load than the server has less memory and cpu speed lower. In case the server has the same configuration, the load is divided equally among the servers. 1.2 Least connection The connection to the server with at least will bethe next to get connected. The load sharing is consistent with long connections such as LDAP,SQL, etc. But would not fit the protocol shorttime as HTTP connections. Type of load balancing and load sharing priority for the server more powerful, better handling capabilities, more memory. 1.3 Source IP Address the user will be hashed and divided by the total load of the system. Thus find appropriate server will allow connections from users with other IP addresses. This ensures that each user corresponds to a certain IP address will always be the same server process. However, dividing this load is no longer suitable for DHCP assigned IP system automatically does the same for the different session. 1.4 URI Each web server will be required to include a URI hostname and the connection behind it. Algorithms based on the algorithm URI is URI, hash URI and divided by the total system load.The result returned is always a certain server to handle certain requests. This algorithm is best applied when WebCache proxy for load sharing. Chapter 2: Web server loadbalancing model 2.1 Webserver and the neccessary of loadbalancing Previously, when the internet has expanded and popularity, the web service is simple and requiresless processing power of the server. Then find theimmense possibilities of the Internet, people started using the internet more, the server also need to upgrade the load capacity increases gradually. The common ways such as changing the processor at high speed, more RAM for larger capacity, plug the hard drive for storage capacity increases. However, the server has limited ability to upgrade, ram limited number of slots, limited capacity ram, hard drive slots and hard drive capacity is also limited. People began to add the server load. Each server will receive requests divided by the load balancing algorithm specified. 2.2 Webserver loadbalancing system A load balancing system consists of a simple web server receives the request, and load-sharing andserver systems behind it, getting request from theload balancing and returns to the user. The system can be compounded by the need for firewalls, load balancing, firewall, NAT system for backend server. Returned results can be directly returned to the user or through machine load balancing. Machine load balancing can be a dedicatedmachine, specially designed load bearing duties at the hardware level. The machine is capable ofload sharing is good, but limited string processing capabilities such as packet contents, the content request. Machine load balancing software is used to run a machine with load sharing software. The machine is capable of handling more flexible on request, capable of programmable attributes, the ability to install applications and easier distribution of load in hardware. Chapter 3: Load balancing system for Squid proxies 3.1 Introduct of Squid proxy – Web cache proxy Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times by caching and reusing frequently- requested web pages. Squid has extensive access controls and makes a great server accelerator. It runs on most available operating systems, including Windows and is licensed under the GNU GPL. [The Squid systems] are currently running at a hit-rate of approximately 75%, effectively quadrupling the capacity of the Apache servers behind them. This is particularly noticeable when a large surge of traffic arrives directed to a particular page via a web link from another site, as the caching efficiency for that page will be nearly 100%. - Wikimedia Deployment Information. 3.2 Solution for Squid proxy load balancing using Haproxy With the ability of haproxy load balancing andcaching capabilities of Squid, the solution given for the use Squid and haproxy load balancing algorithm URI on HAProxy and implement the Squid WebCache. We see a lot of load balancing system for the problem but with Webserver meet a huge number of hits from one organization to the Internet, bandwidth usage, a large number of simultaneous connections correlates with the most internet traffic unable to meet user needs. By using the Squid WebCache, we can store acertain amount WebCache. The storage capacity increases, the effective hit-rate increases, but Squid can not cache all data on the internet so the regular content is connected to the Squid cache priority. The number of simultaneous Squid meet increased needs, then we add another Squid server, the cache size increases, and simultaneously increase the number of simultaneous connections to Squid Webcache system. In addition, to increase the hit-rate ratio for Squidsystem, we use haproxy for load balancing algorithm URI described above. The download request will be divided equally among the server,each request to the URI specified always refer to a single server. This reduces the server with the ability to keep a similar record. A website not only contains a single URI, the URI is evenly distributedon the Squid proxy. Compared to the same time send all request URI to a certain Squid, and sendthe URI to the Squid are keeping it, the connection to a server is shared equally, the response timethrough which there is also multiply each times. Using HAProxy to create a certain convenience for internet access. Instead administrator shall notify the IP address of the Squid proxy for eachgroup of users, the user now only needs to knowonly the address of haproxy. Haproxy has the ability to divide each group of IP addressespointing to certain cluster server. Regular user groups (in this case students) will be pointed tothe ordinary user Server clusters. Other user groups will be pointed to another server cluster. In this way users and administrators will help take much effort for the establishment and use daily. 3.3 The deployment model Deployment model testing is done on virtual machines, including a master machine running HAProxy service, 2 slave 1 and 2 machines running Squid service, the client machines are pointing to the master machine The package needed Haproxy-1.4.20 Rsyslog Squid-3.1 Vmware-7.0 CentOS 6.0 Firefox-11 Master HAProxy service is running Eth0: 10.10.10.2/24 gw 172.16.1.151 Eth1: 172.16.1.151/24 Squid is running the service Slave1 Eth0: 172.16.1.201/24 gw 192.168.1.102 Eth1: 192.168.1.102/24 gw 192.168.1.254 Squid is running the service Slave2 Eth0: 172.16.1.202/24 gw 192.168.1.103 Eth1: 192.168.1.103/24 gw 192.168.1.254 Client running firefox proxy configuration and 10.10.10.2:8080: Eth0: 10.10.10.102/24 gw 10.10.10.2 Chapter 4: Experience result and error analysis 4.1 Introduct of Apache Bench (ab) Apache HTTP server benchmarking tool (ab) is a tool for benchmarking your Apache Hypertext Transfer Protocol (HTTP) server. It is designed to give you an impression of how your current Apache installation performs. This especially shows you how many requests per second your Apache installation is capable of serving. 4.2 Result of using “ab” for Haproxy and squid stand alone Below is the result of running apache bench with haproxy and squid, is used to test the site http://us.24h.com.vn/ and http://dantri.com.vn/ URI running haproxy load balancing for web page testing: http://us.24h.com.vn/ Roundrobin running haproxy load balancing for web page testing: http://us.24h.com.vn/ [...]... can be put to use for server vnunet system, capable of replacing the modern system loadbalancing and very expensive as Citrix or Baracuda References [1] Chandra Kopparapu, Load Balancing Servers, Firewalls, and Caches [2] Syme, Matthew, Optimizing Network Performance with Content Switching [3] Kulbir Saini, Squid Proxy Server 3.1 Beginner's Guide [4] Tony Bourke, Server Load Balancing [5] RFC2616 HTTP/1.1...Running haproxy load balancing for rotating the test page http://dantri.com.vn/ Squid1 direct request from the client receives the test page http://dantri.com.vn/ Squid2 direct request from the client receives the test page http://dantri.com.vn/ Check the log files logfile With http://us.24h.com.vn/ page of the following services: HAProxy With http://dantri.com.vn/ Logfile Squid1 Logfile Squid2 ... sec ~ 500Kbytes/sec Having this result is due to be http://us.24h.com.vn/ Squidcache Compared with the case using roundrobin load balancing algorithms, when thispower of 2 servers are added, the execution time reduces by half the previous experiment But there are some Squid fail to meet the request because a longer time to request permission is 4000ms Looking at the log file of Squid and haproxy, we... virtual machine capability of the machine running the service is limited by the ability of the real machine, the maximum number of connections is limited compared with other modern network card So the actual ability to deploy very large Using this model can satisfy the needs of the organization Currently this model is being tested on the central part vnunet, after performing these tests, testing can be... lower thanprivate use Squid However, time 4000ms to wait on results from Squid is anuncomfortable time for many users This also explains why the time required to performtests on two machines with Squid http://dantri.com.vn/ longer on the HAProxy Especially when tested with http://us.24h.com.vn, we have a high data transfer speed is1329.01 Kbytes / sec while the speed of your Internet connection up... also found one more thing, when HAProxy does not receive the results of the Squid returns within allowed (here 4000ms) then in Squid although still receive a correct packet has however HAProxy as an invalid packet 4.4 A number of other test results As well as using ab to test the response capabilities of the system, there is the test of time page load, use YSlow firebug addon in the firefox browser... Logfile Squid1 Logfile Squid2 4.3 Experience of using web with proxy configure Configure HAProxy With this configuration, we will be interested in the option of clitimeout = 3000ms, srvtimeout = 4000ms When separate ab squid server, we receive the request returns100% correct but in this case, the request returns over time 4000ms, corresponding to that time in HAProxy, the request will be considered haproxy... tests showed that the squid have long response time than expected, while haproxy very stable operation Time to reload the page turn test is 6.168s, 5.45s, 5.354s, 5.099s with http://us.24h.com.vn/ page Page load speed reduction was partly due Squid keeps some elements in the memory of it, partly because of the cache system web browser, in addition to the division for the second request Squid server should... Load Balancing [5] RFC2616 HTTP/1.1 http://www.w3.org/Protocols/rfc2616/rfc2616.html [6] Haproxy configuration 1.4 http://haproxy.1wt.eu/download/1.4/doc/configuration.txt [7] Load balancing (computing) - Wikipedia, the free encyclopedia.htm [8] Squid http://www .squid- cache.org/ . server loadbalancing model 4 2.1 Webserver and the neccessary of loadbalancing 4 2.2 Webserver loadbalancing system 4 Chapter 3: Load balancing system for Squid proxies 6 3.1 Introduct of Squid. Roundrobin running haproxy load balancing for web page testing: http://us.24h.com.vn/ Running haproxy load balancing for rotating the test page http://dantri.com.vn/ Squid1 direct. “SINH VIÊN NGHIÊN CỨU KHOA HỌC” NĂM 2012 Tên công trình Solution for load balancing in Local network with Squid Họ tên sinh viên: Lê Trí Thái, lớp K53CA-KHMT Khoa: Công nghệ thông tin