Mọi thứ bắt đầu
When rendered on Web browser, these HTML tags are interpreted and these HTML formatted document objects are display with these defined formats Figure figures the above HTML codes when it is displayed on a Web browser Figure 2: HTML document is rendered on a Internet Explorer One of the most important HTML tag is and tag These tags allow user to submit the input data to web application via web browser For example: username: password: When the user submits the data into web server, the browser makes a request like: POST /secure/login.php?app=quotations HTTP/1.1 Host: wahh-app.com Content-Type: application/x-www-form-urlencoded Content-Length: 39 Cookie: SESS=GTnrpx2ss2tSWSnhXJGyG0LJ47MXRsjcFM6Bd username=daf&password=foo&redir=/secure/home.php&submit=log+in In this request, we notice two points: - The method is defined in form tag is POST, so the request uses POST method in HTTP request - Beside some public input data as username or password, the browser also submits a hidden data: redir argument Javascript JavaScript is a scripting language used to enable programmatic access to objects within other applications It is primarily used in the form of client-side JavaScript for the development of dynamic websites It is commonly used to perform some actions: - Validating user input data directly on client side - Dynamically modify user interface in response to user actions as: drop down menu - Querying and updating the document object model (DOM) within the browser to control web browser‘s behavior HTTP (Hypertext Transfer Protocol) Web application uses HTTP as the primary medium of communication between client and server HTTP is a TCP based protocol through default port 80 Two primary HTTP operation functions are response and request HTTP also defined the mechanism which allows a client to request a resource from server; in its response, web server will return this specified resource These resources are called URI (Uniform Resource Identifiers) and they may be different and various type from text data to multimedia data Web client using uses two primary methods to push data to web server: POST and GET method After receiving, web server processes data, returns status code and attached data (if exist) in response messages to web client All data is transmitted in ASCII code In HTTP/1.0, all connections from client to server are closed immediately after each request and response pair In HTTP/1.1, these connections are not closed immediately and used to transfer data in the future transportation HTTP is a stateless protocol That is, if you request a resource and receive a valid response, then request another, the server regards this as a wholly separate and unique request Web server does not maintain anything like a session or otherwise attempt to maintain the integrity of a link with the client store any information about user in each request However, the developers must use alternative approaches to store the user state in order to manage such as cookies or session variables HTTP is also an ASCII text-based protocol This works in conjunction with its simplicity to make it approachable to anyone who can read There is no need to understand complex binary encoding schemes or use translators—everything needs to know is available within each request and response, in cleartext Cookies Cookies is the most widely used approach that makes HTTP be a stateful protocol It is a piece of text data which stored on a user’s computer by a web browser Cookies stored user information, configuration which was used by web applications on a client The main part of cookies is token It is exchanged as part of the HTTP request and response dialogue to make the client and application think they are actually connected via virtual circuit URIs A Uniform Resource Identifiers (URI) is a unique identifier for a web resource, via which that resource can be retrieved The format of most URIs is as follows: protocol://hostname[:port]/[path/]file[?param=value] Several components in this scheme are optional, and the port number is normally only included if it diverges from the default used by the relevant protocol For example: http://www.example.comm/books/search.asp?q=wahh This above example is the absolute form of URIs URIs may be specified relative to a particular host, or relative to a particular path on that host, for example: /books/search.asp?q=wahh search.asp?q=wahh These relative forms are often used in web pages to describe navigation within the web site or application itself 10 $db = new PDO(“mysql:host=’localhost’; dbname=webapp”, ‘root’, ‘’); $stmt = $db->prepare("SELECT username, password FROM users WHERE username=:username AND password=:password"); $stmt->bindParam(':username', $_POST[‘user’]); $stmt->bindParam(':password', $_POST[‘pwd’]); There are some important notes about using parameterized queries in webapp development: - They should be used for every database query In cases where user-supplied input was clearly being used, they did so; otherwise, they didn’t bother This approach has been the cause of many SQL injection flaws First, by focusing only on input that has been immediately received from the user, it is easy to overlook second-order attacks because data that has already been processed is assumed to be trusted Second, it is easy to make mistakes about the specific cases in which the data being handled is user-controllable In a large application, different items of data will be held within the session or received from the client Assumptions made by one developer may not be communicated to others The handling of specific data items may change in the future, introducing a SQL injection flaw into previously safe queries It is much safer to take the approach of mandating the use of parameterized queries throughout the application - Every item of data inserted into the query should be properly parameterized In some situations, query’s parameters are handled safely; however, one or two items are concatenated directly into the string used to specify the query structure The use of parameterized queries will not prevent SQL injection if some parameters are handled in this way - Parameter placeholders cannot be used to specify the table and column names used in the query In some very rare cases, applications need to specify these items within an SQL query on the basis of user-supplied data In this situation, the best approach is to use a white list of known good values (i.e., the list of tables and columns actually used within the database) and reject any input that does not match an item on this list Besides, strict validation should be enforced on the user input - for example, allowing only alphanumeric characters, excluding whitespace, and enforcing a suitable length limit 45 3.1.2 Secure programming in XSS prevention Despite of the variant of XSS exploitation, preventing the vulnerability itself is in fact conceptually straightforward The only problem, which is an obstacle to XSS prevention, is identifying every instance in which input data is handled in potentially dangerous way Because of the complexity and friendliness of a modern webapp, in each page, developers must manipulate and display donzens of user data Naturally, XSS is the most common flaw even in the most security-critical applications Reflected XSS and stored XSS are involved to server side technology and DOMbased XSS is in opposite position So, there are two countermeasures for preventing XSS in webapp 3.1.2.1 Preventing reflected XSS and stored XSS The root cause of both reflected and stored XSS is that submitted data is copied into application responses without adequate validation and sanitization Because the data is being inserted into the raw source code of an HTML page, malicious data can interfere with that page, modifying not only its content but also its structure - breaking out of quoted strings, opening and closing tags, injecting scripts, and so on To eliminate reflected and stored XSS vulnerabilities, the first step is identifying every instance in the application where user-controllable data is being copied into responses This includes data that is copied from the immediate request and also any stored data that originated from any user at any prior time, including via out-of-band channels The best way to ensure that every instance is identified is a close review of all webapp source code In order to identify and defend all the operations which are potentially at risk of XSS, there are three approaches must be applied They are: - Validating input - Validating output - Eliminating dangerous insertion point in HTML code 46 a Validating input At the point where the application receives user-supplied data that may be copied into one of its responses at any future point, the application should perform contextdependent validation of this data, in as strict a manner as possible Potential features to validate include the following: - That the data length is in accepted range - That the data only contains certain permitted set of characters - That the data matches a particular regular expression Different validation rules should be applied as restrictively as possible to names, email addresses, account numbers, and so on, according to the type of data that the application is expecting to receive in each field b Validating Output At the point where the application copies into its responses any item of data that originated from some user or third party, this data should be HTML - encoded to sanitize potentially malicious characters HTML-encoding involves replacing literal characters with their corresponding HTML entities This ensures that browsers will handle potentially malicious characters in a safe way, treating them as part of the content of the HTML document, not part of its structure The HTML-encodings of the primary problematic characters are as follows: Characters HTML entities “ ‘ & < > % * " ' & < > % * 47 This is a demo code written in Java which implements this idea: public static void String HTMLEncode (String s) { StringBuffer out = new StringBuffer(); for (int i = 0; i < s.length(); i++) { char c = s.charAt (i); if (c > 0x7f || c == ‘”’ || c == ‘&’ || c == ‘’) { out.append (“” + (int) c + “;”); } else out.append (c); } return out.toString(); } A common mistake made by developers is to HTML-encode only the characters that immediately appear to be of use to an attacker in the specific context For example, if an item is being inserted into a double-quoted string, the application might encode only the character “; if the item is being inserted unquoted into a tag, it might encode only the > character This approach considerably increases the risk of bypasses being found An attacker can often exploit browsers’ tolerance of invalid HTML and JavaScript to change context or inject code in unexpected ways Further, it is often possible to span an attack across multiple controllable fields, exploiting the different filtering being employed in each one A far more robust approach is to always HTML-encode every character that may be of potential use to an attacker, regardless of the context where it is being inserted To provide the highest possible level of assurance, developers may elect to HTMLencode every non-alphanumeric character, including whitespace This approach normally imposes no measurable overhead on the application, and presents a severe obstacle to any kind of filter bypass attack The reason for combining input validation and output sanitization is that this involves two layers of defenses, either one of which will provide some protection if the other one fails As you have seen, many filters which perform input and output validation are subject to bypasses By employing both techniques, the application gains some additional assurance that an attacker will be defeated even if one of its two filters is found to be defective Of the two defenses, the output 48 validation is the most important and is absolutely mandatory Performing strict input validation should be viewed as a secondary failover Of course, when devising the input and output validation logic itself, great care should be taken to avoid any vulnerabilities that lead to bypasses In particular, filtering and encoding should be carried out after any relevant canonicalization, and the data should not be further canonicalized afterwards The application should also ensure that the presence of any null bytes does not interfere with its validation c Eliminating dangerous insertion points: There are some locations within the application page where it is just too inherently dangerous to insert user-supplied input, and developers should look for an alternative means of implementing the desired functionality Inserting user-controllable data directly into existing JavaScript should be avoided wherever possible When applications attempt to this safely, it is frequently possible to bypass their defensive filters And once an attacker has taken control of the context of the data he controls, he typically needs to perform minimal work to inject arbitrary script commands and so perform malicious actions A second location where user input should not be inserted is any other context in which JavaScript commands may appear directly For example: In these situations, an attacker can proceed directly to injecting JavaScript commands within the quoted string Further, the defense of HTML-encoding the user data may not be effective, because some browsers will HTML-decode the contents of the quoted string before this is processed For example: A further point needed to avoid is situations where an attacker can manipulate the encoding type of the application response, either by injecting into a relevant directive or because the application uses a request parameter to specify the preferred encoding type In 49 this situation, input and output filters that are well designed in other respects may fail because the attacker’s input is encoded in an unusual form that the filters not recognize as potentially malicious Wherever possible, the application should explicitly specify an encoding type in its response headers, disallow any means of modifying this, and ensure that its XSS filters are compatible with it For example: Context-Type: text/html; charset=ISO-8859-1 3.1.2.2 Preventing DOM-based XSS DOM-based XSS is related to client-side script, so all above techniques are not applicable in DOM-based XSS prevention Wherever possible, applications should avoid using client-side scripts to process DOM data and insert it into the page Because the data being processed is outside of the server’s direct control, and in some cases even outside of its visibility, this behavior is risky If it is in unavoidable situations to use client-side scripts in this way, DOM-based XSS flaws can be prevented through two types of defenses, corresponding to the input and output validation described for reflected XSS a Validating input It is similar with these methods described in previous section In addition to help client-side control, in server side, it is deployed strict rules which are able to detect malicious URL exploiting DOM-based XSS b Validating output As with reflected XSS flaws, applications can perform HTML-encoding of usercontrollable DOM data before this is inserted into the document This will enable all kinds of potentially dangerous characters and expressions to be displayed within the page in a safe way HTML encoding can be implemented in client-side JavaScript with a function like the following: function sanitize (str) { var d = document.createElement (‘div’); d.appendChild (document.createTextNode (str)); return d.innerHTML; } 50 3.1.3 Secure programming in malicious file execution prevention Opposing to previous vulnerabilities prevention which requires a careful and mixed methods, malicious file execution prevention is easy, simple and direct These are steps which developers must be implemented in order to treat malicious file execution flaw - Before including any file, developers must use a whitelist file for accepted files, and deny any file which is not in this list This step prevents user to be able include system files as password file, log file - Using only-read function when interacting with these file which are only read (log file, content file) as using function file_get_contents() in PHP or similar functions, because these functions only read the content of files without executing any command - Should use absolute file path instead of relative path when using inclusion function - Always initialize the value for any variable Because of the flexibility, all scripting languages allow their user to use variable in any place in code without defining It leads to missing set value for every existing variable So, webapp cannot control completely all input data - Trapping bugs strictly For example, this is a good code for preventing malicious file execution 51 3.2 HARDENING WEB APPLICATION SYSTEM Although applying secure programming in development phase of each webapp is the most simple and effective against SQL injection, malicious file execution or XSS but in the insecurity environment of internet day, it is not enough to ensure this application is solid An application, which is evaluated safe over previous flaws, is completely able to be compromised if is operated in a shared environment with other insecure applications Hackers can attack the safer application through the less secure applications Besides, the exploitations of these flaws are more and more various and complex, developers and webapp owners are always is passive position to prevent In order to reduce to the minimum affect of possible risks to webapps, hardening application system is the must-do work Hardening contains three works: - Securing database server - Securing execution environment - Securing Web server 3.2.1 Securing database server Because of containing important and necessary information for each web application, database is always the most sensitive component And in order to deploy and install easily, almost web applications use account name and password for connections to database in plain text, without ciphering It is vulnerable and leads to the data losses or collapse entire system To secure database, three tactics must be applied: - Separating completely web server and database server in physical layer instead of using one physical server for both database and web server This prevents the escalating attacks on local noticeablely - Each web application uses an secluded account to access database server with minimum privileges which are enough for webapp to operate normally In general, the application does not need DBA level permissions - it normally only needs to read and write its own data In security-critical situations, the application may employ a different database account for performing different actions For example, if 90% of its database queries only require read access, then these can be performed using an account which does not have write privileges If a particular query only needs to read a subset of data (for example, the 52 orders table, but not the user accounts table), then an account with the corresponding level of access can be used If this approach is enforced throughout the application, then any residual SQL injection flaws that may exist are likely to have their impact significantly reduced For example, in MySQL, to grant a permission for an user with specific privileges at unique IP address can access a specified database, we can use the following command: GRANT SELECT, INSERT, UPDATE, DELETE ON test.* TO ‘duongpt’@’127.0.0.1’ IDENTIFIED BY ‘c0m3On’; - Many enterprise databases include a huge amount of default functionality that can be leveraged by an attacker who gains the ability to execute arbitrary SQL statements Wherever possible, unnecessary functions should be removed or disabled Even though there are cases where a skilled and determined attacker may be able to recreate some required functions through other means, this task is not usually straightforward, and the database hardening will still place significant obstacles in the way of the attacker 3.2.2 Securing execution environment Execution environment or on other hand, it is a particular compiler or platform on which a webapp can operate as PHP, Zend Framework, Java,… It is the middle layer between webapp and operating system So, if a hacker can control entire a web application, he is able compromise the under system of this application - For general purpose, almost platforms or compilers provide some system functions which can be interactive with low system as file system, services, processes, executing program, opening socket to outbound network or working with remote file So, depending on requirement of each application, system admin need to turn off these arguments in platform configuration - In some cases, attacker is only able to attack a webapp after gathering much useful information from errors during this application operation So the error displaying feature needs to be disabled in production-server In below example, there are some features of PHP configuration need to be turned off in order to enhance security of both server and the existing applications 53 safe_mode=On; tính so sánh ID người dùng sở hữu tài ngun máy tính với ID tiến trình chuẩn bị tương tác với tài nguyên disable_functions=exec,system,passthru,shell_exec,escapeshellarg,esc apeshellcmd,proc_close,proc_open,ini_alter,dl,popen,popen,pcntl_exec ,socket_accept,socket_bind,socket_clear_error,socket_close,socket_co nnect,socket_create_listen,socket_create_pair,socket_create,socket_g et_option,socket_getpeername,socket_getsockname,socket_last_error,so cket_listen,socket_read,socket_recv,socket_recvfrom,socket_select,so cket_send,socket_sendto,socket_set_block,socket_set_nonblock,socket_ set_option,socket_shutdown,socket_strerror,socket_write,stream_socke t_client,stream_socket_server,pfsockopen,stream_set_timeout,disk_tot al_space,disk_free_space,chown,diskfreespace,getrusage,get_current_u ser,set_time_limit,getmyuid,getmypid,dl,leak,listen,chgrp,link,symli nk,dlopen,proc_nice,proc_get_stats,proc_terminate,shell_exec,sh2_exe c,posix_getpwuid,posix_getgrgid,posix_kill,ini_restore,mkfifo,dbmope n,dbase_open,filepro,filepro_rowcount,posix_mkfifo,putenv,sleep,phpi nfo; Tắt hàm hệ thống không cần thiết allow_url_include=Off; không cho phép include file từ xa allow_url_fopen=On; Cho phép mở file từ xa register_globals=Off; Tắt tính biến tồn cục display_errors=Off; Tắt tính hiển thị lỗi magic_quotes_gpc=Off; Tắt tính chèn dấu / vào trước kí hiệu nguy hiểm 3.2.3 Securing Web server - Web server operation to moderate and manage HTTP requests and responses stable and securely Because it is also the first level on application layer which receives HTTP requests, if using filters at this level to block malicious requests, it will increase the performance of web applications significantly; avoid web applications handling all incoming requests Besides, for known attack patterns based on commons security holes as: SQL injection, XSS or malicious file, filtering at web server level is very effective to insecure web applications against “script-kiddies” attackers Today, there are some software packages which implement this features as mod_security for Apache HTTPD or WebKnight for Microsoft IIS Web server 54 - Placing individual web application directory in separate area In Unix servers, system admin can grant for each compiler running in an unique account For example, in a Linux server with PHP compiler, and hosting many web applications, system admin can specify a unique account which will owns the individual web application and PHP processes will run under this account id - Protecting important file/folder by using authenticating feature of web server For example, in Web server like Apache HTTPD, using htaccess with following content to require user to login before accessing file admin.php AuthUserFile /home/example/public_html/.htpasswd AuthGroupFile /dev/null AuthName "Authorization Required" AuthType Basic require user admin - Don’t use unnecessary modules Because of scalability of a web application system, each web sever support many module to support extra feature But some extra modules are vulnerable and easy to exploit and via them, attackers can compromise entire system Depending on the particular requirements of a web application system, disable unnecessary modules In these application systems which is lack of power and hardware resources, this tactic also improves the performance of entire system, because system does not waste resources for unnecessary work, all resources are uses for right work 55 3.3 EXTRA METHODS Besides the described methods, those works are should to be done in order to improve the security and stability of any web application system a For top security-critical applications, security testing is must be applied during testing phase besides other testing method as system testing or integrated testing This step guarantees that the application has no security problems with known vulnerabilities b Auditing an entire application system periodically: source code auditing, database assessment,… to detect and repair un-known errors soon The error may be 0-day vulnerabilities or human mistakes Even a little mistake can cause a disaster Beside, before deploying any application into an existing system, it is must ensured that this application is not vulnerable to known security flaw It is should be in a security policy of any organization Because, as described previously, in some situations when the application target is impossible to attack, attackers often change their target to other web applications, which is less secure ore more vulnerable than the direct target So when the second target is defeated, attackers are able to make an escalation attack to compromise his target For this reason, this step is must be implemented in any system In many enterprises in which security is a part of their business, there have a security team whose work to auditing and find security holes within existing systems b Focusing on audit log maintenance and deployment for every component in web application system These audit logs will be very useful not only to detect attack patterns to system soon and find out the repair solutions Based on these logs, system administrator can detect the strange requests from users: strange pattern requests, many same requests focusing on specify resource on server or some authenticated access from unknown location From this evidence, they can know the weak point in their system and have a solution to fix soon Audit log can be very useful to investigate when an incident occurs Following such an incident, effective audit logs should enable the application’s owners to understand exactly what has taken place, which vulnerabilities (if any) were exploited, whether the attacker gained unauthorized access to data or performed any unauthorized actions, and as far as possible, provide evidence as to the intruder’s identity But log files are always huge file and it is an obstacle to monitor and analysis Depending on the system characteristics, owners should find a suitable solution to control them 56 c Monitoring, planning to update the important patches for every component regularly Everyday, there are many new security flaws which are discovered in existing system softwares The software vendors always release the hotfixes or patches to fix these holes But, vendors often attach into these patches some new features which can be obstacles to the operation of existing system So, to ensure the system is always safe to known flaws, system admin should propose a plan to update carefully, predict the change after updating, assess possible risk and decide to or not to update d Deploying other security software as: firewall, intrusion detection/prevention system (IDS/IPS) These solutions are often deployed on the perimeter network to control and handle all accesses to system They are the improvement solutions to enhance the security of any network and system e Planning business continuity against incidents or disasters in the future It includes back up strategies or some response against disasters It helps to reduce the down-time of any system 57 CONCLUSION In conclusion, the main goals of my thesis are: Research documents to give the sum up about these following problems systematically: - Web applications and their architecture which are currently used in real life - Some common security vulnerabilities in web application security: SQL injection, Cross Site Scripting and malicious file execution In each flaw, I showed the description, the cause and the classification and some demo exploitation From these researches, and base some information principals, I propose some solutions to prevent these vulnerabilities They spreads through the depth of system They are secure programming strategies during development phase, hardening the operation environment of any web application system I also propose some extra ideas to improve the security and stability of web application 58 ... Context-Type: text/html; charset=ISO-885 9-1 3.1.2.2 Preventing DOM -based XSS DOM -based XSS is related to client-side script, so all above techniques are not applicable in DOM -based XSS prevention. .. affected All web application frameworks and applications are vulnerable to XSS 2.4.4 Exploitation Cross-Site Scripting (XSS) attacks occur when: - Data is submitted to Web application from an un-trusted... The typical Web application architecture The below description outlines the basic components used in Web application architecture: Web client The standard Web application client is the Web browser