1. Trang chủ
  2. » Công Nghệ Thông Tin

hackapps book hack proofing your web applications phần 5 pps

63 185 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 63
Dung lượng 545,37 KB

Nội dung

224 Chapter 6 • Code Auditing and Reverse Engineering automatic variable expansion or garbage collection exists to make your life easier. NOTE Technically, various C++ classes do handle automatic variable expan- sion (making the variable larger when there’s too much data to put it in) and garbage collection. But such classes are not really standard and widely vary in features. C does not use such classes. C/C++ can prove mighty challenging for you to thoroughly audit, due to the extensive control an application has and the amount of things that could potentially go wrong. My best advice is to take a deep breath and plow forth, tackling as much as you can in the process. Reviewing ColdFusion ColdFusion is an inline HTML embedded scripting language by Allaire. Similar to JSP, ColdFusion scripting looks much like HTML tags— therefore, you need to be careful you don’t overlook anything nestled away inside what appears to be benign HTML markup. ColdFusion is a highly database-centric language—its core function- ality is mostly comprised of database access, formatted record output, and light string manipulation and calculation. But ColdFusion is exten- sible via various means (Java beans, external programs, objects, and so on), so you must always keep tabs on what external functionality ColdFusion scripts may be using.You can find more information on ColdFusion in Chapter 10. Looking for Vulnerabilities What follows are a collection of problem areas and the specific ways you can look for them.The majority of the problem areas all are based on a single principle: use of a function that interacts with user-supplied data. www.syngress.com 137_hackapps_06 6/19/01 3:37 PM Page 224 Code Auditing and Reverse Engineering • Chapter 6 225 Realistically, you will want to look at every such function—but doing so may require too much time. So we have compiled a list of the “higher risk” functions with which remote attackers have been known to take advantage of Web applications. Because the attacker will masquerade as a user, we only need to look at areas in the code that are influenced by the user. However, you also have to consider other untrusted sources of input into your program that influence program execution: external databases, third-party input, stored session data, and so on.You must consider that another poorly coded application may insert tainted SQL data into a database, which your application would be unfortunate enough to read and potentially be vulnerable to. Getting the Data from the User Before we start tracing problems in reverse, the first (and most impor- tant, in my opinion) step is to zoom directly to the section of code that accepts the user’s data. Hopefully all data collection from the user is cen- tralized into one spot; instead, however, bits and pieces may be received from the user as the application progresses (typical of interactive applica- tions). Centralizing all user data input into one section (or a single rou- tine) serves two important functions: It allows you to see exactly what pieces of data are accepted from a user and what variables the program puts them in; it also allows you to centrally filter incoming user data for illegal values. For any language, first check to see if any of the incoming user data is put through any type of filtering or sanity checks. Hopefully all data input is done at a central location, with the filtering/checking done immediately thereafter.The more fragmented an application’s approach to filtering becomes, the more chances a variable containing user data will be left out of the filtering mechanism(s). Also, knowing ahead of time which variables contain user-supplied data simplifies following the flow of user data through a program. www.syngress.com 137_hackapps_06 6/19/01 3:37 PM Page 225 226 Chapter 6 • Code Auditing and Reverse Engineering NOTE Perl refers to any variable (and thus any command using that vari- able) containing user data as “tainted.” Thus, a variable is tainted until it is run through a proper filter/validity check. We will use the term tainted throughout the chapter. Perl actually has an official “taint” mode, activated by the –T command line switch. When acti- vated, the Perl interpreter will abort the program when a tainted variable is used. Perl programmers should consider using this handy security feature. Looking for Buffer Overflows Buffer overflows are one of the top flaws for exploitation on the Internet today.A buffer overflow occurs when a particular operation/function writes more data into a variable (which is actually just a place in memory) than the variable was designed to hold.The result is that the data starts overwriting other memory locations without the computer knowing those locations have been tampered with.To make matters worse, some hardware architectures (such as Intel and Sparc) use the stack (a place in memory for variable storage) to store function return addresses.Thus, the problem is that a buffer overflow will overwrite these return addresses, and the computer—not knowing any better—will still attempt to use them. If the attacker is skilled enough to precisely control what values the return pointers are over- written with, they can control the computer’s next operation(s). The two flavors of buffer overflows referred to today are “stack” and “heap.” Static variable storage (variables defined within a function) is referred to as “stack” because they are actually stored on the stack in memory. Heap data is the memory that is dynamically allocated at run- time, such as by C’s malloc() function.This data is not actually stored on the stack, but somewhere amidst a giant “heap” of temporary, dispos- able memory used specifically for this purpose.Actually exploiting a www.syngress.com 137_hackapps_06 6/19/01 3:37 PM Page 226 Code Auditing and Reverse Engineering • Chapter 6 227 heap buffer overflow is a lot more involved, because there are no conve- nient frame pointers (as are on the stack) to overwrite. Luckily, however, buffer overflows are only a problem with languages that must predeclare their variable storage sizes (such as C and C++). ASP, Perl, and Python all have dynamic variable allocation—the language interpreter itself handles the variable sizes.This is rather handy, because it makes buffer overflows a moot issue (the language will increase the size of the variable if there’s too much data). But C and C++ are still widely used languages (especially in the Unix world), and therefore buffer over- flows are not bound to disappear anytime soon. NOTE More information on regular buffer overflows can be found in an article by Aleph1 entitled Smashing the Stack for Fun and Profit. A copy is available online at www.insecure.org/stf/smashstack.txt. Information on heap buffer overflows can be found in the “Heap Buffer Overflow Tutorial” by Shok, available at www.w00w00.org/ files/articles/heaptut.txt. The str* Family of Functions The str* family of functions (strcpy(), strcat(), and so on) are the most notorious—they all will copy data into a variable with no regard to the variable’s length.Typically these functions take a source (the original data) and copy it to a destination (the variable). In C/C++, you have to check all uses of the following functions: strcpy(), strcat(), strcadd(), strccpy(), streadd(), strecpy(), and strtrns(). Determine if any of the source data incorporates user-sub- mitted data, which could be used to cause a buffer overflow. If the source data does include user-submitted data, you must ensure that the maximum length/size of the source (data) is smaller than the destination (variable) size. www.syngress.com 137_hackapps_06 6/19/01 3:37 PM Page 227 228 Chapter 6 • Code Auditing and Reverse Engineering If it appears that the source data is larger than the destination vari- able, you should then trace the exact origin of the source data to deter- mine if the user could potentially use this to his advantage (by giving arbitrary data used to cause a buffer overflow). The strn* Family of Functions A safer alternative to the str* family of functions is the strn* family (strncpy(), strncat(), and so on).These are essentially the same as the str* family except they allow you to specify a maximum length (or a number, hence the n in the function name). Properly used, these func- tions specify the source (data), destination (variable), and maximum number of bytes—which must be no more than the size of the destina- tion variable! Therein lies the danger: Many people believe these func- tions to be foolproof against buffer overflows; however, buffer overflows are still possible if the maximum number specified is still larger than the destination variable. In C/C++, look for the use of strncpy() and strncat().You need to check that the specified maximum value is equal to or less than the des- tination variable size; otherwise, the function is prone to potential over- flow just like the str* family of functions discussed in the preceding section. NOTE Technically, any function that allows for a maximum limit to be spec- ified should be checked to ensure that the maximum limit isn’t set higher than it should be (in effect, larger than the destination vari- able has allocated). The *scanf Family of Functions The *scanf family of functions “scan” an input source, looking to extract various variables as defined by the given format string.This leads www.syngress.com 137_hackapps_06 6/19/01 3:37 PM Page 228 Code Auditing and Reverse Engineering • Chapter 6 229 to potential problems if the program is looking to extract a string from a piece of data, and it attempts to put the extracted string into a variable that isn’t large enough to accommodate it. First, you should check to see if your C/C++ program uses any of the following functions: scanf(), sscanf(), fscanf(), vscanf(), vsscanf(), or vfscanf(). If it does, then you should look at the use of each function to see if the supplied format string contains any character-based conversions (indi- cated by the s, c, and [ tokens). If the format specified includes character- based conversions, you need to verify that the destination variables specified are large enough to accommodate the resulting scanned data. NOTE The *scanf family of functions allows for an optional maximum limit to be specified. This is given as a number between the conversion token % and the format flag. This limit functions similar to the limit found in the strn* family functions. Other Functions Vulnerable to Buffer Overflows Buffer overflows can also be caused in other ways, many of which are very hard to detect.The following list includes some other functions which otherwise populate a variable/memory address with data, making them susceptible to vulnerability. Some miscellaneous functions to look for in C/C++ include the following: ■ memcpy(), bcopy(), memccpy(), and memmove() are sim- ilar to the strn* family of functions (they copy/move source data to destination memory/variable, limited by a maximum value). Like the strn* family, you should evaluate each use to determine if the maximum value specified is larger than the destination variable/memory has allocated. www.syngress.com 137_hackapps_06 6/19/01 3:37 PM Page 229 230 Chapter 6 • Code Auditing and Reverse Engineering ■ sprintf(), snprintf(), vsprintf(), vsnprintf(), swprintf(), and vswprintf() allow you to compose multiple variables into a final text string.You should determine that the sum of the vari- able sizes (as specified by the given format) does not exceed the maximum size of the destination variable. For snprintf() and vsnprintf(), the maximum value should not be larger than the destination variable’s size. ■ gets() and fgets() read in a string of data from various file descriptors. Both can possibly read in more data than the desti- nation variable was allocated to hold.The fgets() function requires a maximum limit to be specified; therefore, you must check that the fgets() limit is not larger than the destination variable size. ■ getc(), fgetc(), getchar(), and read() functions used in a loop have a potential chance of reading in too much data if the loop does not properly stop reading in data after the maximum desti- nation variable size is reached.You will need to analyze the logic used in controlling the total loop count to determine how many times the code loops using these functions. Checking the Output Given to the User Most applications will, at one point or another, display some sort of data to the user.You would think that the printing of data is a fundamentally secure operation; but alas, it is not. Particular vulnerabilities exist that have to do with how the data is printed, as well as what data is printed. Format String Vulnerabilities Format string vulnerabilities are a recent phenomenon that has occurred in the last year.This class of vulnerability arises from the *printf family of functions (printf(), fprintf(), and so on).This class of functions allows you to specify a “format” in which the provided variables are converted into string format. www.syngress.com 137_hackapps_06 6/19/01 3:37 PM Page 230 Code Auditing and Reverse Engineering • Chapter 6 231 NOTE Technically, the functions described in this section are a buffer over- flow attack, but we are classifying them under this category due to the popular misuse of the printf() and vprintf() functions normally used for output. The vulnerability arises when an attacker is able to specify the value of the format string. Sometimes this is due to programmer laziness.The proper way of printing a dynamic string value would be: printf("%s",user_string_data); However, a lazy programmer may take a shortcut approach: printf(user_string_data); Although this does indeed work, a fundamental problem is involved: The function is going to look for formatting commands within the sup- plied string.The user may supply data which the function believes to be formatting/conversion commands—and via this mechanism she could cause a buffer overflow due to how those formatting/conversion com- mands are interpreted (actual exploitation to cause a buffer overflow is a little involved and beyond the scope of this chapter; suffice it to say that it definitely can be done and is currently being done on the Internet as we speak). N OTE You can find more information on format string vulnerabilities in an analysis written by Tim Newsham, available online at www.net-secu- rity.org/text/articles/string.shtml. Format string bugs are, again, seemingly limited to C/C++.While other languages have *printf functionality, their handling of these issues www.syngress.com 137_hackapps_06 6/19/01 3:37 PM Page 231 232 Chapter 6 • Code Auditing and Reverse Engineering may exclude them from exploitation. For example, Perl is not vulnerable (which stems from how Perl actually handles variable storage). So, to find potential vulnerable areas in your C/C++ code, you need to look for the following functions: printf(), fprintf(), sprintf(), snprintf(), vprintf(), vfprintf(), vsprintf(), vsnprintf(), wsprintf(), and wprintf(). Determine if any of the listed functions have a format string containing user-supplied data. Ideally, the format string should be static (a predefined, hard-coded string); however, as long as the format string is generated and controlled internal to the program (with no user intervention), it should be safe. Home-grown logging routines (syslog, debug, error, and so on) tend to be culprits in this area.They sometimes hide the actual avenue of vul- nerability, requiring you to backtrack through function calls. Imagine the following logging routine (in C): void log_error (char *error){ char message[1024]; snprintf(message,1024,"Error: %s",error); fprintf(LOG_FILE,message); } Here we have fprintf() taking the message variable as the format string.This variable is composed of the static string “Error:” and the error message passed to the function. (Notice the proper use of snprintf to limit the amount of data put into the message variable; even if it’s an internal function, it’s still good practice to safeguard against potential problems.) So is this a problem? Well, that depends on every use of the above log_error() function. So now you should go back and look at every occurrence of log_error(), evaluating the data being supplied as the parameter. Cross-Site Scripting Cross-site scripting (CSS) is a particular concern due to its potential to trick a user. CSS is basically due to Web applications taking user data www.syngress.com 137_hackapps_06 6/19/01 3:37 PM Page 232 Code Auditing and Reverse Engineering • Chapter 6 233 and printing it back out to the user without filtering it. It’s possible for an attacker to send a URL with embedded client-side scripting com- mands; if the user clicks on this Trojaned URL, the data will be given to the Web application. If the Web application is vulnerable, it will give the data back to the client, thus exposing the client to the malicious scripting code.The problem is compounded due to the fact that the Web application may be in the user’s trusted security zone—thus the malicious scripting code is not limited to the same security restrictions normally imposed during normal Web surfing. To avoid this, an application must explicitly filter or otherwise re- encode user supplied data before it inserts it into output destined for the user’s Web browser.Therefore, what follows is a list of typical output functions; your job is to determine if any of the functions print out tainted data that has not been passed through some sort of HTML- escaping function. An HTML escape routine will either remove any found HTML elements or encode the various HTML metacharacters (particularly replacing the “<” and “>” characters with “&lt;” and “&gt;” respectively) so that the result will not be interpreted as valid HTML. Looking for CSS vulnerabilities is tough; the best place to start is with the common output functions used by your language: ■ C/C++ Calls to printf(), fprintf(), output streams, and so on. ■ ASP Calls to Response.Write and Response.BinaryWrite that contain user variables, as well as direct variable output using <%=variable%> syntax. ■ Perl Calls to print, printf, syswrite, and write that contain variables holding user-supplied data. ■ PHP Calls to print, printf, and echo that contain variables that may hold user-supplied data. ■ TCL Calls to puts that contain variables that may hold user- supplied data. In all languages, you need to trace back to the origin of the user data and determine if the data goes through any filtering of HTML and/or scripting characters. If it doesn’t, then an attacker could use your Web www.syngress.com 137_hackapps_06 6/19/01 3:37 PM Page 233 [...]... www.syngress.com 251 137 _hackapps_ 06 6/19/01 3:37 PM Page 252 137 _hackapps_ 07 6/19/01 3:38 PM Page 253 Chapter 7 Securing Your Java Code Solutions in this chapter: s Overview of the Java Security Architecture s How Java Handles Security s Potential Weaknesses in Java s Coding Functional but Secure Java Applets Summary Solutions Fast Track Frequently Asked Questions 253 137 _hackapps_ 07 254 6/19/01 3:38 PM Page 254 ... language on a PC and after they begin executing, they have access to all resources on your system Security for ActiveX seems to be implemented as a reaction to security breaches rather than designed into the architecture right from the start www.syngress.com 255 137 _hackapps_ 07 256 6/19/01 3:38 PM Page 256 Chapter 7 • Securing Your Java Code There are basically five goals for any complete security architecture,... www.syngress.com 249 137 _hackapps_ 06 250 6/19/01 3:37 PM Page 250 Chapter 6 • Code Auditing and Reverse Engineering Frequently Asked Questions The following Frequently Asked Questions, answered by the authors of this book, are designed to both measure your understanding of the concepts presented in this chapter and to assist you with real-life implementation of these concepts To have your questions about... essentially an interpreter that translates the Java code and allows it to run on your PC— sort of like a middleman between your Java code and your operating system A JVM also exists in your browser As soon as a user surfs to your Web page with a browser, your Java applet will begin executing on the browser virtual machine www.syngress.com 259 ... is handled by the ADODB.* objects.This means that if your script doesn’t create a ADODB.Connection or ADODB.Recordset object via the www.syngress.com 137 _hackapps_ 06 6/19/01 3:37 PM Page 2 45 Code Auditing and Reverse Engineering • Chapter 6 Server.CreateObject function, you don’t have to worry about your script containing ADO vulnerabilities If your script does create ADODB objects, then you need to... trail of transactions, but these are not JVM level Using digital signatures and keeping an internal record of transactions, an application can keep a www.syngress.com 257 137 _hackapps_ 07 258 6/19/01 3:38 PM Page 258 Chapter 7 • Securing Your Java Code fairly reliable auditing trail However, anything developed at the application level means holes can be introduced with the implementation of the auditing... we will at potential weaknesses in Java from a developer point of view.This section describes how others can exploit weaknesses to wreak www.syngress.com 137 _hackapps_ 07 6/19/01 3:38 PM Page 255 Securing Your Java Code • Chapter 7 havoc with your Internet application Finally, we get into the nuts and bolts of coding functional but secure Java applets by looking at how to implement various security... (luckily) to the local system that houses the Web application Only attackers able to log into that system would be able to potentially exploit those vulnerabilities We are not going to focus on this realm of problems here, because best practice dictates using dedicated Web application servers (which don’t allow normal user access) www.syngress.com 2 35 137 _hackapps_ 06 236 6/19/01 3:37 PM Page 236 Chapter...137 _hackapps_ 06 234 6/19/01 3:37 PM Page 234 Chapter 6 • Code Auditing and Reverse Engineering application for a CSS attack against another user (taking advantage of your user/customer due to your application’s insecurity) Information Disclosure Information disclosure is not a technical problem per se It’s quite possible that your application may provide an attacker... www.syngress.com 137 _hackapps_ 07 6/19/01 3:38 PM Page 259 Securing Your Java Code • Chapter 7 Authentication with certificates allows us to ensure that a class received by someone over the Internet is the same class that was originally sent It is technically possible for someone to modify a class maliciously by decompiling the original work and recompiling it If an applet requests additional access on your computer, . concern due to its potential to trick a user. CSS is basically due to Web applications taking user data www.syngress.com 137 _hackapps_ 06 6/19/01 3:37 PM Page 232 Code Auditing and Reverse Engineering. of HTML and/or scripting characters. If it doesn’t, then an attacker could use your Web www.syngress.com 137 _hackapps_ 06 6/19/01 3:37 PM Page 233 234 Chapter 6 • Code Auditing and Reverse Engineering application. best practice dictates using dedicated Web application servers (which don’t allow normal user access). www.syngress.com 137 _hackapps_ 06 6/19/01 3:37 PM Page 2 35 236 Chapter 6 • Code Auditing and

Ngày đăng: 14/08/2014, 04:21