Introduction to the Porting Problem
Why Port Code?
In the last ten years, the computer industry has experienced significant changes in the variety and quantity of machines sold Previously, the most popular computers were primarily utilized in industrial and academic settings for high-speed, efficient computations Although many such machines still exist, they typically operated on systems like UNIX, which prioritized resource allocation over user-friendliness.
As we move into the twenty-first century, the trend of using diverse operating systems is diminishing Personal workstations, predominantly utilizing Microsoft’s Windows operating system, have become increasingly prevalent This shift is evident in the expanding computer workstation market, as illustrated in Figure 1.
Workstations are experiencing rapid growth, while UNIX systems are relatively constant (Deloitte 4)
Figure 1 Workstation Market Trend The change in the number of UNIX and Windows workstations sold (Deloitte 4).
The shift in operating system preference is largely due to the contrasting characteristics of UNIX and Windows UNIX, a robust and fast operating system that supports multiple users and tasks simultaneously, has been a stable choice for over 30 years, focusing primarily on functionality and performance rather than aesthetics In contrast, Windows, which has been available for less than half the time of UNIX, offers a more user-friendly experience with appealing interfaces that are accessible even to children Its growing stability, functionality, and compatibility with affordable hardware have contributed significantly to its market expansion.
As the Windows market rapidly expands, it is essential for companies to adapt their existing UNIX-based software solutions to reach new consumers Whether these are software companies selling products for profit or organizations developing software for internal use, the most effective strategy for entering this market is to port their existing code to the new environment Porting involves transferring the original source code and compiling it for the new machine, often requiring some modifications to ensure compatibility However, since the same programming language is typically used, a significant portion of the code remains unchanged, enabling companies to efficiently introduce their legacy products to the growing Windows audience.
Other Options than Porting
Companies can offer their software products on the Windows architecture through various methods beyond code porting Two prevalent alternatives include developing the software from the ground up and utilizing third-party software solutions.
Starting from scratch in software development means completely rewriting the source code for a new platform while integrating its functionalities However, true "starting from scratch" is rare in the software industry due to the prevalent practice of code reuse, where developers utilize existing, tested code instead of creating new, untested code Despite this, the process of rebuilding a system can still be time-consuming, especially if the original system took years to develop By the time the new product is ready for sale, market dynamics may have shifted Additionally, creating a new version of software, such as transitioning from UNIX to Windows, results in the challenge of managing and upgrading two separate versions, leading to unnecessary complexity and overhead in future updates and functionality enhancements.
One effective method for transferring software between platforms is through third-party software solutions Companies like Nutcracker and Cygwin provide a UNIX-like environment on Windows, enabling users to run UNIX programs seamlessly However, the effectiveness of these solutions is still debated, particularly regarding their speed and efficiency Running code compiled for UNIX on PC hardware can lead to wasted time, and organizations must also consider the varying costs of these third-party tools, which can range from free to quite expensive.
While free open-source solutions may be appealing, they typically offer limited support, leading many companies to invest in costly products to ensure compatibility with third-party systems This potential for increased expenses and the risk of reduced efficiency make such options less attractive for businesses.
The Perils of Porting
While manually porting code is often preferable to starting anew or purchasing third-party solutions, it presents significant challenges Various issues can emerge during the manual porting process, potentially resulting in the new software not operating as effectively as the original.
Porting software can be challenging due to dependencies on operating system calls, particularly in languages like C For instance, obtaining system time in UNIX involves using the Gettimeofday() function, which retrieves the time from the UNIX operating system However, this function is incompatible with Windows, where the GetSystemTime() function must be used instead This discrepancy is compounded by the fact that the UNIX function returns time in microseconds, while the Windows function provides it in milliseconds, highlighting the complexities of cross-platform software development.
A significant issue in code porting arises when the code compiles successfully but fails to function as intended on a different machine For instance, while a Windows compiler might detect an issue with UNIX system calls related to time functions, there are instances where the compiler overlooks potential problems, leading to unexpected behavior in the program.
Porting applications can be challenging not only due to system calls but also because the original development team may no longer be available In a dynamic job market, many developers may have left the company by the time porting is required This can lead to new developers encountering uncertainty about the purpose of specific functions In their efforts to adapt the application to the new system, these developers might unintentionally alter the original functionality, resulting in the creation of bugs.
Project Statement
The significance of software porting is evident, as it provides companies with a cost-effective and efficient method for transferring their software between platforms Despite its advantages, it is crucial to recognize that porting can introduce bugs into the code, which may lead to severe consequences for the software’s functionality.
Therac-25 Therefore this thesis focused on eliminating bugs in C code ported from UNIX to a Microsoft Windows environment.
C source code was selected due to its widespread availability on UNIX platforms and its continued relevance in software development This project specifically addresses the ports from UNIX to WIN32 systems, which include various operating systems like Windows 95, 98, NT, and 2000 that support 32-bit memory addressing Most companies engaged in porting applications from UNIX to Windows primarily utilize Windows NT for this purpose.
Windows 2000 Both of these operating systems provide more advanced technologies and support more protocols than Windows 95 and 98
The bug elimination process was divided into three phases The first phase identifies key issues to consider when transitioning C source code from a UNIX environment to Microsoft WIN32 The second phase involves modifying the static checker LCLint to detect the specific porting bugs identified in the first phase Finally, the third phase tests the enhanced static checker on ported C source code to effectively uncover useful bugs.
Debugging Methods
The Limits of Other Debugging Methods
The software design process is divided into multiple stages, collectively known as the software lifecycle Key stages include requirements analysis, system design, implementation, testing, and maintenance To enhance software quality, various techniques are employed throughout these stages to identify and eliminate bugs before the final product is delivered.
The requirements analysis and specification phase marks the initial step in the software development lifecycle, focusing on creating a comprehensive document that clearly outlines the client's software needs This critical document serves as a reference throughout all subsequent stages of the project, ensuring alignment between client expectations and development efforts.
The specification document serves as the foundation for system design and testing, ensuring the software meets initial requirements while minimizing bugs during development The inherent ambiguity of English often complicates the accurate specification of complex systems, leading to the creation of specialized languages like Z, which promote formal specifications and reduce inconsistencies Although these formal languages aim to decrease bugs in the final product, they do not significantly aid in software porting, which occurs during the maintenance phase after the product's release Porting typically involves transferring source code with minimal rework, often neglecting the original specifications.
Testing and verification are crucial phases in the software lifecycle aimed at identifying and eliminating bugs This process involves executing a program with specific test cases and comparing the actual outputs to the expected results While testing can effectively identify many bugs, it cannot guarantee the removal of all issues, as its success heavily depends on the selection of relevant test cases, which are often based on risk areas identified in the requirements specification After a software port, new risks may emerge that were not covered in the original implementation, potentially leaving newly introduced bugs undetected if only the old test cases are used.
Testing can be flawed due to reliance on outdated or incomplete specifications If the original specification contains errors, the resulting test cases may also be designed incorrectly, leading to incorrect outputs from the system Consequently, these errors may go undetected until the end user encounters them, as the software operates according to the flawed specifications.
When a specification is designed correctly, it enables the creation of effective test cases However, it is important to note that it is not feasible to create a comprehensive set of test cases that will identify every possible error.
Static Checking
Static checking is an effective method for identifying bugs in programs without executing the source code Utilizing specialized software tools, static checkers analyze the code and generate reports highlighting errors or potential issues These tools focus on various aspects, such as unreachable code, unused variables, and unreferenced labels, ensuring a thorough examination of the code in its text format.
One potential static checker explored in this thesis is Metal, a system and language designed for static code analysis Metal enables users to specify the exact issues they wish to identify within their code, making it particularly useful for tasks like porting C code from UNIX to WIN32 systems By utilizing Metal's language, users can effectively "program" the system to search for specific bugs relevant to their projects.
Stanford University is currently developing Metal, which poses several challenges for my thesis As the project is still in the development stages and primarily intended for in-house testing, it lacks sufficient documentation and support Consequently, if I encountered bugs or implementation issues while using Metal, there would be no assurance that these problems could be resolved in time for my thesis completion.
LCLint, currently under development at the University of Virginia, is a static checking system similar to Metal that enables code analysis and allows programmers to customize searches for specific issues It features user-defined annotations, empowering developers to create specific constraints that are checked during the static analysis process.
LCLint, much like Metal, is being developed in a university environment and is already utilized by thousands across various applications While its user-defined annotations are still in development and lack extensive documentation, relying on LCLint for support seems less risky, especially since its development is overseen by my technical advisor at the University of Virginia.
Static checking emerged as the most suitable debugging method for my thesis due to its reliability and effectiveness in identifying bugs during the maintenance phase One of the key objectives of this research is to save time and money, and static checking is recognized as a cost-effective solution, as highlighted by Jalote, who states that "static analysis is a very cost-effective way of discovering errors" (370) Additionally, it has been demonstrated that static checking significantly reduces the time and effort required during the testing process.
Finding Porting Issues (Phase 1)
Research Methodology
The initial phase of the project involved identifying common issues encountered in previous C code ports by reviewing various documented sources such as computer journals, websites, and newsgroups These resources provided valuable insights from programmers sharing their experiences, which helped highlight frequent porting challenges Additionally, researching the specific characteristics of UNIX and WIN32 operating systems revealed potential complications that could arise during the porting process due to their inherent differences To further enhance understanding, personal interviews were conducted, including one with a University of Virginia student who was transitioning C source code from UNIX to Microsoft Windows.
NT Along with his knowledge I added my own personal knowledge of porting C code
As a summer intern for STR Software in Richmond, Virginia, part of my responsibility was to begin porting their main software product to Windows NT
To ensure accuracy in identifying potential bugs, a comprehensive list of issues was compiled based on credible sources Only bugs from reputable documented journals, multiple less reliable sources like websites or newsgroups, or those personally verified were included This approach minimizes false reports and maintains the integrity of the issue list.
The Issues
The completed list of issues can be broken up into several general categories These categories include memory management, file input and output, process management, and miscellaneous
Memory management presents challenges during the porting process due to the differing approaches of UNIX and WIN32 UNIX utilizes distinct APIs for various memory types, while WIN32 consolidates these into a limited set of APIs, complicating the transition A notable example is the UNIX function shmctl(), which manages shared memory but lacks a direct equivalent in WIN32 Consequently, effectively managing shared memory is a critical consideration when porting applications between these two systems.
When transitioning code from UNIX to WIN32 systems, file input and output present significant challenges due to the differing ways files are managed WIN32 files are handled distinctly from UNIX files, necessitating a unique set of WIN32 APIs that serve as alternatives to standard C and UNIX file APIs Although WIN32 systems offer APIs for most standard calls, the inability to use these standard functions in conjunction with WIN32-specific functions creates a substantial hurdle for developers tasked with porting applications.
In a standard UNIX C program, a function call to fopen() opens a file, which is then passed to a read() function to retrieve data When this code is ported to a WIN32 system, it may compile and function correctly; however, if a programmer introduces a WIN32-specific function like ReadFile() using the same opened file, there is no guarantee from Microsoft that this will operate as expected Such scenarios are frequent when handling files or sockets in C applications transitioned to WIN32 environments.
Process management is a critical aspect of operating systems, alongside file input and output A key system call frequently utilized by UNIX developers is the fork() function, which generates a duplicate process However, the behavior of this function can vary significantly when compared to its WIN32 equivalent, depending on the subsequent statements in the program.
Process management concerns extend beyond just one issue In UNIX, process priority is determined by the lowest number indicating the highest priority, whereas in WIN32, the priority is defined within a set range where higher numbers represent higher priority This discrepancy can lead to potential bugs after porting code, as the process priorities may inadvertently reverse, significantly impacting the functionality of the processes involved (Digital 9).
The miscellaneous category encompasses issues that do not fit into more specific classifications, such as bit size and bit order on UNIX systems Typically, an integer declaration in C is 32 bits, but the arrangement of these bits can vary between machines The most significant bit, representing 2 to the 31 power, contrasts with the least significant bit at 2 to the 0 power, leading to discrepancies in bit order across different UNIX machines WIN32 systems introduce their own standards for integer size and orientation While code that does not manipulate bit order or size remains unaffected during a port, logical shifts or specific value placements within an integer can pose significant challenges when transitioning to a WIN32 environment.
Designing and Preparation of the System (Phase 2)
Initial Design
In deciding on a static checker for my system, I evaluated three options: Metal, LCLint, and the possibility of creating a custom solution However, building my own checker was quickly dismissed due to its time-consuming nature and the project's scope Both Metal and LCLint possess the required functionalities for this project, making it unnecessary to develop a new solution when viable alternatives already exist.
LCLint was chosen as the static checker for my system due to its advantages over Metal, including better documentation and direct access to one of its developers at the University of Virginia This connection facilitated the use of user-defined annotations and allowed for enhancements tailored to my needs Additionally, LCLint's widespread use among thousands of users compared to Metal's limited adoption further supported my decision, ensuring greater exposure for my research and design Ultimately, LCLint's capabilities and available support made it the most reasonable choice for my project.
After selecting LCLint, the next step involved retooling the issues identified in phase one for integration into LCLint's user-defined annotations These annotations can be utilized to assign a state to a variable upon its creation and to track how that state changes as the variable is passed through various functions If a programmer attempts to pass a variable to a function while it is in an incorrect state, LCLint will issue a warning For instance, a key issue identified in phase one was the difference in socket connection handling between UNIX and WIN32 systems; in UNIX, socket connections are closed using close(int), whereas in WIN32, close(int) only closes files, necessitating the use of closesocket(int) to close sockets.
To ensure LCLint detects when a programmer mistakenly uses the file close function to close a socket, I would assign a specific state to any variable representing a socket If this state is passed to the close() function, it would be flagged as an illegal state, prompting LCLint to notify the programmer of the error.
Incorporating issues from phase one into LCLint presented varying levels of difficulty; while some were easily integrated, others proved challenging or impossible to define, resulting in their exclusion from the project Notably, several of these challenges related to buffer passing to specific functions and the verification of buffer sizes For instance, the CreateProcess() function on WIN32 systems accepts a maximum of 1024 characters, highlighting the complexities involved in accurately managing buffer limits.
UNIX systems do not impose limitations on string lengths, which can lead to issues when porting code, such as the truncation of long strings in CreateProcess() At the time of this thesis, LCLint lacked the ability to check for buffer length or overflow, making it impossible to address this critical concern when transitioning C code from UNIX to WIN32 systems However, buffer checking is currently being developed for LCLint, suggesting that this and other similar issues could be incorporated into the system in the future.
LCLint Preparation
After selecting LCLint as the static checker for the project, the issues identified in phase one were restructured for compatibility with LCLint The subsequent step involved preparing LCLint for effective implementation.
For LCLint to function effectively, it requires a comprehensive library of function headers for all system-provided functions, including standard C functions like printf(), read(), and malloc(), as well as WIN32 system calls such as CreateProcess(), GetSystemTime(), and CreateThread() While LCLint includes libraries for UNIX, ANSI, and POSIX, it currently lacks a WIN32 library Therefore, the initial step in preparing LCLint for implementation involves creating a WIN32 library that encompasses the essential WIN32 system calls and standard C function calls.
In addressing the modified issues, I initiated a search for the Windows include files provided with Microsoft’s Visual C/C++ distribution, which contain essential function declarations for WIN32 system calls and standard C functions This search focused on identifying functions relevant to the previously outlined issues from phase two Any file containing necessary function declarations was integrated into the WIN32 library For instance, the closesocket(int) function, crucial for resolving an issue, was found in the winsock2.h file, prompting its inclusion during the compilation of the WIN32 library Although developing a comprehensive WIN32 library is a primary objective for LCLint developers, its complete creation exceeds the scope of this thesis.
The decision to include only essential files in the creation of the WIN32 library was driven by concerns over speed and efficiency Reading the entire library each time LCLint was executed would be time-consuming, especially since only a few files were necessary for the system Additionally, LCLint faced challenges when compiling WIN32 code, requiring modifications, renaming, or commenting out problematic code to ensure proper functionality Including unnecessary files would have complicated the process and extended the workload, ultimately falling outside the scope of the thesis.
Modifications to the WIN32 header files were implemented to enhance their utility for LCLint, including the addition of annotations to defined types for improved checking of user-defined types Additionally, all function headers were updated to address a naming issue; Microsoft’s practice of naming all variables in function headers hindered the use of macros with the same names To resolve this, all function parameters were prefixed with "p_" to prevent naming conflicts.
After successfully developing a functional WIN32 library that encompassed all necessary files and function declarations for my thesis, LCLint was prepared for the implementation of my design Chapter 5 will delve into phase two, providing a detailed exploration of the system's implementation.
System Implementation (Phase 2 continued)
Warn On Use
The first category of implementation focused on warnings related to usage, addressing critical issues for porters that static checkers could not identify as errors This was achieved by adding annotations to the WIN32 library for essential functions, formatted as /*@warn unixtowin “message”@*/ The C and C++ comment syntax ensures that the compiler ignores this line while processing the code The "at" symbol indicates to LCLint that this is an annotation, prompting it to interpret the warn command, the flag (unixtowin), and the accompanying message in quotes.
After LCLint processes the library, it alerts the programmer with a specified message whenever a function with the relevant annotation is utilized The programmer can suppress this warning by supplying LCLint with the appropriate flag value at the command line once the issue has been resolved.
One significant issue identified in the implementation of the SetProcessPriority() function is the difference in how process priority is defined between WIN32 and UNIX systems In WIN32, a higher numerical value indicates a higher priority, while in UNIX, the opposite is true, with lower numbers representing higher priority To address this potential confusion, a "warn on use" annotation was added to the WIN32 library for the SetProcessPriority() function This annotation serves as a precautionary measure, alerting programmers to the common porting issue and allowing them to investigate and resolve any potential misconfigurations before suppressing the warning.
Variable States
After incorporating the usage constraints, the next phase involved implementing variable state constraints, as briefly outlined in Chapter 4 By creating xh files, state values are assigned to variables during function calls, while mts files establish rules for state transitions and trigger errors for incorrect state inputs When LCLint processes the xh and mts files, it effectively manages the variable states.
In UNIX, the close(int) function effectively closes both open files and sockets, while in WIN32 systems, it only closes files, necessitating the use of closesocket(int) for sockets This discrepancy may lead to porting challenges when attempting to close sockets using the close(int) function on WIN32 systems To address this issue, a variable state is employed, as illustrated in Figure 2, which features an excerpt from socketstate.xh, a file that defines annotations for the WIN32 library to manage this functionality.
WSAAPI socket( int p_af, int p_type, int p_protocol
SOCKET p_s, struct sockaddr FAR * p_addr, int FAR * p_addrlen
_CRTIMP int cdecl _close(/*@socketna@*/ int);
_CRTIMP int cdecl close(/*@socketna@*/ int);
Figure 2 Socketstate.xh The LCLint file that defines the state of variables in WIN32 functions
Annotations inside of functions define the state that the variable should be when the function is called Annotations outside of functions define the state of the return variable.
Anytime a socket connection is opened (this can be accomplished by a call to several different functions) the variable assigned the socket value is given a state
“socketopen” The function closesocket(int) was annotated so that the state “socketopen” should enter into it, and the state of the variable after the function call is “socketclosed”
The function close(int) is designed to only accept variables with the state "socketna." It utilizes context references for socket states, which include closed, open, na, and dc Specific annotations ensure that an open socket cannot be passed as closed, resulting in an error, and vice versa Additionally, sockets in the open state cannot be passed as na, nor can closed sockets be treated as na The function also prevents the illegal passing of sockets in the dc state as either open or closed, leading to errors Lastly, if there is a loss of reference to an open socket, an error is triggered, ensuring strict adherence to socket state management.
Figure 3 Socketstate.mts The LCLint file that defines the rules of state transitions for the “socketstate” variable state definition.
The socketstate.mts file outlines the rules for state transitions in socket management After processing this file, LCLint can verify that sockets are properly closed When a socket is opened, it is marked with the state "socketopen." If the close(int) function is invoked with a "socketopen" state, an error is triggered, stating, "Sockets not allowed to be passed to this func." Additionally, if a programmer neglects to close the socket, the lost reference rule in the socketstate.mts file will generate an error, ensuring the program does not terminate with an open socket.
This implementation provides two key benefits for programmers Firstly, it alerts them if they attempt to close a socket incorrectly Secondly, it prevents the use of closed sockets in the closesocket(int) function and ensures that only open sockets can be used in read, write, and other operations Consequently, this approach not only addresses the porting issue but also introduces additional checks that enhance overall reliability.
Global State
The final implementation method involved assigning a global state, which functions similarly to variable state but applies to the entire system rather than just local variables This approach ensures that the appropriate initialization procedures are executed before any specific functions or operations are performed.
On UNIX systems, C programs can easily open sockets using the socket() function, while WIN32 systems require an additional step: calling WSAStartup() to initialize the sockets before using the socket() function Failing to call WSAStartup() can lead to unpredictable socket behavior, as the compiler will not flag this oversight, and Microsoft does not provide guarantees for socket functionality in such cases Therefore, when porting C programs from UNIX to WIN32, it is crucial to address this initialization step to avoid introducing bugs into the program.
To ensure proper operation of socket functions, we establish a global state that prevents their invocation prior to initializing the system This state, referred to as "sockets_uninitialized," remains active until the initialization function is executed If any socket function is attempted during this uninitialized state, an error will be triggered, alerting the programmer of the incorrect state Once the initialization function is successfully called, socket functions will function as expected.
5.4 Self-tests and System Limitations
Throughout the implementation stages, each design underwent continuous testing and redesign to ensure proper functionality Initial tests focused on evaluating the effectiveness of "warn on use" checks, specifically using LCLint on a small program that featured four different references to the function call select().
//calls select() function to test for warn on use //issues. return select(read_fd, write_fd, NULL, NULL);
Figure 4 Test1.c A function from a test program written to test the warn on use implementation of the function select.
LCLint effectively identifies actual function calls, as demonstrated by its warning on only one instance of select() in the program, despite grep detecting all four occurrences The other references were merely comments or irrelevant text, highlighting LCLint's capability to focus on meaningful code This underscores the tool's utility in enforcing warnings on function usage, particularly in complex programming environments.
The individual and collective testing of variable state implementations was conducted to ensure their effectiveness in identifying specific errors The results are illustrated in Figure 5, which features a test program utilized with the "socketstate" implementation depicted in the accompanying figures.
2 and 3 The comments show when errors should be raised These self-tests showed two important areas of concern with the implementation
The provided code snippet demonstrates various socket operations in a Winsock environment It initializes multiple integer variables for socket management but encounters several errors during the execution of socket functions Notably, the `closesocket(a)` function fails because socket 'a' is not open, while the `accept` function successfully opens a new socket 'b' The `select` function is executed without issues, but the subsequent `send` call raises a warning due to a null pointer Closing socket 'b' results in an error, indicating it is still open, and attempting to close socket 'd' also fails for the same reason A new socket 'c' is created successfully and closed properly afterward The program concludes with a return statement, but it loses reference to the open socket 'b', highlighting the importance of proper socket management in network programming.
Figure 5 Sock.c The C test file used to test the “socketstate” implementation discussed earlier Notice the comments report when errors should occur.
One major concern with LCLint is its approach to parameter passing, as it conducts static checks without tracking the program's actual flow This results in LCLint analyzing each function and procedure in isolation, disregarding the sequence of function calls Consequently, when a variable is assigned a value in one procedure and subsequently passed to another function, LCLint fails to recognize the variable's value, leading to potential issues in variable state implementation.
To avoid unnecessary errors or the potential omission of errors in LCLint, it's crucial to address the issue of variable state management This can be easily resolved by adding annotations to functions within the program, ensuring that each variable retains its correct state as it is passed between functions, similar to the annotations used in the WIN32 library.
Post Port Testing (Phase 3)
Application Selection
The chosen application for testing was a ping program written in C and ported to Windows NT, selected for its simplicity and accessibility With only a few hundred lines of code spread across two files, the manageable size ensured easier error identification Additionally, the program's extensive use of sockets made it an ideal candidate, as it was more likely to encounter socket-related issues compared to applications that did not utilize sockets.
The testing of the ping application identified nearly fifty errors reported by LCLint, with only seven pertaining to porting issues Most errors were linked to common checks by LCLint, such as loss of return values, incorrect assignments, and improper NULL handling in functions Out of the seven porting errors, only two were relevant to the ping application, while the other five highlighted issues within the checking system Although five errors may appear significant, they reflect the methodology employed to address the problem rather than being merely unwanted errors.
Issues Discovered
The initial problem identified in the ping application involved the improper implementation of the abstract type SOCKET While preparing the WIN32 library for LCLint, various Windows types were annotated to restrict their usage as their base type.
SOCKET types are essentially integer types designated for socket descriptors, ensuring clarity in their intended use While SOCKET types can be utilized in arithmetic operations like integers, they should exclusively be employed in socket-related functions The WIN32 library annotates this distinction, allowing LCLint to verify proper usage in socket operations.
The SOCKET type is being utilized correctly; however, the ping application has a flaw where a SOCKET is mistakenly treated as an integer due to an arbitrary integer assignment This practice is problematic because socket functions are designed to operate solely on valid socket descriptors, not on any random integer values.
LCLint identified a critical issue in the application related to a lost reference to an open socket The problem arose when a socket was opened, but the program could exit without properly closing it Although the ping application initially included a call to the closesocket() function, a deeper investigation revealed that the application could encounter an error and terminate without closing the open socket, leading to potential resource leaks.
Remaining Issues
LCLint identified significant issues in the ping application, highlighting flaws in its testing approach rather than directly finding bugs The tool reported instances where the program exited without closing an open socket and flagged illegal transfers between states, indicating attempts to use variables as open sockets without proper initialization However, further inspection revealed that these reports stemmed from parameter passing errors within the application, as LCLint's static analysis could not determine the state of variables when entering functions This resulted in default state assignments, leading to the reported illegal transitions Specifically, while a socket was opened early in the application and passed to a function for data transmission, LCLint incorrectly categorized the variable state, causing errors Consequently, the application did not close the open socket upon exiting the function, contributing to additional reported issues.
The issue with parameter passing in the ping application led to four unwanted errors, but it was easily resolved by adding annotations to the function declarations Specifically, an annotation was included in the send function to specify that only open state sockets should be passed This allowed LCLint to recognize that the socket is open, preventing illegal transfer errors Furthermore, the function will now raise an error if called with a non-open socket, ensuring proper usage throughout the program Ultimately, by implementing these annotations, similar to those added to the WIN32 library, the unwanted errors in the ping application were eliminated.
LCLint raised a warning regarding the use of the select function in the ping application, highlighting potential timing and socket error issues However, these concerns were not applicable to this particular application, allowing me to suppress the warning and proceed with my checks.
The ping application proved to be an invaluable tool for verifying LCLint with porting annotations It demonstrated LCLint's capability to identify issues within a real application and highlighted the significance of annotating source code, enabling LCLint to accurately interpret the programmer's intent.
Summary and Significance
System Usefulness
The three phases outlined demonstrate how the modified LCLint serves as an effective solution for addressing porting issues Phase one highlighted the various challenges encountered when transferring C code from UNIX to WIN32 systems In phase two, a system was designed and implemented to efficiently identify these bugs Finally, phase three confirmed that the enhanced LCLint successfully detects software bugs, showcasing its utility in the porting process.
The LCLint system offers significant benefits in two key areas: pre-porting and post-porting code evaluation Before porting, LCLint can analyze the code to identify potential issues, enabling programmers to devise effective solutions Additionally, if the cost of third-party software is lower than the time required for code redesign, LCLint can guide the decision to opt for a third-party solution instead of pursuing the port After the code has been ported, LCLint aids in system testing by identifying bugs that might be overlooked during testing The software development community recognizes static checking as a cost-effective method for error detection, as it can reveal the underlying errors rather than just their presence Consequently, utilizing LCLint can streamline the system testing process, reducing costs and allowing companies to enter new markets more swiftly with reliable products.
System Effects
LCLint, with its modifications, not only identifies porting issues but also influences multiple facets of society Its impact is felt locally at the University of Virginia, within the business sector, and across the broader community.
The project marks the inaugural application of Professor Evans’s LCLint with newly developed user-defined annotations, allowing for enhanced functionality in checking specific issues in ported C code The ease of adding these annotations, despite minimal documentation, enables Professor Evans to evaluate the success of this feature If deemed beneficial, he may enhance the overall product by releasing updated documentation and announcing the new feature Given LCLint's widespread use, the results could reach thousands of users, further amplifying the impact of this development.
My thesis could significantly impact software companies by streamlining the porting process from UNIX to WIN32 environments Typically, after porting, extensive testing is required to ensure functionality, which can be time-consuming and labor-intensive However, with the introduction of a static checker designed to identify specific porting issues, the time spent on testing can be greatly reduced This efficiency not only lowers labor costs but also accelerates product turnaround, ultimately leading to increased revenue for companies.
The implementation of LCLint with porting annotations could significantly influence the corporate landscape, particularly regarding third-party providers Companies may find it more cost-effective to use LCLint rather than relying on manual ports or third-party solutions If many organizations opt for LCLint, it could lead to decreased revenue for third-party providers, potentially resulting in layoffs or restructuring within those companies Conversely, if LCLint enhances confidence in system functionality and reduces porting costs, it may deter companies from using third-party vendors altogether Ultimately, the impact will vary based on the specific source code of each company.
This project significantly impacts human life by focusing on static checking and software porting, ultimately contributing to the development of fault-tolerant software Static checking serves as a vital tool for programmers to verify and test software functionality By enhancing the reliability of software systems, this project demonstrates its potential to save human lives.
Computers are increasingly integral to our daily lives, controlling everything from aircraft to medical devices As our reliance on these systems grows, the risk of software bugs or failures leading to serious harm also rises A notable case is the Therac-25, a radiation therapy device that, due to a software error, resulted in the deaths of seven patients and injuries to many others.
The initial phase of the project revealed that bugs often arise during the porting process Subsequent phases demonstrated that LCLint, when utilized with annotations for porting, effectively identifies these bugs Implementing this system in software that impacts human lives could potentially uncover critical bugs, preventing situations that might lead to human casualties Ultimately, this approach aims to enhance the reliability of software in life-critical devices, ensuring that tragedies like the Therac-25 incident are avoided in the future.
Recommendations
This project holds significant potential, making it an appealing foundation for future research Two primary areas warrant further investigation to enhance understanding and application of the findings.
Future research should focus on incorporating additional constraints to enhance LCLint’s capabilities By identifying more porting bugs or integrating those discovered in the initial phase of the project, LCLint's effectiveness can be significantly improved While the system's performance should be assessed based on the quality of bugs detected rather than their quantity, an increased ability to identify issues will make LCLint more valuable to companies and society overall.
Future research should focus on incorporating additional programming languages into static checking tools While C was selected due to its extensive documentation on UNIX systems, other languages could also benefit from a static checker designed to identify porting issues This endeavor would require identifying new constraints specific to each language and utilizing a different static checker, as LCLint is currently limited to C code.
Chou, Andy, and Dawson Engler Metal: a language and system for building lightweight, system-specfic software checkers, analyzers, and optimizers
Available upon request: acc@cs.stanford.edu, 2000.
Deloitte & Touche Consultant Group Deploying Windows NT in Technical Workstation
Environments Online 1997 Available: http://www.microsoft.com/ntworkstation/technical/WhitePapers/MigrateUnix.asp.
Digital Equipment Corporation, Digital UNIX and Windows NT Interoperability Guide,
USA, 1996 Digital Equipment Corporation Available: http://wint.decsy.ru/du/ Digital/Unixnt/index.htm
Evans, David “Annotation-Assisted Lightweight Static Checking.” Position Paper for
The First International Workshop on Automated Program Anaylysis, Testing, and Verification Available: http://lclint.cs.virginia.edu/icse-position.html
Evans, David LCLint User’s Guide Online May 2000 Available: http://lclint.cs.virginia.edu/guide/.
Giguere, Eric “Porting C Programs.” Computer Language February 1988: 75 – 78.
Glass, David “Porting UNIX Applications to DOS.” Dr Dobb’s November 1991: 68-
Jalote, Pankaj An Integrated Approach to Software Engineering New York: Springer-
Johnston, Stuart “UNIX to NT Hassle-free.” Information Week Online February 22,
1999 Available: http://www.informationweek.com/722/unixnt.htm
Leveson, Nancy, and Clark Turner “An Investigation of the Therac-25 Accidents.”
Niezgoda, Steve “Charting the Uncharted.” Byte October 1994: 203- 204.
Schubert, Brenden, Third Year attending University of Virginia, Charlottesville, VA
Silberschatz, Avi, and Peter Galvin Operating System Concepts New York: John
Wagner, Bill The Complete Idiot’s Guide to UNIX Indianapolis: Que Corporation,
Phase 1 of this thesis revealed several issues, each accompanied by a detailed description of the problem, the relevant files, and, when applicable, the proposed solutions to address these challenges.
In UNIX systems, sockets are terminated using the close() function, while in WIN32 systems, the closesocket() function is required for closure, although the close() function is also available but serves a different purpose.
To address the issue, a variable state named socketstate was established Whenever an integer or socket type was declared, it was assigned a state of dcstate This approach ensured that if the variable received a value from a function leading to an open socket, such as socket() or accept(), the system could effectively manage the socket's state.
When the WSASocket() function is called, the state transitions to socketopen If a variable indicating the socketopen state is passed to the close() function, it triggers an error Additionally, if the program exits without invoking closesocket() while a variable remains in the socketopen state, an error will also occur.
Issue 2: Standard C APIs vs WIN32 APIs for Sockets
WIN32 systems offer unique functions for creating and utilizing sockets, alongside the standard C methods found in UNIX systems While it's possible to create sockets using one approach and operate them with functions from the other, Microsoft does not ensure proper functionality in such cases Therefore, maintaining consistency in the method used is crucial for reliable performance.
To address the issue, a variable state named apistate was introduced Whenever an integer, handle, or socket type was declared, it was assigned a state of dcstate When any of the socket functions were invoked, the state would switch to unixapi or winapi based on the type used If the variable was subsequently passed to a function of the opposing type, an error would be triggered.
Issue 3: Standard C APIs vs WIN32 APIs for Files
WIN32 systems offer unique file handling functions alongside standard C methods found in UNIX systems Users can create or open files with one method and subsequently read or write using the other; however, Microsoft does not ensure reliable performance when mixing these approaches Therefore, maintaining consistency in file handling methods is crucial for optimal functionality.
Files Involved: winsock2.h, io.h, stdio.h, winbase.h
Solution: Solved by creating a variable state called apistate, in conjunction with issue 2
In programming, when an integer, handle, or socket type is declared, it is assigned a state known as dcstate This state changes to either unixapi or winapi based on the file functions invoked If the variable is subsequently passed to a function that operates with the opposite type, an error is triggered.
In UNIX, both files and sockets can be monitored to determine if they are ready for reading or writing, while on WIN32 systems, polling files is not possible and yields unpredictable results.
Files Involved: winsock2.h, io.h stdio.h, stat.h, types.h
Solution: Solved by creating a variable state called
Issue 5: Error Checking of Sockets
In UNIX systems, socket errors or hang-ups are identified by placing the socket file descriptor in the error set following a call to the poll() or select() function Conversely, on WIN32 systems, errors are indicated by signaling that a socket is ready to read, followed by reading zero bytes Consequently, depending on select to detect hung-up sockets is not advisable for WIN32 systems.
A warning annotation has been implemented for the select function to alert programmers about its differences This notification allows developers to make necessary adjustments before choosing to suppress the warning message.
Description: In UNIX, sockets can be created with no initializing, but on WIN32 systems, sockets must first be initiated by a call to the function WSAStartup().
Description: In UNIX, the poll function timeouts after the supplied time in milliseconds.
On WIN32 systems the select function timeouts after the supplied time in seconds and milliseconds Easy to get confused and have timeout to long.
A warning annotation has been implemented for the select function to alert programmers about its differences whenever it is used This allows developers to make necessary adjustments before suppressing the warning message.
Description: Only 1024 characters are allowed to be passed to the function
CreateProcess() on WIN32 systems It’s counterpart on UNIX, fork(), has no restriction.