Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 47 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
47
Dung lượng
222,88 KB
Nội dung
file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm involved in the cycle of writing, assembling, and testing an assembly-language program. The cycle itself sounds more complex than it is. I've drawn you a map to help you keep your bearings during the discussions in this chapter. Figure 3.5 shows the assembly- language development process in a "view from a height." At first glance it may look like a map of the L.A. freeway system, but in reality the flow is fairly straightforward. Follow along on a quick tour. Assembling the Source-Code File You use the text editor to first create a new text file and then to edit that same text file, as you perfect your assembly language program. As a convention, most assembly language source code files are given a file extension of .ASM. In other words, for the program named FOO, the assembly language source code file would be named FOO.ASM. It is possible to use file extensions other than .ASM, but I feel that using the .ASM extension can eliminate some confusion by allowing you to tell at a glance what a file is for—just by looking at its name. All tolled, about nine different kinds of files can be involved during assembly language development. We're only going to speak of four or five in this book.) Each type of file will have its own standard file extension. Anything that will help you keep all that complexity in line will be worth the (admittedly) rigid confines of a standard naming convention. As you can see from the flow in Figure 3.5, the editor produces a source code text file, which we show as having the .ASM extension. This file is then passed to the assembler program itself, for translation to a relocatable object module file with an extension of .OBJ. Invoking the assembler is very simple. For small standalone assembly-language programs in Turbo Assembler, it's nothing more than the name of the assembler followed by the name of the program to be assembled (for example, C:\ASM>TASM FOO). For Microsoft's MASM, you need to put a semicolon on the end of the command. This tells MASM that no further prompts are necessary (for example C:\ASM>MASM FOO). If you omit the semicolon, nothing bad will happen, but MASM will ask you for the names of several other files, and you will have to press Enter several times to select the defaults. DOS will load the assembler from disk and run it. The assembler will open the source code file you named after the name of the assembler, and begin processing the file. Almost immediately afterward, it will create an object file with the same name as the file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm (19 of 39) [9/25/2002 6:47:17 PM] file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm source file, but with the .OBJ extension. As the assembler reads lines from the source code file, it will examine them, construct the binary machine instructions the source code lines represent, and then write those machine instructions to the object code file. When the assembler detects the EOF marker signaling the end of the source code file, it will close both source code file and object code file and return control to DOS . file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm (20 of 39) [9/25/2002 6:47:17 PM] file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm Assembler Errors The previous three paragraphs describe what happens if the .ASM file is correct. By correct, I mean the file is completely comprehensible to the assembler, and can be translated into machine instructions without the assembler getting confused. If the assembler encounters something it doesn't understand when it reads a line from the source code file, we call the misunderstood text an error, and the assembler displays an error message. For example, the following line of assembly language will confuse the assembler and summon an error message: MOV AX.VX The reason is simple: there's no such thing as a "VX." What came out as "VX" was actually intended to be "BX," which is the name of a register. (The V key is right next to the B key and can be struck by mistake without your fingers necessarily knowing that they done wrong.) Typos are by far the easiest kind of error to spot. Others that take some study to find usually involve transgressions of the assembler's rules. Take for example the line: MOV ES,OFFOOH This looks like it should be correct, since ES is a real register and 0F000H is a real, 16-bit quantity that will fit into ES. However, among the multitude of rules in the fine print of the 86-family of assemblers is that you cannot directly move an immediate value (any number like 0FF00H) directly into a segment register like ES,DS;SS, or CS. It simply isn't part of the CPU's machinery to do that. Instead, you must first move the immediate value into a register like AX, and then move AX into ES. You don't have to remember the details here; we'll go into the rules later on. For now, simply understand that some things that look reasonable are simply "against the rules" and are considered an error. There are much, much more difficult errors that involve inconsistencies between two otherwise legitimate lines of source code. I won't offer any examples here, but I wanted to point out that errors can be truly ugly, hidden things that can take a lot of study and torn hair to find. Toto, we are definitely not in BASIC anymore The error messages vary from assembler to assembler, but they may not always be as file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm (21 of 39) [9/25/2002 6:47:17 PM] file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm helpful as you might hope. The error TASM displays upon encountering the VX typo follows: Turbo Assembler Version 1.0 Copyright (c) 1988 by Borland International Assembling file: FOO.ASM **Error** FOO.ASMC74) Undefined symbol: VX Error messages: 1 Warning messages: None Remaining memory: 395k This is pretty plain, assuming you know what a "symbol" is. The error message TASM will present when you try to load an immediate value into ES is less helpful: Turbo Assembler Version 1.0 Copyright (c) 1988 by Borland International Assembling file: IBYTECPY.ASM **Error** IBYTECPY.ASMC74) Illegal use of segment register Error messages: 1 Warning messages: None Remaining memory: 395k It'll let you know you're guilty of performing illegal acts with a segment register, but that's it. You have to know what's legal and what's illegal to really understand what you did wrong. As in running a stop sign, ignorance of the law is no excuse. Assembler error messages do not absolve you from understanding the CPU's or the assembler's rules. I hope I don't frighten you too terribly by warning you that for more complex errors, the error messages may be almost no help at all. You may make (or will make; let's get real) more than one error in writing your source code files. The assembler will display more than one error message in such cases, but it may not necessarily display an error for every error present in the source code file. At some point, multiple errors confuse the assembler so thoroughly that it cannot necessarily tell right from wrong anymore. While it's true that the assembler reads and translates source code files line by line, there is a cumulative picture of the final assembly language program that is built up over the course of the whole assembly process. If this picture is shot too full of errors, in time the whole picture collapses. The assembler will stop and return to DOS, having printed numerous error messages. Start at the first one and keep going. If the following errors don't make sense, fix the first one or two and assemble again. file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm (22 of 39) [9/25/2002 6:47:17 PM] file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm Back to the Editor The way to fix errors is to load the .ASM file back into your text editor and start hunting up the error. This "loopback" is shown in Figure 3.5. The error message will almost always contain a line number. Move the cursor to that line number and start looking for the false and the fanciful. If you find the error immediately, fix it and start looking for the next. Here's a little logistical snag: how do you make a list of the error messages on paper so that you don't have to memorize them or scribble them down on paper with a pencil? You may or may not be aware that you can redirect the assembler's error message displays to a DOS text file on disk. It works like this: you invoke the assembler just as you normally would, but add the redirection operator > and the name of the text file to which you want the error messages sent. If you were assembling FOO.ASM with TASM and wanted your error messages written out to a disk file named ERRORS.TXT, you would invoke TASM by entering C:\ASM>TASM FOO > ERRORS.TXT. Here, error messages will be sent to ERRORS.TXT in the current DOS directory C:\ASM. When you use redirection, the output does not display on the screen. The stream of text from TASM that you would ordinarily see is quite literally steered in its entirety to another place, the file ERRORS.TXT. Once the assembly process is done, the DOS prompt will appear again. You can then print the ERRORS.TXT file on your printer and have a handy summary of all that the assembler discovered was wrong with your source code file. Assembler Warnings As taciturn a creature as an assembler may appear to be, it genuinely tries to help you any way it can. One way it tries to help is by displaying warning messages during the assembly process. These warning messages are a monumental puzzle to beginning assembly language programmers: are they errors or aren't they? Can I ignore them or should I fool with the source code until thev go away? There is no clean answer. Sorry about that. Warnings are the assembler acting as experienced consultant, and hinting that something in your source code is a little dicey. Now, in the nature of assembly language, you may fully intend that the source code be dicey. In an 86-family CPU, dicey code may be the file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm (23 of 39) [9/25/2002 6:47:17 PM] file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm only way to do something fast enough or just to do it at all. The critical factor is that you had better know what you're doing. The most common generator of warning messages is doing something that goes against the assembler's default conditions and assumptions. If you're a beginner doing ordinary, 100%-by-the-book sorts of things, you should crack your assembler reference manual and figure out why the assembler is tut-tutting you. Ignoring a warning may cause peculiar bugs to occur later on during program testing. Or, ignoring a warning message may have no undesir-able consequences at all. I feel, however, that it's always better to know what's going on. Follow this rule: Ignore a warning message only if you know exactly what it means. In other words, until you understand why you're getting a warning message, treat it as though it were an error message. Only when you fully understand why it's there and what it means should you try to make the decision whether or not to ignore the message. In summary, the first part of the assembly language development process (as shown in Figure 3.5) is a loop. You must edit your source code file, assemble it, and return to the editor to fix errors until the assembler spots no further errors. You cannot continue until the assembler gives your source code file a clean bill of health. When no further errors are found, the assembler will write an .OBJ file to disk, and you will be ready to go on to the next step. Linking Theoretically, an assembler could generate an .EXE (executable) program file directly from your source code .ASM file. Some obscure assemblers have been able to do this, but it's not a common assembler feature. What actually happens is that the assembler writes an intermediate object code file with an .OBJ extension to disk. You can't run this .OBJ file, even though it contains all the machine instructions that your assembly language source code file specified. The .OBJ file needs to be processed by another translator program, the linker. The linker performs a number of operations on the ,OBJ file, most of which would be meaningless to you at this point. The most obvious task the linker does is to weave several .OBJ files into a single .EXE program file. Creating an assembly language program from multiple .ASM files is called modular assembly. Why create multiple .OBJ files when writing a single executable program? One of two major reasons is size. A middling assembly-language application might be 50,000 lines long. Cutting that single monolithic .ASM file up into multiple 8,000 line .ASM files file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm (24 of 39) [9/25/2002 6:47:17 PM] file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm would make the individual .ASM files smaller and much easier to understand. The other reason is to avoid assembling completed portions of the program every time any part of the program is assembled. One thing you'll be doing is writing assembly language procedures, small detours from the main run of steps and tests that can be taken from anywhere within the assembly language program. Once you write and perfect a procedure, you can tuck it away in an .ASM file with other completed procedures, assemble it, and then simply link the resulting .OBJ file into the "working" .ASM file. The alternative is to waste time by reassembling perfected source code over and over again every time you assemble the main portion of the program. Notice that in the upper-right corner of Figure 3.5 is a row of .OBJ files. These .OBJ files were assembled earlier from correct .ASM files, yielding binary disk files containing ready-to-go machine instructions. When the linker links the .OBJ file produced from your in-progress .ASM file, it adds in the previously assembled .OBJ files, which are called modules. The single .EXE file that the linker writes to disk contains the machine instructions from all of the .OBJ files handed to the linker when then linker is invoked. Once the in-progress .ASM file is completed and made correct, its .OBJ module can be put up on the rack with the others, and added to the next in-progress .ASM source code file. Little by little you construct your application program out of the modules you build one at a time. A very important bonus is that some of the procedures in an .OBJ module may be used in a future assembly language program that hasn't even been begun yet. Creating such libraries of toolkit procedures can be an extraordinar-ily effective way to save time by reusing code over and over again, without even passing it through the assembler again! Something to keep in mind is that the linker must be used even when you have only one .OBJ file. Connecting multiple modules is only one of many essential things the linker does. To produce an .EXE file, you must invoke the linker, even if your program is a little thing contained in only one .ASM and hence one .OBJ file. Invoking the linker is again done from the DOS command line. Each assembler typically has its own linker. MASM's linker is called LINK, and TASM's is called TLINK. Like the assembler, the linker understands a suite of commands and directives that I can't describe exhaustively here. Read your assembler manuals carefully. For single-module programs, however, there's nothing complex to be do Linking our hypothetical FOO.OBJ object file into an .EXE file using TLINK ' done by entering C:\ASM>TLINK FOO at the DOS prompt. If you're using MASM, using LINK is done much the same way. Again, as with MASM, file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm (25 of 39) [9/25/2002 6:47:17 PM] file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm you need to place a semicolon at the end of the command to avoid a series of questions about various linker defaults (for example, C:\ASM>LINK FOO;) Linking multiple files involves naming each file on the command line. With TLINK, you simply name each .OBJ file on the command line after the word TLINK, with a space between each filename. You do not have to include the .OBJ extension—TLINK assumes that all modules to be linked end in .OBJ: C:\ASM>TLINK FOO BAR BAS Under MASM, you do the same thing, except that you place a plus sign (+) between each of the .OBJ filenames: C:\ASM>LINK FOO+BAR+BAS In both cases, the name of the .EXE file produced will be the name of the first .OBJ file named, with the .EXE extension replacing the .OBJ extension. Linker Errors As with the assembler, the linker may discover problems as it weaves multiple .OBJ files together into a single .EXE file. Linker errors are subtler than assembler errors and are usually harder to find. Fortunately, they are rarer and not as easy to make. As with assembler errors, when you are presented with a linker error you have to return to the editor and figure out what the problem is. Once you've identified the problem (or think you have) and changed something in the source code file to fix the problem, you must reassemble and relink the program to see if the linker error went away. Until it does, you have to loop back to the editor, try something else, and assemble/link once more. If possible, avoid doing this by trial and error. Read your assembler and linker manuals. Understand what you're doing. The more you understand about what's going on within the assembler and the linker, the easier it will to determine who or what is giving the linker fits. Testing the .EXE File If you receive no linker errors, the linker will create and fill a single .EXE file with the file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm (26 of 39) [9/25/2002 6:47:17 PM] file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm machine instructions present in all of the .OBJ files named on the linker command line. The .EXE file is your executable program. You can run it by simply naming it on the DOS command line and pressing Enter: C:\ASM>FOO When you invoke your program in this way, one of two things will happen: the program will work as you intended it to, or you'll be confronted with the effects of one or more program bugs. A bug is anything in a program that doesn't work the way you want it to. This makes a bug somewhat more subjective than an error. One person might think red characters displayed on a blue background is a bug, while another might consider it a clever New Age feature and be quite pleased. Settling bug vs. feature conflicts like this is up to you. Consensus is called for here, with fistfights only as a last resort. There are bugs and there are bugs. When working in assembly language, it's quite common for a bug to completely "blow the machine away," which is less violent than some think. A system crash is what you call it when the machine sits there mutely, and will not respond to the keyboard. You may have to press Ctrl+Alt+Delete to reboot the system, or (worse) have to press the reset button, or even power down and then power up again. Be ready for this—it will happen to you, sooner and oftener than you will care for. Figure 3.5 announces the exit of the assembly language development process as happening when your program works perfectly. A very serious question is this: How do you know when it works perfectly? Simple programs assembled while learning the language may be easy enough to test in a minute or two. But any program that accomplishes anything useful will take hours of testing at minimum. A serious and ambitious application could take weeks—or months—to test thoroughly. A program that takes various kinds of input values and produces various kinds of output should be tested with as many different combinations of input values as possible, and you should examine every possible output every time. Even so, finding every last bug is considered by some to be an impossible ideal. Perhaps—but you should strive to come as close as possible, in as efficient a fashion as you can manage. I'll have a lot more to say about bugs and debugging throughout the rest of this book. Errors Versus Bugs In the interest of keeping the Babel-effect at bay, I think it's important to carefully draw file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm (27 of 39) [9/25/2002 6:47:17 PM] file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm the distinction between errors and bugs. An error is something wrong with your source code file that either the assembler or the linker kick out as unacceptable. An error prevents the assembly or link process from going to completion, and will thus prevent a final .EXE file from being produced. A bug, by contrast, is a problem discovered during execution of a program Under DOS. Bugs are not deferred by either the assembler or the linker can be benign, such as a misspelled word in a screen message or a line positioned on the wrong screen row; or a bug can make your DOS session run off into the bushes and not come back. Both errors and bugs require that you go back to the text editor and change something in your source code file. The difference here is that most errors are reported with a line number telling you where to go in your source code file to fix the problem. Bugs, on the other hand, are left as an exercise for the student. You have to hunt them down, and neither the assembler nor the linker will give you much in the line of clues. Debuggers and Debugging The final, and almost certainly the most painful part of the assembly language development process is debugging. Debugging is simply the systematic process by which bugs are located and corrected. A debugger is a utility program designed specifically to help you locate and identify bugs. Debugger programs are among the most mysterious and difficult to under-stand of all programs. Debuggers are part X-ray machine and part magnifying glass. A debugger loads into memory with your program and remains in memory, side by side with your program. The debugger then puts tendrils down into both DOS and into your program, and enables some truly peculiar things to be done. One of the problems with debugging computer programs is that they operate so quickly. Thousands of machine instructions can be executed in a single second, and if one of those instructions isn't quite right, it's long gone before you can identify which one it is by staring at the screen. A debugger allows you to execute the machine instructions in a program one at a time, allowing you to pause indefinitely between each one to examine the effects of the last instruction on the screen. The debugger also lets you look at the contents of any location in memory, and the values stored in any register, during that pause between instructions. As mentioned previously, both MASM and TASM are packaged with their own advanced debuggers. MASM's CodeView and TASM's Turbo Debugger are brutally powerful (and hellishly complicated) creatures that require manuals considerably thicker file:///D|/Agent%20Folders/Chapter%203%20The%20Right%20To%20Assemble.htm (28 of 39) [9/25/2002 6:47:17 PM] [...]... CR, the LF, and the EOF Use the E command to enter them, and then display a dump of the file again: -e 0112 1980:0112 -d 0100 38 E3:0100 38 E3:0110 38 E3:0120 38 E3:0 130 38 E3:0140 38 E3:0150 38 E3:0160 38 E3:0170 OD.2e OA.Od lA.0a OD.la 53 61 60 OD OA 77 61 73- OD OA 61 OD OA 6D 6F 6F Sam was .a .moo 73 65 2E OD OA 1A 04 26-F7 24 5D C2 04 00 55 88 se &.$] U EC 83 EC 12 FF 76 06 FF-76 04 9A 66 17 7D 30 89 .v v... starting with 0100H To see SAM.TXT again, you need to specify the starting address of the dump, which was 0100H: -d 0100 Enter that command, and you'll see the following dump with the altered memory image of SAM.TXT: 38 E3:0100 38 E3:0110 38 E3:0120 38 E3:0 130 38 E3:0140 38 E3:0150 53 61 6D 00 OA 77 61 73- OD OA 61 OD OA 6D 6F 6F Sam was .a .moo 73 65 OA 1A C4 76 04 26-F7 24 5D C2 04 00 55 8B se v.&.$] U EC 83 EC... file:///D|/Agent%20Folders/Chapter%2 03% 20The%20Right%20To%20Assemble.htm (39 of 39 ) [9/25/2002 6:47:17 PM] file:///D|/Agent%20Folders/Chapter%204%20Learning%2 0and% 20Using%20Jed.htm Learning and Using Jed A Programming Environment for Assembly Language 4.1 A Place to Stand with Access to Tools >• 100 4.2 JED's Place to Stand >• 101 4 .3 Using JED's Tools >• 104 4.4 JED's Editor in Detail •> 116 4.1 A Place to stand with Access... 66 17 7D 30 89 .v v f.}0 46 FE 83 7E 10 00 75 OF-C4 76 08 26 8B 34 F7 DE F - u V.&.4 C4 5E OC 03 DE EB 03 C4-5E OC 89 5E F6 8C 46 F8 A A A F C4 76 08 26 8B 1C C4 7E-F6 26 8D 01 8C C2 89 46 v.& -.& F file:///D|/Agent%20Folders/Chapter%2 03% 20The%20Right%20To%20Assemble.htm (35 of 39 ) [9/25/2002 6:47:17 PM] file:///D|/Agent%20Folders/Chapter%2 03% 20The%20Right%20To%20Assemble.htm 38 E3:0160 38 E3:0170 F2... file:///D|/Agent%20Folders/Chapter%204%20Learning%2 0and% 20Using%20Jed.htm (14 of 33 ) [9/25/2002 6:57: 23 PM] file:///D|/Agent%20Folders/Chapter%204%20Learning%2 0and% 20Using%20Jed.htm alone is pretty pointless unless you intend to run the executable program file produced by the link step JED combines the link step with the step of actually running your new assembly language program to see what it does Furthermore, it performs the assemble step again,... MASM and TASM may happen in time, but you must understand that both Microsoft and Borland are catering to their most important audience, the established assembly- language programmer That doesn't do much good for you One glance back at Figure 3. 5 can give you the screaming willies Assembly- language development not a simple process, and grabbing all the tools from the DOS prompt is complicated and error... and TLINK, and provides a very good text editor to boot, but you can work very well from the DOS prompt using some other text editor I'll be referencing JED as I discuss the assembly language process in this book; there are a multitude of ways to work with assembly language and I have to settle on something But the information on assembly language itself is independent of the text editor and programming. .. as many bytes to disk as are specified in BX and CX This could be 20,000 bytes more than the file contains, or it could be 0 bytes, leaving you with an empty file You can destroy a file this way Either leave BX and CX alone while you're examining and "patching" a file with DEBUG, or write the initial values in BX and CX down, and enter them back into BX and CX just before issuing the W command The... different ways, and take plenty of explanation and considerable practice to master In this chapter I'll introduce you to DEBUG, a program that will allow you to single step your assembly language programs and examine their and the machine's innards between each and every machine instruction This section is only an introduction—DEBUG is learned best by doing, and you'll be both using and learning DEBUG's... file:///D|/Agent%20Folders/Chapter%204%20Learning%2 0and% 20Using%20Jed.htm (11 of 33 ) [9/25/2002 6:57: 23 PM] file:///D|/Agent%20Folders/Chapter%204%20Learning%2 0and% 20Using%20Jed.htm JED will quite literally say "uh-uh." Try it and see! Changing the link command line is done the same way TASM's link command line is this: TLINK ~ MASM's link command line, on the other hand, requires a semicolon, and for the same reasons mentioned . v.&.$] .U. 38 E3:0120 EC 83 EC 12 FF 76 06 FF-76 04 9A 66 17 7D 30 89 v v f.}0. 38 E3:0 130 46 FE 83 7E 10 00 75 OF-C4 76 08 26 8B 34 F7 DE F u V.&.4 38 E3:0140 C4 5E OC 03 DE EB 03 C4-5E OC. .moo 38 E3:0110 73 65 2E OD OA 1A 04 26-F7 24 5D C2 04 00 55 88 se &.$] U. 38 E3:0120 EC 83 EC 12 FF 76 06 FF-76 04 9A 66 17 7D 30 89 v v f.}0. 38 E3:0 130 46 FE 83 7E 10 00 75 OF-C4 76 08 26 8B 34 . that command, and you'll see the following dump with the altered memory image of SAM.TXT: 38 E3:0100 53 61 6D 00 OA 77 61 73- OD OA 61 OD OA 6D 6F 6F Sam was. .a. .moo 38 E3:0110 73 65 OA