Assembly Language Program that Performs Arithmetic Operations Project
ASCII CONTROL CHARACTERSThe following list shows the ASCII codes generated when a control key combination is pressed.
The mnemonics and descriptions refer to ASCII functions used for screen and printer formatting
and data communications.
ASCII
Code*
Ctrl-
Mnemonic
00
Description
ASCII
Code*
Ctrl-
Mnemonic
NUL
Null character
10
Ctrl-P
DLE
Data link escape
Description
Device control 1
01
Ctrl-A
SOH
Start of header
11
Ctrl-Q
DC1
02
Ctrl-B
STX
Start of text
12
Ctrl-R
DC2
Device control 2
03
Ctrl-C
ETX
End of text
13
Ctrl-S
DC3
Device control 3
04
Ctrl-D
EOT
End of transmission
14
Ctrl-T
DC4
Device control 4
05
Ctrl-E
ENQ
Enquiry
15
Ctrl-U
NAK
Negative acknowledge
06
Ctrl-F
ACK
Acknowledge
16
Ctrl-V
SYN
Synchronous idle
07
Ctrl-G
BEL
Bell
17
Ctrl-W
ETB
End transmission block
08
Ctrl-H
BS
Backspace
18
Ctrl-X
CAN
Cancel
09
Ctrl-I
HT
Horizontal tab
19
Ctrl-Y
EM
0A
Ctrl-J
LF
Line feed
1A
Ctrl-Z
SUB
End of medium
Substitute
0B
Ctrl-K
VT
Vertical tab
1B
Ctrl-I
ESC
Escape
0C
Ctrl-L
FF
Form feed
1C
Ctrl-\
FS
File separator
0D
Ctrl-M
CR
Carriage return
1D
Ctrl-]
GS
Group separator
0E
Ctrl-N
SO
Shift out
1E
Ctrl- ^
RS
Record separator
0F
Ctrl-O
SI
Shift in
1F
Ctrl-†
US
Unit separator
* ASCII codes are in hexadecimal.
† ASCII code 1Fh is Ctrl-Hyphen (-).
ALT-KEY COMBINATIONS
The following hexadecimal scan codes are produced by holding down
the ALT key and pressing each character:
Key
1
Scan Code
78
Key
A
Scan Code
Key
1E
N
Scan Code
31
2
79
B
30
O
18
3
7A
C
2E
P
19
4
7B
D
20
Q
10
5
7C
E
12
R
13
6
7D
F
21
S
1F
7
7E
G
22
T
14
8
7F
H
23
U
16
9
80
I
17
V
2F
0
81
J
24
W
11
82
K
25
X
2D
83
L
26
Y
15
M
32
Z
2C
KEYBOARD SCAN CODES
The following keyboard scan codes may be retrieved either by calling INT 16h or by calling
INT 21h for keyboard input a second time (the first keyboard read returns 0). All codes are in
hexadecimal:
FUNCTION KEYS
Key
Normal
With
Shift
With
Ctrl
With Alt
F1
3B
54
5E
68
F2
3C
55
5F
69
F3
3D
56
60
6A
F4
3E
57
61
6B
F5
3F
58
62
6C
F6
40
59
63
6D
F7
41
5A
64
6E
F8
42
5B
65
6F
F9
43
5C
66
70
F10
44
5D
67
71
F11
85
87
89
8B
F12
86
88
8A
8C
Key
Alone
With
Ctrl Key
Home
47
77
End
4F
75
PgUp
49
84
PgDn
51
76
PrtSc
37
72
Left arrow
4B
73
Rt arrow
4D
74
Up arrow
48
8D
Dn arrow
50
91
Ins
52
92
Del
53
93
Back tab
0F
94
Gray +
4E
90
Gray −
4A
8E
Assembly Language for
x86 Processors
Seventh Edition
KIP R. IRVINE
Florida International University
School of Computing and Information Sciences
Boston Columbus Indianapolis New York San Francisco Upper Saddle River
Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montreal Toronto
Delhi Mexico City São Paulo Sydney Hong Kong Seoul Singapore Taipei Tokyo
Vice President and Editorial Director, ECS: Marcia Horton
Executive Editor: Tracy Johnson
Executive Marketing Manager: Tim Galligan
Marketing Assistant: Jon Bryant
Program Management Team Lead: Scott Disanno
Program Manager: Clare Romeo
Project Manager: Greg Dulles
Senior Operations Specialist: Nick Sklitsis
Operations Specialist: Linda Sager
Permissions Project Manager: Karen Sanatar
Full-Service Project Management: Pavithra Jayapaul, Jouve
Printer/Binder: Courier/Westford
Typeface: Times
IA-32, Pentium, i486, Intel64, Celeron, and Intel 386 are trademarks of Intel Corporation. Athlon, Phenom, and Opteron
are trademarks of Advanced Micro Devices. TASM and Turbo Debugger are trademarks of Borland International.
Microsoft Assembler (MASM), Windows Vista, Windows 7, Windows NT, Windows Me, Windows 95, Windows 98,
Windows 2000, Windows XP, MS-Windows, PowerPoint, Win32, DEBUG, WinDbg, MS-DOS, Visual Studio, Visual
C++, and CodeView are registered trademarks of Microsoft Corporation. Autocad is a trademark of Autodesk. Java is a
trademark of Sun Microsystems. PartitionMagic is a trademark of Symantec. All other trademarks or product names are
the property of their respective owners.
Copyright © 2015, 2011, 2007, 2003 by Pearson Education, Inc., Upper Saddle River, New Jersey 07458. All rights
reserved. Manufactured in the United States of America. This publication is protected by Copyright and permissions
should be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in
any form or by any means, electronic, mechanical, photocopying, recording, or likewise. To obtain permission(s) to use
materials from this work, please submit a written request to Pearson Higher Education, Permissions Department, 1 Lake
Street, Upper Saddle River, NJ 07458.
Previously published as Assembly Language for Intel-Based Computers.
The author and publisher of this book have used their best efforts in preparing this book. These efforts include the
development, research, and testing of the theories and programs to determine their effectiveness. The author and publisher make no warranty of any kind, expressed or implied, with regard to these programs or the documentation
contained in this book. The author and publisher shall not be liable in any event for incidental or consequential damages in
connection with, or arising out of, the furnishing, performance, or use of these programs.
Library of Congress Cataloging-in-Publication Data
Irvine, Kip R., 1951Assembly language for x86 processors / Kip R. Irvine, Florida International University,
School of Computing and Information Sciences. — Seventh Edition.
pages cm
ISBN-13: 978-0-13-376940-1
ISBN-10: 0-13-376940-2
1. IBM microcomputers–Programming. 2. X86 assembly language (Computer program
language) I. Title.
QA76.8.I77 2014
005.265—dc23
2013046432
10 9 8 7 6 5 4 3 2 1
ISBN-13: 978-0-13-376940-1
ISBN-10: 0-13-376940-2
To Jack and Candy Irvine
This page intentionally left blank
Contents
Preface
xxiii
1
Basic Concepts
1.1
Welcome to Assembly Language
1.1.1
1.1.2
1.1.3
1.2
7
Section Review 9
Data Representation 9
1.3.1
1.3.2
1.3.3
1.3.4
1.3.5
1.3.6
1.3.7
1.3.8
1.3.9
1.4
1
Questions You Might Ask 3
Assembly Language Applications 6
Section Review 6
Virtual Machine Concept
1.2.1
1.3
1
Binary Integers 10
Binary Addition 12
Integer Storage Sizes 13
Hexadecimal Integers 13
Hexadecimal Addition 15
Signed Binary Integers 16
Binary Subtraction 18
Character Storage 19
Section Review 21
Boolean Expressions 22
1.4.1
1.4.2
Truth Tables for Boolean Functions 24
Section Review 26
1.5
Chapter Summary 26
1.6
Key Terms 27
1.7
Review Questions and Exercises 28
1.7.1
1.7.2
Short Answer 28
Algorithm Workbench 30
2
x86 Processor Architecture
2.1
General Concepts 33
2.1.1
2.1.2
Basic Microcomputer Design 33
Instruction Execution Cycle 34
v
32
vi
Contents
2.1.3
2.1.4
2.1.5
2.2
32-Bit x86 Processors 37
2.2.1
2.2.2
2.2.3
2.2.4
2.3
64-Bit Operation Modes 43
Basic 64-Bit Execution Environment 43
Components of a Typical x86 Computer 44
2.4.1
2.4.2
2.4.3
2.5
Modes of Operation 37
Basic Execution Environment 38
x86 Memory Management 41
Section Review 42
64-Bit x86-64 Processors 42
2.3.1
2.3.2
2.4
Reading from Memory 36
Loading and Executing a Program 36
Section Review 37
Motherboard 44
Memory 46
Section Review 46
Input–Output System 47
2.5.1
2.5.2
Levels of I/O Access 47
Section Review 49
2.6
Chapter Summary 50
2.7
Key Terms 51
2.8
Review Questions 52
3
Assembly Language Fundamentals
3.1
Basic Language Elements 54
3.1.1
3.1.2
3.1.3
3.1.4
3.1.5
3.1.6
3.1.7
3.1.8
3.1.9
3.1.10
3.1.11
3.2
First Assembly Language Program 54
Integer Literals 55
Constant Integer Expressions 56
Real Number Literals 57
Character Literals 57
String Literals 58
Reserved Words 58
Identifiers 58
Directives 59
Instructions 60
Section Review 63
Example: Adding and Subtracting Integers 63
3.2.1
3.2.2
3.2.3
3.2.4
The AddTwo Program 63
Running and Debugging the AddTwo Program 65
Program Template 70
Section Review 70
53
Contents
3.3
vii
Assembling, Linking, and Running Programs 71
3.3.1
3.3.2
3.3.3
3.4
Defining Data 74
3.4.1
3.4.2
3.4.3
3.4.4
3.4.5
3.4.6
3.4.7
3.4.8
3.4.9
3.4.10
3.4.11
3.4.12
3.4.13
3.5
The Assemble-Link-Execute Cycle 71
Listing File 71
Section Review 73
Intrinsic Data Types 74
Data Definition Statement 74
Adding a Variable to the AddTwo Program 75
Defining BYTE and SBYTE Data 76
Defining WORD and SWORD Data 78
Defining DWORD and SDWORD Data 79
Defining QWORD Data 79
Defining Packed BCD (TBYTE) Data 80
Defining Floating-Point Types 81
A Program That Adds Variables 81
Little-Endian Order 82
Declaring Uninitialized Data 83
Section Review 83
Symbolic Constants 84
3.5.1
3.5.2
3.5.3
3.5.4
3.5.5
Equal-Sign Directive 84
Calculating the Sizes of Arrays and Strings 85
EQU Directive 86
TEXTEQU Directive 87
Section Review 88
3.6
64-Bit Programming 88
3.7
Chapter Summary 90
3.8
Key Terms 91
3.8.1
3.8.2
3.9
Terms 91
Instructions, Operators, and Directives 92
Review Questions and Exercises 92
3.9.1
3.9.2
Short Answer 92
Algorithm Workbench 93
3.10 Programming Exercises
94
4
Data Transfers, Addressing, and
Arithmetic 95
4.1
Data Transfer Instructions 96
4.1.1
4.1.2
4.1.3
Introduction 96
Operand Types 96
Direct Memory Operands 96
viii
Contents
4.1.4
4.1.5
4.1.6
4.1.7
4.1.8
4.1.9
4.1.10
4.2
Addition and Subtraction 105
4.2.1
4.2.2
4.2.3
4.2.4
4.2.5
4.2.6
4.2.7
4.2.8
4.3
Indirect Operands 117
Arrays 118
Indexed Operands 119
Pointers 121
Section Review 122
JMP and LOOP Instructions 123
4.5.1
4.5.2
4.5.3
4.5.4
4.5.5
4.5.6
4.6
OFFSET Operator 112
ALIGN Directive 113
PTR Operator 114
TYPE Operator 115
LENGTHOF Operator 116
SIZEOF Operator 116
LABEL Directive 116
Section Review 117
Indirect Addressing 117
4.4.1
4.4.2
4.4.3
4.4.4
4.4.5
4.5
INC and DEC Instructions 105
ADD Instruction 105
SUB Instruction 106
NEG Instruction 106
Implementing Arithmetic Expressions 106
Flags Affected by Addition and Subtraction 107
Example Program (AddSubTest) 111
Section Review 112
Data-Related Operators and Directives 112
4.3.1
4.3.2
4.3.3
4.3.4
4.3.5
4.3.6
4.3.7
4.3.8
4.4
MOV Instruction 98
Zero/Sign Extension of Integers 99
LAHF and SAHF Instructions 101
XCHG Instruction 102
Direct-Offset Operands 102
Example Program (Moves) 103
Section Review 104
JMP Instruction 123
LOOP Instruction 124
Displaying an Array in the Visual Studio Debugger 125
Summing an Integer Array 126
Copying a String 127
Section Review 128
64-Bit Programming 128
4.6.1
4.6.2
4.6.3
4.6.4
MOV Instruction 128
64-Bit Version of SumArray 130
Addition and Subtraction 130
Section Review 131
Contents
ix
4.7
Chapter Summary 132
4.8
Key Terms 133
4.8.1
4.8.2
4.9
Terms 133
Instructions, Operators, and Directives 133
Review Questions and Exercises 134
4.9.1
4.9.2
Short Answer 134
Algorithm Workbench 136
4.10 Programming Exercises 137
5
Procedures
5.1
Stack Operations 140
5.1.1
5.1.2
5.1.3
5.2
Background Information 154
Section Review 155
The Irvine32 Library 155
5.4.1
5.4.2
5.4.3
5.4.4
5.4.5
5.5
PROC Directive 145
CALL and RET Instructions 147
Nested Procedure Calls 148
Passing Register Arguments to Procedures 150
Example: Summing an Integer Array 150
Saving and Restoring Registers 152
Section Review 153
Linking to an External Library 153
5.3.1
5.3.2
5.4
Runtime Stack (32-bit mode) 140
PUSH and POP Instructions 142
Section Review 145
Defining and Using Procedures 145
5.2.1
5.2.2
5.2.3
5.2.4
5.2.5
5.2.6
5.2.7
5.3
139
Motivation for Creating the Library 155
Overview 157
Individual Procedure Descriptions 158
Library Test Programs 170
Section Review 178
64-Bit Assembly Programming 178
5.5.1
5.5.2
5.5.3
5.5.4
The Irvine64 Library 178
Calling 64-Bit Subroutines 179
The x64 Calling Convention 179
Sample Program that Calls a Procedure 180
5.6
Chapter Summary 182
5.7
Key Terms 183
5.7.1
5.7.2
Terms 183
Instructions, Operators, and Directives 183
x
5.8
Contents
Review Questions and Exercises 183
5.8.1
5.8.2
Short Answer 183
Algorithm Workbench 186
5.9
Programming Exercises 187
6
Conditional Processing
6.1
Conditional Branching 190
6.2
Boolean and Comparison Instructions 190
6.2.1
6.2.2
6.2.3
6.2.4
6.2.5
6.2.6
6.2.7
6.2.8
6.2.9
6.2.10
6.2.11
6.3
Block-Structured IF Statements 210
Compound Expressions 213
WHILE Loops 214
Table-Driven Selection 216
Section Review 219
Application: Finite-State Machines 219
6.6.1
6.6.2
6.6.3
6.7
LOOPZ and LOOPE Instructions 209
LOOPNZ and LOOPNE Instructions 209
Section Review 210
Conditional Structures 210
6.5.1
6.5.2
6.5.3
6.5.4
6.5.5
6.6
Conditional Structures 199
Jcond Instruction 200
Types of Conditional Jump Instructions 201
Conditional Jump Applications 204
Section Review 208
Conditional Loop Instructions 209
6.4.1
6.4.2
6.4.3
6.5
The CPU Status Flags 191
AND Instruction 191
OR Instruction 192
Bit-Mapped Sets 194
XOR Instruction 195
NOT Instruction 196
TEST Instruction 196
CMP Instruction 197
Setting and Clearing Individual CPU Flags 198
Boolean Instructions in 64-Bit Mode 199
Section Review 199
Conditional Jumps 199
6.3.1
6.3.2
6.3.3
6.3.4
6.3.5
6.4
189
Validating an Input String 219
Validating a Signed Integer 220
Section Review 224
Conditional Control Flow Directives 225
6.7.1
6.7.2
6.7.3
6.7.4
Creating IF Statements 226
Signed and Unsigned Comparisons 227
Compound Expressions 228
Creating Loops with .REPEAT and .WHILE 231
Contents
xi
6.8
Chapter Summary 232
6.9
Key Terms 233
6.9.1
6.9.2
Terms 233
Instructions, Operators, and Directives 234
6.10 Review Questions and Exercises 234
6.10.1 Short Answer 234
6.10.2 Algorithm Workbench 236
6.11 Programming Exercises 237
6.11.1 Suggestions for Testing Your Code 237
6.11.2 Exercise Descriptions 238
7
Integer Arithmetic
7.1
Shift and Rotate Instructions 243
7.1.1
7.1.2
7.1.3
7.1.4
7.1.5
7.1.6
7.1.7
7.1.8
7.1.9
7.1.10
7.2
Shifting Multiple Doublewords 252
Binary Multiplication 253
Displaying Binary Bits 254
Extracting File Date Fields 254
Section Review 255
Multiplication and Division Instructions 255
7.3.1
7.3.2
7.3.3
7.3.4
7.3.5
7.3.6
7.3.7
7.4
Logical Shifts and Arithmetic Shifts 243
SHL Instruction 244
SHR Instruction 245
SAL and SAR Instructions 246
ROL Instruction 247
ROR Instruction 247
RCL and RCR Instructions 248
Signed Overflow 249
SHLD/SHRD Instructions 249
Section Review 251
Shift and Rotate Applications 251
7.2.1
7.2.2
7.2.3
7.2.4
7.2.5
7.3
242
MUL Instruction 255
IMUL Instruction 257
Measuring Program Execution Times 260
DIV Instruction 262
Signed Integer Division 264
Implementing Arithmetic Expressions 267
Section Review 269
Extended Addition and Subtraction 269
7.4.1
7.4.2
7.4.3
7.4.4
ADC Instruction 269
Extended Addition Example 270
SBB Instruction 272
Section Review 272
xii
7.5
Contents
ASCII and Unpacked Decimal Arithmetic 273
7.5.1
7.5.2
7.5.3
7.5.4
7.5.5
7.6
AAA Instruction
AAS Instruction
AAM Instruction
AAD Instruction
Section Review
274
276
276
276
277
Packed Decimal Arithmetic 277
7.6.1
7.6.2
7.6.3
DAA Instruction 277
DAS Instruction 279
Section Review 279
7.7
Chapter Summary 279
7.8
Key Terms 280
7.8.1
7.8.2
7.9
Terms 280
Instructions, Operators, and Directives 280
Review Questions and Exercises 281
7.9.1
7.9.2
Short Answer 281
Algorithm Workbench 282
7.10 Programming Exercises 284
8
Advanced Procedures
8.1
Introduction 287
8.2
Stack Frames 287
8.2.1
8.2.2
8.2.3
8.2.4
8.2.5
8.2.6
8.2.7
8.2.8
8.2.9
8.2.10
8.2.11
8.3
Stack Parameters 288
Disadvantages of Register Parameters 288
Accessing Stack Parameters 290
32-Bit Calling Conventions 293
Local Variables 295
Reference Parameters 297
LEA Instruction 298
ENTER and LEAVE Instructions 298
LOCAL Directive 300
The Microsoft x64 Calling Convention 301
Section Review 302
Recursion 302
8.3.1
8.3.2
8.3.3
8.4
286
Recursively Calculating a Sum 303
Calculating a Factorial 304
Section Review 311
INVOKE, ADDR, PROC, and PROTO 311
8.4.1
8.4.2
8.4.3
8.4.4
INVOKE Directive 311
ADDR Operator 312
PROC Directive 313
PROTO Directive 316
Contents
xiii
8.4.5
8.4.6
8.4.7
8.4.8
8.4.9
8.5
Creating Multimodule Programs 323
8.5.1
8.5.2
8.5.3
8.5.4
8.5.5
8.5.6
8.5.7
8.6
Hiding and Exporting Procedure Names 323
Calling External Procedures 324
Using Variables and Symbols across Module Boundaries 325
Example: ArraySum Program 326
Creating the Modules Using Extern 326
Creating the Modules Using INVOKE and PROTO 330
Section Review 333
Advanced Use of Parameters (Optional Topic) 333
8.6.1
8.6.2
8.6.3
8.6.4
8.7
Parameter Classifications 319
Example: Exchanging Two Integers 320
Debugging Tips 321
WriteStackFrame Procedure 322
Section Review 323
Stack Affected by the USES Operator 333
Passing 8-Bit and 16-Bit Arguments on the Stack 335
Passing 64-Bit Arguments 336
Non-Doubleword Local Variables 337
Java Bytecodes (Optional Topic) 339
8.7.1
8.7.2
8.7.3
8.7.4
Java Virtual Machine 339
Instruction Set 340
Java Disassembly Examples 341
Example: Conditional Branch 344
8.8
Chapter Summary 346
8.9
Key Terms 347
8.9.1
8.9.2
Terms 347
Instructions, Operators, and Directives 348
8.10 Review Questions and Exercises 348
8.10.1 Short Answer 348
8.10.2 Algorithm Workbench 348
8.11 Programming Exercises
349
9
Strings and Arrays
9.1
Introduction 352
9.2
String Primitive Instructions 353
9.2.1
9.2.2
9.2.3
9.2.4
9.2.5
9.2.6
352
MOVSB, MOVSW, and MOVSD 354
CMPSB, CMPSW, and CMPSD 355
SCASB, SCASW, and SCASD 356
STOSB, STOSW, and STOSD 356
LODSB, LODSW, and LODSD 356
Section Review 357
xiv
9.3
Contents
Selected String Procedures 357
9.3.1
9.3.2
9.3.3
9.3.4
9.3.5
9.3.6
9.3.7
9.3.8
9.4
Two-Dimensional Arrays 368
9.4.1
9.4.2
9.4.3
9.4.4
9.4.5
9.5
Str_compare Procedure 358
Str_length Procedure 359
Str_copy Procedure 359
Str_trim Procedure 360
Str_ucase Procedure 363
String Library Demo Program 364
String Procedures in the Irvine64 Library 365
Section Review 368
Ordering of Rows and Columns 368
Base-Index Operands 369
Base-Index-Displacement Operands 371
Base-Index Operands in 64-Bit Mode 372
Section Review 373
Searching and Sorting Integer Arrays 373
9.5.1
9.5.2
9.5.3
Bubble Sort 373
Binary Search 375
Section Review 382
9.6
Java Bytecodes: String Processing (Optional Topic) 382
9.7
Chapter Summary 383
9.8
Key Terms and Instructions 384
9.9
Review Questions and Exercises 384
9.9.1
9.9.2
Short Answer 384
Algorithm Workbench 385
9.10 Programming Exercises 386
10 Structures and Macros 390
10.1 Structures 390
10.1.1
10.1.2
10.1.3
10.1.4
10.1.5
10.1.6
10.1.7
10.1.8
Defining Structures 391
Declaring Structure Variables 393
Referencing Structure Variables 394
Example: Displaying the System Time 397
Structures Containing Structures 399
Example: Drunkard’s Walk 399
Declaring and Using Unions 403
Section Review 405
10.2 Macros 405
10.2.1 Overview 405
10.2.2 Defining Macros 406
10.2.3 Invoking Macros 407
Contents
xv
10.2.4
10.2.5
10.2.6
10.2.7
Additional Macro Features 408
Using the Book’s Macro Library (32-bit mode only) 412
Example Program: Wrappers 419
Section Review 420
10.3 Conditional-Assembly Directives 420
10.3.1
10.3.2
10.3.3
10.3.4
10.3.5
10.3.6
10.3.7
10.3.8
10.3.9
Checking for Missing Arguments 421
Default Argument Initializers 422
Boolean Expressions 423
IF, ELSE, and ENDIF Directives 423
The IFIDN and IFIDNI Directives 424
Example: Summing a Matrix Row 425
Special Operators 428
Macro Functions 431
Section Review 433
10.4 Defining Repeat Blocks 433
10.4.1
10.4.2
10.4.3
10.4.4
10.4.5
10.4.6
WHILE Directive 433
REPEAT Directive 434
FOR Directive 434
FORC Directive 435
Example: Linked List 436
Section Review 437
10.5 Chapter Summary 438
10.6 Key Terms 439
10.6.1 Terms 439
10.6.2 Operators and Directives 439
10.7 Review Questions and Exercises 440
10.7.1 Short Answer 440
10.7.2 Algorithm Workbench 440
10.8 Programming Exercises 442
11 MS-Windows Programming 445
11.1 Win32 Console Programming 445
11.1.1
11.1.2
11.1.3
11.1.4
11.1.5
11.1.6
11.1.7
11.1.8
11.1.9
11.1.10
Background Information 446
Win32 Console Functions 450
Displaying a Message Box 452
Console Input 455
Console Output 461
Reading and Writing Files 463
File I/O in the Irvine32 Library 468
Testing the File I/O Procedures 470
Console Window Manipulation 473
Controlling the Cursor 476
xvi
Contents
11.1.11
11.1.12
11.1.13
11.1.14
Controlling the Text Color 477
Time and Date Functions 479
Using the 64-Bit Windows API 482
Section Review 484
11.2 Writing a Graphical Windows Application 484
11.2.1
11.2.2
11.2.3
11.2.4
11.2.5
11.2.6
11.2.7
Necessary Structures 484
The MessageBox Function 486
The WinMain Procedure 486
The WinProc Procedure 487
The ErrorHandler Procedure 488
Program Listing 488
Section Review 492
11.3 Dynamic Memory Allocation 492
11.3.1
11.3.2
HeapTest Programs 496
Section Review 499
11.4 x86 Memory Management 499
11.4.1
11.4.2
11.4.3
Linear Addresses 500
Page Translation 503
Section Review 505
11.5 Chapter Summary 505
11.6 Key Terms 507
11.7 Review Questions and Exercises 507
11.7.1
11.7.2
Short Answer 507
Algorithm Workbench 508
11.8 Programming Exercises 509
12 Floating-Point Processing and Instruction
Encoding
511
12.1 Floating-Point Binary Representation 511
12.1.1
12.1.2
12.1.3
12.1.4
12.1.5
12.1.6
IEEE Binary Floating-Point Representation 512
The Exponent 514
Normalized Binary Floating-Point Numbers 514
Creating the IEEE Representation 514
Converting Decimal Fractions to Binary Reals 516
Section Review 518
12.2 Floating-Point Unit 518
12.2.1
12.2.2
12.2.3
12.2.4
FPU Register Stack 519
Rounding 521
Floating-Point Exceptions 523
Floating-Point Instruction Set 523
Contents
xvii
12.2.5
12.2.6
12.2.7
12.2.8
12.2.9
12.2.10
12.2.11
12.2.12
Arithmetic Instructions 526
Comparing Floating-Point Values 530
Reading and Writing Floating-Point Values 533
Exception Synchronization 534
Code Examples 535
Mixed-Mode Arithmetic 537
Masking and Unmasking Exceptions 538
Section Review 539
12.3 x86 Instruction Encoding 539
12.3.1
12.3.2
12.3.3
12.3.4
12.3.5
12.3.6
12.3.7
Instruction Format 540
Single-Byte Instructions 541
Move Immediate to Register 541
Register-Mode Instructions 542
Processor Operand-Size Prefix 543
Memory-Mode Instructions 544
Section Review 547
12.4 Chapter Summary 547
12.5 Key Terms 549
12.6 Review Questions and Exercises 549
12.6.1
12.6.2
Short Answer 549
Algorithm Workbench 550
12.7 Programming Exercises 551
13 High-Level Language Interface 555
13.1 Introduction 555
13.1.1
13.1.2
13.1.3
13.1.4
General Conventions 556
.MODEL Directive 557
Examining Compiler-Generated Code 559
Section Review 564
13.2 Inline Assembly Code 564
13.2.1
13.2.2
13.2.3
__asm Directive in Visual C++ 564
File Encryption Example 566
Section Review 569
13.3 Linking 32-Bit Assembly Language Code to C/C++ 570
13.3.1
13.3.2
13.3.3
13.3.4
13.3.5
13.3.6
IndexOf Example 570
Calling C and C++ Functions 574
Multiplication Table Example 576
Calling C Library Functions 579
Directory Listing Program 582
Section Review 583
xviii
Contents
13.4 Chapter Summary 583
13.5 Key Terms 584
13.6 Review Questions 584
13.7 Programming Exercises 585
Chapters are available from the Companion Web site
14 16-Bit MS-DOS Programming 14.1
14.1 MS-DOS and the IBM-PC 14.1
14.1.1
14.1.2
14.1.3
14.1.4
14.1.5
14.1.6
Memory Organization 14.2
Redirecting Input-Output 14.3
Software Interrupts 14.4
INT Instruction 14.5
Coding for 16-Bit Programs 14.6
Section Review 14.7
14.2 MS-DOS Function Calls (INT 21h) 14.7
14.2.1
14.2.2
14.2.3
14.2.4
14.2.5
Selected Output Functions 14.9
Hello World Program Example 14.11
Selected Input Functions 14.12
Date/Time Functions 14.16
Section Review 14.20
14.3 Standard MS-DOS File I/O Services 14.20
14.3.1
14.3.2
14.3.3
14.3.4
14.3.5
14.3.6
14.3.7
14.3.8
14.3.9
Create or Open File (716Ch) 14.22
Close File Handle (3Eh) 14.23
Move File Pointer (42h) 14.23
Get File Creation Date and Time 14.24
Selected Library Procedures 14.24
Example: Read and Copy a Text File 14.25
Reading the MS-DOS Command Tail 14.27
Example: Creating a Binary File 14.30
Section Review 14.33
14.4 Chapter Summary 14.33
14.5 Programming Exercises 14.35
15 Disk Fundamentals 15.1
15.1 Disk Storage Systems 15.1
15.1.1
15.1.2
15.1.3
Tracks, Cylinders, and Sectors 15.2
Disk Partitions (Volumes) 15.4
Section Review 15.4
Contents
xix
15.2 File Systems 15.5
15.2.1
15.2.2
15.2.3
15.2.4
15.2.5
15.2.6
FAT12 15.6
FAT16 15.6
FAT32 15.6
NTFS 15.7
Primary Disk Areas 15.7
Section Review 15.8
15.3 Disk Directory 15.9
15.3.1
15.3.2
15.3.3
15.3.4
MS-DOS Directory Structure 15.10
Long Filenames in MS-Windows 15.12
File Allocation Table (FAT) 15.14
Section Review 15.14
15.4 Reading and Writing Disk Sectors 15.15
15.4.1
15.4.2
Sector Display Program 15.16
Section Review 15.19
15.5 System-Level File Functions 15.20
15.5.1
15.5.2
15.5.3
15.5.4
15.5.5
15.5.6
15.5.7
Get Disk Free Space (7303h) 15.20
Create Subdirectory (39h) 15.23
Remove Subdirectory (3Ah) 15.23
Set Current Directory (3Bh) 15.23
Get Current Directory (47h) 15.24
Get and Set File Attributes (7143h) 15.24
Section Review 15.25
15.6 Chapter Summary 15.25
15.7 Programming Exercises 15.26
16 BIOS-Level Programming 16.1
16.1 Introduction 16.1
16.1.1
BIOS Data Area 16.2
16.2 Keyboard Input with INT 16h 16.3
16.2.1
16.2.2
16.2.3
How the Keyboard Works 16.3
INT 16h Functions 16.4
Section Review 16.8
16.3 VIDEO Programming with INT 10h 16.8
16.3.1
16.3.2
16.3.3
16.3.4
16.3.5
Basic Background 16.8
Controlling the Color 16.10
INT 10h Video Functions 16.12
Library Procedure Examples 16.22
Section Review 16.23
xx
Contents
16.4 Drawing Graphics Using INT 10h 16.23
16.4.1
16.4.2
16.4.3
16.4.4
16.4.5
INT 10h Pixel-Related Functions 16.24
DrawLine Program 16.25
Cartesian Coordinates Program 16.27
Converting Cartesian Coordinates to Screen Coordinates 16.29
Section Review 16.30
16.5 Memory-Mapped Graphics 16.30
16.5.1
16.5.2
16.5.3
Mode 13h: 320 X 200, 256 Colors 16.30
Memory-Mapped Graphics Program 16.32
Section Review 16.34
16.6 Mouse Programming 16.35
16.6.1
16.6.2
16.6.3
Mouse INT 33h Functions 16.35
Mouse Tracking Program 16.40
Section Review 16.44
16.7 Chapter Summary 16.45
16.8 Programming Exercises 16.46
17 Expert MS-DOS Programming 17.1
17.1 Introduction 17.1
17.2 Defining Segments 17.2
17.2.1
17.2.2
17.2.3
17.2.4
17.2.5
Simplified Segment Directives 17.2
Explicit Segment Definitions 17.4
Segment Overrides 17.7
Combining Segments 17.7
Section Review 17.9
17.3 Runtime Program Structure 17.9
17.3.1
17.3.2
17.3.3
17.3.4
Program Segment Prefix 17.10
COM Programs 17.10
EXE Programs 17.11
Section Review 17.13
17.4 Interrupt Handling 17.13
17.4.1
17.4.2
17.4.3
17.4.4
17.4.5
17.4.6
Hardware Interrupts 17.14
Interrupt Control Instructions 17.16
Writing a Custom Interrupt Handler 17.16
Terminate and Stay Resident Programs 17.19
Application: The No_Reset Program 17.19
Section Review 17.23
17.5 Hardware Control Using I/O Ports 17.23
17.5.1
17.5.2
Input–Output Ports 17.24
PC Sound Program 17.24
17.6 Chapter Summary 17.26
Contents
xxi
Appendix A
Appendix B
Appendix C
MASM Reference 587
The x86 Instruction Set 609
Answers to Section Review
Questions 644
Appendices are available from the Companion Web site
Appendix D
Appendix E
Index
664
BIOS and MS-DOS Interrupts D.1
Answers to Review Questions
(Chapters 14–17) E.1
This page intentionally left blank
Preface
Assembly Language for x86 Processors, Seventh Edition, teaches assembly language programming and architecture for x86 and Intel64 processors. It is an appropriate text for the following
types of college courses:
• Assembly Language Programming
• Fundamentals of Computer Systems
• Fundamentals of Computer Architecture
Students use Intel or AMD processors and program with Microsoft Macro Assembler (MASM),
running on recent versions of Microsoft Windows. Although this book was originally designed as
a programming textbook for college students, it serves as an effective supplement to computer
architecture courses. As a testament to its popularity, previous editions have been translated into
numerous languages.
Emphasis of Topics This edition includes topics that lead naturally into subsequent courses
in computer architecture, operating systems, and compiler writing:
• Virtual machine concept
• Instruction set architecture
• Elementary Boolean operations
• Instruction execution cycle
• Memory access and handshaking
• Interrupts and polling
• Hardware-based I/O
• Floating-point binary representation
Other topics relate specially to x86 and Intel64 architecture:
• Protected memory and paging
• Memory segmentation in real-address mode
• 16-Bit interrupt handling
• MS-DOS and BIOS system calls (interrupts)
• Floating-point unit architecture and programming
• Instruction encoding
Certain examples presented in the book lend themselves to courses that occur later in a computer
science curriculum:
• Searching and sorting algorithms
• High-level language structures
xxiii
xxiv
Preface
• Finite-state machines
• Code optimization examples
What’s New in the Seventh Edition
In this revision, we increased the discussions of program examples early in the book, added more supplemental review questions and key terms, introduced 64-bit programming, and reduced our dependence on the book’s subroutine library. To be more specific, here are the details:
• Early chapters now include short sections that feature 64-bit CPU architecture and programming, and we have created a 64-bit version of the book’s subroutine library named Irvine64.
• Many of the review questions and exercises have been modified, replaced, and moved from
the middle of the chapter to the end of chapters, and divided into two sections: (1) Short
answer questions, and (2) Algorithm workbench exercises. The latter exercises require the
student to write a short amount of code to accomplish a goal.
• Each chapter now has a Key Terms section, listing new terms and concepts, as well as new
MASM directives and Intel instructions.
• New programming exercises have been added, others removed, and a few existing exercises
were modified.
• There is far less dependency on the author’s subroutine libraries in this edition. Students are
encouraged to call system functions themselves and use the Visual Studio debugger to step
through the programs. The Irvine32 and Irvine64 libraries are available to help students handle input/output, but their use is not required.
• New tutorial videos covering essential content topics have been created by the author and
added to the Pearson website.
This book is still focused on its primary goal, to teach students how to write and debug programs at
the machine level. It will never replace a complete book on computer architecture, but it does give
students the first-hand experience of writing software in an environment that teaches them how a
computer works. Our premise is that students retain knowledge better when theory is combined with
experience. In an engineering course, students construct prototypes; in a computer architecture
course, students should write machine-level programs. In both cases, they have a memorable experience that gives them the confidence to work in any OS/machine-oriented environment.
Protected mode programming is entirely the focus of the printed chapters (1 through 13). As such,
students will create 32-bit and 64-bit programs that run under the most recent versions of Microsoft
Windows. The remaining four chapters cover 16-bit programming, and are supplied in electronic
form. These chapters cover BIOS programming, MS-DOS services, keyboard and mouse input,
video programming, and graphics. One chapter covers disk storage fundamentals. Another chapter
covers advanced DOS programming techniques.
Subroutine Libraries We supply three versions of the subroutine library that students use for
basic input/output, simulations, timing, and other useful tasks. The Irvine32 and Irvine64 libraries run
in protected mode. The 16-bit version (Irvine16.lib) runs in real-address mode and is used only by
Chapters 14 through 17. Full source code for the libraries is supplied on the companion website. The
link libraries are available only for convenience, not to prevent students from learning how to program input–output themselves. Students are encouraged to create their own libraries.
Included Software and Examples All the example programs were tested with Microsoft
Macro Assembler Version 11.0, running in Microsoft Visual Studio 2012. In addition, batch files
are supplied that permit students to assemble and run applications from the Windows command
Preface
xxv
prompt. The 32-bit C++ applications in Chapter 14 were tested with Microsoft Visual C++ .NET.
Information Updates and corrections to this book may be found at the Companion Web site, including additional programming projects for instructors to assign at the ends of chapters.
Overall Goals
The following goals of this book are designed to broaden the student’s interest and knowledge in
topics related to assembly language:
• Intel and AMD processor architecture and programming
• Real-address mode and protected mode programming
• Assembly language directives, macros, operators, and program structure
• Programming methodology, showing how to use assembly language to create system-level
software tools and application programs
• Computer hardware manipulation
• Interaction between assembly language programs, the operating system, and other application programs
One of our goals is to help students approach programming problems with a machine-level mind
set. It is important to think of the CPU as an interactive tool, and to learn to monitor its operation
as directly as possible. A debugger is a programmer’s best friend, not only for catching errors,
but as an educational tool that teaches about the CPU and operating system. We encourage students to look beneath the surface of high-level languages and to realize that most programming
languages are designed to be portable and, therefore, independent of their host machines. In
addition to the short examples, this book contains hundreds of ready-to-run programs that demonstrate instructions or ideas as they are presented in the text. Reference materials, such as
guides to MS-DOS interrupts and instruction mnemonics, are available at the end of the book.
Required Background The reader should already be able to program confidently in at least
one high-level programming language such as Python, Java, C, or C++. One chapter covers C++
interfacing, so it is very helpful to have a compiler on hand. I have used this book in the classroom with majors in both computer science and management information systems, and it has
been used elsewhere in engineering courses.
Features
Complete Program Listings The Companion Web site contains supplemental learning materials, study guides, and all the source code from the book’s examples. An extensive link library
is supplied with the book, containing more than 30 procedures that simplify user input–output,
numeric processing, disk and file handling, and string handling. In the beginning stages of the
course, students can use this library to enhance their programs. Later, they can create their
own procedures and add them to the library.
Programming Logic Two chapters emphasize Boolean logic and bit-level manipulation. A
conscious attempt is made to relate high-level programming logic to the low-level details of the
machine. This approach helps students to create more efficient implementations and to better
understand how compilers generate object code.
xxvi
Preface
Hardware and Operating System Concepts The first two chapters introduce basic hardware and data representation concepts, including binary numbers, CPU architecture, status flags,
and memory mapping. A survey of the computer’s hardware and a historical perspective of the
Intel processor family helps students to better understand their target computer system.
Structured Programming Approach Beginning with Chapter 5, procedures and functional
decomposition are emphasized. Students are given more complex programming exercises,
requiring them to focus on design before starting to write code.
Java Bytecodes and the Java Virtual Machine In Chapters 8 and 9, the author explains the
basic operation of Java bytecodes with short illustrative examples. Numerous short examples are
shown in disassembled bytecode format, followed by detailed step-by-step explanations.
Disk Storage Concepts Students learn the fundamental principles behind the disk storage
system on MS-Windows–based systems from hardware and software points of view.
Creating Link Libraries Students are free to add their own procedures to the book’s link
library and create new libraries. They learn to use a toolbox approach to programming and to
write code that is useful in more than one program.
Macros and Structures A chapter is devoted to creating structures, unions, and macros,
which are essential in assembly language and systems programming. Conditional macros with
advanced operators serve to make the macros more professional.
Interfacing to High-Level Languages A chapter is devoted to interfacing assembly language to C and C++. This is an important job skill for students who are likely to find jobs programming in high-level languages. They can learn to optimize their code and see examples of
how C++ compilers optimize code.
Instructional Aids All the program listings are available on the Web. Instructors are provided
a test bank, answers to review questions, solutions to programming exercises, and a Microsoft
PowerPoint slide presentation for each chapter.
VideoNotes VideoNotes are Pearson’s new visual tool designed to teach students key programming concepts and techniques. These short step-by-step videos demonstrate basic assembly
language concepts. VideoNotes allow for self-paced instruction with easy navigation including
the ability to select, play, rewind, fast-forward, and stop within each VideoNote exercise.
VideoNotes are free with the purchase of a new textbook. To purchase access to VideoNotes,
go to www.pearsonhighered.com/irvine and click on the VideoNotes under Student Resources.
Chapter Descriptions
Chapters 1 to 8 contain core concepts of assembly language and should be covered in sequence.
After that, you have a fair amount of freedom. The following chapter dependency graph shows
how later chapters depend on knowledge gained from other chapters.
Preface
xxvii
1 through 9
10
11
12
15
13
14
16
17
1. Basic Concepts: Applications of assembly language, basic concepts, machine language, and data
representation.
2. x86 Processor Architecture: Basic microcomputer design, instruction execution cycle, x86
processor architecture, Intel64 architecture, x86 memory management, components of a
microcomputer, and the input–output system.
3. Assembly Language Fundamentals: Introduction to assembly language, linking and
debugging, and defining constants and variables.
4. Data Transfers, Addressing, and Arithmetic: Simple data transfer and arithmetic instructions,
assemble-link-execute cycle, operators, directives, expressions, JMP and LOOP instructions, and
indirect addressing.
5. Procedures: Linking to an external library, description of the book’s link library, stack operations, defining and using procedures, flowcharts, and top-down structured design.
6. Conditional Processing: Boolean and comparison instructions, conditional jumps and
loops, high-level logic structures, and finite-state machines.
7. Integer Arithmetic: Shift and rotate instructions with useful applications, multiplication
and division, extended addition and subtraction, and ASCII and packed decimal arithmetic.
8. Advanced Procedures: Stack parameters, local variables, advanced PROC and INVOKE
directives, and recursion.
9. Strings and Arrays: String primitives, manipulating arrays of characters and integers, twodimensional arrays, sorting, and searching.
10. Structures and Macros: Structures, macros, conditional assembly directives, and defining
repeat blocks.
11. MS-Windows Programming: Protected mode memory management concepts, using the
Microsoft-Windows API to display text and colors, and dynamic memory allocation.
12. Floating-Point Processing and Instruction Encoding: Floating-point binary representation and floating-point arithmetic. Learning to program the IA-32 floating-point unit. Understanding the encoding of IA-32 machine instructions.
13. High-Level Language Interface: Parameter passing conventions, inline assembly code, and
linking assembly language modules to C and C++ programs.
• Appendix A: MASM Reference
• Appendix B: The x86 Instruction Set
• Appendix C: Answers to Review Questions
xxviii
Preface
The following chapters and appendices are supplied online at the Companion Web site:
14. 16-Bit MS-DOS Programming: Memory organization, interrupts, function calls, and standard MS-DOS file I/O services.
15. Disk Fundamentals: Disk storage systems, sectors, clusters, directories, file allocation
tables, handling MS-DOS error codes, and drive and directory manipulation.
16. BIOS-Level Programming: Keyboard input, video text, graphics, and mouse programming.
17. Expert MS-DOS Programming: Custom-designed segments, runtime program structure,
and Interrupt handling. Hardware control using I/O ports.
• Appendix D: BIOS and MS-DOS Interrupts
• Appendix E: Answers to Review Questions (Chapters 14–17)
Instructor and Student Resources
Instructor Resource Materials
The following protected instructor material is available on the Companion Web site:
www.pearsonhighered.com/irvine
For username and password information, please contact your Pearson Representative.
• Lecture PowerPoint Slides
• Instructor Solutions Manual
Student Resource Materials
The student resource materials can be accessed through the publisher’s Web site located at
www.pearsonhighered.com/irvine. These resources include:
• VideoNotes
• Online Chapters and Appendices
• Chapter 14: 16-Bit MS-DOS Programming
• Chapter 15: Disk Fundamentals
• Chapter 16: BIOS-Level Programming
• Chapter 17: Expert MS-DOS Programming
• Appendix D: BIOS and MS-DOS Interrupts
• Appendix E: Answers to Review Questions (Chapters 14–17)
Students must use the access card located in the front of the book to register and access the online chapters and VideoNotes. If there is no access card in the front of this textbook, students can purchase access
by going to www.pearsonhighered.com/irvine and selecting “Video Notes and Web Chapters.” Instructors must also register on the site to access this material. Students will also find a link to the author’s Web
site. An access card is not required for the following materials, located at www.asmirvine.com:
• Getting Started, a comprehensive step-by-step tutorial that helps students customize Visual
Studio for assembly language programming.
• Supplementary articles on assembly language programming topics.
• Complete source code for all example programs in the book, as well as the source code for
the author’s supplementary library.
Preface
xxix
• Assembly Language Workbook, an interactive workbook covering number conversions, addressing modes, register usage, debug programming, and floating-point binary numbers. Content
pages are HTML documents to allow for customization. Help File in Windows Help Format.
• Debugging Tools: Tutorials on using the Microsoft Visual Studio debugger.
Acknowledgments
Many thanks are due to Tracy Johnson, Executive Editor for Computer Science at Pearson Education, who has provided friendly, helpful guidance over the past few years. Pavithra Jayapaul of
Jouve did an excellent job on the book production, along with Greg Dulles as the production
editor at Pearson.
Previous Editions
I offer my special thanks to the following individuals who were most helpful during the development of earlier editions of this book:
• William Barrett, San Jose State University
• Scott Blackledge
• James Brink, Pacific Lutheran University
• Gerald Cahill, Antelope Valley College
• John Taylor
This page intentionally left blank
About the Author
Kip Irvine has written five computer programming textbooks, for Intel Assembly Language,
C++, Visual Basic (beginning and advanced), and COBOL. His book Assembly Language for
Intel-Based Computers has been translated into six languages. His first college degrees (B.M.,
M.M., and doctorate) were in Music Composition, at University of Hawaii and University
of Miami. He began programming computers for music synthesis around 1982 and taught programming at Miami-Dade Community College for 17 years. Kip earned an M.S. degree in Computer Science from the University of Miami, and he has been a full-time member of the faculty
in the School of Computing and Information Sciences at Florida International University since
2000.
xxxi
This page intentionally left blank
1
Basic Concepts
1.1 Welcome to Assembly Language
1.1.1
1.1.2
1.1.3
1.3.7
1.3.8
1.3.9
Questions You Might Ask
Assembly Language Applications
Section Review
1.4 Boolean Expressions
1.2 Virtual Machine Concept
1.2.1
1.4.1
1.4.2
Section Review
Truth Tables for Boolean Functions
Section Review
1.5 Chapter Summary
1.6 Key Terms
1.7 Review Questions and Exercises
1.3 Data Representation
1.3.1
1.3.2
1.3.3
1.3.4
1.3.5
1.3.6
Binary Subtraction
Character Storage
Section Review
Binary Integers
Binary Addition
Integer Storage Sizes
Hexadecimal Integers
Hexadecimal Addition
Signed Binary Integers
1.7.1
1.7.2
Short Answer
Algorithm Workbench
This chapter establishes some core concepts relating to assembly language programming. For
example, it shows how assembly language fits into the wide spectrum of languages and applications. We introduce the virtual machine concept, which is so important in understanding the relationship between software and hardware layers. A large part of the chapter is devoted to the
binary and hexadecimal numbering systems, showing how to perform conversions and do basic
arithmetic. Finally, this chapter introduces fundamental boolean operations (AND, OR, NOT,
XOR), which will prove to be essential in later chapters.
1.1
Welcome to Assembly Language
Assembly Language for x86 Processors focuses on programming microprocessors compatible
with Intel and AMD processors running under 32-bit and 64-bit versions of Microsoft Windows.
1
2
Chapter 1 • Basic Concepts
The latest version of Microsoft Macro Assembler (known as MASM) should be used with this
book. MASM is included with most versions of Microsoft Visual Studio (Pro, Ultimate,
Express, . . . ). Please check our web site (asmirvine.com) for the latest details about support for
MASM in Visual Studio. We also include lots of helpful information about how to set up your
software and get started.
Some other well-known assemblers for x86 systems running under Microsoft Windows
include TASM (Turbo Assembler), NASM (Netwide Assembler), and MASM32 (a variant of
MASM). Two popular Linux-based assemblers are GAS (GNU assembler) and NASM. Of
these, NASM’s syntax is most similar to that of MASM.
Assembly language is the oldest programming language, and of all languages, bears the
closest resemblance to native machine language. It provides direct access to computer hardware, requiring you to understand much about your computer’s architecture and operating
system.
Educational Value Why read this book? Perhaps you’re taking a college course whose title is
similar to one of the following courses that often use our book:
• Microcomputer Assembly Language
• Assembly Language Programming
• Introduction to Computer Architecture
• Fundamentals of Computer Systems
• Embedded Systems Programming
This book will help you learn basic principles about computer architecture, machine language, and low-level programming. You will learn enough assembly language to test your
knowledge on today’s most widely used microprocessor family. You won’t be learning to program a “toy” computer using a simulated assembler; MASM is an industrial-strength assembler,
used by practicing professionals. You will learn the architecture of the Intel processor family
from a programmer’s point of view.
If you are planning to be a C or C++ developer, you need to develop an understanding of how
memory, address, and instructions work at a low level. A lot of programming errors are not easily recognized at the high-level language level. You will often find it necessary to “drill down”
into your program’s internals to find out why it isn’t working.
If you doubt the value of low-level programming and studying details of computer software
and hardware, take note of the following quote from a leading computer scientist, Donald Knuth,
in discussing his famous book series, The Art of Computer Programming:
Some people [say] that having machine language, at all, was the great mistake that I made.
I really don’t think you can write a book for serious computer programmers unless you are
able to discuss low-level detail.1
Visit this book’s web site to get lots of supplemental information, tutorials, and exercises at
www.asmirvine.com
1.1
1.1.1
Welcome to Assembly Language
3
Questions You Might Ask
What Background Should I Have? Before reading this book, you should have programmed
in at least one structured high-level language, such as Java, C, Python, or C++. You should know
how to use IF statements, arrays, and functions to solve programming problems.
What Are Assemblers and Linkers? An assembler is a utility program that converts source
code programs from assembly language into machine language. A linker is a utility program that combines individual files created by an assembler into a single executable program. A related utility, called a
debugger, lets you to step through a program while it’s running and examine registers and memory.
What Hardware and Software Do I Need? You need a computer that runs a 32-bit or 64-bit
version of Microsoft Windows, along with one of the recent versions of Microsoft Visual Studio.
What Types of Programs Can Be Created Using MASM?
• 32-Bit Protected Mode: 32-bit protected mode programs run under all 32-bit versions of
Microsoft Windows. They are usually easier to write and understand than real-mode programs. From now on, we will simply call this 32-bit mode.
• 64-Bit Mode: 64-bit programs run under all 64-bit versions of Microsoft Windows.
• 16-Bit Real-Address Mode: 16-bit programs run under 32-bit versions of Windows and on
embedded systems. Because they are not supported by 64-bit Windows, we will restrict discussions of this mode to Chapters 14 through 17. These chapters are in electronic form, available from the publisher’s web site.
What Supplements Are Supplied with This Book? The book’s web site (www.asmirvine.com)
has the following:
• Assembly Language Workbook, a collection of tutorials
• Irvine32, Irvine64, and Irvine16 subroutine libraries for 64-bit, 32-bit, and 16-bit programming, with complete source code
• Example programs with all source code from the book
• Corrections to the book
• Getting Started, a detailed tutorial designed to help you set up Visual Studio to use the
Microsoft assembler
• Articles on advanced topics not included in the printed book for lack of space
• A link to an online discussion forum, where you can get help from other experts who use the book
What Will I Learn? This book should make you better informed about data representation,
debugging, programming, and hardware manipulation. Here’s what you will learn:
• Basic principles of computer architecture as applied to x86 processors
• Basic boolean logic and how it applies to programming and computer hardware
• How x86 processors manage memory, using protected mode and virtual mode
• How high-level language compilers (such as C++) translate statements from their language
into assembly language and native machine code
4
Chapter 1 • Basic Concepts
• How high-level languages implement arithmetic expressions, loops, and logical structures at
the machine level
• Data representation, including signed and unsigned integers, real numbers, and character data
• How to debug programs at the machine level. The need for this skill is vital when you work in
languages such as C and C++, which generate native machine code
• How application programs communicate with the computer’s operating system via interrupt
handlers and system calls
• How to interface assembly language code to C++ programs
• How to create assembly language application programs
How Does Assembly Language Relate to Machine Language? Machine language is a
numeric language specifically understood by a computer’s processor (the CPU). All x86 processors
understand a common machine language. Assembly language consists of statements written with
short mnemonics such as ADD, MOV, SUB, and CALL. Assembly language has a one-to-one relationship with machine language: Each assembly language instruction corresponds to a
single machine-language instruction.
How Do C++ and Java Relate to Assembly Language? High-level languages such as
Python, C++, and Java have a one-to-many relationship with assembly language and machine
language. A single statement in C++, for example, expands into multiple assembly language or
machine instructions. Most people cannot read raw machine code, so in this book, we examine
its closest relative, assembly language. For example, the following C++ code carries out two
arithmetic operations and assigns the result to a variable. Assume X and Y are integers:
int
int
Y;
X = (Y + 4) * 3;
Following is the equivalent translation to assembly language. The translation requires multiple
statements because each assembly language statement corresponds to a single machine instruction:
mov
add
mov
imul
mov
eax,Y
eax,4
ebx,3
ebx
X,eax
;
;
;
;
;
move Y to the EAX register
add 4 to the EAX register
move 3 to the EBX register
multiply EAX by EBX
move EAX to X
(Registers are named storage locations in the CPU that hold intermediate results of operations.)
The point of this example is not to claim that C++ is superior to assembly language or vice
versa, but to show their relationship.
Is Assembly Language Portable? A language whose source programs can be compiled and
run on a wide variety of computer systems is said to be portable. A C++ program, for example,
will compile and run on just about any computer, unless it makes specific references to library
functions that exist under a single operating system. A major feature of the Java language is that
compiled programs run on nearly any computer system.
Assembly language is not portable, because it is designed for a specific processor family. There
are a number of different assembly languages widely used today, each based on a processor family.
1.1
Welcome to Assembly Language
5
Some well-known processor families are Motorola 68×00, x86, SUN Sparc, Vax, and IBM-370.
The instructions in assembly language may directly match the computer’s architecture or they may
be translated during execution by a program inside the processor known as a microcode interpreter.
Why Learn Assembly Language? If you’re still not convinced that you should learn assembly
language, consider the following points:
• If you study computer engineering, you may likely be asked to write embedded programs.
They are short programs stored in a small amount of memory in single-purpose devices such
as telephones, automobile fuel and ignition systems, air-conditioning control systems, security systems, data acquisition instruments, video cards, sound cards, hard drives, modems,
and printers. Assembly language is an ideal tool for writing embedded programs because of
its economical use of memory.
• Real-time applications dealing with simulation and hardware monitoring require precise
timing and responses. High-level languages do not give programmers exact control over
machine code generated by compilers. Assembly language permits you to precisely specify a
program’s executable code.
• Computer game consoles require their software to be highly optimized for small code size and fast
execution. Game programmers are experts at writing code that takes full advantage of hardware
features in a target system. They often use assembly language as their tool of choice because it
permits direct access to computer hardware, and code can be hand optimized for speed.
• Assembly language helps you to gain an overall understanding of the interaction between
computer hardware, operating systems, and application programs. Using assembly language,
you can apply and test theoretical information you are given in computer architecture and
operating systems courses.
• Some high-level languages abstract their data representation to the point that it becomes awkward to perform low-level tasks such as bit manipulation. In such an environment, programmers will often call subroutines written in assembly language to accomplish their goal.
• Hardware manufacturers create device drivers for the equipment they sell. Device drivers
are programs that translate general operating system commands into specific references to
hardware details. Printer manufacturers, for example, create a different MS-Windows device
driver for each model they sell. Often these device drivers contain significant amounts of
assembly language code.
Are There Rules in Assembly Language? Most rules in assembly language are based on
physical limitations of the target processor and its machine language. The CPU, for example,
requires two instruction operands to be the same size. Assembly language has fewer rules than
C++ or Java because the latter use syntax rules to reduce unintended logic errors at the expense
of low-level data access. Assembly language programmers can easily bypass restrictions characteristic of high-level languages. Java, for example, does not permit access to specific memory
addresses. One can work around the restriction by calling a C function using JNI (Java Native
Interface) classes, but the resulting program can be awkward to maintain. Assembly language,
on the other hand, can access any memory address. The price for such freedom is high: Assembly language programmers spend a lot of time debugging!
6
Chapter 1 • Basic Concepts
1.1.2 Assembly Language Applications
In the early days of programming, most applications were written partially or entirely in assembly language. They had to fit in a small area of memory and run as efficiently as possible on slow
processors. As memory became more plentiful and processors dramatically increased in speed,
programs became more complex. Programmers switched to high-level languages such as C,
FORTRAN, and COBOL that contained a certain amount of structuring capability. More
recently, object-oriented languages such as Python, C++, C#, and Java have made it possible to
write complex programs containing millions of lines of code.
It is rare to see large application programs coded completely in assembly language because
they would take too much time to write and maintain. Instead, assembly language is used to optimize certain sections of application programs for speed and to access computer hardware.
Table 1-1 compares the adaptability of assembly language to high-level languages in relation to
various types of applications.
Table 1-1
Comparison of Assembly Language to High-Level Languages.
Type of Application
High-Level Languages
Assembly Language
Commercial or scientific application, written for single platform, medium to large size.
Formal structures make it easy to organize and maintain large sections of
code.
Minimal formal structure, so one
must be imposed by programmers
who have varying levels of experience. This leads to difficulties maintaining existing code.
Hardware device driver.
The language may not provide for direct
hardware access. Even if it does, awkward coding techniques may be required,
resulting in maintenance difficulties.
Hardware access is straightforward and
simple. Easy to maintain when programs are short and well documented.
Commercial or scientific application written for multiple
platforms (different operating
systems).
Usually portable. The source code can
be recompiled on each target operating
system with minimal changes.
Must be recoded separately for each
platform, using an assembler with a
different syntax. Difficult to maintain.
Embedded systems and computer games requiring direct
hardware access.
May produce large executable files that
exceed the memory capacity of the
device.
Ideal, because the executable code is
small and runs quickly.
The C and C++ languages have the unique quality of offering a compromise between highlevel structure and low-level details. Direct hardware access is possible but completely nonportable. Most C and C++ compilers allow you to embed assembly language statements in their
code, providing access to hardware details.
1.1.3 Section Review
1. How do assemblers and linkers work together?
2. How will studying assembly language enhance your understanding of operating systems?
1.2
Virtual Machine Concept
7
3. What is meant by a one-to-many relationship when comparing a high-level language to
machine language?
4. Explain the concept of portability as it applies to programming languages.
5. Is the assembly language for x86 processors the same as those for computer systems such as
the Vax or Motorola 68×00?
6. Give an example of an embedded systems application.
7. What is a device driver?
8. Do you suppose type checking on pointer variables is stronger (stricter) in assembly language, or in C and C++?
9. Name two types of applications that would be better suited to assembly language than a
high-level language.
10. Why would a high-level language not be an ideal tool for writing a program that directly
accesses a printer port?
11. Why is assembly language not usually used when writing large application programs?
12. Challenge: Translate the following C++ expression to assembly language, using the example
presented earlier in this chapter as a guide: X (Y * 4) 3.
1.2
Virtual Machine Concept
An effective way to explain how a computer’s hardware and software are related is called the
virtual machine concept. A well-known explanation of this model can be found in Andrew
Tanenbaum’s book, Structured Computer Organization. To explain this concept, let us begin
with the most basic function of a computer, executing programs.
A computer can usually execute programs written in its native machine language. Each
instruction in this language is simple enough to be executed using a relatively small number of
electronic circuits. For simplicity, we will call this language L0.
Programmers would have a difficult time writing programs in L0 because it is enormously
detailed and consists purely of numbers. If a new language, L1, could be constructed that was
easier to use, programs could be written in L1. There are two ways to achieve this:
• Interpretation: As the L1 program is running, each of its instructions could be decoded and
executed by a program written in language L0. The L1 program begins running immediately,
but each instruction has to be decoded before it can execute.
• Translation: The entire L1 program could be converted into an L0 program by an L0 program
specifically designed for this purpose. Then the resulting L0 program could be executed
directly on the computer hardware.
Virtual Machines
Rather than using only languages, it is easier to think in terms of a hypothetical computer, or virtual machine, at each level. Informally, we can define a virtual machine as a software program
that emulates the functions of some other physical or virtual computer. The virtual machine
8
Chapter 1 • Basic Concepts
VM1, as we will call it, can execute commands written in language L1. The virtual machine
VM0 can execute commands written in language L0:
Virtual Machine VM1
Virtual Machine VM0
Each virtual machine can be constructed of either hardware or software. People can write programs for virtual machine VM1, and if it is practical to implement VM1 as an actual computer,
programs can be executed directly on the hardware. Or programs written in VM1 can be interpreted/translated and executed on machine VM0.
Machine VM1 cannot be radically different from VM0 because the translation or interpretation would be too time-consuming. What if the language VM1 supports is still not programmerfriendly enough to be used for useful applications? Then another virtual machine, VM2, can be
designed that is more easily understood. This process can be repeated until a virtual machine
VMn can be designed to support a powerful, easy-to-use language.
The Java programming language is based on the virtual machine concept. A program written
in the Java language is translated by a Java compiler into Java byte code. The latter is a low-level
language quickly executed at runtime by a program known as a Java virtual machine (JVM). The
JVM has been implemented on many different computer systems, making Java programs relatively system independent.
Specific Machines
Let us relate this to actual computers and languages, using names such as Level 2 for VM2 and Level 1
for VM1, shown in Figure 1-1. A computer’s digital logic hardware represents machine Level 1. Above
this is Level 2, called the instruction set Architecture (ISA). This is the first level at which users can typically write programs, although the programs consist of binary values called machine language.
Instruction Set Architecture (Level 2) Computer chip manufacturers design into the processor an instruction set to carry out basic operations, such as move, add, or multiply. This set of
instructions is also referred to as machine language. Each machine-language instruction is executed either directly by the computer’s hardware or by a program embedded in the microprocessor
chip called a microprogram. A discussion of microprograms is beyond the scope of this book, but
you can refer to Tanenbaum for more details.
Assembly Language (Level 3) Above the ISA level, programming languages provide translation layers to make large-scale software development practical. Assembly language, which
appears at Level 3, uses short mnemonics such as ADD, SUB, and MOV, which are easily translated to the ISA level. Assembly language programs are translated (assembled) in their entirety
into machine language before they begin to execute.
1.3
Data Representation
9
Figure 1–1 Virtual machine levels.
Level 4
High-level language
Level 3
Assembly language
Level 2
Instruction set
architecture (ISA)
Level 1
Digital logic
High-Level Languages (Level 4) At Level 4 are high-level programming languages such as
C, C++, and Java. Programs in these languages contain powerful statements that translate into
multiple assembly language instructions. You can see such a translation, for example, by examining the listing file output created by a C++ compiler. The assembly language code is automatically assembled by the compiler into machine language.
1.2.1 Section Review
1. In your own words, describe the virtual machine concept.
2. Why do you suppose translated programs often execute more quickly than interpreted ones?
3. (True/False): When an interpreted program written in language L1 runs, each of its instructions is decoded and executed by a program written in language L0.
4. Explain the importance of translation when dealing with languages at different virtual
machine levels.
5. At which level does assembly language appear in the virtual machine example shown in this
section?
6. What software utility permits compiled Java programs to run on almost any computer?
7. Name the four virtual machine levels named in this section, from lowest to highest.
8. Why don’t programmers write applications in machine language?
9. Machine language is used at which level of the virtual machine shown in Figure 1-1?
10. Statements at the assembly language level of a virtual machine are translated into statements at which other level?
1.3
Data Representation
Assembly language programmers deal with data at the physical level, so they must be adept at
examining memory and registers. Often, binary numbers are used to describe the contents of
computer memory; at other times, decimal and hexadecimal numbers are used. You must develop
10
Chapter 1 • Basic Concepts
a certain fluency with number formats, so you can quickly translate numbers from one format to
another.
Each numbering format, or system, has a base, or maximum number of symbols that can be
assigned to a single digit. Table 1-2 shows the possible digits for the numbering systems used
most commonly in hardware and software manuals. In the last row of the table, hexadecimal
numbers use the digits 0 through 9 and continue with the letters A through F to represent decimal values 10 through 15. It is quite common to use hexadecimal numbers when showing the
contents of computer memory and machine-level instructions.
1.3.1 Binary Integers
A computer stores instructions and data in memory as collections of electronic charges. Representing
these entities with numbers requires a system geared to the concepts of on and off or true and false.
Binary numbers are base 2 numbers, in which each binary digit (called a bit) is either 0 or 1. Bits are
numbered sequentially starting at zero on the right side and increasing toward the left. The bit on the
left is called the most significant bit (MSB), and the bit on the right is the least significant bit (LSB).
The MSB and LSB bit numbers of a 16-bit binary number are shown in the following figure:
MSB
LSB
1 0 1 1 0 0 1 0 1 0 0 1 1 1 0 0
15
Table 1-2
0
Bit number
Binary, Octal, Decimal, and Hexadecimal Digits.
System
Base
Possible Digits
Binary
2
01
Octal
8
01234567
Decimal
10
0123456789
Hexadecimal
16
0123456789ABCDEF
Binary integers can be signed or unsigned. A signed integer is positive or negative. An
unsigned integer is by default positive. Zero is considered positive. When writing down large
binary numbers, many people like to insert a dot every 4 bits or 8 bits to make the numbers easier to read. Examples are 1101.1110.0011.1000.0000 and 11001010.10101100.
Unsigned Binary Integers
Starting with the LSB, each bit in an unsigned binary integer represents an increasing power of
2. The following figure contains an 8-bit binary number, showing how powers of two increase
from right to left:
1
1
1
1
1
1
1
1
27
26
25
24
23
22
21
20
Table 1-3 lists the decimal values of 20 through 215.
1.3
Data Representation
Table 1-3
2n
11
Binary Bit Position Values.
Decimal Value
2n
Decimal Value
1
2
8
256
9
2
0
2
1
2
2
22
4
210
1024
23
8
211
2048
2
12
4096
13
512
2
4
2
5
32
2
26
64
214
16384
27
128
215
32768
16
8192
Translating Unsigned Binary Integers to Decimal
Weighted positional notation represents a convenient way to calculate the decimal value of an
unsigned binary integer having n digits:
dec (Dn1 2n1) (Dn2 2n2) (D1 21) (D0 20)
D indicates a binary digit. For example, binary 00001001 is equal to 9. We calculate this value
by leaving out terms equal to zero:
(1 23) (1 20) 9
The same calculation is shown by the following figure:
8
1
9
0 0 0 0 1 0 0 1
Translating Unsigned Decimal Integers to Binary
To translate an unsigned decimal integer into binary, repeatedly divide the integer by 2, saving each
remainder as a binary digit. The following table shows the steps required to translate decimal 37 to
binary. The remainder digits, starting from the top row, are the binary digits D0, D1, D2, D3, D4, and D5:
Division
Quotient
Remainder
37 / 2
18
1
18 / 2
9
0
9/2
4
1
4/2
2
0
2/2
1
0
1/2
0
1
12
Chapter 1 • Basic Concepts
We can concatenate the binary bits from the remainder column of the table in reverse order
(D5, D4, . . .) to produce binary 100101. Because computer storage always consists of binary
numbers whose lengths are multiples of 8, we fill the remaining two digit positions on the left
with zeros, producing 00100101.
Tip: How many bits? There’s a simple formula to find b, the number of binary bits you need to
represent the unsigned decimal value n. It is b = ceiling ( log2 n). If n = 17, for example, log2 17 =
4.087463, which when raised to the smallest following integer, equals 5. Most calculators don’t
have a log base 2 operation, but you can find web pages that will calculate it for you.
1.3.2 Binary Addition
When adding two binary integers, proceed bit by bit, starting with the low-order pair of bits (on
the right) and add each subsequent pair of bits. There are four ways to add two binary digits, as
shown here:
000
011
101
1 1 10
When adding 1 to 1, the result is 10 binary (think of it as the decimal value 2). The extra
digit generates a carry to the next-highest bit position. In the following figure, we add binary
00000100 to 00000111:
Carry:
Bit position:
1
0
0
0
0
0
1
0
0
(4)
0
0
0
0
0
1
1
1
(7)
0
0
0
0
1
0
1
1
(11)
7
6
5
4
3
2
1
0
Beginning with the lowest bit in each number (bit position 0), we add 0 1, producing a 1 in
the bottom row. The same happens in the next highest bit (position 1). In bit position 2, we add
1 1, generating a sum of zero and a carry of 1. In bit position 3, we add the carry bit to 0 0,
producing 1. The rest of the bits are zeros. You can verify the addition by adding the decimal
equivalents shown on the right side of the figure (4 7 11).
Sometimes a carry is generated out of the highest bit position. When that happens, the size
of the storage area set aside becomes important. If we add 11111111 to 00000001, for example, a 1 carries out of the highest bit position, and the lowest 8 bits of the sum equal all zeros.
If the storage location for the sum is at least 9 bits long, we can represent the sum as
100000000. But if the sum can only store 8 bits, it will equal to 00000000, the lowest 8 bits of
the calculated value.
1.3
Data Representation
13
1.3.3 Integer Storage Sizes
The basic storage unit for all data in an x86 computer is a byte, containing 8 bits. Other
storage sizes are word (2 bytes), doubleword (4 bytes), and quadword (8 bytes). In the following
figure, the number of bits is shown for each size:
Byte
8
Word
16
Doubleword
32
Quadword
64
Double quadword
128
Table 1-4 shows the range of possible values for each type of unsigned integer.
Large Measurements A number of large measurements are used when referring to both
memory and disk space:
• One kilobyte is equal to 210, or 1024 bytes.
• One megabyte (1 MByte) is equal to 220, or 1,048,576 bytes.
• One gigabyte (1 GByte) is equal to 230, or 10243, or 1,073,741,824 bytes.
• One terabyte (1 TByte) is equal to 240, or 10244, or 1,099,511,627,776 bytes.
• One petabyte is equal to 250, or 1,125,899,906,842,624 bytes.
• One exabyte is equal to 260, or 1,152,921,504,606,846,976 bytes.
• One zettabyte is equal to 270 bytes.
• One yottabyte is equal to 280 bytes.
Table 1-4
Ranges and Sizes of Unsigned Integer Types.
Type
Unsigned byte
Range
0 to 28 − 1
Storage Size
in Bits
8
Unsigned word
0 to 2
−1
16
Unsigned doubleword
0 to 232 − 1
32
Unsigned quadword
0 to 264 − 1
64
−1
128
Unsigned double quadword
0 to 2
16
128
1.3.4 Hexadecimal Integers
Large binary numbers are cumbersome to read, so hexadecimal digits offer a convenient way to
represent binary data. Each digit in a hexadecimal integer represents four binary bits, and two
hexadecimal digits together represent a byte. A single hexadecimal digit represents decimal 0 to
15, so letters A to F represent decimal values in the range 10 through 15. Table 1-5 shows how
each sequence of four binary bits translates into a decimal or hexadecimal value.
14
Chapter 1 • Basic Concepts
Table 1-5
Binary, Decimal, and Hexadecimal Equivalents.
Binary
Decimal
Hexadecimal
Binary
Decimal
Hexadecimal
0000
0
0
1000
8
8
0001
1
1
1001
9
9
0010
2
2
1010
10
A
0011
3
3
1011
11
B
0100
4
4
1100
12
C
0101
5
5
1101
13
D
0110
6
6
1110
14
E
0111
7
7
1111
15
F
The following example shows how binary 0001 0110 1010 0111 1001 0100 is equivalent to
hexadecimal 16A794:
1
6
A
7
9
4
0001
0110
1010
0111
1001
0100
Converting Unsigned Hexadecimal to Decimal
In hexadecimal, each digit position represents a power of 16. This is helpful when calculating the
decimal value of a hexadecimal integer. Suppose we number the digits in a four-digit hexadecimal
integer with subscripts as D3D2D1D0. The following formula calculates the integer’s decimal value:
dec (D3 163) (D2 162) (D1 161) (D0 160)
The formula can be generalized for any n-digit hexadecimal integer:
dec (Dn1 16n1) (Dn2 16n2) (D1 161) (D0 160)
In general, you can convert an n-digit integer in any base B to decimal using the following
formula: dec = (Dn1 Bn1) (Dn2 Bn2) (D1 × B1) (D0 B0).
For example, hexadecimal 1234 is equal to (1 163) (2 162) (3 161) (4 160), or
decimal 4660. Similarly, hexadecimal 3BA4 is equal to (3 163) (11 162) (10 161)
(4 160), or decimal 15,268. The following figure shows this last calculation:
3 × 163 ⫽ 12,288
11 × 162 ⫽ 2,816
10 × 161 ⫽
160
4 × 160 ⫽ ⫹
3
B
A
4
Total:
4
15,268
1.3
Data Representation
15
Table 1-6 lists the powers of 16 from 160 to 167.
Table 1-6
16n
16
0
16
1
16
2
16
3
Powers of 16 in Decimal.
16n
Decimal Value
1
16
256
4096
Decimal Value
16
4
65,536
16
5
1,048,576
16
6
16,777,216
16
7
268,435,456
Converting Unsigned Decimal to Hexadecimal
To convert an unsigned decimal integer to hexadecimal, repeatedly divide the decimal value by
16 and retain each remainder as a hexadecimal digit. For example, the following table lists the
steps when converting decimal 422 to hexadecimal:
Division
Quotient
Remainder
422 / 16
26
6
26 / 16
1
A
1 / 16
0
1
The resulting hexadecimal number is assembled from the digits in the remainder column, starting from the last row and working upward to the top row. In this example, the hexadecimal representation is 1A6. The same algorithm was used for binary integers in Section 1.3.1. To convert
from decimal into some other number base other than hexadecimal, replace the divisor (16) in
each calculation with the desired number base.
1.3.5 Hexadecimal Addition
Debugging utility programs (known as debuggers) usually display memory addresses in hexadecimal. It is often necessary to add two addresses in order to locate a new address. Fortunately, hexadecimal addition works the same way as decimal addition, if you just change the
number base.
Suppose we want to add two numbers X and Y, using numbering base b. We will number
their digits from the lowest position (x0) to the highest. If we add digits xi and yi in X and
Y, we produce the value si. If s i ≥ b , we recalculate si (si MOD b) and generate a carry
value of 1. When we move to the next pair of digits xi+1 and yi+1, we add the carry value to
their sum.
For example, let’s add the hexadecimal values 6A2 and 49A. In the lowest digit position,
2 A decimal 12, so there is no carry and we use C to indicate the hexadecimal sum
digit. In the next position, A 9 decimal 19, so there is a carry because 19 ≥ 16 , the number base. We calculate 19 MOD 16 3, and carry a 1 into the third digit position. Finally,
we add 1 6 4 decimal 11, which is shown as the letter B in the third position of the
sum. The hexadecimal sum is B3C.
16
Chapter 1 • Basic Concepts
Carry
1
X
6
A
2
Y
4
9
A
S
B
3
C
1.3.6 Signed Binary Integers
Signed binary integers are positive or negative. For x86 processors, the MSB indicates the
sign: 0 is positive and 1 is negative. The following figure shows examples of 8-bit negative
and positive integers:
Sign bit
1
1
1
1
0
1
1
0
0
0
0
0
1
0
1
0
Negative
Positive
Two’s-Complement Representation
Negative integers use two’s-complement representation, using the mathematical principle that
the two’s complement of an integer is its additive inverse. (If you add a number to its additive
inverse, the sum is zero.)
Two’s-complement representation is useful to processor designers because it removes the need
for separate digital circuits to handle both addition and subtraction. For example, if presented with
the expression A B, the processor can simply convert it to an addition expression: A (B).
The two’s complement of a binary integer is formed by inverting (complementing) its bits
and adding 1. Using the 8-bit binary value 00000001, for example, its two’s complement turns
out to be 11111111, as can be seen as follows:
Starting value
00000001
Step 1: Reverse the bits
11111110
Step 2: Add 1 to the value from Step 1
Sum: Two’s-complement representation
11111110
+00000001
11111111
11111111 is the two’s-complement representation of 1. The two’s-complement operation is
reversible, so the two’s complement of 11111111 is 00000001.
Hexadecimal Two’s Complement To create the two’s complement of a hexadecimal integer,
reverse all bits and add 1. An easy way to reverse the bits of a hexadecimal digit is to subtract the
digit from 15. Here are examples of hexadecimal integers converted to their two’s complements:
6A3D –> 95C2 + 1 –> 95C3
95C3 –> 6A3C + 1 –> 6A3D
1.3
Data Representation
17
Converting Signed Binary to Decimal Use the following algorithm to calculate the decimal
equivalent of a signed binary integer:
• If the highest bit is a 1, the number is stored in two’s-complement notation. Create its two’s
complement a second time to get its positive equivalent. Then convert this new number to
decimal as if it were an unsigned binary integer.
• If the highest bit is a 0, you can convert it to decimal as if it were an unsigned binary integer.
For example, signed binary 11110000 has a 1 in the highest bit, indicating that it is a negative
integer. First we create its two’s complement, and then convert the result to decimal. Here are the
steps in the process:
Starting value
11110000
Step 1: Reverse the bits
00001111
00001111
1
Step 2: Add 1 to the value from Step 1
+
Step 3: Create the two’s complement
Step 4: Convert to decimal
00010000
16
Because the original integer (11110000) was negative, we know that its decimal value is −16.
Converting Signed Decimal to Binary To create the binary representation of a signed decimal integer, do the following:
1. Convert the absolute value of the decimal integer to binary.
2. If the original decimal integer was negative, create the two’s complement of the binary number from the previous step.
For example, −43 decimal is translated to binary as follows:
1. The binary representation of unsigned 43 is 00101011.
2. Because the original value was negative, we create the two’s complement of 00101011,
which is 11010101. This is the representation of −43 decimal.
Converting Signed Decimal to Hexadecimal To convert a signed decimal integer to hexadecimal, do the following:
1. Convert the absolute value of the decimal integer to hexadecimal.
2. If the decimal integer was negative, create the two’s complement of the hexadecimal number
from the previous step.
Converting Signed Hexadecimal to Decimal To convert a signed hexadecimal integer to
decimal, do the following:
1. If the hexadecimal integer is negative, create its two’s complement; otherwise, retain the
integer as is.
2. Using the integer from the previous step, convert it to decimal. If the original value was negative, attach a minus sign to the beginning of the decimal integer.
You can tell whether a hexadecimal integer is positive or negative by inspecting its most significant (highest) digit. If the digit is ≥ 8, the number is negative; if the digit is ≤ 7, the number is positive. For example, hexadecimal 8A20 is negative and 7FD9 is positive.
18
Chapter 1 • Basic Concepts
Maximum and Minimum Values
A signed integer of n bits uses only n 1 bits to represent the number’s magnitude. Table 1-7
shows the minimum and maximum values for signed bytes, words, doublewords, and quadwords.
Table 1-7
Ranges and Sizes of Signed Integer Types.
Type
Range
Storage Size in Bits
Signed byte
–27 to +27– 1
8
Signed word
–215 to +215– 1
16
Signed doubleword
–231 to +231– 1
32
Signed quadword
–263 to +263– 1
64
–2127 to +2127– 1
128
Signed double quadword
1.3.7 Binary Subtraction
Subtracting a smaller unsigned binary number from a large one is easy if you go about it in the
same way you handle decimal subtraction. Here’s an example:
0 1 1 0 1
– 0 0 1 1 1
———–
(decimal 13)
(decimal 7)
Subtracting the bits in position 0 is straightforward:
0 1 1 0 1
– 0 0 1 1 1
———-0
In the next position (0 – 1), we are forced to borrow a 1 from the next position to the left. Here’s
the result of subtracting 1 from 2:
0 1 0 0 1
– 0 0 1 1 1
———-1 0
In the next bit position, we again have to borrow a bit from the column just to the left and subtract 1 from 2:
0 0 0 1 1
– 0 0 1 1 1
———-1 1 0
Finally, the two high-order bits are zero minus zero:
0 0 0 1 1
– 0 0 1 1 1
———-0 0 1 1 0
(decimal 6)
1.3
Data Representation
19
A simpler way to approach binary subtraction is to reverse the sign of the value being subtracted,
and then add the two values. This method requires you to have an extra empty bit to hold the
number’s sign. Let’s try it with the same problem we just calculated: (01101 minus 00111).
First, we negate 00111 by inverting its bits (11000) and adding 1, producing 11001. Next, we
add the binary values and ignore the carry out of the highest bit:
0 1 1 0 1
1 1 0 0 1
——–0 0 1 1 0
(+13)
(-7)
(+6)
The result, +6, is exactly what we expected.
1.3.8 Character Storage
If computers only store binary data, how do they represent characters? They use a character set,
which is a mapping of characters to integers. In earlier times, character sets used only 8 bits. Even
now, when running in character mode (such as MS-DOS), IBM-compatible microcomputers use
the ASCII (pronounced “askey”) character set. ASCII is an acronym for American Standard Code
for Information Interchange. In ASCII, a unique 7-bit integer is assigned to each character.
Because ASCII codes use only the lower 7 bits of every byte, the extra bit is used on various computers to create a proprietary character set. On IBM-compatible microcomputers, for example,
values 128 through 255 represent graphic symbols and Greek characters.
ANSI Character Set The American National Standards Institute (ANSI) defines an 8-bit
character set that represents up to 256 characters. The first 128 characters correspond to the
letters and symbols on a standard U.S. keyboard. The second 128 characters represent special characters such as letters in international alphabets, accents, currency symbols, and
fractions. Early version of Microsoft Windows used the ANSI character set.
Unicode Standard Today, computers must be able to represent a wide variety of international
languages in computer software. As a result, the Unicode standard was created as a universal
way of defining characters and symbols. It defines numeric codes (called code points) for characters, symbols, and punctuation used in all major languages, as well as European alphabetic
scripts, Middle Eastern right-to-left scripts, and many scripts of Asia. Three transformation formats are used to transform code points into displayable characters:
• UTF-8 is used in HTML, and has the same byte values as ASCII.
• UTF-16 is used in environments that balance efficient access to characters with economical
use of storage. Recent versions of Microsoft Windows, for example, use UTF-16 encoding.
Each character is encoded in 16 bits.
• UTF-32 is used in environments where space is no concern and fixed-width characters are
required. Each character is encoded in 32 bits.
ASCII Strings A sequence of one or more characters is called a string. More specifically, an ASCII
string is stored in memory as a succession of bytes containing ASCII codes. For example, the numeric
codes for the string “ABC123” are 41h, 42h, 43h, 31h, 32h, and 33h. A null-terminated string is a string
of characters followed by a single byte containing zero. The C and C++ languages use null-terminated
strings, and many Windows operating system functions require strings to be in this format.
20
Chapter 1 • Basic Concepts
Using the ASCII Table A table on the inside back cover of this book lists ASCII codes used
when running in Windows Console mode. To find the hexadecimal ASCII code of a character, look
along the top row of the table and find the column containing the character you want to translate.
The most significant digit of the hexadecimal value is in the second row at the top of the table; the
least significant digit is in the second column from the left. For example, to find the ASCII code of
the letter a, find the column containing the a and look in the second row: The first hexadecimal digit
is 6. Next, look to the left along the row containing a and note that the second column contains the
digit 1. Therefore, the ASCII code of a is 61 hexadecimal. This is shown as follows in simplified
form:
6
1
a
ASCII Control Characters Character codes in the range 0 through 31 are called ASCII
control characters. If a program writes these codes to standard output (as in C++), the control characters will carry out predefined actions. Table 1-8 lists the most commonly used
characters in this range, and a complete list may be found in the inside front cover of this
book.
Table 1-8
ASCII Control Characters.
ASCII Code (Decimal)
Description
8
Backspace (moves one column to the left)
9
Horizontal tab (skips forward n columns)
10
Line feed (moves to next output line)
12
Form feed (moves to next printer page)
13
Carriage return (moves to leftmost output column)
27
Escape character
Terminology for Numeric Data Representation It is important to use precise terminology
when describing the way numbers and characters are represented in memory and on the display
screen. Decimal 65, for example, is stored in memory as a single binary byte as 01000001.
A debugging program would probably display the byte as “41,” which is the number’s hexadecimal representation. If the byte were copied to video memory, the letter “A” would appear on the
screen because 01000001 is the ASCII code for the letter A. Because a number’s interpretation
can depend on the context in which it appears, we assign a specific name to each type of data
representation to clarify future discussions:
• A binary integer is an integer stored in memory in its raw format, ready to be used in a calculation. Binary integers are stored in multiples of 8 bits (such as 8, 16, 32, or 64).
1.3
Data Representation
21
• A digit string is a string of ASCII characters, such as “123” or “65.” This is simply a representation of the number and can be in any of the formats shown for the decimal number 65 in
Table 1-9:
Table 1-9
Types of Digit Strings.
Format
Binary digit string
Value
“01000001”
Decimal digit string
“65”
Hexadecimal digit string
“41”
Octal digit string
“101”
1.3.9 Section Review
1. Explain the term least significant bit (LSB).
2. What is the decimal representation of each of the following unsigned binary integers?
a. 11111000
b. 11001010
c. 11110000
3. What is the sum of each pair of binary integers?
a. 00001111 00000010
b. 11010101 01101011
c. 00001111 00001111
4. How many bytes are contained in each of the following data types?
a. word
b. doubleword
c. quadword
d. double quadword
5. What is the minimum number of binary bits needed to represent each of the following
unsigned decimal integers?
a. 65
b. 409
c. 16385
6. What is the hexadecimal representation of each of the following binary numbers?
a. 0011 0101 1101 1010
b. 1100 1110 1010 0011
c. 1111 1110 1101 1011
7. What is the binary representation of the following hexadecimal numbers?
a. A4693FBC
b. B697C7A1
c. 2B3D9461
22
Chapter 1 • Basic Concepts
1.4
Boolean Expressions
Boolean algebra defines a set of operations on the values true and false. It was invented by George
Boole, a mid-nineteenth-century mathematician. When early digital computers were invented, it
was found that Boole’s algebra could be used to describe the design of digital circuits. At the same
time, boolean expressions are used in computer programs to express logical operations.
A boolean expression involves a boolean operator and one or more operands. Each boolean
expression implies a value of true or false. The set of operators includes the following:
• NOT: notated as ¬ or ~ or ’
• AND: notated as ∧ or •
• OR: notated as ∨ or
The NOT operator is unary, and the other operators are binary. The operands of a boolean
expression can also be boolean expressions. The following are examples:
Expression
Description
¬X
NOT X
X ∧Y
X AND Y
X∨ Y
X OR Y
¬X ∨ Y
¬(X ∧ Y)
X ∧ ¬Y
(NOT X) OR Y
NOT (X AND Y)
X AND (NOT Y)
NOT The NOT operation reverses a boolean value. It can be written in mathematical notation
as ¬X, where X is a variable (or expression) holding a value of true (T) or false (F). The following truth table shows all the possible outcomes of NOT using a variable X. Inputs are on the left
side and outputs (shaded) are on the right side:
X
¬X
F
T
T
F
A truth table can use 0 for false and 1 for true.
AND The Boolean AND operation requires two operands, and can be expressed using the notation
X ∧ Y. The following truth table shows all the possible outcomes (shaded) for the values of X and Y:
X
Y
X ∧Y
F
F
F
F
T
F
T
F
F
T
T
T
1.4
Boolean Expressions
23
The output is true only when both inputs are true. This corresponds to the logical AND used
in compound boolean expressions in C++ and Java.
The AND operation is often carried out at the bit level in assembly language. In the following
example, each bit in X is ANDed with its corresponding bit in Y:
X:
Y:
X ∧ Y:
11111111
00011100
00011100
As Figure 1-2 shows, each bit of the resulting value, 00011100, represents the result of ANDing
the corresponding bits in X and Y.
Figure 1–2 ANDing the bits of two binary integers.
X:
1
1
1
1
1
1
1
1
AND AND AND AND AND AND AND AND
Y:
0
0
0
1
1
1
0
0
X ^Y:
0
0
0
1
1
1
0
0
OR The Boolean OR operation requires two operands, and is often expressed using the notation X ∨ Y. The following truth table shows all the possible outcomes (shaded) for the values of
X and Y:
X
Y
X ∨Y
F
F
F
F
T
T
T
F
T
T
T
T
The output is false only when both inputs are false. This truth table corresponds to the
logical OR used in compound boolean expressions in C++ and Java.
The OR operation is often carried out at the bit level. In the following example, each bit in X
is ORed with its corresponding bit in Y, producing 11111100:
X:
Y:
X ∨ Y:
11101100
00011100
11111100
24
Chapter 1 • Basic Concepts
As shown in Figure 1-3, the bits are ORed individually, producing a corresponding bit in the
result.
Figure 1–3 ORing the bits in two binary integers.
X:
1
1
1
0
1
1
0
0
OR
OR
OR
OR
OR
OR
OR
OR
Y:
0
0
0
1
1
1
0
0
^
1
1
1
1
1
1
0
0
X Y:
Operator Precedence Operator precedence rules are used to indicate which operators execute first in expressions involving multiple operators. In a boolean expression involving more
than one operator, precedence is important. As shown in the following table, the NOT operator
has the highest precedence, followed by AND and OR. You can use parentheses to force the initial evaluation of an expression:
Expression
Order of Operations
¬X ∨
Y
NOT, then OR
¬(X ∨
Y)
OR, then NOT
X ∨ (Y ∧ Z)
AND, then OR
1.4.1 Truth Tables for Boolean Functions
A boolean function receives boolean inputs and produces a boolean output. A truth table can be
constructed for any boolean function, showing all possible inputs and outputs. The following are
truth tables representing boolean functions having two inputs named X and Y. The shaded column on the right is the function’s output:
1.4
Boolean Expressions
25
Example 1: ¬X ∨ Y
X
¬X
Y
¬X ∨ Y
F
T
F
T
F
T
T
T
T
F
F
F
T
F
T
T
X
Y
¬Y
X ∧¬Y
F
F
T
F
F
T
F
F
T
F
T
T
T
T
F
F
Example 2: X ∧ ¬Y
Example 3: (Y ∧ S) ∨ (X ∧ ¬S)
X
Y
S
Y∧S
¬S
X ∧¬S
(Y ∧ S) ∨ (X ∧ ¬S)
F
F
F
F
T
F
F
F
T
F
F
T
F
F
T
F
F
F
T
T
T
T
T
F
F
T
T
T
F
F
T
F
F
F
F
F
T
T
T
F
F
T
T
F
T
F
F
F
F
T
T
T
T
F
F
T
26
Chapter 1 • Basic Concepts
The boolean function in Example 3 describes a multiplexer, a digital component that uses
a selector bit (S) to select one of two outputs (X or Y). If S false, the function output (Z) is
the same as X. If S true, the function output is the same as Y. Here is a block diagram of a
multiplexer:
S
X
mux
Z
Y
1.4.2 Section Review
1. Describe the following boolean expression: ¬X ∨ Y.
2. Describe the following boolean expression: (X ∧ Y).
3. What is the value of the boolean expression (T ∧ F) ∨ T ?
4. What is the value of the boolean expression ¬(F ∨ T) ?
5. What is the value of the boolean expression ¬F ∨ ¬T ?
1.5
Chapter Summary
This book focuses on programming x86 processors, using the MS-Windows platform. We cover
basic principles about computer architecture, machine language, and low-level programming.
You will learn enough assembly language to test your knowledge on today’s most widely used
microprocessor family.
Before reading this book, you should have completed a single college course or equivalent in
computer programming.
An assembler is a program that converts source-code programs from assembly language into
machine language. A companion program, called a linker, combines individual files created by
an assembler into a single executable program. A third program, called a debugger, provides a
way for a programmer to trace the execution of a program and examine the contents of memory.
You will create 32-bit and 64-bit programs for the most part, and 16-bit programs if you focus
on the last four chapters.
You will learn the following concepts from this book: basic computer architecture applied to
x86 (and Intel 64) processors; elementary boolean logic; how x86 processors manage memory;
how high-level language compilers translate statements from their language into assembly language and native machine code; how high-level languages implement arithmetic expressions,
loops, and logical structures at the machine level; and the data representation of signed and
unsigned integers, real numbers, and character data.
Assembly language has a one-to-one relationship with machine language, in which a single
assembly language instruction corresponds to one machine language instruction. Assembly language is not portable because it is tied to a specific processor family.
1.6
Key Terms
27
Programming languages are tools that you can use to create individual applications or parts of
applications. Some applications, such as device drivers and hardware interface routines, are
more suited to assembly …
Top-quality papers guaranteed
100% original papers
We sell only unique pieces of writing completed according to your demands.
Confidential service
We use security encryption to keep your personal data protected.
Money-back guarantee
We can give your money back if something goes wrong with your order.
Enjoy the free features we offer to everyone
-
Title page
Get a free title page formatted according to the specifics of your particular style.
-
Custom formatting
Request us to use APA, MLA, Harvard, Chicago, or any other style for your essay.
-
Bibliography page
Don’t pay extra for a list of references that perfectly fits your academic needs.
-
24/7 support assistance
Ask us a question anytime you need to—we don’t charge extra for supporting you!
Calculate how much your essay costs
What we are popular for
- English 101
- History
- Business Studies
- Management
- Literature
- Composition
- Psychology
- Philosophy
- Marketing
- Economics