Modern X86 Assembly Language Programming: Covers X86 64-bit, AVX, AVX2, and AVX-512

Modern X86 Assembly Language Programming: Covers X86 64-bit, AVX, AVX2, and AVX-512

by Daniel Kusswurm
Modern X86 Assembly Language Programming: Covers X86 64-bit, AVX, AVX2, and AVX-512

Modern X86 Assembly Language Programming: Covers X86 64-bit, AVX, AVX2, and AVX-512

by Daniel Kusswurm

Paperback(3rd ed.)

$69.99 
  • SHIP THIS ITEM
    Qualifies for Free Shipping
  • PICK UP IN STORE
    Check Availability at Nearby Stores

Related collections and offers


Overview

This book is an instructional text that will teach you how to code x86-64 assembly language functions. It also explains how you can exploit the SIMD capabilities of an x86-64 processor using x86-64 assembly language and the AVX, AVX2, and AVX-512 instruction sets.

This updated edition’s content and organization are designed to help you quickly understand x86-64 assembly language programming and the unique computational capabilities of x86 processors. The source code is structured to accelerate learning and comprehension of essential x86-64 assembly language programming constructs and data structures. Modern X86 Assembly Language Programming, Third Edition includes source code for both Windows and Linux. The source code elucidates current x86-64 assembly language programming practices, run-time calling conventions, and the latest generation of software development tools.

What You Will Learn

• Understand important details of the x86-64 processor platform, including its core architecture, data types, registers, memory addressing modes, and the basic instruction set
• Use the x86-64 instruction set to create assembly language functions that are callable from C++
• Create assembly language code for both Windows and Linux using modern software development tools including MASM (Windows) and NASM (Linux)
• Employ x86-64 assembly language to efficiently manipulate common data types and programming constructs including integers, text strings, arrays, matrices, and user-defined structures
• Explore indispensable elements of x86 SIMD architectures, register sets, and data types.
• Master x86 SIMD arithmetic and data operations using both integer and floating-point operands
• Harness the AVX, AVX2, and AVX-512 instruction sets to accelerate the performance of computationally-intense calculations in machine learning, image processing, signal processing, computer graphics, statistics, and matrix arithmetic applications
• Apply leading-edge coding strategies to optimally exploit the AVX, AVX2, and AVX-512 instruction sets for maximum possible performance

Who This Book Is For
Software developers who are creating programs for x86 platforms and want to learn how to code performance-enhanced algorithms using the core x86-64 instruction set; developers who need to learn how to write SIMD functions or accelerate the performance of existing code using the AVX, AVX2, and AVX-512 instruction sets; and computer science/engineering students or hobbyists who want to learn or better understand x86-64 assembly language programming and the AVX, AVX2, and AVX-512 instruction sets.


Product Details

ISBN-13: 9781484296028
Publisher: Apress
Publication date: 09/09/2023
Edition description: 3rd ed.
Pages: 680
Product dimensions: 7.01(w) x 10.00(h) x (d)

About the Author

Daniel Kusswurm has 40+ years of professional experience as a software developer, computer scientist, and author. During his career, he has developed innovative software for medical devices, scientific instruments, and image processing applications. On many of these projects, he successfully employed x86 assembly language and the AVX, AVX2, and AVX-512 instruction sets to significantly improve the performance of computationally intense algorithms and solve unique programming challenges. His educational background includes a BS in electrical engineering technology from Northern Illinois University along with an MS and PhD in computer science from DePaul University.

Daniel is also the author multiple computer programming books, including Modern Arm Assembly Language Programming (ISBN: 9781484262665) and Modern Parallel Programming with C++ and Assembly Language (ISBN: 9781484279175), both published by Apress.

Table of Contents

Chapter 1 – X86-Core Architecture

Chapter Goal: Explains the core architecture of an x86-64 processor. Topics discussed include fundamental data types, registers, status flags, memory addressing modes, and other important architectural subjects. Understanding of this material is necessary for the reader to successfully comprehend the book’s subsequent chapters.
Historical overview
Data types
Fundamental data types
Numerical data types
SIMD data types
Miscellaneous data types
Strings
Bit fields and bit strings
X86-64 processor internal architecture
Overview
General-purpose registers
Instruction pointer
RFLAGS
Floating-point and SIMD registers
MXCSR Register
Instruction operands
Memory addressing
Condition codes
Differences between x86-32 and x86-64

Chapter 2 – X86-64 Core Programming (Part 1)
Chapter Goal: Introduces the fundamentals of x86-64 assembly language programming. The programming examples illustrate essential x86-64 assembly language programming concepts including integer arithmetic, bitwise logical operations, and shift instructions. This chapter also explains basic assembler usage and x86-64 assembly language syntax.
Assembler basics
Instruction syntax
Assembler directives
Modern X86 Assembly Language Programming, Third Edition Page 2 of 7
Daniel Kusswurm – F:\ModX86Asm3E\Proposal\ModernX86Asm3e_Outline (proposal).docx
MASM vs. NASM
Source code overview
File and function naming conventions
Integer arithmetic
Integer (32-bit) addition and subtraction
Bitwise logical operations
Shift operations
Integer (64-bit) addition and subtraction
Integer multiplication and division

Chapter 3 – X86-64 Core Programming (Part 2)
Chapter Goal: Explores additional core x86-64 assembly language programming concepts. Topics discussed include advanced integer arithmetic, memory addressing modes, and condition codes. This chapter also covers important x86-64 assembly language programming concepts including proper stack use and for-loops.Simple stack arguments
Mixed-type integer arithmetic
Memory addressing
Condition codes
Assembly language for-loops

Chapter 4 – X86-64 Core Programming (Part 3)
Chapter 4 explains how to exercise core x86-64 assembly language programming data constructs including arrays and structures. It also describes how to use common x86-64 string processing instructions.
Arrays
1D integer array arithmetic calculations
1D integer array arithmetic calculations using multiple arrays
2D integer arrays
Strings
Overview of x86 string instructions
Counting characters
String/array compare
String/array copy
String/array reversal
Assembly language structures

Chapter 5 – Scalar Floating-Point
Chapter 5 teaches the reader how to perform scalar floating-point arithmetic and other operations using assembly language. It also outlines the calling convention requirements for scalar floating-point arguments and return values.
Floating-point programming concepts
Single-precision floating-point arithmetic
Temperature conversions Cone volume/surface area calculation
Double-precision floating-point arithmetic

Sphere volume/surface area calculation
Floating-point compares and conversions
Floating-point compares using VUCOMIS[S|D]
Floating-point compares using VCMPS[S|D]
Floating-point conversions
Floating-point arrays
Array mean/standard deviation calculation

Chapter 6 – Assembly Language Calling Conventions
Chapter 6 formally defines the calling run-time conventions for x86-64 assembly language functions. The first section explains the requirements for Windows and Visual C++ while the second section covers Linux and GNU C++.Calling convention requirements for Windows and Visual C++
Stack frames (Ch06_01)
Using non-volatile general-purpose registers
Using non-volatile SIMD registers
Calling external functions
Calling convention requirements for Linux and GNU C++
Stack arguments
Using non-volatile general-purpose registers
Calling external functions

Chapter 7 – Advanced Vector Extensions
Chapter 7 introduces Advanced Vector Extensions (AVX). It begins with a discussion of AVX architecture and related topics. Chapter 7 also explains elementary SIMD programming concepts. Understanding of this material is necessary for the reader to comprehend the AVX, AVX2, and AVX-512 programming examples in subsequent chapters.
X86-AVX architecture overview
AVX
AVX2
AVX-512
Merge masking and zero masking
Embedded broadcasts
Instruction level rounding
SIMD programing concepts
Basic arithmetic
Wraparound vs. saturated arithmetic
Pack floating-point
Pack integer
Programming differences between x86-SSE and x86-AVX

Chapter 8 – AVX Programming – Packed Integers
Chapter 8 spotlights packed integer arithmetic and other operations using AVX. It also describes how to code packed integer calculating functions using arrays and the AVX instruction set.
Integer arithmetic
Addition and subtraction
Multiplication
Bitwise logical operations
Arithmetic and logical shifts
Integer array algorithms
Pixel minimum and maximum
Pixel mean

Chapter 9 – AVX Programming – Packed Floating Point
Chapter 9 demonstrates packed floating-point arithmetic and other operations using AVX. This chapter also explains how to use AVX instructions to perform calculations with floating-point arrays and matrices.
Floating-point arithmetic
Basic arithmetic operations
Compares
Conversions
Floating-point arrays
Array mean and standard deviation
Array square roots and compares
Floating-point matrices
Matrix column means

Chapter 10 – AVX2 Programming – Packed Integers
Chapter 10 describes AVX2 integer programming using x86-64 assembly language. This chapter also elucidates the coding of common image processing algorithms using the AVX2 instruction set.
Integer arithmetic
Basic operations
Size promotions
Image processing
Pixel clipping
RGB to grayscale
Pixel conversions
Image histogram

Chapter 11 – AVX2 Programming – Packed Floating Point (Part 1)
Chapter 11 teaches the reader how to enhance the performance of universal floating-point calculations using x86-64 assembly language and the AVX2 instruction set. The reader will also learn how to accelerate these types of calculations using fused-multiply-add (FMA) instructions.
Floating-Point Arrays
Least squares with FMA
Floating-Point Matrices
Matrix multiplication F32
Matrix multiplication F64

Matrix (4x4) multiplication F32
Matrix (4x4) multiplication F64
Matrix (4x4) vector multiplication F32
Matrix (4x4) vector multiplication F64
Covariance matrix F64

Chapter 12 – AVX2 Programming – Packed Floating Point (Part 2)
Chapter 12 is a continuation of the previous chapter. It explicates the coding of advanced algorithms including matrix inversion and convolutions using AVX2 and FMA instructions.
Advanced Matrix Operations
Matrix inverse F32
Matrix inverse F64
Signal Processing
1D convolution F32 variable-size kernel
1D convolution F64 variable-size kernel 1D convolution F32 fixed-size kernel
1D convolution F64 fixed-size kernel

Chapter 13 – AVX-512 Programming – Packed Integers
Chapter 13 highlights packed integer arithmetic and other operations using x86-64 assembly language and AVX-512. It also discusses how to code frequently used image processing algorithms using the AVX-512 instruction set.Integer Arithmetic
Addition and subtraction
Masked addition and subtraction
Image Processing
Pixel clipping
Image statistics
Image histogram

Chapter 14 – AVX-512 Programming – Packed Floating Point (Part 1)
Chapter 14 explains basic operations using packed floating-point operands and the AVX-512 instruction set. It also teaches the reader how to code common floating-point algorithms using x86-64 assembly language and AVX-512.
Floating-point arithmetic
Floating-point arithmetic
Floating-point compares
Floating-point arithmetic and mask registers
Floating-point matrices
Covariance matrix
Matrix multiplication F32
Matrix multiplication F64
Matrix (4x4) vector multiplication F32
Matrix (4x4) vector multiplication F64 (Ch14_08)

Chapter 15 – AVX-512 Programming – Packed Floating Point (Part 2)
Chapter 15 is a continuation of the previous chapter. It illustrates the coding of advanced algorithms using AVX-512 and FMA instructions.
Signal Processing
1D convolution F32 variable-size kernel
1D convolution F64 variable-size kernel
1D convolution F32 fixed-size kernel
1D convolution F64 fixed-size kernel

Chapter 16 – Advanced Instructions and Optimization Guidelines
Chapter 16 demonstrates the use of advanced x86-64 assembly language instructions. It also discusses guidelines that the reader can exploit to improve the performance of their assembly language code.
Advanced instructions
CPUID instruction – processor information
CPUID instruction – AVX, AVX2, FMA, and AVX-512 detection
Integer non-temporal memory loads and stores
Floating-point non-temporal memory stores
SIMD text processing Processor microarchitecture overview
X86-64 assembly language optimization guidelines

Appendix A – Source Code and Development Tools
Appendix A describes how to download, install, and execute the source code. It also includes some brief usage notes about the software development tools used to create the source code examples.Source code
Download instructions
Setup and configuration
Executing a source code example
Software development tools for Windows
Microsoft Visual Studio
MASM
Software development tools for Linux
GNU make
GNU C++ compiler
NASM
Benchmarking notes

Appendix B – References and Additional Resources
Appendix B contains a list of references that were consulted during the writing of this book. It also lists supplemental resources that the reader can consult for additional x86-64 assembly language programming information.
X86-64 assembly language programming references

Algorithm references
C++ references
X86 processor software utilities and libraries
Additional resources
From the B&N Reads Blog

Customer Reviews