Start free trial Sign in

From the course: Learning Assembly Language

Introduction to assembly language - Python Tutorial

From the course: Learning Assembly Language

Start my 1-month free trial

Introduction to assembly language

“

- [Instructor] Computers of all shapes and sizes achieve their computational tasks by manipulating binary data, zeros and ones, using a set of instructions, built into the hardware of their chip sets. These instructions themselves are represented as zeros and ones in what's known as machine code. For as humans it's a bit mind numbing looking at just zeros and ones. So we aggregate them into sets of four bits and call them hexadecimal. Represented by the numbers no to nine, and the letters A to F 16 possible values in total. Chip manufacturers will have their own set of instructions for their chips. The most popular instruction set is that used with the Intel x86 chip. Two of the commonly seen chip sets are they ARM chip which is the most popular mobile phone chip, and the Atmel chip, which appears in many small IoT type devices. For the purposes of this course, we'll be focusing on the x86 chip set. Instructions are executed in what's known as the central processing unit of the chip using a set of registers. Special areas of the chip, which are able to manipulate bits. We'll look at what these registers are shortly but as an example, the instruction to add 28 to a register called ESP will be coded in machine language as 83 C4 1C. Even this is a bit tricky for us. Although many low level programs can write machine code directly. It's more common however, to use mnemonics to represent the various parts of the instruction. And this is what's known as assembly language. So we'd code this instruction in assembly language as ADD ESP, 1C. And then we'd use a program called an assembler to convert this mnemonic form back to machine code ready for the computer to execute it. The general model that we can keep in the back of our minds as we program in assembler starts with the processing unit. This works most effectively with its set of high performance registers. However, we often need more than just what can be stored in registers. So we need to have a memory area which can be used to store our data and our program instructions. We want to interact with the program. So we need an input and output device usually a screen and keyboard. Of course, we also have other devices such as track pads and the mouse and then embedded products, we may display to an LED. However these are just variations on the general theme. While the internal memory of a computer gives us what we need to operate, we need more. While internal memory doesn't need to be as fast as registers. It needs to be fast enough. This means it's volatile. So its contents disappear when power is removed. Consequently, we need to also have access to backing storage of some sort. Typically a solid state or magnetic surface hard disk. The size of the instruction does matter. For X 86 much of the assembler code we see is written in 32 bit code. However with the advent of 64 bit processes more extensive instructions have been included in the 64 bit versions of the assembler tools. We'll cover both 32 and 64 bit instructions. Microsoft includes both 32 bit and 64 bit assemblers called ml.exe and ml64.exe in their software development kits. These tools can be accessed via visual studio by including them in a C ++ project or run directly in the command line. The MASM32 SDK is project developed around the Microsoft MASM32 product. Intended as a simpler and easier introduction to the Microsoft MASM32 bit programming environment. And it comes with a simple IDE. GoAsm is a very easy to use set of tools, providing assembly and linking of both 32 and 64 bits assembler programs. We'll be focusing on GoASM as our assembler of choice in this course. We've described a basic architecture with which we can process instructions that work on data. Let's look at the general classes of instructions. The first category of instructions is the load and store instructions, which move data between registers and memory locations. These include basic move and store instructions as well as some more esoteric instructions such as sign extensions and data exchange instructions. The second category is the set of instructions used to add, subtract, multiply, and divide. Which provide the computational capability of the chip. These include signed and unsigned operations Packed Decimal Format operations, floating point operations and increment and decrement operations. The third major category is bit wise or logical operations. These are special forms of manipulation at the bit level such as shifting, adding or calling and so on. And they are used for a variety of purposes including extracting parts of a memory location a process known as masking. The next category is that involving instructions which change the flow of execution of the program. The most basic form of coding is to perform an instruction move to the next sequential instruction location and perform that instruction and so on. However, we also need to make decisions on whether to move to the next instruction or to take an alternative path. We do this using a set of instructions which change the Control flow. This includes if statements, looping instructions and sub program calls. There's also a number of advanced capabilities of the chip including AES hardware capability. Packed decimal 128 bit instructions and 256 and larger vector instructions. We'll cover some of this advanced material in this course but after the course, you might want to check the full range of advanced instructions yourself. With that as an introduction let's get into learning how to write assembly code.

Contents