In the last post, we looked at what computer memory is, how it differs from storage, and why it is so significant to the functioning of any computer or digital device. In this post, we will explore how memory is actually used by computer programs. In so doing, we will also start to consider the concept of memory addressing (explored in more detail here) and why it is fundamental to the workings of any computer program. As in previous posts, we will use 8-bit architecture to illustrate certain points; while modern computers have a much more advanced 64-bit architecture with vastly superior CPUs than the humble 6502 chipset, they broadly operate in much the same way.
The first thing to understand about any program is that it must be loaded into memory in order for it to do anything at all. Until or unless the program code is placed into computer memory, it is merely a collection of bytes stored on disk, inert and powerless. If a program has been written using a compiled language like Java or Python, the programmer has little control over where in memory that program goes. This is handled for them by the compiler, which can also manage the memory addresses of the various variables in the program itself. If, however, the program is written in assembly code, it is both possible and indeed necessary to tell the computer precisely where in memory the program should go. While this offers a greater deal of flexibility, it nevertheless requires the programmer to keep track of how much memory is needed for the program code, as otherwise they may find it does not ‘fit’ into the area of memory they specify. Many are the tales of hobbyist programmers in the 1980s who composed amazing games on file, but then found that the code exceeded the memory available to them at runtime.
What do we mean when we talk about specifying an area of memory, though? Memory itself is made up of a series of addresses, with each address being capable of storing a fixed amount of data. In 8-bit architecture, a single memory address typically refers to one byte (i.e. 8 bits); in 64-bit architecture, by comparison, a single memory address may refer to anything from one to eight bytes. A memory address itself is a unique identifier that provides the computer with enough information to know where in memory to look (more on memory addressing here). In many 8-bit systems like the BBC Micro, memory is typically made up of pages, and each page will have a set number of locations within the page. The memory address, then, is a label that provides both a page number and a specific location on that page. The particular architecture of the computer will dictate how many memory addresses there are; an 8-bit BBC Micro, for example, has 256 pages of memory with each page having 256 locations, giving a total of 65,536 (256*256) memory addresses. Memory is a shared resource, meaning that certain memory addresses may be reserved for particular operations, while others remain free for data, instructions or entire programs to be loaded into by an operating system, a compiler or even the programmer themselves. When we tell the computer to store something at a particular memory address, it will dutifully put whatever we tell it to into memory at that address. Care must be taken, however, to only pass data to a memory address that fits the size of the memory address itself. If we pass more than a single byte to an 8-bit memory address, for example, it will – intentionally or otherwise – ‘spill over’ into the next consecutive memory address, which can potentially cause unexpected results if we had plans to use that next memory address for something else!
It isn’t actually necessary to load the entire program into memory in order to run it; larger, more complex programs may only require a certain percentage of the code to be loaded into memory and then – at certain intervals throughout the program’s runtime, some of that code may be cleared out of memory to be replaced with other program code. Anyone who remembers those old games that came on a set of floppy disks and which would periodically pause and ask the user to ‘insert disk 2 now’ after a certain period of time will be familiar with this idea. Of course, with the huge amounts of storage – and memory – available to modern computers, memory management issues are not what they used to be. That said, any loading screen you ever encounter in a modern game is an example of the game’s source code being swapped out of memory for other code on disk.
What does it actually mean to say that a program is loaded into memory? Essentially what happens is the code – which is nothing more than a series of bytes on disk – gets copied into an area of computer memory, starting from a designated memory address. The first instruction of the program will exist at the first memory address of that area of memory – either as specified by the compiler or as specified by the programmer. With the program (or a sufficient amount of that program) loaded into memory, the computer is ready to run it. The CPU will send out a request on something called an address bus to retrieve the instruction held at the first memory address. The data representing that instruction is then passed back to the CPU on the data bus, and the CPU will decode (using the control bus) and carry out that instruction, which in turn may require it to retrieve data from or write data to another memory address. Each time it processes an instruction, an internal program counter is incremented by one so that once the first instruction has been carried out, the computer will request data from the memory address, plus one. Then after that instruction has been decoded and processed, it looks again for an instruction at the same memory address, plus two. The act of the CPU sending out a request on the address bus, getting data back on the data bus and then processing it is known as a ‘cycle’. The CPU will cycle for as many times as it takes to reach the memory address of the program’s final instruction. At this point, the program is considered by the computer to have completed. Technically, the code of the program itself continues to reside in memory (assuming the computer is not switched off after the program terminates) but it will likely get overwritten by another program that the user next decides to run.
All of this may make it sound as if a computer program containing 99 instructions simply starts from instruction one, proceeds in a linear fashion until it reaches instruction ninety-nine and then stops. Nothing could be further from the truth! Although very simplistic programs may only require the CPU to run through some instructions in series, most programs will have the CPU jumping around from one instruction to another, loop back, repeat certain instructions, skip others based on logical conditions, carry out instructions looped inside other instructions and only after many, many cycles come to a stop. This is done using something called ‘branching’, in which the CPU program counter (that it uses to know which memory address to go to) is essentially taken over by the program itself and manipulated to different values. This has the effect of making the CPU go to memory addresses entirely out of sequence, thereby allowing certain instructions to be repeated and others to be skipped. Badly written programs may even put the CPU into an infinite loop, forever repeating the same set of instructions over and again. It is this capacity for a program to take charge of which instructions are executed in which sequence that ultimately makes a computer superior to a simple calculator.
Precisely how a program interacts with computer memory to control the repetition of, and the sequence in which, its instructions are carried out will be the subject of a future post…
[…] How do programs themselves access memory? This will be the subject of the next post… […]
[…] the previous post, we looked at how a computer program works, and discovered that for it to be able to do anything at […]