logical address, virtual address, linear address and physical address

In this article, I want to illustrate the relationship between the logical address, virtual address, linear address and physical address. These three concepts are very easy to confuse. Many people learn these concepts from operating system books, but these books can’t explain them on a very low level. Apart from these books, the Linux kernel and the Intel Developer’s Manual also illustrate these terminologies in their way, so it’s hard to find out what exactly these words mean. Now I will explain them in details.

Suppose the system is 32-bit so we can distinguish these concepts completely.

Let’s look at the logical address as a start. Linux kernel doesn’t use this concept, but Intel Manual defines it as selector:offset, or in other words, a logical address consists of a segment selector and an offset. cs:eip and ds:offset are good examples.

Then let’s look at what is the linear address. We need a concrete example to illustrate this concept. In AT&T assembly language, the instruction movl (%ecx) %eax means move a value which is stored in memory to the eax register. (%ecx) is called indirect addressing. It implies the data segment register is ds and the offset is stored in ecx register. So (%ecx) actually means a memory unit which logical address is ds:ecx. When trying to retrieve the value stored in memory, CPU will use the selector ds as the index to get a descriptor from GDT (aka Global Descriptor Table), and then extract a base address from it. Then the offset part of the logical address (ecx in our example) is added to the base address to generate a new address and we called this new address the linear address. More precisely, the base address should be called base linear address. So we can say the logical address consist of two parts: selector (being used to get the base linear address) and a offset (being added to the base linear address). This is the process of translating the logical address to the linear address. This process is handled by segmentation mechanism.

Finally let’s look at the physical address. This is a very simple concept. You can imagine the RAM is a long array and the physical address is the array index. More precisely, a byte in RAM is uniquely identified with a physical address. The process that translate the linear address to the physical address is paging mechanism. The nature of paging mechanism is splitting the linear address and use each section as a index to locate a entry in pagetables. I won’t go further about paging mechanism.

Wait a minute! Where is the virtual address? This concept does appear in many OS books including Linux kernel materials! Do you forget it? No. Actually, the concept virtual address is just an alias of linear address. In other words, linear address is an Intel term while virtual address is a kernel term. Many people are confused by these two concepts.

I hope this article can help you make clear these terms.