Why 64 bit mode ( Long mode ) doesn't use segment registers?

Question

I'm a beginner level of student :) I'm studying about intel architecture, and I'm studying a memory management such as a segmentation and paging. I'm reading Intel's manual and it's pretty nice to understand intel's architectures.

However I'm still curious about something fundamental. Why in the 64bit long mode, all segment registers are going to bit 0? Why system doesn't use segment registers any longer?

Because system's 64bit of size (such as a GP registers) are enough to contain those logical address at once? Is protection working properly in 64bit mode?

I tried to find 64bit addressing but I couldn't find in Google. Perhaps I have terrible searching skill or I may need some specfied previous knowledge to searching in google.

Hence I'd like to know why 16bit of segment registers are not going to use in 64bit mode, and how could protection work properly in 64bit mode.

Thank you!

Segment registers were an implementation detail of 16-bit real mode. That stopped being relevant 20 years ago. 32-bit and 64-bit mode use a flat unsegmented virtual memory address space. — Hans Passant, Jan 16 '14 at 15:48
@HansPassant: If segment registers had grown up to be 32 bits along with everything else, they could be very relevant and useful in an object-oriented framework [having every object starts at offset zero of some segment would allow a framework to access many gigs of memory using offset registers that are half the size of those in x64]. The real reason they're not useful is that segment identifiers remained 16 bits while everything else got bigger. — supercat, Jan 16 '14 at 16:14
possible duplicate of [How to interpret segment register accesses on x86-64?](http://stackoverflow.com/questions/7844963/how-to-interpret-segment-register-accesses-on-x86-64) — Ciro Santilli OurBigBook.com, Sep 20 '15 at 15:36
`going to bit 0`. I'm not sure what you mean by this, but they neither are necessarily zero, nor do _all_ the segment registers have zero base. `FS` and `GS` are still used with complete 64-bit base and are quite useful for accessing thread-local storage. — Ruslan, Oct 07 '15 at 16:46

score 32 · Accepted Answer · edited May 04 '16 at 11:51

In a manner of speaking, when you perform array ("indexed") type addressing with general registers, you are doing essentially the same thing as the segment registers. In the bad old days of 8-bit and 16-bit programming, many applications required much more data (and occasionally more code) than a 16-bit address could reach.

So many CPUs solved this by having a larger addressable memory space than the 16-bit addresses could reach, and made those regions of memory accessible by means of "segment registers" or similar. A program would set the address in a "segment register" to an address above the (65536 byte) 16-bit address space. Then when certain instructions were executed, they would add the instruction specified address to the appropriate (or specified) "segment register" to read data (or code) beyond the range of 16-bit addresses or 16-bit offsets.

However, the situation today is opposite!

How so? Today, a 64-bit CPU can address more than (not less than) all addressable memory space. Most 64-bit CPUs today can address something like 40-bits to 48-bits of physical memory. True, there is nothing to stop them from addressing a full 64-bit memory space, but they know nobody (but the NSA) can afford that much RAM, and besides, hanging that much RAM on the CPU bus would load it down with capacitance, and slow down ALL memory accesses outside the CPU chip.

Therefore, the current generation of mainstream CPUs can address 40-bits to 48-bits of memory space, which is more than 99.999% of the market would ever imagine reaching. Note that 32-bits is 4-gigabytes (which some people do exceed today by a factor of 2, 4, 8, 16), but even 40-bits can address 256 * 4GB == 1024GB == 1TB. While 64GB of RAM is reasonable today, and perhaps even 256GB in extreme cases, 1024GB just isn't necessary except for perhaps 0.001% of applications, and is unaffordable to boot.

And if you are in that 0.001% category, just buy one of the CPUs that address 48-bits of physical memory, and you're talking 256TB... which is currently impractical because it would load down the memory bus with vastly too much capacitance (maybe even to the point the memory bus would stop completely stop working).

The point is this. When your normal addressing modes with normal 64-bit registers can already address vastly more memory than your computer can contain, the conventional reason to add segment registers vanishes.

This doesn't mean people could not find useful purposes for segment registers in 64-bit CPUs. They could. Several possibilities are evident. However, with 64-bit general registers and 64-bit address space, there is nothing that general registers could not do that segment registers can. And general purpose registers have a great many purposes, which segment registers do not. Therefore, if anyone was planning to add more registers to a modern 64-bit CPU, they would add general purpose registers (which can do "anything") rather than add very limited purpose "segment registers".

And indeed they have. As you may have noticed, AMD and Intel keep adding more [sorta] general-purpose registers to the SIMD register-file, and AMD doubled the number of [truly] general purpose registers when they designed their 64-bit x86_64 CPUs (which Intel copied).

The 8086 segment registers were a good hardware design, actually, save only for the fact that there weren't enough of them and there were a few irksome omissions in the instruction set [e.g. `mov segReg,immed`]. Language support, however, was lacking. If allocations were rounded to 16-byte multiples, then pointers to allocated objects should have only taken two bytes rather than four, but common languages had no concept of paragraph-aligned pointers. Many object-oriented frameworks, however, could work very nicely with something similar to real-mode segmentation (but with 32-bit regs). — supercat, Jan 23 '14 at 20:50
You've missed that `FS` and `GS` are still useful and are in fact actively used to access TLS both on Windows and Linux. — Ruslan, Oct 07 '15 at 16:47
I think you completely missed how segment registers are used in 32 bits protected mode. They're not even usable to access more memory, they're used as selectors in the global descriptor table to set an offset and some memory protection. This feature has been almost completely disabled in long mode. I think that was the actual question (that what brought me here). You also missed the the Distributed Shared Memory model to allow access to a huge amount of memory physically distributed on a network. — Celelibi, Nov 29 '15 at 04:55
@Celelibi: I think you *could* make an OS that used a non-flat memory space in 32bit protected mode, with multiple 4GiB segments. Nobody did/does, and the CPUs are optimized for that case. (non-zero segment offsets slow down all memory accesses, IIRC). — Peter Cordes, May 04 '16 at 17:59
To expand on @Celelibi's point about distributed shared memory, even a multi-socket single system has memory attached to the separate memory controllers in every socket. Multi-channel memory controllers in each socket also mean less bus load from lots of memory sticks. (Desktop CPUs have dual-channel controllers, but server CPUs can have more.) 64bit *virtual* address space also allows memory-mapping gigantic files. It's only physical address space that reflects how much actual RAM could potentially be hooked up to in a single cache-coherent SMP system. — Peter Cordes, May 04 '16 at 18:04
@PeterCordes Protected-mode segments could be used to add poor man's memory protection. Those GDT segments can have an offset (base address) that is either a multiple of 1 byte or of 4 Kbytes. I don't know what carried me to talk about distributed shared memory. There are NUMA machines, which you kinda mention but which are not really DSM. There's also real DSM involving several nodes and memory mapping through the network (probably using RDMA). But it's not quite related to segments. — Celelibi, May 06 '16 at 03:54
Correction to my earlier comment: with a non-zero segment base in 32-bit mode, segbase + offset gives you a *32-bit* linear address (virtual if paging is enabled; seg:off -> linear happens before virt->phys). Segmentation can't expand the amount of address space one process can see at once, even if paging is disabled. Only PAE or compat-mode paging (32-bit virtual -> wider physical) can let different processes with different page tables use more than 4GiB of physical RAM at the same time, with each process limited to 4GiB of linear address space. — Peter Cordes, Jun 09 '20 at 08:33

score 5 · Answer 2 · edited Apr 16 '14 at 15:22

Most answers to questions on irrelevance of segment registers in a 32/64 bit world always centers around memory addressing. We all agree that the primary purpose of segment registers was to get around address space limitation in a 16 bit DOS world. However, from a security capability perspective segment registers provide 4 rings of address space isolation, which is not available if we do 64 bit long mode, say for a 64 bit OS. This is not a problem with current popular OS's such as Windows and Linux that use only ring 0 and ring 3 with two levels of isolation. Ring 1 and 2 are sometimes part of the kernel and sometimes part of user space depending on how the code is written. With the advent of hardware virtualization (as opposed to OS virtualization) from isolation perspective, hypervisors did not quite fit in either in ring 0 or ring 1/2/3. Intel and AMD added additional instructions (e.g., INTEL VMX) for root and non-root operations of VM's.

So what is the point being made? If one is designing a new secure OS with 4 rings of isolation then we run in to problems if segmentation is disabled. As an example, we use one ring each for hardware mux code, hypervisor code /containers/VM, OS Kernel and User Space. So we can make a case for leveraging additional security afforded by segmentation based on requirements stated above. However, Intel/AMD still allow F and G segment registers to have non-zero value (i.e., segmentation is not disabled). To best of my knowledge no OS exploits this ray of hope to write more secure OS/Hypervisor for hardware virtualization.

x86-64 still uses the same 2-bit (4 level) privilege level mechanism as 32-bit protected mode. The low 2 bits of the CS selector are the CPL (current privilege level), and the privilege-level checking mechanisms based on what GDT entries allow is still the same. The segment base / limit are fixed at 0 / -1 for CS/DS/ES/SS, but CS still needs to index a valid code segment in the GDT. This is also how an x86-64 CPU knows whether to decode machine code as 16, 32, or 64-bit mode, as well as whether privileged instructions like `invlpg` are allowed (ring 0). — Peter Cordes, Jun 09 '20 at 08:39
Memory protection is provided by paging, but other privilege-level stuff still exists. I'm not sure if privilege levels can affect data segment loads/stores to e.g. have memory that ring 1 can read but ring 3 can't. So it's possible that it might not be fully usable that way even though all 4 privilege levels do still exist. — Peter Cordes, Jun 09 '20 at 08:41

Why 64 bit mode ( Long mode ) doesn't use segment registers?

2 Answers2