hn-classics/_stories/2010/13127282.md

18 KiB
Raw Permalink Blame History

created_at title url author points story_text comment_text num_comments story_id story_title story_url parent_id created_at_i _tags objectID year
2016-12-07T23:47:54.000Z Why is there no CR1, and why are control registers such a mess anyway? (2010) http://www.pagetable.com/?p=364 JoshTriplett 99 15 1481154474
story
author_JoshTriplett
story_13127282
13127282 2010

Source

Why is there no CR1 and why are control registers such a mess anyway? | pagetable.com

pagetable.com

Some Assembly Required

Menu

Skip to content

Why is there no CR1 and why are control registers such a mess anyway?

12 Replies

If you want to enable protected mode or paging on the i386/x86_64 architecture, you use CR0, which is short for control register 0. Makes sense. These are important system settings. But if you want to switch the pagetable format, you have to change a bit in CR4 (CR1 does not exist and CR2 and CR3 dont hold control bits), if you want to switch to 64 bit mode, you have to change a bit in an MSR, oh, and if you want to turn on single stepping, thats actually in your FLAGS. Also, have I mentioned that CR5 through CR15 dont exist except for CR8, of course?

Like many (but unfortunately not all) quirks of the i386/x86_64 architecture, this mess can be explained with history.

8086 FLAGS

x86 history typically starts with the 16 bit 8086, but although it was not binary compatible with its predecessor, it was nevertheless a rather straightforward assembly-level compatible 16 bit extension of the 8 bit Intel 8080 with some ideas of the Zilog Z80. The 8086 is still a classic “home computer class” CPU, which was not meant for modern operating systems: It had no MMU of any kind, and no concept of privileged and unpriviliged modes. Therefore, control bits that we see as system state today were encoded into the 16 bit FLAGS register: The interrupt enable bit and the trap flag (which will cause a software interrupt after the next instruction and thus lets you single-step) are encoded into FLAGS right next to the ALUs flags like Zero and Carry.

80286 Machine Status Word

The 80286 then came with a simple form of memory management that allowed more sophisticated (but not yet “modern”) operating systems to run like the original versions of OS/2. The 16 bit “Machine Status Word” was created to host the big switch between legacy mode (real mode) and the new memory-managed mode (protected mode) and a program could access it using the new instructions “lmsw” and “smsw”. The 80286 had more system state than just this bit: The GDT, the IDT and the TSS had its own registers and dedicated instructions to access them (“lgdt”/”sgdt”, “lidt”/”sidt”, “ltr”/”str”)

i386 Control Registers

The i386 finally had a real MMU that allowed paging and thus modern operating systems. The MMU required two more registers in the system state, one for the base address of the pagetables, and one to read a fault address from. Intel decided against adding more special purpose registers with dedicated accessor instructions, but instead introduced eight indexed 32 bit wide “control registers” CR0 to CR7. The new accessors “mov cr_n_, r32“/”mov r32, cr_n_” allowed copying between registers and control registers and had the 3 bit CR index encoded in the opcode.

The old MSW was also wired into the lower 16 bits of CR0; but CR0 was also extended with new bits like the switch to turn on paging. CR1 was kept reserved, presumably as a second control register for miscellaneous control bits, and CR2 and CR3 were used for the aforementioned fault address and pagetable base pointer. The opcodes to access reserved control registers generated an “invalid opcode” fault, making it possible for Intel to reuse the opcodes later if they dont use the control registers.

i486 CR4

The i486 added a few more control bits, and some of them went into CR0. But instead of overflowing the new bits into CR1, Intel decided to skip it and open up CR4 instead for unknown reasons.

Pentium MSRs

On the Pentium, Intel added for the first time control bits that were a property of the implementation as opposed to the architecture, i.e. bits that are microarchitecture-specific and will therefore only work on certain CPUs and not necessarily be supported on later CPUs like caching details and debug settings. In order not to waste the valuable CR space with throw-away control bits, Intel introduced the Model-specific Registers (MSRs). The MSR address space is 32 bits, and every MSR is 64 bits wide. The two new instructions “rdmsr” and “wrmsr” copy between an ECX-indexed MSR and the EDX:EAX registers.

Pentium II SYSENTER MSR

The SYSENTER instruction that got introduced on the on the Pentium II is a fast way to switch between unprivileged and privileged mode. Instead of looking up the destination segment, instruction pointer and stack pointer in memory, the CPU holds this information in three special-purpose system registers. CR space is valuable, so Intel decided against filling up CR5, CR6 and CR7, so they put it into the MSR address space instead at 0×174 through 0×176. This was practically an abuse of the MSR concept.

AMD K6 EFER MSR

Who can blame AMD for doing similar things then? With the K6, which was introduced at the same time as the Pentium II, AMD diverged from just copying Intel for the first time and actually added features of their own: They added the SYSCALL instruction, and with it, a control bit that turns it on and off, and an extra control register with the target location. Being afraid to collide with Intel extensions they they didnt know about, they put the extra system registers into the MSR space: the control register “EFER” (Extended Feature Enable Register) at 0xC000_0080 and the Syscall Target Register (STAR) at 0xC000_0081. Intel had been nicely lining up MSRs counting up from 0, so AMD decided to start counting at 0xC000_0080. Understandable as this is, it is basically the same abuse of the MSR concept as Intels with SYSENTER.

A very similar thing happened in the CPUID space, by the way: While Intel encoded all its feature bits in leaf 0x0000_0001, AMD defined leaf 0x8000_0001 for its features.

x86_64 Chaos!

So far everything looked like it was getting a little more controlled. Both Intel and AMD are only adding new control registers in the MSR space, and since this is a big address space and AMD and Intel extend it on rather opposite locations, it all looks nicer. But then came x86_64: For the first time, Intel was copying a feature that AMD introduced, and it needed to be compatible with all its details. AMD had encoded the availibility of x86_64 in its own CPUID leaf in 0x8000_0001, so Intel had to support this leaf as well. And since Long Mode was turned on in the EFER MSR, Intel had to support an MSR in the AMD space of 0xC000_0000. Long mode also required supporting SYSCALL, so Intel also supported the STAR MSR.

Since x86_64 introduced the REX prefix to double the number of available general-purpose registers, AMD decided to allow this prefix also for “mov cr”, doubling the number of control registers and therefore introducing CR8 through CR15 also doubling their width. And since AMD introduced them, they owned them, and decided to use CR8 for the “Task Priority Register” feature.

VMX and SVM

The architecture is messy, sure, but does it matter? Maybe not… as long as CPUs didnt have virtualization extensions! Both Intel VMX and AMD SVM are designed so that they can automatically switch the complete privileged machine state including control registers and certain MSRs. Intel for example special cases CR0, CR3, CR4 and CR8, leaves CR2 to the user. AMD on the other hand has 16 fields for all CRs in its switcher. And because of the two different starting points of the MSR space, Intel VMX required a whitelist bitmap for 8192 MSRs starting at 0x0000_0000 and for another 8192 MSRs starting at 0xC000_0000 and of course SYSENTER_CS, EFER, STAR and friends are special-cased. If you want to have a lot of fun, read the VMCS layout reference of Intels manual 3B!

Future?

  • CR1 and CR5 to CR7 are still “owned” by Intel. AMD has shown that they dont want to use them and even Intel has not added a control register since 1989.
  • CR9 through CR15 are technically owned by AMD, since they introduced them with x86_64 and decided to use CR8. Intel adopted the reserved ones when adopting x86_64, but it is unlikely that Intel will ever adopt smaller changes to the architecture from AMD, and AMD is unlikely to use them if they wont be part of the architecture, so these will probably never be used either. On the other hand, AMD added these to the auto-switcher list of their SVM Virtual Machine Control Block (VMCB), showing that they havent given up on them yet.
  • The MSR space is properly de-facto partitioned. Intel continues adding MSRs at 0 and AMD at 0xC000_0000 but MSR have already lost their model-specificness in 1997. MSRs are the new CRs.

Dear Intel, dear AMD: I like the control registers, and I hate to see them wasted. Why dont you finally define CR1 and give it a few control bits in the future? If youre scared about collisions, I will be happy to be the arbiter. Ah, whatever: Intel, you get to define all even bits in CR1, and AMD, you get to define all odd bits. Okay? Cool.

This entry was posted in archeology, trivia, whines on July 2, 2010 by Michael Steil.

Post navigation

High-Res Pictures of a MOS KIM-1 Playstation 3 Hacking Linux Is Inevitable →

12 thoughts on “Why is there no CR1 and why are control registers such a mess anyway?”

  1. Ed July 3, 2010 at 01:08

As Ollie never said: “Another fine mess” nice article! (Two typos in the MSW paragraph: “decicated”)

Reply

  1. Ron July 3, 2010 at 03:49

This architecture needs a coffin. Badly.

Reply

  1. Michael Steil July 3, 2010 at 05:37

@Ed: fixed, thanks!

Reply

  1. Nate July 4, 2010 at 10:13

Hi Michael. I think the VMM save state structure is derived from the ICE structure, which first showed up on the 386SL. This started with the LOADALL instruction and then moved to RSM (resume from SMM).

http://www.rcollins.org/ddj/Jan97/Jan97.html

It would be interesting to compare all three.

Reply

  1. **Yuhong Bao** July 25, 2010 at 22:48

Actually CR4 was not introduced with the original 486. It again originated in the Pentium and added to 486 when they backported some features from the Pentium at the same time.

Reply

  1. **Yuhong Bao** July 25, 2010 at 22:56

“MSRs are the new CRs.”
Yep, Intel still use CR4 to enable their new features, AMD use EFER for their new features.

Reply

  1. **Myria** September 7, 2010 at 01:11

The “mov cr0″ instruction had an advantage over “lmsw”: mov cr0 can change the PE (“protection enable”) bit from 1 back to 0 to exit protected mode. The 286 couldnt leave protected mode once entered. lmsw still doesnt allow clearing the PE bit for backward compatibility.

The 386 also added the DRx debug registers.

The 486 added another set of registers, TRx, for testing the cache RAM that was added to the 486 for the first time in the x86 line. These registers disappeared after the 486; Intel may even have reassigned the opcode space.

MSRs were meant to be model-specific, hence the name, but enough important functionality existed there that Intel designated some of the MSRs as “architectural”, meaning they wont change meaning. AMD does the same for the important x86-64 registers like EFER.

AMDs CR8 register is useful enough that CR8-CR15 can be accessed from 32-bit mode. In 64-bit mode, accessing CR8 through CR15 is done through a REX prefix. REX prefixes are the “INC reg32″ and “DEC reg32″ instructions in 32-bit mode, so that wouldnt work. Instead, providing a “LOCK” prefix to the instruction can be used to select the high 8 registers.

Reply

  1. Peter Lund September 12, 2010 at 04:18

Didnt Intel break the model-specificity right from the start by putting the tick counter in an MSR?

Reply

  1. **Yuhong Bao** October 3, 2010 at 17:42

Myria: Correct, except that TRx was also in the 386.

Reply

  1. Czerno November 2, 2010 at 05:03

Typo (I presume a typo…) : MSR are indexed by ECX, not EDX !

Please do yourself a favor, re-read whatever you post carefully, as errors such as this one, the 386 TR registers as remarked by someone above and a few other “bits” easily remove credibility from a blog.

Reply

  1. Michael Steil November 2, 2010 at 15:50

@Czerno: Thanks, fixed; criticism noted.

Reply

  1. Czerno November 9, 2010 at 04:16

Thank /you/, Michael!

Regarding the “mess” with various control and status registers, we should note that part of the status (most importantly EFLAGS) pertains to the /task/ while another part is machine-wide, so there is some logical basis for it to be stored in a different manner. Of course, youll point out that CR3 also is per-task hence copied to the TSS… Right, it /is/ a mess ;=)

Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Name *

Email *

Website

Comment

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Notify me of follow-up comments by email.

Notify me of new posts by email.

Search for:

Recent Posts

github

Blogroll

Categories

Meta

Proudly powered by WordPress

[*HTML]: HyperText Markup Language [*RSS]: Really Simple Syndication