NT Architecture Basics


NT executes in two modes, user mode and kernel mode:
 
 


 


Kernel mode is a highly privileged processor mode, with direct access to all hardware and memory; user mode is a less privileged mode, with no direct access to hardware and restricted access to memory.

User mode is the mode in which applications and operating system environment subsystems execute. The operating system environments that NT supplies include POSIX, OS/2, Win16, DOS, and Win32. Applications are clients of exactly one environment subsystem and use only the APIs that subsystem exports. Thus, Win32 programs are clients of the Win32 subsystem and use only the Win32 API.

The subsystems use basic NT services that the NT Executive and the Microkernel provide. These services run in kernel mode. The Executive includes core operating system components:

The Executive is generally portable across processor architectures (e.g., Alpha, x86), and it relies on the Microkernel for processor-specific functions such as context-switching (scheduling) and synchronization primitives.

Beneath the Microkernel resides the Hardware Abstraction Layer (HAL), through which the Executive subsystems and the Microkernel interface with the processor. The HAL is the portion of the kernel that is written in the specific platform assembly language.  Microsoft ships different HALs for different processors and processor boards.

Device drivers are modules that interface NT and applications to specific hardware devices. A large number of device drivers for disk drives, video cards, modems, network cards, and input devices ship with NT. However, hardware vendors can include custom device drivers with their hardware, and NT dynamically adds the drivers to its kernel-mode environment.
 

User Mode vs. Kernel Mode

What differentiates user mode from kernel mode is the privilege level. A program executing in user mode runs in a sandbox (not unlike a Java virtual machine's sandbox) that the NT Executive and the program's operating system environment create for the program. The sandbox enforces restrictions as to what the program can do. One type of restriction relates to what parts of the computer's memory the program can reference and in what ways.
 
Figure shows the virtual memory map that NT creates for applications. Addressable memory totals 4GB, but NT evenly divides the space between the memory assigned to a program and the memory that the kernel-mode portion of NT uses.

The lower 2GB mapping changes, depending on which program is currently running. For example, if Microsoft Word is running, NT places Word's address mapping in the lower 2GB; if Netscape Navigator runs next, its mapping replaces Word's mapping.

The upper 2GB mapping always remains that of the Executive, Microkernel, device drivers, and HAL. Thus, the split between user mode and kernel mode also shows up in NT's address space mapping. (In NT Server 4.0, Enterprise Edition, you can adjust the address split between user mode and kernel mode so that applications have 3GB of memory, with 1GB left for NT's Executive, drivers, and HAL. You will see this split only when NT is running on systems with several gigabytes of physical memory.)

The primary memory restriction placed on user-mode programs is that they cannot access any of the kernel-mode memory. User-mode programs also cannot access invalid portions of their mapping (i.e., portions not filled with data or code from the program). This arrangement contrasts with the kernel-mode portions of NT, which have free rein over the entire address map.

The user-mode sandbox enforces another restriction that limits a program's ability to directly access hardware devices such as disks, the video screen, and the printer. Programs must typically go through their operating system environment (e.g., Win32) to read data from or write data to a peripheral. The operating system environment then usually calls on the services of the Executive in kernel mode, effectively forwarding the request. The Executive finally completes the request, sometimes with the aid of a device driver, but almost always with the use of functions in the HAL that interface with the computer's hardware. NT implements the transition between user mode and kernel mode as a system call gateway, through which the passage of data is precisely controlled.

Although a user-mode program can try to directly communicate with a hardware device, NT prevents it from doing so. Any kernel-mode component, however, can touch any part of the hardware. For example, a device driver implemented to interface with a disk drive can access video hardware without NT stopping it.

What do I mean when I say that NT stops user-mode programs from reaching outside their sandboxes to touch memory that isn't theirs or access hardware devices directly? You probably have seen the result of such an attempt:
 

    The infamous Dr. Watson dialog box signals that NT caught a program doing something illegal, and NT is terminating the program. The detection of such transgressions takes place in a kernel-mode subsystem such as the Process Manager or the Virtual Memory Manager. Some legal user-mode operations (e.g., referencing memory that the paging file is currently using) generate processor exceptions, but a program can also trigger exceptions when it steps outside its sandbox. A kernel-mode component must determine whether an exception is the result of a legal or an illegal operation; when a kernel-mode component catches an illegal exception, it notifies the Dr. Watson user-mode application. With the help of hardware support in the processor, the kernel-mode portions of NT keep user-mode applications constrained to acceptable activity and prevent user-mode applications from corrupting other applications or crossing the boundary between user mode and kernel mode other than through the gateway.

Kernel-Mode Rules

Portions of the memory map are undefined, and consequently, those portions are invalid regardless of what tries to access them. For example, if the space between 3GB and 4GB in the address map is not defined, a device driver accessing that portion of the map will cause a processor exception. In this example, the Virtual Memory Manager will recognize that a kernel-mode device driver has tried to touch invalid memory. For exceptions that originate in user mode, a kernel-mode subsystem handles the exception.

Kernel-mode components also have rigid rules about what they can do when the processor is in different states.

Each processor in an NT system has an associated Interrupt Request Level (IRQL) that changes as the processor's interrupt controller fields various software and hardware interrupts. Although IRQLs have almost nothing to do with scheduling priorities, you can think of IRQLs as priorities in the sense that the interrupt controller blocks out interrupt requests with lower IRQLs while the processor is handling interrupts with higher IRQLs. In its design, NT attempts to keep the IRQL at Passive Level, where no interrupts are blocked out, as much as possible. The NT scheduler executes at Dispatch Level, and NT services hardware interrupts at even higher IRQLs.

Only when the IRQL is below Dispatch Level can kernel-mode components access pageable memory or cause scheduling operations. Pageable memory includes all user-mode application memory and portions of kernel-mode memory. Pageable memory gets its name from the fact that its data can be temporarily moved from the processor's physical memory to a paging file on disk and brought back when needed. When a kernel-mode component (such as a device driver) accesses part of the memory map referring to pageable memory that has data in a paging file, it triggers a processor exception (the same one that's triggered when a component references invalid memory), and the Virtual Memory Manager must retrieve the data. However, if the IRQL is Dispatch Level or higher, the Virtual Memory Manager cannot be invoked.

The scheduler's IRQL is Dispatch Level, so a device driver cannot yield control of a processor to another program or kernel-mode component if the IRQL is at Dispatch Level or higher. To do so would force the invocation of the scheduler, which would detect that it had been called at an illegal processor state.

When a kernel-mode device driver or subsystem causes an illegal exception, NT faces a difficult dilemma. It has detected that a part of the operating system with the ability to access any hardware device and any valid memory has done something it wasn't supposed to do.

NT could just ignore the exception and let the device driver or subsystem continue as if nothing had happened. The possibility exists that the error was isolated and that the component will somehow recover, letting NT limp along. What's more likely is that the detected exception resulted from deeper problems--for example, from a general corruption of memory or from a hardware device that's not functioning properly. Permitting the system to continue operating will probably result in more exceptions, and data stored on disk or other peripherals can become corrupt--a risk that's too high to take.

A device driver or subsystem also might realize that something is not quite right. For example, a subsystem might call a function in a device driver when the processor IRQL is Passive Level. If the function returns and the IRQL has changed, the device driver has somehow modified the IRQL without restoring it, which reveals a bug in the driver. As device drivers and subsystems execute, they require certain operations to succeed or return results within a valid range. For instance, if the Configuration Manager tries to read a Registry file from the disk and encounters an error, the Configuration Manager might not be able to continue processing without risking damage to the Registry.

To stop a system in the face of kernel-mode exceptions and to provide a systems administrator or developer information about what has happened, NT exports the KeBugCheck function for use by kernel-mode device drivers, subsystems, and the Microkernel. This function takes a Stop Code and four more parameters that are interpreted on a per-Stop Code basis. After KeBugCheck masks out all interrupts on all processors of the system, it switches the display into blue screen mode (80 columns by 50 lines text mode), paints a blue background, and begins to print information about the system's state: BSOD.
 

Referencias