SINGULARITY------A research OS at Microsoft,Redmond

Abstract

Singularity is a research project in Microsoft Research that started with the question: what would a software platform look like if it was designed from scratch with the primary goal of dependability? Singularity is working to answer this question by building on advances in programming languages and tools to develop a new system architecture and operating system (code named Singularity), with the aim of producing a more robust and dependable software platform. Singularity demonstrates the practicality of new technologies and architectural decisions, which should lead to the construction of more robust and dependable systems.

Chapter 1:Introduction

Our present softwares run on a platform that has evolved over the past 40 years and is increasingly showing its age. This platform is the vast collection of code—operating systems, programming languages, compilers, libraries, run-time systems, middleware, etc.—and hardware that enables a program to execute. On one hand, this platform is an enormous success in both financial and practical terms. The platform forms the foundation of the $179 billion dollar packaged software industry and has enabled revolutionary innovations such as the Internet. On the other hand, the platform and software running on it are less robust, reliable, and secure than most users (and developers!) would wish. Our current platform has not evolved far beyond the computer architectures, operating systems, and programming languages of the 1960’s and 1970’s. The computing environment of that period was very different from today’s era. Singularity is a new operating system started with the question: what would a software platform look like if it was designed from scratch with the primary goal of dependability, instead of the more common goal of performance? Being developed as a basis for more dependable system and application software . Singularity exploits advances in programming languages and tools to create an environment in which software is more likely to be built correctly, program behavior is easier to verify, and run-time failures can be contained. To a lesser degree, OS research has not exploited the exponential evolution of hardware exemplified by Moore’s Law. But now OS research is ready for a revolution.

Chapter 2. Key Features of Singularity

2.1 Software-Isolated Processes (SIPs)

A key aspect of Singularity is an extension model based on Software-Isolated Processes (SIPs), which encapsulate pieces of an application or a system and provide information hiding, failure isolation, and strong interfaces. SIPs are used throughout the operating system and application software. It is believed that building a system on this abstraction will lead to more dependable software. SIPs are the OS processes on Singularity. All code outside the kernel executes in a SIP.

SIPs differ from conventional operating system processes in a number of ways:

1. SIPs are closed object spaces, not address spaces. Two Singularity processes cannot simultaneously access an object. Communications between processes transfers exclusive ownership of data.

2. SIPs are closed code spaces. A process cannot dynamically load or generate code.

3. SIPs do not rely on memory management hardware for isolation. Multiple SIPs can reside in a physical or virtual address space.

4. Communications between SIPs is through bidirectional, strongly typed, higher-order channels. A channel specifies its communications protocol as well as the values transferred, and both aspects are verified.

5. SIPs are inexpensive to create and communication between SIPs incurs low overhead. Low cost makes it practical to use SIPs as a fine-grain isolation and extension mechanism.

6. SIPs are created and terminated by the operating system, so that on termination, a SIP’s resources can be efficiently reclaimed.

7. SIPs executed independently, even to the extent of having different data layouts, run-time systems, and garbage collectors.

· Advantage Of SIP’s

SIPs are not just used to encapsulate application extensions. Singularity uses a single mechanism for both protection and extensibility, instead of the conventional dual mechanisms of processes and dynamic code loading. As a consequence, Singularity needs only one error recovery model, one communication mechanism, one security policy, and one programming model, rather than the layers of partially redundant mechanisms and policies in current systems. A key experiment in Singularity is to construct an entire operating system using SIPs and demonstrate that the resulting system is more dependable than a conventional system. The Singularity kernel consists almost entirely of safe code and the rest of the system, which executes in SIPs, consists of only verifiably safe code, including all device drivers, system processes, and applications. Language safety protects this trusted base from untrusted code. The integrity of the SIPs depends on language safety and on a system-wide invariant that a process does not hold a reference into another process’s object space.

2.2 Ensuring the code safety

Ensuring code safety is obviously essential. In the short term, Singularity relies on compiler verification of source and intermediate code. In the future, typed assembly language (TAL) will allow Singularity to verify the safety of compiled code. TAL requires that a program executable supply a proof of its type safety (which can be produced automatically by a compiler for a safe language). Verifying that a proof is correct and applicable to the instructions in an executable is a straightforward task for a simple verifier of a few thousand lines of code. This end-to-end verification strategy eliminates a compiler—a large, complex program—from Singularity’s trusted base.

2.3 Extensions

Microsoft provide extensions to Singularity that use the application abstraction and driver resource declarations to provide guarantees about the I/O and IPC resources used by a de device driver .Our extensions allow Singularity to detect resource conflicts before drivers execute, infer a valid total boot order from strictly declarative syntax, and automatically generate significant driver initialization code. These capabilities increase the reliability and maintainability of the system with no significant cost in run-time performance.

>2.4 Four Centers of Gravity

Four design points combine to produce an OS that is agile to future research and innovates in system dependability. These design points are: a type safe abstract instruction set as the system binary interface, a unified extension mechanism for applications and the OS, a strong process isolation architecture, and a ubiquitous metadata infrastructure describing code and data.

A type safe abstract instruction set, based on type safe MSIL, provides an ideal system binary interface. It eliminates a whole class of programmer errors due to bad pointer arithmetic, enables changing boundaries between privileged and unprivileged code, opens new opportunities for dynamically adjusting trade-offs between security and performance, and allows ubiquitous analysis and instrumentation for both dependability and research.

Providing a unified extension mechanism simplifies design and implementation of the OS, applications, and new research proposals. Supporting one extension mechanism, instead of many, helps focus design efforts on getting that one mechanism correct; it also simplifies the management of extensions from the perspectives of both dependability and security. Choosing an extension mechanism that isolates extension code simplifies the development of dependable systems and opens otherwise unavailable opportunities for static analysis and optimization.

Strong process isolation overcomes two major challenges in contemporary systems: unintended collision between friendly applications and undesired attack from hostile code. The need for strong process isolation is witnessed by the rise of virtual machines like Virtual PC and OS alternative security models such as the CLR’s Code Access Security. Strong process isolation is a property of both time and space. Once a Singularity process has been created, its code cannot be mutated by either accident or malicious software such as worms. Strong process isolation will improve system dependability and provide solid boundaries which can be exploited for further OS research.

Ubiquitous metadata provides a unifying design thread and enables the removal of unreliability as early as possible in the system life cycle. Metadata enables type safety and flexibility in the abstract instruction set, verification of interfaces in the extension model, and explicit relationship labeling for deep process isolation. Introduction of new classes of metadata via an application abstraction moves many checks for unreliability from run time to installation or even compile time. Singularity’s metadata system creates new abstractions for not just applications, but also application components and overall system configuration. Singularity enables OS research across the application boundary by providing a ubiquitous, OS-protected infrastructure for storing and manipulating metadata.

Initial research and development resources will be devoted to making progress on these four centers of gravity as early as possible in the design and implementation of Singularity.

Chapter 3: Contributions of SINGULARITY

key contributions of Singularity are:

1. Construction of a system and application model called software-isolated processes, which

uses verified safe code to implement a strong boundary between processes without hardware mechanisms. Since SIPs cost less to create and schedule, the system and applications can support more and finer isolation boundaries and a stronger isolation model.

2. A consistent extension model for the system and applications that simplifies the security model, improves dependability and failure recovery, increases code optimization, and makes programming and testing tools more effective.

3. A fast, verifiable communication mechanism between the processes on a system, which

preserves process independence and isolation, yet enables process to communicate

correctly and at low cost.

4. Language and compiler support to build an entire system in safe code and to verify interprocess communications with explicit resource management.

5. Elimination of the distinction between an operating system and a safe language run-time system, such as the Java JVM or Microsoft CLR.

6. Pervasive use of specifications throughout a system to describe, configure, and verify components.

The design and implementation of a system based on SIPs is a major contribution of this work. A software isolated process is a collection of memory pages and a language safety mechanism that ensures that code in the process cannot access another process’s pages. A SIP replaces hardware memory protection with static verification of program safety. Singularity uses language safety and a fast communication mechanism built on channels to enforce a system-wide invariant that neither the kernel nor any other process contains a reference into a given process’s object space. Singularity segregates objects on a per-process basis to facilitate reclaiming resources on process termination. This architecture provides fine-granularity isolation and high performance. If a process fails, no other process’s data is left in an inconsistent state, and failure notification is cleanly propagated through communication channels. Moreover, the system can easily reclaim the failed process’s resources, including memory. And, without hardware isolation, system calls and inter-process communication run significantly faster (30–500%). Also, it provides extensions that use the application abstraction and driver resource declarations to provide guarantees about the I/O and IPC resources used by a device driver. Singularity’s extensions allow it to detect resource conflicts before drivers execute, infer a valid total boot order from strictly declarative syntax, and automatically generate significant driver initialization code. These capabilities increase the reliability and maintainability of the system with no significant cost in run-time performance. An additional contribution of this work is an initial exploration of Singularity’s ability to use hardware isolation selectively, rather than at every process boundary. For example, system processes and device drivers (each of which run in their own process) can — but need not — reside in the same address space as the kernel. Using a single address space permits fast communication, but still provides memory and failure isolation.

Chapter 4:Singularity Architecture

Figure below depicts the architecture of the Singularity OS, which is built around three key abstractions: a kernel, software-isolated processes, and channels. The kernel provides the core functionality of the system, including memory management, process creation and termination, channel operations, scheduling, and I/O. Like other microkernels, most of the system’s functionality and extensibility exist in processes outside of the kernel.

4.1 The Trusted Base

Code in Singularity is either verified or trusted. Verified code’s type and memory safety is checked by a compiler. Unverifiable code must be trusted by the system and is limited to the hardware abstraction layer (HAL), kernel, and parts of the run-time system. Most of the kernel is verifiably safe, but portions are written in assembler, C++, and unsafe C#. All other code is written in a safe language, translated to safe Microsoft Intermediate Language (MSIL), and then compiled to x86 by the Bartok compiler. Currently, MS trusts that Bartok correctly verifies and generates safe code. This is obviously unsatisfactory in the long run and they plan to use typed assembly language to verify the output of the compiler and reduce this part of the trusted computing base to a small verifier. The dividing line between the two types of code is blurred by the run-time system. This trusted, but unverifiable, code is effectively isolated from a computation, whose verified safety prevents it from interacting with the run-time system and its data structures, except through safe interfaces. Singularity’s compiler is able to in-line some of these routines, thereby safely moving operations that would traditionally run in a kernel into a user process.

4.2 Kernel

The Singularity kernel is a privileged system component that controls access to hardware resources, allocates and reclaims memory, creates and schedules threads, provides intraprocess thread synchronization, and manages I/O. It is written in a mixture of safe and unsafe C# code and runs in its own garbage collected object space. In addition to the usual mechanism of message-passing channels, processes communicate with the kernel through a strongly versioned application binary interface (ABI) that invokes static methods in kernel code. This interface follows the design of the rest of the system and isolates the kernel and process object spaces. All parameters to this ABI are values, not pointers, so the kernel and process’s garbage collectors need not coordinate. The only exception is the location of the ABI methods. Singularity’s garbage collectors currently do not relocate code, but if they did, they would need to maintain the invariant that these methods remain at known addresses. The ABI maintains the system-wide state isolation invariant: a process cannot alter the state of another process using the ABI. With only two exceptions, an ABI call affects only the state of its calling process. The two exceptions alter the state of a child process before or after it executes, but not during execution. The first is a call to create a child process, which specifies the code loaded for the child before it begins execution. The second is a call to stop a child process, which reclaims its resources after all threads cease execution. State isolation ensures that a Singularity process has sole control over its state.

4.3 Scheduler

The Singularity scheduler is optimized for a large number of threads that communicate frequently. The scheduler maintains two lists of runable threads. The first, called the unblocked list, contains threads that have recently become runable. The second, called the preempted list, contains runable threads that have been pre-empted. When choosing the next thread to run, the scheduler removes threads from the unblocked list in FIFO order. When the unblocked list is empty, the scheduler removes the next thread from the preempted list. Whenever a scheduling timer interrupt occurs, all threads in the unblocked list are moved to the end of the preempted list, followed by the thread that was running when the timer fired. The first thread from the preempted list is scheduled and the scheduling timer is reset. The net effect of the two list scheduling policy is to favor threads that are awoken by a message, do a small amount of work, send one or more messages to other processes, and then block waiting for a message. This is a common behavior for threads running message handling loops.

4.4 Exchange Heap

The Exchange Heap, which underlies efficient communication in Singularity, holds data passed between instead uses reference counts to track usage of blocks of processes. The Exchange Heap is not garbage collected, but

memory (Figure above) called regions. A process accesses a region through a structure called an allocation. Allocations within the Exchange Heap are owned by at most one process at time with ownership enforced by static verification. Allocations may be split; for example, protocol processing code in a network stack can strip protocol headers off a packet and hand the payload to an application without copying the packet.

4.5 Processes

A Singularity system lives in a single virtual address space. Virtual memory hardware is used to protect pages, for example by mapping out the first 16K of address space to trap null pointer references. Within a Singularity system, the address space is logically partitioned into: a kernel object space, an object space for each process, and the Exchange Heap for channel data. A pervasive design decision is the memory independence invariant: cross-object space pointers only point into the Exchange Heap. In particular, the kernel does not have pointers into a process’s object space, nor does one process have a pointer to another process’ objects. This invariant ensures that each process can be garbage collected and terminated without the cooperation of other processes. The kernel creates a process by allocating memory sufficient to load an executable image from a file stored in Microsoft’s portable executable (PE) format. Singularity then performs relocations and fixups, including linking kernel ABI functions. The kernel starts the new process by creating a thread running at the image’s entry point, which is trusted thread startup code that calls the stack and page manager to initialize the process. A process obtains additional address space by calling the kernel’s page manager, which returns new, unshared pages. These pages need not be adjacent to the process’s existing address space, since the garbage collectors do not require the address space be contiguous, though they may need contiguous regions for large objects or arrays. In addition to memory, which holds the process’s code and heap data, a process has a stack per thread and can access the Exchange Heap.

4.6 Garbage Collection

Garbage collection is an essential component of most safe languages, as it prevents memory deallocation errors that can subvert safety guarantees. In Singularity, kernel and process object spaces are garbage collected.The large number of garbage collection algorithms and experience strongly suggest that no one garbage collector is appropriate for all system or application code. Singularity’s architecture decouples the algorithm, data structures, and execution of each process’s garbage collector, so it can be selected to accommodate the behavior of code in the process and to run without global coordination. The four aspects of Singularity that make this possible are:

each process is a closed environment with its own run-time support;
pointers do not cross process or kernel boundaries, so collectors need not consider cross-space pointers;
messages on channels are not objects, so agreement on memory layout is only necessary for messages and other data in the Exchange Heap;
the kernel controls memory page allocation, which provides a nexus for coordinating resource allocation.

Singularity’s run-time systems currently support five types of collectors—generational semi-space, generational sliding compacting, an adaptive combination of the previous two collectors, mark-sweep, and concurrent mark-sweep. During a collection, the collector stops each thread to scan its stack, which introduces a pause time of less than 100 microseconds for typical stacks. The overhead of this collector is higher than non-concurrent collectors, so Microsoft Singularity uses a simpler non-concurrent marksweep collector in applications.Each SIP has its own collector that is solely responsible for collection of objects in the object space. From the garbage collector’s perspective, when a thread of control enters or leaves an application (or the kernel) it is treated similarly to a call to or a call-back from native code in conventional garbage collected environments. Garbage collection for different object spaces can therefore be scheduled and run completely independently.

4.7 Channels

Singularity processes communicate exclusively by sending messages over channels. Channel communication is governed by statically verified channel contracts that describe messages, message argument types, and valid message interaction sequences as finite state machines. Messages are tagged collections of values or message blocks in the Exchange Heap that are transferred from a sending to a receiving process. These primitives enforce much stronger semantics than the low-level IPC mechanisms of a typical microkernel. Channel endpoints can be sent in messages over channels. Thus, the communication network can evolve dynamically while conforming to the explicit communication invariant. Sending and receiving on a channel requires no memory allocation. Sends are non-blocking and non-failing; receives block synchronously until a message arrives or the send endpoint is closed. A process creates a channel by invoking a contract’s static NewChannel method, which returns the channel’s two endpoints. The process can pass either or both endpoints to other processes over existing channels. When data or endpoints are sent over a channel, ownership passes from the sending process, which may not retain a reference, to the receiving process. This ownership invariant maintains the state isolation invariant and is enforced by the language using linear types and by the run-time systems.

Chapter 5: Performance

If Singularity’s goal is more dependable systems, why does this report include performance measurements? The answer is simple: these numbers demonstrate that architecture that is proposed not only does not incur a performance penalty, but is often as fast as or faster than more conventional architecture. In other words, it is a practical basis on which to build a system.

Cost of basic operations

This section contains micro benchmarks comparing the performance of Singularity channel operations against other systems. All systems ran on AMD Athlon 64 3000+ (1.8 GHz) on an NVIDIA nForce4 Ultra chipset and 1GB RAM. The research team used Red Hat Fedora Core 4 (kernel version 2.6.11-1.1369 FC4), and Windows XP (SP2). Table given below reports the cost of simple operations. On the Linux system, the process kernel call was getpid(), on Windows, it was GetProcessId(), and on Singularity, it was ProcessService.GetCyclesPerSecond() All these calls operate on a readily available data structure in the respective kernels. The Linux thread test ran on user-space scheduled pthreads. Kernel sched-uled threads performed significantly worse. The “wait-setping pong” test measured the cost of switching between two threads in the same process through a synchronization object. The “2 message ping pong” measured the cost of sending a 1-byte message from one process to another and then back to the original process. On Linux, they used

TABLE : COST (CPU CYCLES)

	Singularity	Linux	Windows
Process kernel Call	78	324	445
Thread yield	401	900	763
2 thread wait-set ping pong	2,156	2,390	3,554
2 thread message ping pong	2,462	10,758	12,806

sockets, on Windows,used a named pipe, and on Singularity, used a channel.As the numbers show, Singularity is quite competitive with systems that have been tuned for years. Particularly encouraging are the message ping pong numbers on Singularity as they show that the cost of using a channel between distinct processes is only about 15% slower than the cost of the wait-set ping pong on two threads of the same process. Channels on Singularity outperform the mechanisms available on other systems by factors of 4 to 5. These improvements don’t even factor in the ability to pass pointers rather than copies of memory blocks .

Disk I/O Benchmarks

To quantify the effect of Singularity’s architecture on I/O, the MSR team measured the cost of random and sequential disk reads and writes on the various operating systems. The sequential tests read or wrote 512MB of data from the same portion of the hard disk. The random read and write tests performed 1000 operations on the same sequences of blocks on the disk. The tests were single threaded and performed synchronous raw I/O. Each test was run seven times and the results averaged. All benchmarks ran on the same hardware. On Singularity, the benchmark communicated with the disk driver process over a channel, whereas FreeBSD, Linux, and XP use system calls to communicate with their drivers. Following graphs* shows the throughput of the systems in I/O operations per second. For random read operations, Singularity’s performance was within 10% of the UNIX variants and marginally better than Windows. For random write operations, Singularity has the highest performance for a majority of block sizes. It is interesting to note that all systems had higher throughput in the random write measurements than for random read. For the sequential write operations, each of the systems were the best performer for at least one of the block sizes less than 8KB. (FreeBSD failed to complete the test with a block size of 512 bytes—performance dropped to 50 operations per second and the test did not finish within a reasonable period of time.) At block sizes above 8KB, FreeBSD again achieved the highest performance, with a margin of 6% between the best and worst performers for each block size.

(*Please refer the graphs)

The process of tree shaking

Sealed processes offer improved opportunities for

static analysis because all of the code that will run in a

process is known before the process begins execution.

Static analysis is available to any process architecture,

but sound static analysis of a complete process is

possible only when the entire process code is fixed.

One example of the type of static analysis enabled by sealed processes is whole process tree shaking. The Bartok compiler creates a tree of all of the code available within a process. It then safely eliminates (a.k.a. shakes out) fields, methods, and classes unused in all possible executions of the process. As shown in Table above, tree shaking can reduce program code size by as much as 75%.

Conclusion

Singularity is above all a laboratory for exploring interactions among domains like system architecture, programming languages, compilers, specification, and verification. Singularity is small and well structured, so it is possible to make changes that span the arbitrary boundaries between these domains. At the same time, it is large and realistic enough to demonstrate the practical advantages of new techniques.

References

http://research.microsoft.com/os/singularity

· http://www.cs.kuleuven.ac.be/conference/EuroSys2006/papers/p177-fahndrich.pdf

· ftp://ftp.research.microsoft.com/pub/tr/

· http://www.wikipedia.com/

· http://www.google.com/