Refactored Code (Reliability) - Linux Kernel Superiority

Linux Kernel Superiority

1. Refactored Code (Reliability)

Here is a diagram of the Linux kernel:

Layers of the Linux kernel “onion”. The Linux kernel is 50% device drivers, and 25% CPU-specific code. The two inner layers are very generic.

Notice that it is built as an onion and is comprised of many dis-crete components. The outermost layer of the diagram is device drivers, which is 50% of the code, and more than 75% of its code is hardware-specific. The Microsoft Windows NT kernel diagram, shown several pages back, puts all the device drivers into a little box in the lower left-hand corner, illustrating the difference between

Device Drivers

Arch

(CPU-specific code)

Network &

file systems Init &

Memory Manager

Crypto Security

theory and reality. In fact, if Microsoft had drawn the kernel mode drivers box as 50% of the Windows NT diagram, they might have understood how a kernel is mostly hardware-specific code, and reconsidered whether it was a business they wanted to get into.

Refactoring (smoothing, refining, simplifying, polishing) is done continuously in Linux. If many drivers have similar tasks, duplicate logic can be pulled out and put into a new subsystem that can then be used by all drivers. In many cases, it isn't clear until a lot of code is written, that this new subsystem is even worthwhile. There are a number of components in the Linux kernel that evolved out of dupli-cate logic in multiple places. This flexible but practical approach to writing software has led Linus Torvalds to describe Linux as “Evolu-tion, not Intelligent Design.”

One could argue that evolution is a sign of bad design, but evolu-tion of Linux only happens when there is a need unmet by the cur-rent software. Linux initially supported only the Intel 80386 processor because that was what Linus owned. Linux evolved, via the work of many programmers, to support additional processors — more than Windows, and more than any other operating system ever has.

There is also a virtuous cycle here: the more code gets refactored, the less likely it is that a code change will cause a regression; the more code changes don't cause regressions, the more code can be refactored. You can think about this virtuous cycle two different ways: clean code will lead to even cleaner code, and the cleaner the code, the easier it is for the system to evolve, yet still be stable.

Andrew Morton has said that the Linux codebase is steadily improv-ing in quality, even as it has tripled in size.

Greg Kroah-Hartman, maintainer of the USB subsystem in Linux, has told me that as USB hardware design has evolved from version 1.0 to 1.1 to 2.0 over the last decade, the device drivers and internal kernel architecture have also dramatically changed. Because all of the drivers live within the kernel, when the architecture is altered to support the new hardware requirements, the drivers can be

adjusted at the same time.

Microsoft doesn't have a single tree with all the device drivers.

Because many hardware companies have their own drivers floating around, Microsoft is obligated to keep the old architecture around so that old code will still run. This increases the size and complexity of the Windows kernel, slows down its development, and in some cases reveals bugs or design flaws that can't even be fixed. These

backward compatibility constraints are one of the biggest reasons Windows takes years to ship. The problem exists not just at the driver layer, but up the entire software stack. When code isn't freely available and in one place, it makes it hard to evolve.

While the internal logic of Linux has evolved a lot in the last ten years, the external programmer interfaces have remained constant.

The key to a stable interface is incorporating the right abstractions.

One of the best abstractions that Linux adopted from Unix is the file abstraction. In order to perform almost any function on a Linux com-puter, from reading a web page on a remote website to downloading a picture from a camera, it is necessary to simply use the standard file commands: open and close, read and write.

On my computer, in order to read the temperature of the CPU, I just need to open the (virtual) text file

“/proc/acpi/thermal_zone/THM0/temperature” and the data I request is inside:²

temperature: 49 C

In essence, the Linux kernel is a bundle of device drivers that communicate with hardware and reveal themselves as a file system.

As new features, security issues, hardware requirements and sce-narios confront the Linux kernel, the internal design evolves and improves, but the file system abstraction allows code outside the kernel to remain unchanged over longer periods of time.

Here is a random sample of the change log of the Linux kernel from 2.6.14. As you can see, it is filled with all kinds of cleanup and bugfix work:

spinlock consolidation

fix numa caused compile warnings ntfs build fix

i8042 - use kzalloc instead of kcalloc

clean up whitespace and formatting in drivers/char/keyboard.c s3c2410_wdt.c-state_warning.patch

[SCSI] Fix SCSI module removal/device add race [SCSI] qla2xxx: use wwn_to_u64() transport helper

[SPARC64]: Fix mask formation in tomatillo_wsync_handler() [ARCNET]: Fix return value from arcnet_send_packet().

Many of the Linux kernel's code changes are polish and cleanup. Clean code is more reliable and maintainable, and reflects the pride of the free software community.

2 This should arguably be expressed as XML, but because there is common code that reads these values and provides them to applications, and because each file contains only one value, this problem isn't very significant; the kernel's configura-tion informaconfigura-tion will never be a part of a web mashup.

If you look at the code changes required to make a bugfix, in the vast majority of cases all that is needed is a revision of a few lines of code in a small number of files. A general guideline Linux has for bugfixes is this: if you can't look at the code change and prove to yourself that it fixes the problem, then perhaps the underlying code is confused, and this fix shouldn't be added near the end of a release cycle.

According to Stanford University researchers, the Linux kernel has .17 bugs per 1,000 lines of code, 150 times less than average commercial code containing 20-30 bugs per 1,000 lines.³ Microsoft's Windows bug databases aren't available on the Internet so it is impossible to make comparisons, but even if Linux isn't more reli-able already, it is setup to become so because the code is simple, well-factored, and all in one place.

Within the free software community, different teams are disparate entities, and so the idea of arbitrarily moving code from one part of the system to another can't easily happen. Inside Microsoft there are no boundaries, and so code is moved around for short-term per-formance gains at the cost of extra complexity.

3 These studies have limited value because their tools usually analyze just a few types of coding errors. Then, they make the IT news, and get fixed quickly because of the publicity, which then makes the study meaningless. However, these tools do allow for comparisons between codebases. I believe the best analysis of the number of Linux bugs is the 1,400 bugs in its bug database, which for 8.2 mil-lion lines of code is .17 bugs per 1,000 lines of code. This is a tiny number, though it could easily be another 100 times smaller. Here is a link to the Linux kernel's active bugs: http://tinyurl.com/LinuxBugs.

Here is a graph of all the function calls into the OS required to return a simple web request. These pictures demonstrate a visual difference in complexity that often exists between free and propri-etary software:

System call graph in Microsoft's proprietary web server, IIS.

System call graph to return a picture in the free web server Apache.

Diagrams provided by SanaSecurity.com

2. Uniform Codebase (Reliability,

In document After The Software Wars (Page 30-35)