The primary goal of this research is to improve the scalability and
robustness of the Linux operating system to support greater network server
workloads more reliably.
We are specifically interested in single-system scalability, performance,
and reliability of network server infrastructure products running on Linux,
such as LDAP directory servers, IMAP electronic mail servers, and web
servers, among others.
Summary
We're becoming more familiar with the Linux development
process, and Linux performance and scalability issues.
We're continuing to reach out to potential sponsors.
Work continues on long-term projects.
Milestones
-
The Intel-donated SC450NX has replaced the Dell PowerEdge
6300/450 in our test harness.
Four 10KRPM 9.1G disk drives have been added to allow us
to begin testing Linux software RAID support, and to
include the Netscape Messaging Server in our benchmark
efforts.
Software configuration of the Messaging Server and
SPECWeb99 is underway.
We intend to add Gigabit ethernet networking to our
test harness soon.
-
Andy continues to develop an NFSv4 implementation,
now based on the Linux 2.2.10 NFSv3 implementation.
In addition to the multi-RPC work he's already completed,
Andy has added support for NFSv4 mount and the readdir RPC.
In October, he participated in a Sun-sponsored NFSv4 bake-off,
where his client implementation successfully mounted servers
running on several operating systems.
-
Niels and Chuck ported the get_unused_fd() enhancements
to 2.3.17, hoping to help ease lock contention in the
2.3 kernel by making fd allocation faster.
Initial results show a measurable performance gain,
but not as much as we'd hoped.
We are still planning some heavily loaded web server benchmarks
to finish this study.
-
Chuck continues to work on an madvise() system call and
read-ahead for mmap-ed files.
He's also researching a new inode allocation model to help make
socket allocation more efficient.
-
Peter wrote a paper describing the Linux Scalability Project
for the upcoming NLUUG conference in Amsterdam, entitled
"Linux en Open Source."
He is spending a week there in November presenting the paper
and discussing funding with potential sponsors, as well as
discussing the project with principal Linux developers.
-
A new directory server, based on the Netscape Directory
Server product, has been set up at CITI for use as a research
vehicle for projects like this one, and other security-related
projects ongoing at CITI.
-
GSRA Olga Kornievskaia has joined the project.
She plans to focus on incorporating the Netscape Directory
Server into our security environment by implementing an
access control system using LDAP.
-
Steve continues to explore generic solutions to thundering herd
issues in the Linux TCP stack.
Jonathan is developing a guide to help server application
developers effectively use the POSIX RT signal API.
This API can allow greater performance and efficiency when
an application must manage thousands of open sockets.
-
Sun is very pleased with progress on the Linux
NFSv4 implementation, and has indicated they
want to continue sponsoring the LSP.
Red Hat continues to respond positively
towards our overtures for gaining project funding.
IBM has approached us regarding funding possibilities.
Challenges
Linux Development Road Map
The actively maintained Linux kernel releases always arrive in pairs:
an even-numbered "stable" kernel release which is meant for
production systems and reliable desktop applications; and an
odd-numbered "development" kernel release, which has new features and
major overhauls underway.
The development kernel is supposed to be more advanced than its
counterpart stable kernel, and will one day become the next stable
release.
Today, a Linux development kernel becomes a stable
release in an almost arbitrary way.
Linus decides how long the release development cycle will take,
and then adds new features until he's ready
to freeze the code base for the stable release.
Then, after a stable release arrives, we all look over our shoulders
and ask ourselves what we accomplished.
In fact, the "stable" release isn't really stable at all, since
features and new drivers continue to be added.
One of the 2.2 kernel's biggest problems is that many of the features
which were added in the 2.1 kernel were not finished when 2.2.0 arrived.
Specific examples of this phenomena include:
- ext3 -
Journalling, ACLs, and extent-based allocation have
been promised since 2.1, but have yet to materialize even in
a development kernel.
-
large fdsets-
For a while, it was a big question whether large fdset support
was in or out of 2.2.
Eventually support, which had been available during the 2.1
development cycle, was added to 2.2, but it broke the rules regarding
what goes into a stable release rather than a development kernel.
-
NFS -
Support for NFS in the stock 2.2 kernel is not production ready;
a patch is required simply to use it, in many instances.
-
RAID -
Many of us remember a long (months?) argument on the linux-kernel
mailing list about whether to add a RAID patch that broke user-space
RAID management tools in a stable release.
Why was a dysfunctional RAID system released in a stable kernel
in the first place?
Commercial enterprises want to know when they can expect
certain features to appear and become stable in any kernel.
Kernel developers, however, have been arbitrarily adding such features
as large memory support and application level access to raw block I/O.
It would be better to provide some codified mechanism for determining
if/when features will get into the kernel, in a way that preserves
both kernel developer sanity and kernel release integrity.
One thing that I would like to see integrated into
the Linux kernel development process,
either socially or technologically,
is a somewhat formal long-term feature development plan.
A development plan, and especially a process for reaching consensus
about it, would help the community decide what features should be added
in a development cycle, and what qualifies each feature as ready for a stable
release (that is, what tests the kernel has to pass before an odd numbered
release becomes an even numbered one).
Especially because Linux kernel development is a distributed process,
it would be useful to have shared information about:
-
Who is working on what new feature?
Knowing who is working on what will prevent redundant
effort, and identify what yet needs to be done.
Redundant effort is an oft-experienced problem in the
Linux community, and can be demoralizing for those who
are attempting to get involved.
-
When are the results of a project expected to appear?
This will allow those who are overseeing the whole development
effort to delegate work, and to plan what will appear in each
release.
It also offers some guide to beta testers, Linux end users,
and corporations who are interested in particular features.
-
What criteria will be used to judge the readiness of a particular
feature?
Having such a criteria list can drive beta testing, and can identify
critical behaviors and performance areas that need to be in place before
a development kernel can be renamed as a stable release.
-
What is expected to appear in each major kernel release?
This makes it easier to dispell FUD, as well as providing
some guidelines about what needs to be tested during
quality assurance.
It also opens the possibility of integrating sets of features
in each new release, instead of piling in a jumble of
features that may not be useful because other, dependent, features
aren't yet present.
The idea of developing a list of release requirements by
building a consensus, then publicly announcing those requirements,
may not appeal to some members of the development community.
After all, developing the Linux kernel has always been a hobby,
not a professional obligation;
having a system by which to set goals and determine progress
feels like Big Brother.
For many, comparing progress to a list of goals is a mixed bag
of "shoulds," "could haves," and "we made its".
Perhaps volunteer developers may feel that this should be the job of
the commercial Linux distributors, as is quality assurance and
performance testing.
I believe that, like QA and performance work, release integrity and
feature plans must be part of the process from its first steps.
The "cathedral"-style approach described above may conflict with
the Linux community's devotion to "bazaar" development processes.
In order to make a development planning process work well, these
anxieties must be met with positive and creative solutions,
not glib party-line answers.
There is no good reason that such a planning process has to add
to the stress that already plagues kernel developers.
Hopefully, a good planning process can ease developers' Linux-related
work load by adding a useful channel for communication and decision-making.
If you have comments or suggestions, email
linux-scalability @ citi.umich.edu