<< home
12 minute read

An abstract filesystem namespace

Alternative Title: An implementation critique of the Filesystem Hierarchy Standard


Background

An ever present frustration I experience while using Linux is what I refer to as PATH hell. Linux packages typically make many assumptions about the layout of a filesystem, and in turn are pretty choosy about how they are installed. On the other hand, filesystems are implemented in a fairly concrete fashion: What you see in your directory hierarchy is the way it is actually stored. This on its face makes a lot of sense. Packages frequently need interface with other packages and typically arrange to have files placed in well-known locations. Further, system administrators and users alike benefit from having an intuitive organization of package files. This is what Filesystem Hierarchy Standard (hereafter “FHS”) aims to standardize.

However, there are many cross-cutting concerns when it comes to deciding where files should go. We ask a lot from our filesystem. It is supposed to:

Suffice it to say, there are many actual use situations that fail to fit in the neat box that FHS describes. What’s more, the FHS freely admits that it simply describes the prevailing convention. Many packages freely ignore the FHS, and instead recommend users to enhance their PATH.

If you ask me, this whole situation is a big mess, and it would provide a lot of benefit to treat system-oriented special file locations and user-oriented filesystem management on their own terms. This is especially true on systems where the system administrator and user are the same person, as is often the case with a personal computer.

The FHS under a microscope

The FHS exists primarily to describe where files should go based on where commonly installed programs expect to find them. An example is that shell scripts and many other programs depend on a shell existing at /bin/sh (or /bin/bash). Paths like these are also prescribed in POSIX, SUS, and other standards.

One fortunate thing about the FHS is that, at it’s core there are approximately three root directories of all well-known paths:

Popular distributions of Linux have many more roots than just these three. I am not considering them for the following reasons:

Decoupling the FHS from the physical filesystem

It may not aesthetically appear as such, but something like /bin/bash (that so often appears in a shebang) is a URI just the same as is https://gitlab.com/. This also includes standardized search paths (especially the infamous PATH). This strongly suggests to me, an extremely online software engineer, that the same trick that web servers having been using for decades could also be applied to the filesystem.

Note that historically, the path in a URL corresponded to real paths on server filesystems. Dynamic content may not correspond to a physical storage location at all, and the path is left to describe a hierarchical resource identifier.

We can do this on the local filesystem as well.

/usr is ripe for abstraction because of its static nature. The only application that ever writes into it is the package manager. A big downside with the /usr hierarchy is that it co-mingles files from various packages, which makes it incredibly painful to install/remove packages your package manager doesn’t support, or to do away with the package manager entirely.

/etc is the primary place of system-level configuration. Contrary to the description in FHS, there are numerous examples of files that are shareable (like /etc/services), and what’s worse, many examples of variable files (like /etc/resolv.conf and /etc/ld.so.cache) that should really be stored in their rightful physical locations and symlinked in.

/var is a bit of a sticky wicket because applications expect to be able to write to it arbitrarily. However, leaving this as-is is mostly okay because the vast majority of applications do well to segregate their data in eponymous directories, but can introduce problems if there is a desire to allow for multiple versions of the same package to run concurrently.

Are traditional package managers still useful?

A project I have the itch to make is quite similar in plumbing to homebrew. Homebrew extracts packages into a location called the “Cellar” that is organized such that each package and each version has its own directory, then requisite symlinks are added to /usr/local/bin and others to virtualize the installed configuration. This greatly simplifies maintenance tasks in a number of ways: This gives the ability for brew doctor to effectively diagnose installation problems, it allows for multiple versions of a package to be installed in parallel, and makes it very obvious just with a ls -l which package owns what file.

I want to take this idea and go a step further. I want to support the user being able to place a package into any number of special folders (that could be on any physical device), and reacts (with fs events) by updating the /usr and /etc trees to match. The ideal way to achieve this is to provide for a new usrfs filesystem driver for /usr (à la procfs for /proc) and a corresponding etcfs for /etc.

What is wrong with just symlinking?

POSIX supports symlinks and hardlinks, which can be used to provided an indirected hierarchy, but has limited usefulness because it moves the problem to managing the symlinks. For example, symlinks don’t provide any facility to merge directory trees; you must instead symlink on a per-file basis, and somehow keep track of changes made to the physical tree. Homebrew manages this by making it part of the install step, but I’d like to eliminate “installation” as a step entirely.

Addendum #1 (Sep 2022)

This article was originally published as a README for a project that has since morphed into Aspen. The tone of the article has been edited to turn this into a free-standing article. If this sounds like a project pitch, this is why.


If you found this article interesting or had any questions/critiques, please reach out to blog@a.lexg.dev and include a link to the article. In particular, please let me know if you reference this article elsewhere, so I can add a link to you.