After buying and setting up a shiny new 4TB HDD, I began filling it slowly with files from various old hard drives I have. I found a backup of a website I had worked on in high school, and decided why not? With so much space, keeping track of all my computer history is going to be easy! The drive was already filled about 1TB, so I figured I’d rather have the files here than on an old dusty spindle.
Copying the files over led to a strange “No space left on device” error. Hmm, why would this colossally large disk balk and squirm at a tiny 18GB website directory? As it turns out, I ran into a bug with the ext4 filesystem that causes it to fail if there are more than about 32,000 files in a directory. The technical reasons for this are boring and I really don’t care why; I just want to trust that my filesystem will do the right thing.
The lesson here is that even in 2016, filesystems still are finicky with large numbers of files in a single directory. Arguably the developers of ext4 should have picked a safer default, but it’s too late for that now. My time has already been wasted.
Too many existing tools and code scale linearly with the number of files (or subdirectories) in a directory. In this case, the chance of failure scales quadratically with the number of files, due to the technical reasons linked above. Trying to get all the settings right on the first time, or updating all tools to work with large directories isn’t a tenable solution.
Instead, plan to have your files stored in a hierarchy from the get-go. Don’t have large flat folders that file explorers will choke on. It will save you the pain of having to fix it later while your files are being held hostage.