Friday, August 5, 2011

Shell Trick: removing all empty directories

Given a directory structure (a subtree) that contains files and directories, I sometimes want to quickly delete all empty directories. There is a simple one-line command that will do this.

There are also many incorrect solutions that are suggested in online discussions. Before presenting my script that does this, let us examine some of the other suggested ways to do it. They are not completely wrong, they just work only for some special cases.

Consider the example at left. The command will need to walk down the directory tree and search for directories. Any directories that are empty need to be removed.

We want to start examining at d1. Notice that we will find 3 empty subdirectories — d3, d6, and d7. If we remove those, we will create one newly empty directory — d5. Removing that will cause d4 to become empty. Then d4 can be removed and there will be no empty directories remaining.

I have found some suggested solutions but they have drawbacks:
  find d1 -type d -empty -exec rmdir {} \;
This cannot work because it traverses the tree top-down. It will find the first pass of empty directories, but not the ones that become empty when their subdirectories are removed.
  find d1 -type d -empty -exec rmdir -p {} \;
Actually this almost works. The difference with the previous flavor is that it knows that it is operating only on leaf directories, but it also then tries to remove empty parent directories. But it has the problem of not knowing where to stop. There is no bound on the root, so it could remove a directory at a level higher than what we specify.

The correct solution is
  find d1 -depth -type d -empty -exec rmdir {} \;
The important aspect is that we need to do a depth-first traversal of the sub-tree. This will allow us to handle the cases where previously non-empty directories become empty.

So I came up with the following script and named it "rmEmpty.sh". Because the action can be done with a single-line command, it really is not strictly necessary to encapsulate it in a script. But notice how many moving parts are working in the argument list to 'find'. Putting it in a script saves quite some typing. Besides the convenience of this, it also adds a factor of safety.
#!/bin/bash
#-----------------------------------------------------------------------------
# Remove all empty directories under and including the given root(s).
#-----------------------------------------------------------------------------

# If no args given, then root at $PWD.
[ $# == "0" ] && set "$PWD"

# Process each arg as a root to examine.
for root in "$@"; do
if [ -d "${root}" ]; then
find "${root}" -depth -type d -empty -exec rmdir -v {} \;
else
printf "Not a directory: [%s]\n" "${root}"
fi
done
So notice that the action can now be invoked simply as
rmEmpty.sh d1
A couple points:
  • Notice that the quoting of variables as shown is strictly necessary to handle directory names which may contain spaces.
  • The script is verbose about each removal so you can explicitly see all the actions. The "-v" can be removed from the call to 'rmdir' to reduce verbosity.
  • One curiosity of this script is that if you run it with no args, it starts processing from your current working directory. It could end up deleting your current directory. This is not an error condition! You can change to another existing directory and continue working.