Dot Slash Star: bash

Showing posts with label bash. Show all posts

Friday, October 14, 2011

Shell Trick: Dynamic bashrc Setup

Anyone who works on Unix has run into the issue of setting up the environment consistently among multiple hosts. If you are a power user, you likely have developed a robust set of aliases, functions, variable settings, etc. that streamline your command-line effectiveness. However, when you create a new account on a different host, it becomes a bigger task to set it up like your others.

Besides just the volume of customizations, each different environment may require specific settings. For example, the path to your favorite editor may be different.

Overview

I have come up with a solution for configuring a bash (Bourne Again Shell) environment which allows common settings and definitions, and also customizations at the level of operating system and host. The modularity is achieved by hooking the framework minimally into .bashrc and then using separate files for the actual settings so the proper ones can be selected on different environments. A graphical representation of this framework is shown in figure 1.

Figure 1.

Hooking the Framework into the Shell

The novice approach for customization is to put all the definitions in the .bashrc file. This becomes unmanageable quickly. Another downside of this is that the base image of this file can be different on different systems. Instead, this framework requires that only two lines are added at the end of the user .bashrc file.

export ENVSETUP="${HOME}/envsetup"
[[ -f ${ENVSETUP}/bashrc-common && -n "${PS1}" ]] && . ${ENVSETUP}/bashrc-common

The first line declares where the framework is located. In this example, it is the envsetup directory in the user's home. The framework can be installed anywhere, as long as ENVSETUP is defined to be that locations. The second line calls the framework after performing two checks. First it checks that the expected main file exists. Then it checks that this is an interactive shell. A non-interactive shell (like one created for scp) will not need the customizations, and actually could fail because of output generated by the framework.

Driving the Framework

The entry point to the framework is the driver file bashrc-common. This is meant to be just structural logic, with the actual settings is separate files. All customizations should go in those other files. This file needs to be described to fully understand the framework.

# A utility to do more env setup using the given file.
# Expect ENVSETUP to be properly defined.
SOURCE_LOG=""
sourceFrom () {
   base_file="$1"; shift
   source_file="${ENVSETUP}/${base_file}";
   if [ -f "${source_file}" ]; then
       # If an optional message was given, print it first.
       [[ $# -gt 0 ]] && { printf "$1\n"; shift; }

       . ${source_file}
       export SOURCE_LOG="${SOURCE_LOG}:${base_file}"
       return 0
   else
       return 1
   fi

   # Note that if the file does not exist, no action is taken (silently).
   # Only the return value indicates that action was done.
}

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

# Common aliases are defined externally.
sourceFrom bashrc-common-aliases

# Common Functions are defined externally.
sourceFrom bashrc-common-functions

# Platform-specific setup
printf "Bourne Again Shell "
os=$(uname -s)
case ${os} in
   Linux)
       sourceFrom bashrc-os-linux "on Linux.";;
   SunOS)
       sourceFrom bashrc-os-sunos "on SunOS.";;
   CYGWIN_NT-5.1)
       sourceFrom bashrc-os-win "on Windows XP.";;
   CYGWIN_NT-5.0)
       sourceFrom bashrc-os-win "on Windows 2000.";;
   CYGWIN_NT-4.0)
       sourceFrom bashrc-os-win "on Windows NT.";;
   *)
       printf "Unknown OS: %s\n" "${os}"
esac

# host-specific setup
sourceFrom "bashrc-host-${HOSTNAME}" "Customization for host: ${HOSTNAME}"

# Final processing after OS-specific and host-specific setup.
sourceFrom bashrc-common-final

The first element is the definition of a shell function which will assist in calling the sub-files. It is not necessary to detail it here. The main logic is after the dividing line.

Common Aliases and Functions

First it reads the definitions of the aliases and the functions from separate files. There is no strict reason for these two to be in separate files other than it makes for cleaner organization.

Operating System Specific Settings

Next comes the critical logic of dynamically selecting configuration files depending on the operating system and the specific host. The magic of determining the OS is done with the uname -s command. Notice that the output of this will specify the OS, but the actual file that gets called as a result is defined by the user. I have mapped various Windows operating systems to the single file bashrc-os-win (although a different message gets printed for each).

Note that this list of operating systems is incomplete. It reflects only those systems that I have tested this framework on. Other users would extend it to the other systems they are using. It it encounters a undefined system, the output will display the string to add to the case statement.

Host Specific Settings

Although the OS requires a specific collection of settings, it may be necessary to further customize the environment to have settings for an individual host. This is done by the next step. It only requires that the settings be given in the file named bashrc-host-HOSTNAME (where HOSTNAME is replaced by the value which that environment variable will have on that system).

Final Common Settings

The final step is to call a file containing common settings that are dependent on earlier settings that come from OS-dependent of host-dependent files. To understand this, consider this example.

export JDK_HOME=${JAVA_HOME}

This will ensure that JDK_HOME has the same value of JAVA_HOME, but the latter will likely be defined in a host specific file. Since we want these two variables to have the same value on all systems, we want this to be in a common section.

Output

You may have noticed several printf statements in the listing. The purpose of these is to provide some minimal information about the environment while allowing the ability of doing some basic debugging. Consider the output that I see on my main system:

Bourne Again Shell on Linux.
Customization for host: king

The first line is always printed starting with "Bourne Again Shell" as an indication that the framework was triggered. It ends naming the OS customization it is using (or that it could not match the OS). The second line states which host-specific file will be used.

The usefulness of this output comes from the fact that the OS and host are the variables that change from system to system. You can understand how those variables were resolved from these two lines.

Why Use this Framework

The power of this framework is realized when a fresh new environment needs to be set up on a new host. That is done simply by copying all files from one existing system to the new one, and then creating a new host specific file from another existing one.

This is easy to do because of the following advantages of this framework:

All files are in a separate subdirectory. They are not intermixed with other, unrelated files in the user's home directory.
All the files are not-hidden. Searching through and debugging a set of hidden files can be inconvenient.
Exactly the same set of files (with same content) used on different systems.

How to Use this Framework

Implementing this yourself can be done in the following steps.

Create a directory for all the setup files. (This will be referenced in the scripts as ENVSETUP.)
Create the file bashrc-common and copy the contents from above.
Add the two lines to the end of .bashrc as described above. (At this point, the bare framework is working.)
Create the setup files for your operating system and host. You may need to customize bashrc-common if it does not recognize your operating system.
Move all your customizations to the appropriate setup files.

More Debugging Details

If it is unclear which files are being read in which order, then the following command can be executed to find out for sure. (Note that the tr command is used to split the single-line output into multiple lines.)

echo $SOURCE_LOG | tr : \\n

On my main system, this produces the following. It displays the exact sequence.

bashrc-common-aliases
bashrc-common-functions
bashrc-os-linux
bashrc-host-king
bashrc-common-final

Friday, August 5, 2011

Shell Trick: removing all empty directories

Given a directory structure (a subtree) that contains files and directories, I sometimes want to quickly delete all empty directories. There is a simple one-line command that will do this.

There are also many incorrect solutions that are suggested in online discussions. Before presenting my script that does this, let us examine some of the other suggested ways to do it. They are not completely wrong, they just work only for some special cases.

Consider the example at left. The command will need to walk down the directory tree and search for directories. Any directories that are empty need to be removed.

We want to start examining at d1. Notice that we will find 3 empty subdirectories — d3, d6, and d7. If we remove those, we will create one newly empty directory — d5. Removing that will cause d4 to become empty. Then d4 can be removed and there will be no empty directories remaining.

I have found some suggested solutions but they have drawbacks:

  find d1 -type d -empty -exec rmdir {} \;

This cannot work because it traverses the tree top-down. It will find the first pass of empty directories, but not the ones that become empty when their subdirectories are removed.

  find d1 -type d -empty -exec rmdir -p {} \;

Actually this almost works. The difference with the previous flavor is that it knows that it is operating only on leaf directories, but it also then tries to remove empty parent directories. But it has the problem of not knowing where to stop. There is no bound on the root, so it could remove a directory at a level higher than what we specify.

The correct solution is

  find d1 -depth -type d -empty -exec rmdir {} \;

The important aspect is that we need to do a depth-first traversal of the sub-tree. This will allow us to handle the cases where previously non-empty directories become empty.

So I came up with the following script and named it "rmEmpty.sh". Because the action can be done with a single-line command, it really is not strictly necessary to encapsulate it in a script. But notice how many moving parts are working in the argument list to 'find'. Putting it in a script saves quite some typing. Besides the convenience of this, it also adds a factor of safety.

#!/bin/bash
#-----------------------------------------------------------------------------
# Remove all empty directories under and including the given root(s).
#-----------------------------------------------------------------------------

# If no args given, then root at $PWD.
[ $# == "0" ] && set "$PWD"

# Process each arg as a root to examine.
for root in "$@"; do
   if [ -d "${root}" ]; then
       find "${root}" -depth -type d -empty -exec rmdir -v {} \;
   else
       printf "Not a directory: [%s]\n" "${root}"
   fi
done

So notice that the action can now be invoked simply as

rmEmpty.sh d1

A couple points:

Notice that the quoting of variables as shown is strictly necessary to handle directory names which may contain spaces.
The script is verbose about each removal so you can explicitly see all the actions. The "-v" can be removed from the call to 'rmdir' to reduce verbosity.
One curiosity of this script is that if you run it with no args, it starts processing from your current working directory. It could end up deleting your current directory. This is not an error condition! You can change to another existing directory and continue working.

Saturday, May 28, 2011

Shell Trick: Create a directory and change to it in one command

Often when you create a directory, you are also intending to change to that directory to continue with other operations. This is usually done as two consecutive commands

> mkdir newdir
> cd newdir

But wouldn't this be more easily done in one single command? After all, the same directory name is used in each command.

I created a bash shell function^details-1 that does exactly this. I called it "mkcd", since it is basically a combination of "mkdir" and "cd". It is defined as

mkcd () {
  if [ $# == 1 ]; then
      dir="$1"
      printf "mkcd %s\n" "${dir}"
      if [ -d "${dir}" ]; then
          cd "${dir}"
      else
          mkdir -p ${dir}
          if [ $? == 0 ]; then
              cd "${dir}"
          fi
      fi
  else
      printf "Usage: mkcd <dir>\n"
  fi
}

This would be defined in a .bashrc environment initialization file.

This will skip the directory creation if it already exists. It actually uses "mkdir -p" so that it also creates any intermediate directory levels needed. It also only performs the "cd" if directory creation had no errors.

The operation can now be done as

> mkcd newdir

Notes

^details-1: Note that this is a function and not a shell script. Doing it as a script will not work because that runs as a child process, so the directory change happens in a different process and you will remain in the same directory in which you started.