Recently, a coworker has been sharing some higher-order shell functions he has been writing, inspired by the first of these articles:
Both discuss small scripts and shell functions that allow you to more easily compose simple unix tools to complete complex task.
My own ~/bin
contains similar higher-order functions and other
simple tools. Like most unix tools, they do a single task and are
composable with other tools on the command line. However, I am often
surprised by the number of heavy command-line users I see who don’t
regularly encapsulate repetitive tasks or constructions into shell
functions or aliases.
In my experience, the majority of such tools
- take less than 5 minutes to write,
- require almost no maintenance, and
- prove repeatably useful after their initial creation.
This post shares two tools in my ~/bin
that I used this weekend. I
created both while ago for entirely different purposes, but was still
able to easily use them together without modification.
do.times N COMMAND
do.times
executes COMMAND, N times. For example,
1 2 3 4 5 6 |
|
The tool itself is little more than
a wrapper to a shell for
loop, but the typing it saves and the
semantic value it provides when constructing a command line has proven
valuable on numerous occasions.
summarize [OPTIONS] [FILE]
summarize
reads columns of data from either a file given as an
argument or its standard input and provides basic summary statistics.
I originally wrote this to do quick spot checks of data files I was
working on before sending them off to collaborators.
Here is an example of its output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
While it doesn’t seem like much, these basic statistics are often all one needs to ensure they are sending the correct data to a teammate or to quickly answer a basic question.
summarize uses Rscript, an
executable shipped with R that allows one to create scripts using R.
Recently, I added a dependency on the CRAN package optparse
to make
handling options a bit more straightforward. With the exception of
the option parsing, the R code itself is likely easily understood by
anyone who has used R.
Smoke Testing Speed Improvements
This weekend I have been working on a small set of improvements to
make some not-so-simple tools I use on a regular basis a bit
faster. Combining do.times
and summarize
allowed me to quickly
generate a smoke test for whether the speed improvements I was
implementing were working. To protect the innocent, I’ll use git
--version
as an example of the command I wanted to test:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
While this isn’t scientific testing by any means, it was enough to keep me moving in the right direction, and simple enough that the cost of creating and running the test was practically zero.