(Ab)using shell

If you've used a Unix-like operating system for any length of time, you've probably been frustrated by the shell. You know, ol' /bin/sh that you have to type commands in to do anything useful. The connective tissue between the organs of the computer that do the real work of computing, you know. Shell gets a bad rap.

Shell gets a bad rap because it's "arcane," it's "old," "crufty," "doesn't have real datatypes," etc. These things are all generally true! But shell can also be fun to (ab)use. I've written a number of things in shell, and I've discovered some tips and also tricks to make it do what I want. That's what the rest of this page holds.

I'm going to stick to POSIX shell here because standards! I've extensively referred to the following references as I've got gud at shell:


Tips & Tricks

This section serves almost as a reminder for me, because I re-implement this stuff all the time. But it might be useful for you, too!

split text into paragraphs
I find myself splitting text into paragraphs all the time for further, paragraph-level processing. There are two ways to do this:
awk -vRS= ...
When awk's RS variable is set to the empty string, it makes each paragraph (contiguous section of non-empty lines followed by a continguous section of empty lines or end-of-file) a record, and each line a field regardless of what FS is set to. In an awk script, you can put it in a BEGIN block: BEGIN{RS=""}.
sed '/./{H;1h;$!d;};x;[...]'
For ultimate nerd cred, use sed! This snippet (minus the ellipsis, that's where you put your program) will, for each non-empty line: append to the hold space; if it's the first line replace the hold space; if it's not the last line, delete the pattern space (so it doesn't print). Then it exchanges the hold space with the pattern space so s can work on it. If you use this method, be sure to remove the leading newlines. In some seds, you'll have to have the newline in its own command for some reason. This is pretty hacky.
sed is good
While awk is my bae, sed can do a few things better that can make it the right tool for the job. To whit:
  • sed can work with regex capture groups. GNU awk can as well, but that's (famously) not POSIX!
  • sed only consumes what it needs. I actually don't remember if awk is an issue here or some other utility, but if you have a long, complex pipeline with lots of processors, sed (the stream editor) is good at not clobbering stdin. Use case: sed q reads and prints the first line of input, then quits— so if you want to store that in a variable then continue processing the rest of the file, it works great.
  • sed is terse. This comes in really handy with code golfing (below).

Golf

These should probably only be used to play shell golf, because they are so short as to be arcane (and they probably break in subtle ways!):

<FILE
You can check for existence of a file by trying to read from it using redirection. In the shells I tested (ash), it didn't error out and it didnt' print a message.
alias
Shell aliases can help save space, if you use a certain command multiple times. Of course, you should use a one-letter alias, and it'll only work if ... the math is too hard for me right now but like, type it out with your command to make sure. And remember that if you alias more than one command, you can use the same alias call to save bytes.
`...`
Using the backtick is generally seen as bad form nowadays in shell—it doesn't nest, it's easy to lose, etc.—but when you're golfing, it can work great! It's one byte shorter than $() (have to count the backtick on both ends), which can add up.