If you've used a Unix-like operating system for any length of time,
you've probably been frustrated by the shell. You know, ol'
/bin/sh
that you have to type commands in to do anything
useful. The connective tissue between the organs of the computer that
do the real work of computing, you know. Shell gets a bad rap.
Shell gets a bad rap because it's "arcane," it's "old," "crufty," "doesn't have real datatypes," etc. These things are all generally true! But shell can also be fun to (ab)use. I've written a number of things in shell, and I've discovered some tips and also tricks to make it do what I want. That's what the rest of this page holds.
I'm going to stick to POSIX shell here because standards! I've extensively referred to the following references as I've got gud at shell:
sh(1p)
Yes, manual sections can have non-numerical characters in them. You can open this with
man 1p sh
if you have the right manpages installed.Yes, you can install POSIX manpages. They're very useful!
- Rich's sh (POSIX shell) tricks
- Pure sh bible
Tips & Tricks
This section serves almost as a reminder for me, because I re-implement this stuff all the time. But it might be useful for you, too!
- split text into paragraphs
- I find myself splitting text into paragraphs all the time for further,
paragraph-level processing. There are two ways to do this:
awk -vRS= ...
- When awk's
RS
variable is set to the empty string, it makes each paragraph (contiguous section of non-empty lines followed by a continguous section of empty lines or end-of-file) a record, and each line a field regardless of whatFS
is set to. In an awk script, you can put it in aBEGIN
block:BEGIN{RS=""}
. sed '/./{H;1h;$!d;};x;[...]'
- For ultimate nerd cred, use sed! This snippet (minus the
ellipsis, that's where you put your program) will, for each
non-empty line: append to the hold space; if it's the first line
replace the hold space; if it's not the last line, delete the
pattern space (so it doesn't print). Then it exchanges the hold
space with the pattern space so
s
can work on it. If you use this method, be sure to remove the leading newlines. In some seds, you'll have to have the newline in its own command for some reason. This is pretty hacky.
sed
is good- While
awk
is my bae,sed
can do a few things better that can make it the right tool for the job. To whit:- sed can work with regex capture groups. GNU awk can as well, but that's (famously) not POSIX!
- sed only consumes what it needs. I actually don't remember if
awk is an issue here or some other utility, but if you have a
long, complex pipeline with lots of processors, sed (the
stream editor) is good at not clobbering stdin.
Use case:
sed q
reads and prints the first line of input, then quits— so if you want to store that in a variable then continue processing the rest of the file, it works great. - sed is terse. This comes in really handy with code golfing (below).
Golf
These should probably only be used to play shell golf, because they are so short as to be arcane (and they probably break in subtle ways!):
<FILE
- You can check for existence of a file by trying to read from it using redirection. In the shells I tested (ash), it didn't error out and it didnt' print a message.
alias
- Shell aliases can help save space, if you use a certain command multiple times. Of course, you should use a one-letter alias, and it'll only work if ... the math is too hard for me right now but like, type it out with your command to make sure. And remember that if you alias more than one command, you can use the same alias call to save bytes.
`...`
- Using the backtick is generally seen as bad form nowadays in
shell—it doesn't nest, it's easy to lose, etc.—but when
you're golfing, it can work great! It's one byte shorter than
$()
(have to count the backtick on both ends), which can add up.