Today I’m working a lot on this website, so I thought I’d write a
little blog post detailing what I’m trying to do and how it sort of
works. I build this site using pandoc
with a lot of lua
filters and an index.py to stitch it all together. You can check out the
actual code at the mirror
I’ve set up at Gitlab.
Organization
This site is organized simply: everything I write is dumped
in /text
, and the extension determines what folder it goes
in at the root. For example, I’m doing this thing where I’m writing a
poem everyday, so those go under poem
.
This blog post and others will go under blog
. When I build the site, I just
make
and it uses pandoc
to build everything
(doing stuff for me like slugifying headers, converting
LineBlocks
to verse blocks, or automatically generating
links across categories to stuff written on the same day). Then,
index.py
is called to make little indexes and finally make
the index for the site. It’s not 100% perfect (especially the site
thing), but it’s good enough for now—and I can always fix it later.
You can read more about this stuff (eventually, when I get around to writing it) at the Colophon.
Shuffling
The biggest thing I’m doing today is working on my little shuffle-script to take texts and shuffle them around. I’ve already done that once, with “The Snubs”, though that was with a thrown-together version of the script I’m including below. Today, I used an article I read yesterday in the Oxford American about the phenomenon of blues music in Tokyo and shuffled it around to generate something resembling a poem, which I then massaged into something that honestly, isn’t great. But it’s something!
Here’s the script (I’m copying it here since it’s not currently under
source control), shuflr
:
import argparse
import os.path
import random
import re
class Shuflr():
def __init__(self, textfile, dedup=False):
self.words = None
self.dedup = dedup
with open(textfile) as f:
self.all = f.read()
def remove_punc(self):
# Convert dashes to spaces
= re.sub(r'--+|\s+-+\s+|[–—]', ' ', self.all)
nopunc # Convert curly apostrophes to normal
= re.sub(r'[‘’]', "'", nopunc)
nopunc # Remove everything else
= re.sub(r'[^ \t\n\r\f\v\w\'-]', '', nopunc)
nopunc self.all = nopunc
def splitwords(self):
self.words = self.all.split()
if self.dedup:
self.words = list(set(self.words))
def shuffle(self, cleanup=True, normalize_case=False):
= []
shuf
if cleanup:
self.remove_punc()
if normalize_case:
self.all = self.all.lower()
if not self.words:
self.splitwords()
while len(self.words) > 0:
= random.randint(0, len(self.words) - 1)
i = self.words.pop(i)
w
shuf.append(w)
self.words = shuf
def versify(self, minLength, maxLength, chance):
= []
lines = 0
count if maxLength <= minLength:
= maxLength
ml = minLength
maxLength = ml
minLength
if not self.words:
self.splitwords()
for w in self.words:
lines.append(w)if count < minLength:
+= 1
count elif count > maxLength:
'\n')
lines.append(= 0
count elif random.randint(0, 100) >= chance:
'\n')
lines.append(= 0
count else:
+= 1
count
self.words = lines
def write_to_template(words, outfile):
= 1
fi = outfile
fixname while os.path.isfile(fixname):
+= 1
fi = outfile + str(fi)
fixname with open(fixname, 'w') as f:
' '.join(words) + '\n')
f.write(
if __name__ == '__main__':
= argparse.ArgumentParser()
parser "infile", type=str,
parser.add_argument(help="File to shuffle")
"-p", "--keeppunc", action="store_true",
parser.add_argument(help="Keep punctuation?")
"-l", "--lowercase", action="store_true",
parser.add_argument(help="Lowercase everything?")
"-d", "--dedup", action="store_true",
parser.add_argument(help="Remove duplicate words")
"-o", "--output", type=str, default="",
parser.add_argument(help="""Filename to write to.
If < file > exists, append a number to the name
til it doesn't.""")
= parser.parse_args()
args
random.seed()= Shuflr(args.infile, args.dedup)
s not args.keeppunc, args.lowercase)
s.shuffle(4, 16, 50)
s.versify(if args.output:
write_to_template(s.words, args.output)else:
print(' '.join(s.words))