Over the past few weeks I had some simple text transformations that had me reaching for Perl, but ended up doing directly in Vim. These were very small exercises in Regular Expression (RE) matching/replacing, and what struck me was how small differences in dialect can sometimes prevent you from making progress.
Here's the first example:
01/17 Led Zeppelin's first album is released, 1969 01/19 Janis Joplin is born in Port Arthur, Texas, 1943 00000122 Sam Cooke is born in Chicago, 1935
These lines are in calendar format. On a lark, I wanted to switch to pal, another calendar program, which required these strings:
00000117 Led Zeppelin's first album is released, 1969 00000119 Janis Joplin born in Port Arthur, Texas, 1943 00000122 Sam Cooke is born in Chicago, 1935
If it were just these three lines, then I'd be in the editor, hand-updating the strings manually (VIM: f/x
% perl -p -e 's/(\d\d)\/(\d\d)(.*)/0000\1\2\3/' < sample 00000117 Led Zeppelin's first album is released, 1969 00000119 Janis Joplin is born in Port Arthur, Texas, 1943 00000122 Sam Cooke is born in Chicago, 1935
But I was looking at the file in VIM, and I knew VIM had the same capability to do search and replace with regular expressions. Unfortunately, entering the above RE in VIM's command-line mode produces a big fat "pattern not found" error.
Here's where habits can sometimes prevent you from learning something new. It would have been easy to "just do it the old way." But learning new things is the essence of keeping your skills sharp. And for me, it's not a new skill either: I know regular expressions. It was a matter of using them inside of a new tool.
So I stared at VIm's pattern.txt help file. Eventually, I found my issue. In VIM, the grouping function is done with a "\(\)", not with unadorned parentheses "()". Backslash! Entering the following into VIM produced the desired results:
%s/^\(\d\d\)\/\(\d\d\)\(.*\)/0000\1\2\3/
It may seem like a small thing, but learning this is helpful because VIM has pattern highlighting. I can enter the first part of the RE as a search expression in the command line (\d\d\/\d\d) and have the pattern highlighted:
This makes VIM a simple RE tester. (Yes, I know there are RE testers on the web, but I've shied away from them.) Moreover, Perl can't give me highlighting, unless I decide to settle for just print the match (with \1).
My second example is slightly more involved. My file contains lines like this:
1.1.8 Crawl audio/video 1.1.9 Federated search (combining inventory results) 1.1.10 Local Search Results.
I needed these lines to look like this:
Crawl audio/video 1.1.8 Federated search (combining inventory results) 1.1.9 Local Search Results. 1.1.10
Using my new found knowledge with VIM, I can test my matching RE (1\.1\.\d\d\?):
Then I group these and do a swap of the grouped expressions:
%s/\(\d\.\d\.\d\d\?\) \(.*$\)/\2 \1/
So what have we learned? Regular expressions have dialects, and sometimes those dialects can prevent us (at least me) from trying something new. But if you do persist and learn those dialects, you can add to your skills. That's a good thing.
Mastering Regular Expressions by Jeffrey Friedl gets into RE dialects, as well as the website regular-expressions.info.