Tuesday, August 05, 2008

Editing a dump file from the airport

Keith Murphy asks if you have ever edited your dump files before feeding them to the database. And he recommends using sed. This reminds me of a war story, when I was consulting in Milan. The customer needed, among other things, to import a huge text file (created by a third party application) into a MySQL database. I did everything, including a cron job to load the data every day. By the time I left for the airport, the third party application hadn't yet provided its first sample, but since we knew the fields list, it was a simple case of creating a LOAD DATA INFILE command.
I was boarding my plane, when the customer called, saying that the application was giving dates in the wrong format. Instead of 'YYYY-MM-DD', it was using the Italian human readable format 'DD/MM/YYYY'. No Internet was available, and with just a few minutes before being confined to the airplane, I told the customer "I will send you a text message. Just pipe this command before feeding the file to the database."
And I sent this message:
perl -pe 's{(\d\d)/{\d\d}/{\d\d\d\d}}{$3-$2-$1}g'
I took a picture of my phone before sending the message.

When I landed, I got a text message from the customer. Everything worked fine.
So, that was when I edited 2 GB data file from the airport, using a mobile phone.


JDD said...

That's a fantastic story! (And I would have needed a Perl regular expressions book in front of me to get it right)

qu1j0t3 said...

This can also be done in the LOAD DATA INFILE statement. Here's an example that converts Apache log format dates:

load data infile 'x.log'
into table logs
fields terminated by ' ' optionally enclosed by '\"'
(@ip, ident, usr, @reqtime, tz, req, status, size, rfrr, agent)
set ip=inet_aton(@ip), reqtime=str_to_date(@reqtime, '[%d/%b/%Y:%H:%i:%S')

Giuseppe Maxia said...

well said.
But, it would have been too much typing by phone ! :)