Archive for the ‘Testing’ Category
Postfix Logs: Bash
My company uses Postfix mail server to send mail.
Recently some users complained that they have received the same e-mail multiple times. I have inspected our logs and confirmed the problem. In short, the problem was that all messages should have unique message-id. We had a bug. While sending the same message to multiple people (for example in a mailing list), all messages had the same message-id and that caused some mails to be sent multiple times.
The developers fixed the problem and I wanted to make sure it is really fixed. After checking in a virtual machine we use for testing we were pretty sure the problem is fixed. All messages had unique message-id, even if the same mail was sent to multiple people. We have updated the code in production. After a few days I wanted to check Postfix logs to make sure the problem is really fixed.
I have found good (and short) article on Postfix logs, Troubleshooting with Postfix Logs that should provide enough information to get you started if you are new to Postfix. In short, Postfix has great logs. You will find everything you need there.
To further improve logs, we use variable envelope return path (VERP) to track all mail. In short, VERP makes it possible to know exactly which message bounced. Our VERP looks like this: vyx1psnf=qf3kay0g=user=example.net@example.com. We have sent vyx1psnf=qf3kay0g message (our tracking ID) to user@example.net. We own example.com domain and bounced mail will be returned there. (Please notice that example.net and example.com are not the same domain.) When we get mail sent to vyx1psnf=qf3kay0g=user=example.net@example.com we know it is a bounce and we know exactly which mail has bounced.
Let’s take a look at the logs. I have just checked logs for the last two months and files for each day are from 64 to 358 MB in size. I have picked one file, Mac OS says it is 109.1 MB on disk. We will call the file log.
How does the file look like? (Some portions are anonymized.)
$ cat log | head Dec 19 00:06:46 cdn postfix/qmgr[11196]: 051452B98BB: from=<19=zrnapvcy=user=example.net@example.com>, size=290769, nrcpt=1 (queue active) Dec 19 00:06:46 cdn postfix/qmgr[11196]: 0E5B82B8EE4: from=<cqpygzdq=5wtz5664=user=example.net@example.com>, size=30125, nrcpt=1 (queue active) ...
(A big thank you to Davor Banović for helping me with Bash.)
Every line in log file has four parts. Let’s inspect the first line in log in detail.
| date and time | Dec 19 00:06:46 |
| host name | cdn |
| Postfix component ID | postfix/qmgr[11196]: |
| message | 051452B98BB: from=<19=zrnapvcy=user=example.net@example.com>, size=290769, nrcpt=1 (queue active) |
We are only interested in message, and it is freeform.
Let's get some statistics about the log file before we dive into one mail.
$ ls -l log -rw-r--r-- 1 zeljko staff 109127473 Dec 20 11:55 19
Google says 109 127 473 bytes = 104.072068 megabytes.
All lines start with Dec so we can count number of lines.
$ grep -c Dec log 791311
So, 791k lines. 154k lines report status (about 19%).
$ grep -c status= log 153573
There are four statuses that are possible: deferred, sent, expired and bounced. Let's see how many of each there is.
$ grep -c status=deferred log 56609 $ grep -c status=sent log 92695 $ grep -c status=expired log 978 $ grep -c status=bounced log 3291
So, 56k deferred (37% of all statuses), 93k sent (60%), 1k expired (1%), 3k bounced (2%).
How many lines are there that contain VERP in from header and how many lines contain message-id?
$ grep -c ": from=<.*=.*=.*@.*>, " log 149133 $ grep -c message-id log 142528
(A big thank you to Rubular for helping me with regular expressions.)
Similar numbers. 149k lines contain VERP in from header (19% of all lines), 143k lines contain message-id (18%).
4k lines contain VERP in to header (1%). Those are bounces, remember?
$ grep -c ": to=<.*=.*=.*@.*>, " log 4279
There are 153k VERPs in either from or to headers, and VERPs appear only in mentioned headers and nowhere else.
$ grep -c ": \w*=<.*=.*=.*@.*>, " log 153412 $ grep -c "<.*=.*=.*@.*>, " log 153412
Enough of statistics, let's inspect one mail. A few lines from the log look like this.
$ cat log ... Dec 19 12:25:07 cdn postfix/qmgr[11196]: 309332B8F9A: from=<vyx1psnf=qf3kay0g=user=example.net@example.com>, size=13803, nrcpt=1 (queue active) Dec 19 12:25:07 cdn postfix/cleanup[31905]: 321512B8F5A: message-id=<801daf8a02064f0789a28289dd0fde67@example.com> Dec 19 12:25:07 cdn postfix/smtpd[31881]: 63CBD2B8EF4: client=example.com[127.0.0.1] Dec 19 12:25:07 cdn postfix/cleanup[31807]: 3233C2B8FA0: message-id=<4bad537e1cb14b9b86f693668a2a56ed@example.com> Dec 19 12:25:07 cdn postfix/smtp[31781]: 419122B8FC8: to=<user@example.net>, relay=127.0.0.1[127.0.0.1]:10027, conn_use=3, delay=0.54, delays=0.41/0/0/0.13, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 8D2632B8F9A) ...
As you can see, it is a mess because Postfix logs stuff as it happens, and a lot of things happen almost at the same time. Since we want to know if any mail is sent multiple times, for now we are only interested in lines that have text status=sent, so we will start there.
$ grep status=sent log ... Dec 19 12:25:07 cdn postfix/smtp[31781]: 419122B8FC8: to=<user@example.net>, relay=127.0.0.1[127.0.0.1]:10027, conn_use=3, delay=0.54, delays=0.41/0/0/0.13, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 8D2632B8F9A) ...
We need ID (419122B8FC8) from that line.
$ echo "Dec 19 12:25:07 cdn postfix/smtp[31781]: 419122B8FC8: to=<user@example.net>, relay=127.0.0.1[127.0.0.1]:10027, conn_use=3, delay=0.54, delays=0.41/0/0/0.13, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 8D2632B8F9A)" | cut -d " " -f6 | cut -d ":" -f1 419122B8FC8
So, mail with Postfix ID 419122B8FC8 is sent.
$ grep 419122B8FC8 log Dec 19 12:25:07 cdn postfix/smtpd[31793]: 419122B8FC8: client=example.com[127.0.0.1] Dec 19 12:25:07 cdn postfix/cleanup[31809]: 419122B8FC8: message-id=<a64ea1f30f964980a4c27593daaa9a8e@example.com> Dec 19 12:25:07 cdn postfix/qmgr[11196]: 419122B8FC8: from=<vyx1psnf=qf3kay0g=user=example.net@example.com>, size=13803, nrcpt=1 (queue active) Dec 19 12:25:07 cdn postfix/smtp[31781]: 419122B8FC8: to=<user@example.net>, relay=127.0.0.1[127.0.0.1]:10027, conn_use=3, delay=0.54, delays=0.41/0/0/0.13, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 8D2632B8F9A) Dec 19 12:25:07 cdn postfix/qmgr[11196]: 419122B8FC8: removed
We also need message-id (a64ea1f30f964980a4c27593daaa9a8e@example.com), VERP (from=<vyx1psnf=qf3kay0g=user=example.net@example.com>) and relay (127.0.0.1).
$ grep 419122B8FC8 log | grep message-id | cut -d "<" -f2 | cut -d ">" -f1 a64ea1f30f964980a4c27593daaa9a8e@example.com $ grep 419122B8FC8 log | grep ": from=<.*=.*=.*@.*>, " | cut -d "<" -f2 | cut -d ">" -f1 vyx1psnf=qf3kay0g=user=example.net@example.com $ grep 419122B8FC8 log | grep relay= | cut -d "=" -f3 | cut -d "[" -f1 127.0.0.1
Now we have all data we need.
| VERP | vyx1psnf=qf3kay0g=user=example.net@example.com |
| message-id | a64ea1f30f964980a4c27593daaa9a8e@example.com |
| Postfix ID | 419122B8FC8 |
| relay | 127.0.0.1 |
How many times do message-id and VERP appear in the file?
$ grep vyx1psnf=qf3kay0g=user=example.net@example.com Dec 19 12:25:07 cdn postfix/qmgr[11196]: 419122B8FC8: from=<vyx1psnf=qf3kay0g=user=example.net@example.com>, size=13803, nrcpt=1 (queue active) Dec 19 12:25:07 cdn postfix/qmgr[11196]: 8D2632B8F9A: from=<vyx1psnf=qf3kay0g=user=example.net@example.com>, size=14415, nrcpt=1 (queue active) $ grep a64ea1f30f964980a4c27593daaa9a8e@dgroups.org log Dec 19 12:25:07 cdn postfix/cleanup[31809]: 419122B8FC8: message-id=<a64ea1f30f964980a4c27593daaa9a8e@example.com> Dec 19 12:25:07 cdn dkimproxy.out[32050]: DKIM signing - signed; message-id=<a64ea1f30f964980a4c27593daaa9a8e@example.com>, signer=<@example.com>, from=<user@example.net> Dec 19 12:25:07 cdn postfix/cleanup[31930]: 8D2632B8F9A: message-id=<a64ea1f30f964980a4c27593daaa9a8e@example.com>
Looks like 419122B8FC8 and 8D2632B8F9A are connected, since they always appear together.
$ grep 419122B8FC8 log Dec 19 12:25:07 cdn postfix/smtpd[31793]: 419122B8FC8: client=example.com[127.0.0.1] Dec 19 12:25:07 cdn postfix/cleanup[31809]: 419122B8FC8: message-id=<a64ea1f30f964980a4c27593daaa9a8e@example.com> Dec 19 12:25:07 cdn postfix/qmgr[11196]: 419122B8FC8: from=<vyx1psnf=qf3kay0g=user=example.net@example.com>, size=13803, nrcpt=1 (queue active) Dec 19 12:25:07 cdn postfix/smtp[31781]: 419122B8FC8: to=<user@example.net>, relay=127.0.0.1[127.0.0.1]:10027, conn_use=3, delay=0.54, delays=0.41/0/0/0.13, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 8D2632B8F9A) Dec 19 12:25:07 cdn postfix/qmgr[11196]: 419122B8FC8: removed $ grep 8D2632B8F9A log Dec 19 12:25:07 cdn postfix/smtpd[31842]: 8D2632B8F9A: client=example.com[127.0.0.1] Dec 19 12:25:07 cdn postfix/cleanup[31930]: 8D2632B8F9A: message-id=<a64ea1f30f964980a4c27593daaa9a8e@example.com> Dec 19 12:25:07 cdn postfix/qmgr[11196]: 8D2632B8F9A: from=<vyx1psnf=qf3kay0g=user=example.net@example.com>, size=14415, nrcpt=1 (queue active) Dec 19 12:25:07 cdn postfix/smtp[31781]: 419122B8FC8: to=<user@example.net>, relay=127.0.0.1[127.0.0.1]:10027, conn_use=3, delay=0.54, delays=0.41/0/0/0.13, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 8D2632B8F9A) Dec 19 12:25:09 cdn postfix/smtp[31964]: 8D2632B8F9A: to=<user@example.net>, relay=example.net[127.0.0.1]:25, delay=1.9, delays=0.13/0/0.43/1.3, dsn=2.0.0, status=sent (250 OK id=1PUIFI-0002Yf-TL) Dec 19 12:25:09 cdn postfix/qmgr[11196]: 8D2632B8F9A: removed
So, 419122B8FC8 used relay 127.0.0.1 (localhost) and 8D2632B8F9A used relay example.net. That is normal. Mail is first sent to 127.0.0.1 (localhost) and then to destination mail server (example.net). It is just the way things are set up. Maybe I get into more detail in another blog post.
We have more data now, and everything looks fine so far.
| VERP | vyx1psnf=qf3kay0g=user=example.net@example.com |
| message-id | a64ea1f30f964980a4c27593daaa9a8e@example.com |
| Postfix ID 1 | 419122B8FC8 |
| relay 1 | 127.0.0.1 |
| Postfix ID 2 | 8D2632B8F9A |
| relay 2 | example.net |
Now, let's finally inspect the whole file and find out if any mail is sent multiple times. There are more than one way to do it. A couple of them are:
- Get all lines with status
sent. From each line grab Postfix ID. Get all lines with the ID. Get message-id, relay and VERP from lines with the ID. Ignore if VERP is intoheader. (Those are bounces. We are interested only in VERPs infromheader.) - Get all lines with a VERP. From each line grab Postfix ID. Get all lines with the ID. Get message-id, relay and status from lines with the ID. Ignore if status is not
sent.
Since I have no idea how would I do that in Bash, I will use Ruby for that. This post got pretty long, that will be covered in another post.
Test Like You Do Not Need the Money
A few days ago it was exactly 5 years since I started working as a software tester. I planned to publish this on the very date, but life happened.
What happened in the last five years? Am I smarter? Or just older? Let’s find out.
Personal
In the last five years I have upgraded LifeOS™ from 26.0 to 31.0, upgraded Girlfriend 5.0 application to Wife 4.0 and installed Son 1.5. I do not use DistanceRunning application so much any more, but I do my best. Drums 1.0 are lost somewhere and Guitar 1.0 is mainly collecting digital dust. No so sure about smarter, but definitely older.
Smarter vs. Older – 0:1
Professional

Five years ago I had a job I did not like. I thought software (programming to be precise) is so cool, and I wished I could develop it for a living. No luck back then.

I still think software is lots of fun, and I live in code as much as I can. But, now I think software testing is even cooler, and I know I was born to be a tester. Not only do I have a job I would do even if I did not have to work, but I get to work from home. A dream I had since I was just a little kid.
Smarter vs. Older – 1:1
Programming Languages

My BASIC, FORTRAN and PHP skills declined a bit in the last five years, but I am sure I would be up and running in minutes if I ever needed them. (Looks like I had a thing for UPPERCASE languages back then.)
I have picked up some Ruby and Watir in the last few years, and I even made it to the Watir core team! Not bad for a simple tester from Croatia, right?!
Smarter vs. Older – 2:1
Web
Five years ago, I had a crappy site with a crappy URL. (In Croatian only. Some stuff is missing because it is archive.org snapshot, but it did not look much better anyway.)
Today, I am all over the place. (I even have a place where I try to aggregate my web presence.) I have a great site with great URL. Five years ago podcasts just started to appear, now I have one.
Smarter vs. Older – 3:1
Watir

I already mentioned Watir, a tool that I use almost every day, and it makes my life much easier. Big portion of this site is dedicated to it. I like it so much I talk about it to anybody that would listen. I even flew across the ocean to meet with people that like Watir.
Conclusion
It really takes a lot of time to write a blog post. No wonder I do not write so much.
Let’s take a look at the final score:
Smarter vs. Older – 3:1
Looks like Smarter won. Maybe another jury would vote differently.
If you remember only one thing from this post, I hope it will be the title:
Test like you do not need the money.
How Testers Look Like?
I read a few blogs. I monitor a few mailing lists. When I read a post, I like to know how that person looks like. I guess that I am one of those visual types. (My image is at every page of this blog.) And sometimes it is not an easy task to find it out. So, I found images of a few testers (not in any specific order):
- Michael J. Hunter
- Jonathan Kohl
- Elisabeth Hendrickson
- Cem Kaner
- Brian Marick
- Bret Pettichord
- James Bach
- Michael Bolton (not so sure about him)
I tagged them as tester at my delicious account (del.icio.us/zeljko.filipin/tester). I will be adding more, as I find more images. If you are a tester, or you have found images of other testers, leave a link to image in comments.



