Tuesday, January 6, 2009

Merge log files and sort the records in Linux

Recently I need to do merging of log files and sort the records according to timestamp. This is what I did, use the bulk replace file extension way to change the log files to .txt files, then applied the following command through Shell script.

#!/bin/sh

export MFILE=merge_file.txt
export SFILE=complete_file.txt

for file in `ls *.txt`
do

cat $file >> $MFILE

done
sort -k4,4 $MFILE > $SFILE
rm $MFILE


Note:
Log files are expected to have same format, otherwise the sorting will not be possible.

-k, --key = POS1[,POS2], means sorting by start a key at POS1, end it at POS2 (which is where the timestamp position in my log files)

2 comments:

Anonymous said...

find/xargs is your friend. You could do away with your ls/for loop like:

find . -name *.txt | xargs sort -m -k4,4 > $SFILE

Then you wouldn't need to be forever deleting $MFILE, and you'll potentially do a lot less work.

-M-ric said...

What about timestamp wrapping? If your timestamp gets to max value and wrap to 0. How do you handle the sorting after that?