syslog (-ng) to postgreSQL database howto


May 15th, 2002
2002-05-15
by Ben Russo

COPYRIGHT: You may copy, modify redistribute, quote or use this document however you want, I would be happy if you mentioned my name as a source.


THANKS

Thanks to all the programmers and people who have contributed their time, skills and passion to open source projects.

AUDIENCE

This document is intended to be read by a UNIX System Administrator with at least a few years of Enterprise level experience. I only say this because I will try not to explain everything to the finest detail, I expect the reader to have a good grasp on the basics of syslog and IP networking and RedHat Linux system administration and database theory.

GOAL

The goal of this document is to provide a procedure for creating a centralized SYSLOG "sink" that imports syslog messages into a postgreSQL database and provides ODBC access to that database. If you don't realize why this is valuable then you probably don't need to realize it and therefore shouldn't worry about it. On the other hand if you have ever been an overworked sysadmin with a large number of servers and networked equipment to manage, then a centralized syslog server can be a help. Combining that with an SQL database and user tools like pgadmin or MS Access and it is a major relief!

I got this to work nicely with PostgreSQL, and chose to use postgres because of many features it has that mySQL does not (and for licensing reasons). However, the procedure I have outlined here could easily be adapted for any database system. Such as Oracle, MySQL, Sybase, DB2, Access (etc).


HOWTO

Set up a PC with the latest version of RedHat Linux and all the patches. You can set it up however you like best but make sure that you at least install the following:

After installing your box, make sure that it is on the network Set up bind to be started at boot time as a cacheing nameserver (this will help to speed local applications like syslog-ng that do lots of name lookups, and it will remove some load on your DNS servers). I understand that syslog-ng has some name cacheing built in, but I don't know that much about it, and having a cacheing DNS server locally can't hurt. :)

Make sure that the systems defualt timezone is set for what you want syslog-ng to timestamp it's messages with. This is important for this box because all our syslogs will be coming from different zones. When we insert the records into the database we will use an ISO time value that includes GMT offset, so that won't matter too much. But for debugging purposes when looking at text it helps not to get confused.

Get the latest tarball for libol and syslog-ng from the balabit website (where syslog-ng is from). use "rpm -tb " on the tarballs (you will need a few of the development tools and compilers and devel libs and rpm-build installed, as well as the kernel headers). You could compile the source code and install them with the install scripts if you wanted to, but I like RPM's.

The rpm -tb command will create i386.rpm binary install packages for libol and syslog-ng and put them in /usr/src/redhat/RPMS/i386 use "rpm -Uvh" on the newly created rpms to install them. Then "cd /usr/etc" and "mkdir /etc/syslog-ng; ln -s ../../etc/syslog-ng syslog-ng" in the /etc/syslog-ng directory place the following into syslog-ng.conf. This is a sample, you can certainly modify it to meet your needs.

*******************************BEGIN FILE***************************
# syslog-ng configuration file for RedHat 7.2
#
# See syslog-ng(8) and syslog-ng.conf(5) for more information.
#
# Originall: 20000925 "gb at sysfive.com"
# Modified: 20020424 "ben at umialumni.com"
#
options { sync (0);
   time_reopen (10);
   log_fifo_size (100);
   long_hostnames (on);
   use_dns (yes);
   use_fqdn (yes);
   create_dirs (yes);
   keep_hostname (yes);
   time_reap(12);
   gc_busy_threshold(768);
   };
###############################################################################
source s_sys { unix-stream ("/dev/log"); internal(); };
   source s_udp { udp(); };
   source s_tcp { tcp(); };
   source s_local { internal(); };
###############################################################################
# destination d_files { file("/var/log/syslog-ng/$HOST/$FACILITY.$LEVEL"
# template("$ISODATE $HOST $FACILITY $PRIORITY $PROCESS $MSG\n"));
# };
destination d_local { file("/var/log/syslog-ng.$MONTH"); };
destination d_msgs { file("/var/log/messages"
   template("$FULLDATE $HOST $FACILITY.$PRIORITY $MSG\n"));
   };
destination d_mail { file("/var/log/maillog"
   template("$FULLDATE $HOST mail.$PRIORITY $MSG\n"));
   };
destination d_secure { file("/var/log/secure"
   template("$FULLDATE $HOST authpriv.$PRIORITY $MSG\n"));
   };
destination d_cron { file("/var/log/cron"
   template("$FULLDATE $HOST cron.$PRIORITY $MSG\n"));
   };
destination d_bootlog { file("/var/log/boot.log"
   template("$FULLDATE $HOST local7.$PRIORITY $MSG\n"));
   };
#destination d_tty12 { pipe("/dev/tty12"
# template("$FULLDATE $FACILITY.$PRIORITY $MSG\n"));
# };
destination d_postgres { file("/spooldir/syslog2pgsql/fulllog.$YEAR.$MONTH.$DAY.$HOUR.$MIN.$SEC"
   template("INSERT INTO msg_table VALUES \( '$R_ISODATE', '$S_ISODATE', '$HOST',    '$FACILITY', '$PRIORITY', '$MSG'\)\;\n")
   template_escape(yes)
   owner(postgres));
   };
################################################################################
################################################################################
filter f_msgs { host("syslogdb")
   and level(warning,err,crit,alert,emerg)
   and not facility(mail,authpriv,cron);
   };
filter f_mail { host("syslogdb")
   and facility(mail);
   };
filter f_cron { host("syslogdb")
   and facility(cron);
   };
filter f_secure { host("syslogdb")
   and facility(authpriv);
   };
filter f_bootlog { host("syslogdb")
   and facility(local7);
   };
# filter f_tty12 { host("syslogdb")
# 			   and level(crit,alert,emerg);
#                 };
## Following is a pretty complex filter that you could modify to fit your tastes.
##  I just use it to filter out syslog messages that happen all the time that
##  I don't want to get inserted into my database.


filter f_postgres { not(
(host("syslogdb") and facility(cron) and level(info))
or (facility(user) and level(notice)
and ( match(" gethostbyaddr: ")
or match("last message repeated ")
)
)
or ( facility(local3) and level(notice)
and match(" SYSMON NORMAL "))
or ( facility(mail) and level(warning)
and match(" writable directory")
)
or ( ( host("dbserv1.somecompany.com")
or host("dbserv2.somecompany.com")
)
and facility(auth) and level(info)
and match("su oracle") and match(" succeeded for root on /dev/")
)
); }; ############################################################################## ##############################################################################
# log { source(s_sys); source(s_udp); destination(d_files); };
log { source(s_sys); source(s_udp); filter(f_msgs); destination(d_msgs); };
log { source(s_sys); source(s_udp); filter(f_mail); destination(d_mail); };
log { source(s_sys); source(s_udp); filter(f_secure); destination(d_secure);    };
log { source(s_sys); source(s_udp); filter(f_cron); destination(d_cron); };
log { source(s_sys); source(s_udp); filter(f_bootlog); destination(d_bootlog);    };
# log { source(s_sys); source(s_udp); filter(f_tty12); destination(d_tty12);    };
# This allows you to hit "ALT-F12" on a Linux console and get
# all the important log messages scrolling up your screen.
log { source(s_sys); source(s_udp); source(s_tcp); filter(f_postgres); destination(d_postgres);    };
log { source(s_local); destination(d_local); };

**************************END OF FILE**************************************
then set up your syslog-ng init script in /etc/rc.d/init.d
like so:
**************************BEGIN FILE****************************************

#!/bin/bash
#
# syslog-ng       Starts syslog-ng/klogd.
#
#
#!/bin/bash
#
# syslog-ng       Starts syslog-ng/klogd.
#
#
# chkconfig: 2345 12 88
# description: Syslog-ng is the facility by which many daemons use to log \
# messages to various system log files.  It is a good idea to always \
# run syslog-ng.

# Source function library.
. /etc/init.d/functions
:

################################################################################
# configuration
#
INIT_PROG="/sbin/syslog-ng"    # Full path to daemon
INIT_OPTS=""                      # options passed to daemon

PATH=/bin:/sbin:/usr/bin:/usr/sbin

INIT_NAME=`basename "$INIT_PROG"`

# Uncomment this if you are on Redhat and think this is useful
#
# . /etc/sysconfig/network
#
# if [ ${NETWORKING} = "no" ]
# then
#       exit 0
# fi

RETVAL=0

umask 077
ulimit -c 0

# See how we were called.
case "$1" in
  start)
        export TZ=UCT
        echo -n "Starting $INIT_NAME: "
        daemon --check $INIT_PROG "$INIT_PROG $INIT_OPTS"
        RETVAL=$?
        echo -n "Starting Kernel Logger: "
        [ -x "/sbin/klogd" ] && daemon klogd
        echo
        [ $RETVAL -eq 0 ] && touch "/var/lock/subsys/${INIT_NAME}"
        ;;
  stop)
        echo -n "Stopping $INIT_NAME: "
        killproc $INIT_PROG
        RETVAL=$?
        echo -n "Stopping Kernel Logger: "
        [ -x "/sbin/klogd" ] && killproc klogd
        echo
        [ $RETVAL -eq 0 ] && rm -f "/var/lock/subsys/${INIT_NAME}"
        ;;
  status)
        status $INIT_PROG
        RETVAL=$?
        ;;
  restart|reload)
        $0 stop
        $0 start
        RETVAL=$?
        ;;
  *)
        echo "Usage: $0 {start|stop|status|restart|reload}"
        exit 1
esac

exit $RETVAL


********************************END OF FILE*****************************

Disable the default syslog daemon installed by RedHat. use chkconfig to get rid of it's sysVinit links start up syslog-ng! use chkconfig to make sure that the proper SysVinit links are created and that syslog-ng is configured to be started when the box is. Also use chkconfig to make sure that postgresql is configured to start when the system is.

Copy /var/lib/pgsql/data/pg_hba.conf to /var/lib/pgsql/data/pg_hba.conf.orig
Edit /var/lib/pgsql/data/pg_hba.conf so that it contains the following lines:

**************BEGIN**************
local syslog-ng trust admins
local all md5
host all 172.16.0.0 255.240.0.0 md5
host all 127.0.0.1 255.255.255.255 md5
*************END******************
echo "postgres" > /var/lib/pgsql/data/admins

Make sure that the following lines are uncommented in /var/lib/pgsql/data/postgresql.conf

port = 5432
tcpip_socket = true
fsync = false
shared_buffers=256

If you are not going to monitor the box to make sure that the log files under /var/log and /var/lib/pgsql/syslog don't grow too big then you should write some scripts to clean them up automatically. I plan on doing this eventually, but at this stage in the development of the system I want to watch it manually until I figure out what is going on.Now lets reboot the box and make sure of a few things.

So far when the box boots only

should be listening on the network NTP should sync up with the time servers. Bind should be forward only and the local system should resolv from itself.
SSH & SCP should be accessible from the network. cron should be running. postmaster should be accepting connections from other psql boxes.

Now, run "su - postgres" and run the command: "createdb syslog-ng"
   Then run "psql -U postgres -d syslog-ng"
   You should now be at an SQL terminal prompt,
   and you need to run the following commands:
 
CREATE sequence msg_seq_num;

CREATE TABLE msg_table (
   msg_rcv_time timestamp,
   msg_sent_time timestamp,
   hostname varchar(256),
   msg_facility varchar(8),
   msg_priority varchar(8),
   msg_text varchar(2048),
   msg_id integer default nextval('msg_seq_num')
   );
  
ALTER USER postgres with encrypted    password 'enter password here'
   \q

 

Now create the file called /var/lib/pgsql/syslog/run-syslog2pgsql-insert.sh
with this content. I am not the worlds best programmer, so I'm sure there could be improvements made here, but this works for me. Note that you can cause this script to perform certain actions by touching $WORKDIR/.insert-(action name) files. I find this very helpful for integrating with other scripts.

**************************************BEGIN FILE *****************************


#!/bin/bash
#
#       run-syslog2pgsql-insert.sh
#       23-April-2002   by Ben Russo
#
#       This script makes sure that the syslogs get
#       into the database.
#       It is designed to be started by CRON periodically.
#       I would run it every minute, or at least every
#       every few minutes.
#
DATADIR="/spooldir/syslog2pgsql"
WORKDIR="/var/lib/pgsql/syslog"
LOGFILE="$WORKDIR/syslog2pgsql-insert.log"
TZ=UCT
export DATADIR WORKDIR TZ
#
#       Here we are going to make sure there isn't
#       already an instance of run-syslog2pgsql-insert.sh
#       that is running.
#
if [ -f $WORKDIR/.syslog2pgsql-insert.pid ]
    then
      OLDPID=`cat $WORKDIR/.syslog2pgsql-insert.pid`
      NUMPROCS=`ps -e | grep $OLDPID | grep run-syslog2 | wc -l`
      if [ $NUMPROCS -gt 0 ]
       then
        exit 0
      fi
fi
#
#       If this script has run this far then there should
#       not be another instance of run-syslog2pgsql-insert.sh
#       running, therefore let's make a PID file and do it.
#
echo $$ > $WORKDIR/.syslog2pgsql-insert.pid
#
#       Now start an endless loop that looks for control files.
#
while true
 do
   DATE=`date`
   if [ -f $WORKDIR/.insert-die ]
    then
      echo "=die=====  $DATE $WORKDIR/.insert-die file found, exiting." >> $LOGFILE
      exit 0
   fi
   if [ -f $WORKDIR/.insert-restart ]
    then
      echo "=restart=  $DATE $WORKDIR/.insert-restart file found, exiting." >> $LOGFILE
      echo "=restart=  $DATE Deleteing .insert-restart file." >> $LOGFILE
      echo "=restart=  $DATE Expecting cron to restart this script." >> $LOGFILE
   fi
   if [ -f $WORKDIR/.insert-pause ]
    then
      echo "=pause===  $DATE $WORKDIR/.insert pause file found." >> $LOGFILE
      echo "=pause===  $DATE sleeping 3 secs." >> $LOGFILE
      sleep 3
    else
       FILELIST=`find $DATADIR -name "fulllog.2[0-9][0-9][0-9].[0-1][0-9].[0-3][0-9].[0-2][0-9].[0-5][0-9].[0-6][0-9]"`
       usleep 999997

       for i in $FILELIST
          do
            cat $i | psql -U postgres -d syslog-ng >> $LOGFILE 2>&1
            DATE=`date`
            echo "========= $DATE finished $i" >> $LOGFILE
            rm -f $i
          done

     fi
done
#
#       END OF THE SCRIPT
#

***************************************END FILE*********************************
Make sure that everything under /var/lib/pgsql/ is owned by postgres. Add the following line to the postgres users crontab:

* * * * * /var/lib/pgsql/syslog/run-syslog2pgsql-insert.sh >> /var/lib/pgsql/syslog/syslog2pgsql-insert.log    2>&1
 

Next, install the PsqlODBC driver on your windows desktop PC and install the "pgadmin" utility.

Open your control panel
Open the ADMINISTRATIVE TOOLS sub-panel
Double click on the "DATA SOURCES (ODBC)" Icon
Select the System DSN tab from the window that appears.
Click on the "ADD" button.
Select PostgreSQL
Enter "syslog-ng" (or whatever the name of the database was that you created)
Enter the host name of your database server
The port should be 5432
Enter a user name and password that you created for the database.
Click OK, click OK again, close your control panel windows.

Now you can run PGADMIN from the windows start menu ICON. In this utility you can add users and groups and set
permissions for the tables and columns. You can use the query wizard to generate and save queries and direct their output to Excell, HTML, TXT or Screen. Create the "guest" "guest" user after logging in as the postgres user. Also create the guest account. Right click on your msg_table and select properties, and on the Security tab give the guest group. Give the guest group the ability to "SELECT" only.

Now open up MS Access, open a blank database.
From the FILE menu select GET EXTERNAL DATA, LINK TABLES
In the filebrowser window that pops up you need to scroll down in the "FILES OF TYPE" window to ODBC Databases()
Then in the "Select Data Sources" window that appears go to the "Machine Data Source" window and scroll down to PostgreSQL
Then select the msg_table and click "OK"
Then select the "oid" field (or the "msg_id" field) as your Unique record Identifier and click OK.
Now you can use MS-Access to do queries and searches and reports.

 

All done....