3.8. Parsing messages

The syslog-ng application can separate parts of log messages (i.e., the contents of the $MSG macro) to named fields (columns). These fields act as user-defined macros that can be referenced in message templates, file- and tablenames, etc.

Parsers are similar to filters: they must be defined in the syslog-ng configuration file and used in the log statement.

[Note] Note

The order of filters, rewriting rules, and parsers in the log statement is important, as they are processed sequentially.

To create a parser, define the columns of the message, the delimiter or separator characters, and optionally the characters that are used to escape the delimiter characters (quote-pairs). For the list of parser parameters, see Section 8.6, “Message parsers”.

Declaration:
            parser parser_name {
            csv-parser(column1, column2, ...)
            delimiters()
            quote-pairs()
            };

Column names work like macros. Always use a prefix to identify the columns of the parsers, e.g., MYPARSER1.COLUMN1, MYPARSER2.COLUMN2, etc. Column names starting with a dot (e.g., .HOST) are reserved for use by syslog-ng.

[Example] Example 3.36. Segmenting hostnames separated with a dash

The following example separates hostnames like example-1 and example-2 into two parts.

parser p_hostname_segmentation {
    csv-parser(columns("HOSTNAME.NAME", "HOSTNAME.ID")
    delimiters("-")
    flags(escape-none)
    template("${HOST}"));
};
destination d_file { file("/var/log/messages-${HOSTNAME.NAME:-examplehost}"); };
log { source(s_local); parser(p_hostname_segmentation); destination(d_file);};
[Example] Example 3.37. Parsing Apache log files

The following parser processes the log of Apache web servers and separates them into different fields. Apache log messages can be formatted like:

"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %T %v"

Here is a sample message:

192.168.1.1 - - [31/Dec/2007:00:17:10 +0100] "GET /cgi-bin/example.cgi HTTP/1.1" 200 2708 "-" "curl/7.15.5 (i4 86-pc-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8c zlib/1.2.3 libidn/0.6.5" 2 example.balabit

To parse such logs, the delimiter character is set to a single whitespace (delimiters(" ")). Whitespaces between quotes and brackets are ignored (quote-pairs('""[]')).

parser p_apache {
    csv-parser(columns("APACHE.CLIENT_IP", "APACHE.IDENT_NAME", "APACHE.USER_NAME",
        "APACHE.TIMESTAMP", "APACHE.REQUEST_URL", "APACHE.REQUEST_STATUS",
        "APACHE.CONTENT_LENGTH", "APACHE.REFERER", "APACHE.USER_AGENT",
        "APACHE.PROCESS_TIME", "APACHE.SERVER_NAME")
         flags(escape-double-char,strip-whitespace)
         delimiters(" ")
         quote-pairs('""[]')
         );
};

The results can be used for example to separate log messages into different files based on the APACHE.USER_NAME field. If the field is empty, the nouser name is assigned.

log { source(s_local);
    parser(p_apache); destination(d_file);};
};
destination d_file { file("/var/log/messages-${APACHE.USER_NAME:-nouser}"); };

Multiple parsers can be used to split a part of an already parsed message into further segments.

[Example] Example 3.38. Segmenting a part of a message

The following example splits the timestamp of a parsed Apache log message into separate fields.

parser p_apache_timestamp {
    csv-parser(columns("APACHE.TIMESTAMP.DAY", "APACHE.TIMESTAMP.MONTH", "APACHE.TIMESTAMP.YEAR", "APACHE.TIMESTAMP.HOUR", "APACHE.TIMESTAMP.MIN", "APACHE.TIMESTAMP.MIN", "APACHE.TIMESTAMP.ZONE")
    delimiters("/: ")
    flags(escape-none)
    template("${APACHE.TIMESTAMP}"));
    };
log { source(s_local);
    log { parser(p_apache); parser(p_apache_timestamp); destination(d_file);};
};

© 2007-2010 BalaBit IT Security
Please send your comments or documentation bugs to: documentation@balabit.com