8.6. Message parsers

The syslog-ng application can separate parts of log messages (i.e., the contents of the $MSG macro) to named fields (columns). These fields act as user-defined macros that can be referenced in message templates, file- and tablenames, etc.

To create a parser, define the columns of the message, the delimiter or separator characters, and optionally the characters that are used to escape the delimiter characters (quote-pairs).

Declaration:
parser parser_name {
    csv-parser(column1, column2, ...)
    delimiters()
    quote-pairs()
    };

Column names work like macros. Always use a prefix to identify the columns of the parsers, e.g., MYPARSER1.COLUMN1, MYPARSER2.COLUMN2, etc. Column names starting with a dot (e.g., .HOST) are reserved for use by syslog-ng.

Name Synopsis Description
csv-parser csv-parser(columns("PARSER.COLUMN1", "PARSER.COLUMN2", ...)) Specifies the type of parser to use, and the name of the columns to separate messages to. Currently only the csv-parser is implemented, which can separate columns based on delimiter characters and strings.
delimiters delimiters("<delimiter_characters>") The character that separates the columns in the message.
flags() flags(escape-none,escape-backslash,escape-double-char,strip-whitespace ) Escaping rules used by the parser. The strip-whitespace flag removes trailing whitespaces from the beginning and the end of the columns.
quote-pairs() quote-pairs('<quote_pairs>') List quote-pairs between single quotes. Delimiter characters enclosed between quote characters are ignored. Note that the beginning and ending quote character does not have to be identical, e.g., [} can also be a quote-pair.
template() template("${<macroname>}") The macro that contains the part of the message that the parser will process. It can also be a macro created by a previous parser of the log path. By default, this is empty and the parser processes the entire message.

Table 8.20. Parser parameters


[Example] Example 8.26. Segmenting hostnames separated with a dash

The following example separates hostnames like example-1 and example-2 into two parts.

parser p_hostname_segmentation {
    csv-parser(columns("HOSTNAME.NAME", "HOSTNAME.ID")
    delimiters("-")
    flags(escape-none)
    template("${HOST}"));
};
destination d_file { file("/var/log/messages-${HOSTNAME.NAME:-examplehost}"); };
log { source(s_local); parser(p_hostname_segmentation); destination(d_file);};
[Example] Example 8.27. Parsing Apache log files

The following parser processes the log of Apache web servers and separates them into different fields. Apache log messages can be formatted like:

"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %T %v"
.

Here is a sample message:

192.168.1.1 - - [31/Dec/2007:00:17:10 +0100] "GET /cgi-bin/example.cgi HTTP/1.1" 200 2708 "-" "curl/7.15.5 (i4 86-pc-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8c zlib/1.2.3 libidn/0.6.5" 2 example.balabit
.

To parse such logs, the delimiter character is set to a single whitespace (delimiters(" ")). Whitespaces between quotes and brackets are ignored (quote-pairs('""[]')).

parser p_apache {
    csv-parser(columns("APACHE.CLIENT_IP", "APACHE.IDENT_NAME", "APACHE.USER_NAME",
        "APACHE.TIMESTAMP", "APACHE.REQUEST_URL", "APACHE.REQUEST_STATUS",
        "APACHE.CONTENT_LENGTH", "APACHE.REFERER", "APACHE.USER_AGENT",
        "APACHE.PROCESS_TIME", "APACHE.SERVER_NAME")
         flags(escape-double-char,strip-whitespace)
         delimiters(" ")
         quote-pairs('""[]')
         );
};

The results can be used for example to separate log messages into different files based on the APACHE.USER_NAME field. If the field is empty, the nouser name is assigned.

log { source(s_local);
    parser(p_apache); destination(d_file);};
};
destination d_file { file("/var/log/messages-${APACHE.USER_NAME:-nouser}"); };
[Example] Example 8.28. Segmenting a part of a message

The following example splits the timestamp of a parsed Apache log message into separate fields.

parser p_apache_timestamp {
    csv-parser(columns("APACHE.TIMESTAMP.DAY", "APACHE.TIMESTAMP.MONTH", "APACHE.TIMESTAMP.YEAR", "APACHE.TIMESTAMP.HOUR", "APACHE.TIMESTAMP.MIN", "APACHE.TIMESTAMP.MIN", "APACHE.TIMESTAMP.ZONE")
    delimiters("/: ")
    flags(escape-none)
    template("${APACHE.TIMESTAMP}"));
    };
log { source(s_local);
    log { parser(p_apache); parser(p_apache_timestamp); destination(d_file);};
};

© 2007-2008 BalaBit IT Security
Please send your comments or documentation bugs to: documentation@balabit.com