Chapter 15. Indexing audit trails

PSM can index the contents of audit trails using its own indexer service or external indexers. Indexing extracts the text from the audit trails and segments it to tokens. A token is a segment of the text that does not contain whitespace: for example words, dates (2009-03-14), MAC or IP addresses, and so on. The indexer returns the extracted tokens to PSM, which builds a comprehensive index from the tokens of the processed audit trails.

Once indexed, the contents of the audit trails can be searched from the web interface. PSM can extract the commands typed and the texts seen by the user in terminal sessions, and text from graphical protocols like RDP, Citrix ICA, and VNC. Window titles are also detected.

PSM has an internal indexer, which runs on the PSM appliance. In addition to the internal indexer, external indexers can run on Linux hosts.

Processing and indexing audit trails requires significant computing resources. If you have to audit lots of connections, or have a large number of custom reports configured, consider using an external indexer to decrease the load on PSM. For sizing recommendations, ask your Balabit partner or contact the Balabit Support Team.

  • The internal indexer service runs on the PSM appliance. It supports languages based on the Latin-, Greek- and Cyrillic alphabets, as well as Chinese, Japanese and Korean languages, allowing it to recognize texts from graphical audit trails in 100+ languages. It can also generate screenshots for content search results.

    Recognizing and OCR-ing CJK (Chinese, Japanese and Korean) languages must be licensed separately.

  • The external indexer runs on Linux hosts and instances. It uses the same engine as the indexer service of PSM, and has the same capabilities and limitations.

    PSM can work with multiple external indexers to process audit trails.

If you have indexed trails, the index itself is also archived:

When using the Indexer service: Every 30 days, unless the Backup & Archive/Cleanup > Archive/Cleanup policies > Retention time in days is configured to occur less frequently (more than 30 days). For example, if the Retention time in days is 60 days, the index will be archived every 60 days. The content of the archived index will be the content that was available X days before the archival date, where X is the number in the Retention time in days field.

Warning

Hazard of data loss!

Make sure you also backup your data besides archiving (for details, see Section 4.7, Data and configuration backups). If a system crash occurs, you can lose up to 30 days of index, since the index is only archived in every 30 days.