Tech Blog: How to configure JSON logging in nginx?

By Ivan Borko on July 21, 2022

In previous posts from this series, we discussed how we formatted UWSGI and Python logs in JSON format. We still have one important production component left: the Nginx server. This blog post will describe how the Nginx logging module works, and showcase a simple logging configuration where Nginx logger is configured to output JSON logs.

This blog post is a part of the Velebit AI Tech Blog series where we discuss good practices for a scalable and robust production deployment.

Formatting logs:

Collecting logs:

Collecting logs in docker clusters

Visualizing logs:

Connecting Elasticsearch logs to Grafana and Kibana

Logging architecture

Our goal is to have all services produce JSON logs so that we can directly feed them to Elasticsearch, without additional processing services (like Logstash) that require additional maintenance, consume a lot of CPU, and thus incur extra costs. This enables us to, instead of having Filebeat, Logstash, and Elasticsearch, use just Fluent Bit and Elasticsearch.

All our services are deployed as docker containers but this solution can work with or without docker. You can read more about our logging system components and architecture here.

Nginx logging

We use Nginx as a load balancer, for SSL termination (HTTPS), as a cache, and as a reverse proxy. Incoming requests for all our services go through the Nginx. As all external client requests go through the Nginx, access logs are the place where you get the best picture of the actual usage and performance of your system from the client’s perspective.

Nginx by default has two log files: access log and error log. Access log records every request to the Nginx server, while the error log records all issues the Nginx service has, but not errors of our services and bad responses (these go to the access log).

Nginx server contains a logging module ngx_http_log module that writes request logs in a specified format. The default format is called combined and logs values with a space delimiter.

JSON format

The log format is specified using characters and variables, so you can create your own JSON format. The main issue with this approach is that variables can contain strings that will break JSON format (if a variable contains “ or ‘ characters, for example). To fix this, Nginx version 1.11.8 added support for JSON escaping.

There are 3 options for escaping: default, json, and none. JSON escaping will escape characters not allowed in JSON strings: characters “"” and “\” are escaped as “\"” and “\\”, characters with values less than 32 are escaped as “\n”, “\r”, “\t”, “\b”, “\f”, or “\u00XX”.

A list of available variables can be found here: http://nginx.org/en/docs/http/ngx_http_log_module.html#log_format

Every log format has a name (identifier) that can be used multiple times. Log formats can be configured outside of server directive, and then used in multiple servers.

In our nginx configuration, this looks like this:

log_format logger-json escape=json '{"source": "nginx", "time": $msec, "resp_body_size": $body_bytes_sent, "host": "$http_host", "address": "$remote_addr", "request_length": $request_length, "method": "$request_method", "uri": "$request_uri", "status": $status,  "user_agent": "$http_user_agent", "resp_time": $request_time, "upstream_addr": "$upstream_addr"}';

The important thing to note here is that we want to index variables that contain numbers as long type in Elasticsearch and they don’t have double quotes around them, while variables that have strings have double quotes.

In the server directive, we reference the log format (logger-json) we want to use when specifying access log parameters.

server {
    listen 443 ssl;
    server_name api.company.com;
    ...

    access_log /var/log/nginx/access.log logger-json;
    ...
}

How the result looks like (sample)

In the access.log file (located in /var/log/nginx), every line contains one log record. Below you can see an example of a log record (formatted for readability).

{
    "source": "nginx",
    "time": 1658188800011,
    "resp_body_size": 23615,
    "host": "api.company.com",
    "address": "192.20.0.1",
    "request_length": 482,
    "method": "POST",
    "uri": "/service/route?variant=default",
    "status": 200,
    "user_agent": "Apache-HttpClient/4.5.1 (Java/11.0.15)",
    "resp_time": 0.042,
    "upstream_addr": "10.0.0.20:80"
}

This record is easily processed by FluentBit and inserted into Elasticsearch (more on FluentBit in this article). In most setups, Elasticsearch is then connected to Grafana, Kibana, or a similar tool used for log visualizations, creating graphs, and alerting.

We are using both Kibana and Grafana. Grafana is better suited for alerting, while Kibana is better when you have a problem and you need to deep dive into logs (discovery). You can read more about connecting Elasticsearch to Grafana and Kibana in our previous blog post.

If you want to read more about similar topics, you can read our other articles or subscribe to Velebit AI on LinkedIn.

Back