Tech Blog: How to configure JSON logging in nginx?
Photo by Gabriel Gheorghe on Unsplash

Tech Blog: How to configure JSON logging in nginx?


In previous posts from this series, we discussed how we formatted UWSGI and Python logs in JSON format. We still have one important production component left: the Nginx server. This blog post will describe how the Nginx logging module works, and showcase a simple logging configuration where Nginx logger is configured to output JSON logs.

Logging architecture

Our goal is to have all services produce JSON logs so that we can directly feed them to Elasticsearch, without additional processing services (like Logstash) that require additional maintenance, consume a lot of CPU, and thus incur extra costs. This enables us to, instead of having Filebeat, Logstash, and Elasticsearch, use just Fluent Bit and Elasticsearch.

All our services are deployed as docker containers but this solution can work with or without docker. You can read more about our logging system components and architecture here.

Nginx logging

We use Nginx as a load balancer, for SSL termination (HTTPS), as a cache, and as a reverse proxy. Incoming requests for all our services go through the Nginx. As all external client requests go through the Nginx, access logs are the place where you get the best picture of the actual usage and performance of your system from the client’s perspective.

Nginx by default has two log files: access log and error log. Access log records every request to the Nginx server, while the error log records all issues the Nginx service has, but not errors of our services and bad responses (these go to the access log).

Nginx server contains a logging module ngx_http_log module that writes request logs in a specified format. The default format is called combined and logs values with a space delimiter.

JSON format

The log format is specified using characters and variables, so you can create your own JSON format. The main issue with this approach is that variables can contain strings that will break JSON format (if a variable contains or characters, for example). To fix this, Nginx version 1.11.8 added support for JSON escaping.

There are 3 options for escaping: default, json, and none. JSON escaping will escape characters not allowed in JSON strings: characters “"” and “\” are escaped as “\"” and “\\”, characters with values less than 32 are escaped as “\n”, “\r”, “\t”, “\b”, “\f”, or “\u00XX”.

A list of available variables can be found here: http://nginx.org/en/docs/http/ngx_http_log_module.html#log_format

Every log format has a name (identifier) that can be used multiple times. Log formats can be configured outside of server directive, and then used in multiple servers.

In our nginx configuration, this looks like this:

log_format logger-json escape=json '{"source": "nginx", "time": $msec, "resp_body_size": $body_bytes_sent, "host": "$http_host", "address": "$remote_addr", "request_length": $request_length, "method": "$request_method", "uri": "$request_uri", "status": $status,  "user_agent": "$http_user_agent", "resp_time": $request_time, "upstream_addr": "$upstream_addr"}';

The important thing to note here is that we want to index variables that contain numbers as long type in Elasticsearch and they don’t have double quotes around them, while variables that have strings have double quotes.

In the server directive, we reference the log format (logger-json) we want to use when specifying access log parameters.

server {
    listen 443 ssl;
    server_name api.velebit.ai;
    ...

    access_log /var/log/nginx/access.log logger-json;
    ...
}

How the result looks like (sample)

In the access.log file (located in /var/log/nginx), every line contains one log record. Below you can see an example of a log record (formatted for readability).

{
    "source": "nginx",
    "time": 1658188800011,
    "resp_body_size": 23615,
    "host": "api.velebit.ai",
    "address": "193.22.105.4",
    "request_length": 482,
    "method": "POST",
    "uri": "/image_search/willhaben/api/v4/search_by_id?variant=default",
    "status": 200,
    "user_agent": "Apache-HttpClient/4.5.1 (Java/11.0.15)",
    "resp_time": 0.042,
    "upstream_addr": "10.0.3.126:80"
}

This record is easily processed by FluentBit and inserted into Elasticsearch (more on FluentBit in this article). In most setups, Elasticsearch is then connected to Grafana, Kibana, or a similar tool used for log visualizations, creating graphs, and alerting.

We are using both Kibana and Grafana. Grafana is better suited for alerting, while Kibana is better when you have a problem and you need to deep dive into logs (discovery). You can read more about connecting Elasticsearch to Grafana and Kibana in our previous blog post.

If you want to read more about similar topics, you can read our other articles or subscribe to Velebit AI on LinkedIn.



Recent blog posts

Tech Blog: How to configure JSON logging in nginx?

Tech Blog: Connecting Elasticsearch logs to Grafana and Kibana

What is the point in collecting logs and metrics if you don’t use them? In this blog post, we will build upon our previous blog post and connect Fluent Bit log collectors to Elasticsearch along with a basic setup and comparison of Kibana and Grafana, tools often used for visualizing logs
Read more

Tech Blog: How to configure JSON logging in nginx?

An Introduction to Statistics and Data Science and Differences between Them

There is a great deal of overlap between the fields of statistics and data science, to the point where many definitions of one discipline could just as easily describe the other. While this is true, there are also many differences between them. Why is statistics important and what is its connection to data science? What is data science? What are their similarities and differences? Let’s try to understand it better at least on a basic level without too much going into subtle details.
Read more

Tech Blog: How to configure JSON logging in nginx?

Tech Blog: Collecting logs in docker clusters

In previous blogs from this series we discussed how we formatted uwsgi and Python logs using JSON. Properly formatted logs are, however, useless if you don't have a way of accessing them. Reading raw log files on servers directly is great when working on a small number of servers, but it quickly becomes cumbersome.
Read more

We build AI for your needs.

Meet our highly experienced team who just loves to build AI and design its surrounding to incorporate it in your business. Find out for your self how much you can benefit from our fair and open approach.

Contact Us

Members of