Tech Blog: How to configure JSON logging in nginx?
By Ivan Borko on July 21, 2022
In previous posts from this series, we discussed how we formatted UWSGI and Python logs in JSON format. We still have one important production component left: the Nginx server. This blog post will describe how the Nginx logging module works, and showcase a simple logging configuration where Nginx logger is configured to output JSON logs.
Our goal is to have all services produce JSON logs so that we can directly feed them to Elasticsearch, without additional processing services (like Logstash) that require additional maintenance, consume a lot of CPU, and thus incur extra costs. This enables us to, instead of having Filebeat, Logstash, and Elasticsearch, use just Fluent Bit and Elasticsearch.
All our services are deployed as docker containers but this solution can work with or without docker. You can read more about our logging system components and architecture here.
We use Nginx as a load balancer, for SSL termination (HTTPS), as a cache, and as a reverse proxy. Incoming requests for all our services go through the Nginx. As all external client requests go through the Nginx, access logs are the place where you get the best picture of the actual usage and performance of your system from the client’s perspective.
Nginx by default has two log files: access log and error log. Access log records every request to the Nginx server, while the error log records all issues the Nginx service has, but not errors of our services and bad responses (these go to the access log).
Nginx server contains a logging module ngx_http_log module that writes request logs in a specified format. The default format is called combined and logs values with a space delimiter.
The log format is specified using characters and variables, so you can create your own JSON format.
The main issue with this approach is that variables can contain strings that will break JSON format
(if a variable contains “ or ‘ characters, for example). To fix this, Nginx version 1.11.8 added support for JSON escaping.
There are 3 options for escaping: default, json, and none.
JSON escaping will escape characters not allowed in JSON strings: characters “"” and “\” are escaped as “\"” and “\\”,
characters with values less than 32 are escaped as “\n”, “\r”, “\t”, “\b”, “\f”, or “\u00XX”.
The important thing to note here is that we want to index variables that contain numbers as long type in Elasticsearch and they don’t have double quotes around them, while variables that have strings have double quotes.
In the server directive, we reference the log format (logger-json) we want to use when specifying access log parameters.
This record is easily processed by FluentBit and inserted into Elasticsearch (more on FluentBit in this article).
In most setups, Elasticsearch is then connected to Grafana, Kibana, or a similar tool used for log visualizations, creating graphs, and alerting.
We are using both Kibana and Grafana. Grafana is better suited for alerting, while Kibana is better when you have a problem and you need to deep dive into logs (discovery).
You can read more about connecting Elasticsearch to Grafana and Kibana in our previous blog post.
What is the point in collecting logs and metrics if you don’t use them? In this blog post, we will build upon our previous blog post and connect Fluent Bit log collectors to Elasticsearch along with a basic setup and comparison of Kibana and Grafana, tools often used for visualizing logs
There is a great deal of overlap between the fields of statistics and data science, to the point where many definitions of one discipline could just as easily describe the other. While this is true, there are also many differences between them. Why is statistics important and what is its connection to data science? What is data science? What are their similarities and differences? Let’s try to understand it better at least on a basic level without too much going into subtle details.
In previous blogs from this series we discussed how we formatted uwsgi and Python logs using JSON. Properly formatted logs are, however, useless if you don't have a way of accessing them. Reading raw log files on servers directly is great when working on a small number of servers, but it quickly becomes cumbersome.
We build AI for your needs.
Meet our highly experienced team who just loves to build AI and design its surrounding to incorporate it in your business. Find out for your self how much you can benefit from our fair and open approach.