Tech Blog: JSON logging in uWSGI

Tech Blog: JSON logging in uWSGI


Velebit AI was recently preparing a new production environment from scratch, and one of the major tasks we had to solve was logging. Our team already had experience with ELK stack (ElasticSearch, Logstash, Kibana) to parse, store, and analyze logs. Starting without legacy gave us the opportunity to choose a new stack.

ELK Stack and Grafana

We searched for possible alternatives to ELK stack because the open-source version lacked some features we wanted. For example, both user management and alerting are available only as paid X-Pack plugins. We decided to use Grafana, a good alternative to Kibana with built-in support for user management and alerting.

The second issue we wanted to fix was the fact that Logstash log parsing consumes a lot of CPU. Our solution was: don’t parse logs! :) Configure all apps and services to output their logs as JSON. This way you can simply collect logs from docker containers using fluentbit and send them to ElasticSearch cluster.

JSON logging everywhere

Most of our services are Python APIs, so the first task was to switch all Python logs to JSON. We built a custom module that reconfigures Python logging to convert logging messages to JSON. Also, we use Nginx as a load balancer and it is the main source of quality logs used to:

  1. calculate different metrics such as average response times, throughput, etc
  2. enable detection of faulty services (status 500)

Nginx logging subsystem has support for JSON-escaping of variables and we used it to switch logging to our custom JSON format. This blog post is the first in a series of blog posts explaining how exactly we configured Python logging and Nginx logging to use JSON encoding.

uWSGI logging

The last thing to switch to JSON logging was uWSGI. We use it as a WSGI server that serves our Flask and Falcon REST APIs. As we already had a few years of experience working with uWSGI, we were aware of the vast amount of settings and options uWSGI has, and we were confident that JSON logging shouldn’t be a big deal since uWSGI has built-in support for JSON log encoder.

uWSGI has 3 sources of log messages:

  1. Application logs (python logs in our use case)
  2. uWSGI server logs (information about server startup, errors, warnings)
  3. Request logs

1. Application logs

Application logs are already JSON formatted, so uWSGI just has to pass it through. We define custom logger (applogger) and filter logs starting with the left curly bracket.

logger = applogger stdio
log-route = applogger {
log-encoder = format:applogger ${msg}

2. uWSGI server logs

uWSGI server logs are simple text messages by default. We use route regex to filter all logs that don’t start with curly braces (to differentiate them from application logs) They can be simply wrapped in JSON using built-it JSON log encoder.

logger = default stdio
log-route = default ^((?!\{).)*$
log-encoder = json:default {"time":"${micros}", "source":"uwsgi", "message":"${msg}"}
log-encoder = nl

3. Request Logs

Built-in JSON log encoder supports only message (msg, msgnl) and time variables (unix, micros, strftime). Since we wanted additional information related to the request encoded in separate JSON fields we had to come up with a different solution for log formatting.

Default uWSGI request log format looks like this:

log-format = [pid: %(pid)|app: -|req: -/-] %(addr) (%(user)) \{\%(vars) vars in %(pktsize) bytes} [%(ctime)] %(method) %(uri) => generated %(rsize) bytes in %(msecs) msecs (%(proto) %(status)) %(headers) headers in %(hsize) bytes (%(switches) switches on core %(core))

uWSGI offers many variables with useful information that we would like to log. As mentioned before, JSON encoder allows only a small set of log variables. Here is an example of JSON logging encoder configuration:

log-encoder = json {"unix":${unix}, "msg":"${msg}", "date":"${strftime:%%d/%%m/%%Y %%H:%%M:%%S}"}

To overcome this limitation, we define log-format as a JSON-like text containing request related variables:

log-format = "address":"%(addr)", "host":"%(host)", "method":"%(method)", "uri":"%(uri)"

and wrap that text to a JSON using

log-req-encoder = format {"time":"${micros}", "source":"uwsgi-req", ${msg}}

The problem is that host and uri variables can contain characters that would mess-up JSON format.

We’ve found this issue referencing the same problem. Solution offered by the commenter was to write a uWSGI plugin that defined new JSON-escaped variables that can be used to define text of a request, but in a form a JSON.

uWSGI logging plugin

There is a really short example of a uWSGI logging plugin here . When the documentation for writing a plugin is only 3 blocks of text, one is unfortunately left with just a source code as the source of meaningful information.

Nevertheless, we decided to write uWSGI plugin that defines custom JSON-escaped variables json_uri and json_host. These variables can be used to configure JSON logging of requests.

Now you can use these new variables in log-format:

log-format = "address":"%(addr)", "host":"%(json_host)", "method":"%(method)", "uri":"%(json_uri)"

The resulting plugin is available in our repo.

Uwsgi plugin can be built using this command:

uwsgi --build-plugin <filename.c>

And you can use it with uwsgi server using command:

uwsgi --plugin <filename_plugin.so> …

Resulting uWSGI config file looks like this:

[uwsgi]
plugin = escape_json_plugin.so
; this will encode uwsgi messages into JSON, encode requests to JSON and leave application output unchanged
logger = default stdio
logger = applogger stdio
log-route = applogger {
log-route = default ^((?!\{).)*$
log-encoder = json:default {"time":"${micros}", "source":"uwsgi", "message":"${msg}"}
log-encoder = format:applogger ${msg}
log-encoder = nl
logger-req = stdio
; json_uri and json_host are json-escaped fields defined in `escape_json_plugin.so`
log-format = "address":"%(addr)", "host":"%(json_host)", "method":"%(method)", "uri":"%(json_uri)", "protocol":"%(proto)", "resp_size":%(size), "req_body_size":%(cl), "resp_status":%(status), "resp_time":%(secs)"
log-req-encoder = format {"time":"${micros}", "source":"uwsgi-req", ${msg}}
log-req-encoder = nl

This way we managed to format all 3 types of logs coming out of uWSGI as JSON.

Follow Velebit AI on LinkedIn.


Recent blog posts

Tech Blog: JSON logging in uWSGI

Faster User Posting: Hierarchical Category Prediction

How Machine Learning can improve customer satisfaction and lower the time needed to post an item? Learn more about image and text based category prediction services.
Read more

Tech Blog: JSON logging in uWSGI

Velebit AI joins CroAI Association

We are proud to announce that Velebit AI has, among the first, joined a recently established CroAI association. At Velebit AI, we believe artificial intelligence is becoming the driving force of the world's economy.
Read more

Tech Blog: JSON logging in uWSGI

Tech Blog: JSON logging in Python

Both uWSGI and especially nginx logs are indeed more important than application logs for figuring out how the microservice-oriented infrastructure breathes in general. However, when problems occur, collecting informative application logs is a must. In this part of the JSON logging series, we discuss our solution to JSON logging in Python.
Read more

We build AI for your needs.

Meet our highly experienced team who just loves to build AI and design its surrounding to incorporate it in your business. Find out for your self how much you can benefit from our fair and open approach.

Contact Us

Members of