How we use logs in production
Note: This blog post is aimed at developers. You're free to read it, but it does get quite technical.
Most developers would agree that having access to logs on your production server is to great benefit, but it seems like a guide on how a good logging system should work doesn't exist out in the wild. Today I want to take a quick look at how I went from scrolling endlessly on a console in screen, to using Node.JS libraries to provide excellent support as well as the option to expand out in the future.
When I first launched my application, and much like with most projects, I wasn't sure if the idea would take off or if I'd even be working on it in the next few weeks. MCJukebox was running on a shared VPS (from the OVH SSD range) and logs were basic. I tried to avoid them, as it was easier to recreate the issue locally, but if I did need to take a look at the logs I simply scrolled up on the screen until I found the request in question. It worked, but there were some serious downsides. It wouldn't scale, as I was working rather than the server. I also had to assume that something would end up in the console, and if I hadn't logged enough information then my chances of fixing the bug would rapidly decrease. I needed something better and after some research, settled on Bunyan.
To describe it simply, Bunyan allows for logs to be sent with different priorities and you can define different streams which listen out for only logs above a certain level. For example, the setup we use sends everything above the 'info' level, including error and fatal straight to stdout. Here's a look at starting up the application, with Redis intentionally blocked. This makes finding out what has gone wrong in production really easy - just open up the screen and all you'll see is what matters. We've also configured two additional streams. The debug.log file stores everything which has happened in the last three days, in a very verbose form so that if a customer contacts us with an issue we can take a look at exactly what happened. I'm also personally a fan of Pushover, an app which sends me push notifications using an API, and so another stream will ping my phone if anything enters the 'error' level.
We've got just one more trick up our sleeve, and this is what has really helped us out in diagnosing issues once a customer contacts us. Bunyan has the concept of a child logger, which uses the same stream destinations but has additional properties added automatically each time you send a log. On all of our servers, we use a small Express middleware function to add a random request ID to a new logger assigned to each incoming request.
app.use(function(req, res, next){By doing this, we're able to identify all of the logs associated with a particular request just by searching via. an ID. Bunyan also saves all of it's logs in JSON format, so searching is really easy in pretty much any software. Getting the 'req_id' field back is as simple as doing 'req.log.fields.req_id' - we then show this in the GUI if something has gone wrong, so that the customer can do the work for us and tell us where to look. We log as much as we can about each request: the cookies present, the request URL. What's great is that really important information, like timestamps, are added by Bunyan automatically.
req.log = log.child({req_id: uuid.v4()}, true);
req.log.debug({req: req});
next();
});
Hopefully from this you understand a bit more about what great logging is - your logs need to be accessible (so you can view them whenever you wish), friendly (so you can actually use them to your benefit), and verbose (not always, but you never know when that bit of additional information may be the key to squashing that nasty bug.
If you have any questions about the setup, or would like to make a suggestion, feel free to tweet me @oliverdunk_ - I'll try and reply as soon as possible.