Decide What Gets Logged - Secure Your Node.js Web Application

We’ve now set up a secure connection between the application and the users, but we’re still missing some important parts of our application’s support mechanisms.

Imagine that your company has an office in another country, and things there are very active, with people coming and going at all hours. You know this because you have cameras capturing information about what’s happening in that office. Logging is to your application what cameras are to that office.

Without logging, you’d have no idea what’s happening with your application.

Other web servers do a basic amount of logging by default, but as I discussed before, Node.js does no hand-holding. You have to do this by yourself.

It’s a common misconception that logging is useful only when crashes occur (completely false, as we both know). Another misconception is that logging is not related to security. In fact, logging is important for security, because it provides input for both prevention and forensics.

Let’s first look at prevention. Logging helps us debug code, detect anomalies in the program workflow, and detect attacks. Inserting proper log lines will

6. https://developer.mozilla.org/en-US/docs/Security/HTTP_Strict_Transport_Security

Decide What Gets Logged

•

²⁹

allow us to learn when the program isn’t working as expected. These unex-pected behaviors are exactly what attackers exploit to attack the system. So by logging, hopefully we’ll be able to find and fix bugs and logic errors in our code before any attacker finds them. Learning about these anomalies by examining log lines is much cheaper than combing the application code after a breach to learn how it could have happened.

Logging also helps you stop an attack during its occurrence. You might be wondering how. Let me demonstrate.

Say our usual log line looks like this:

GET / 200 11 - 4 ms

Now we get a group of logs like these:

GET /’`([{^~' 404 - - 1 ms GET /aND 8=8' 404 - - 0 ms GET /' aND '8'='8' 404 - - 1 ms GET //**/aND/**/8=8' 404 - - 1 ms GET /%' aND '8%'='8' 404 - - 0 ms

We can deduce from these requests that the user is looking for SQL injection points and can act appropriately—by blocking the IP or collecting more information about the attacker for further analysis.

Logging also provides input for forensics. With logging, we can determine the extent of the breach and track down information about the perpetrators. It is likely that at some point attackers will get past our defenses. When they do, we’ll need logs to understand how they succeeded, what they did, and where they originated from. Without logging, we’d be effectively blind and have nothing to go on.

Logging is something we simply must do when writing a secure web applica-tion. Let’s look at how we can do it easily.

The ^{express v3} framework exposed a simple logger middleware from ^connect, the lower-level node module ^express is built upon. You could enable it with just one line:

chp-3-networking/morgan-simple.js app.use(express.logger());

As of ^{express v4}, this has been moved to a separate module called ^morgan and can now be used like so:

// Require the morgan logger var morgan = require('morgan');

app.use(morgan('combined'));

Chapter 3. Start Connecting

•

³⁰

This will provide logs in the following format:

':remote-addr - :remote-user [:date[clf]] ":method :url HTTP/

:http-version" :status :res[content-length] ":referrer" ":user-agent"'

//example

127.0.0.1 - - [23/Nov/2014:14:34:21 +0000] "GET / HTTP/1.1" 200 13 "-"

"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1)

AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.62 Safari/537.36"

You may need a custom solution because you need to log data that isn’t supported, or you’re not using ^connect or ^express. In that case, keep in mind that you should, at the bare minimum, log the time, the user’s remote IP address, the requested path, the type of the request (such as ^GET or ^POST), and the response code to see how the request was handled. It also helps to log important internal information such as detailed error messages and application procedures. You want to know about database alerts (such as database errors and fatal errors) and important application procedures (such as withdrawal transactions in a financial application).

Put your logger high up the stack so that all requests pass through it. For express it means putting the ^morgan middleware before other middleware. I recommend looking up and becoming familiar with the ^morgan logger middle-ware⁷ and the various configuration methods.

You might be thinking now that logging is simple—“I’ll just log everything.”

I’ll just say don’t. Logging everything isn’t as good in practice as it might sound in your head. In this section I’ll provide a few guidelines for what not to do.

First of all, don’t log too much. Different environments and applications can and usually should require different logging levels. In development you want to see as much of the request’s movement as possible in order to trace various problems and/or better understand the internal flow of the application.

However, this information is cumbersome in production because production logs must be persistent. Assuming that the application has proper logging set up, then with development settings the log files would probably grow rapidly.

This would result in information overload—although logs can be searched and consolidated using various tools, you’d still have issues with storage and management if the application usage is high enough.

7. http://www.senchalabs.org/connect/logger.html

Decide What Gets Logged

•

³¹

Second, store your logs securely. I recommend that the production logs be consolidated to separate machines and if the application type demands it, timestamped. By timestamping I mean they should be signed cryptographically so that their time and validity can be checked afterward. This is to prevent log tampering.

And finally, don’t log sensitive information. You need to be aware of what you’re logging. Avoid logging sensitive information like passwords and credit card/Social Security numbers and so on. If your logs are compromised, they will provide a wealth of information to the attacker.

Also, if you plan to add based grouping to logs by logging a session-specific token every time, you must not log the sessionID itself as the token.

Instead, generate a random value every time a session is created, or hash the sessionID and use that value (the latter should be saved to the session because it’s computationally expensive to hash on every request). The last thing you want, if your application gets compromised, is for the logs to be a ready source of sessionIDs and other sensitive information for the attackers.

In document Secure Your Node.js Web Application (Page 40-43)