Pushing Complexity

You want to write readable code. You want people to be able to understand your library, to be able to contribute within a short time from first inspecting it—you know, without messing everything up.

What makes libraries hard to contribute to and use is complexity: if your code is not immediately understandable and easy to get an overview of, you’ve already lost a lot of potential contributors or users.

Fortunately, complexity can be removed. Well, not really removed, but abstracted, composed away, and indirected. Abstracting away some form of complexity will create new complexity at a higher level of abstraction (this should be made clearer as you read along). That’s why I call it Pushing Complexity: it is pushed out to the next level.

Image: "Pushing out Complexity"

Push complexity out of functions #

Complexity has many forms (and Robert C. Martin covers many of them in Clean Code), but what I am going to focus on here is size. Big things are harder to get an overview of, and for that reason I would encourage following Martin’s recommendation of 3-4 line functions.

Shorter functions make it a lot easier to understand just what is going on. For example:

// bad

function addUser(user) {
  var allUsers = database.getUsers();
  for(var i = 0; i < allUsers.length; i++) {
    var thisUser = allUsers[i];
    if(thisUser.getEmail() == user.getEmail()) {
      throw new Error("Email already in use.");
    }
  }
  database.saveUser(user);
}

// good

function addUser(user) {
  var email = user.getEmail();
  if(emailAlreadyInUse(email)) {
    throw new Error("Email already in use.");
  }
  database.saveUser(user);
}

function emailAlreadyInUse(email) {
  var allUsers = database.getUsers();
  return anyMatches(allUsers, function(user) {
    return user.getEmail() == email;
  });
}

function anyMatches(list, matcherFunction) {
  for(var i = 0; i < list.length; i++) {
    if(matcherFunction(list[i])) {
      return true;
    }
  }
  return false;
}

The shorter functions are easier to read, because it’s easier to get an overview of exactly one unit of computation. Notice how each function works on a single level of abstraction, which makes for a coherent reading experience. We immediately understand the first layer (in the good addUser), and can investigate further if we want to understand how the underlying layers work.

Push complexity out of classes #

Now that you have made some nice abstractions of your functions, you have ended up with a whole bunch of them. That’s complexity, too! Having a class with many methods is just as hard to get an overview of as long functions are.

The principle we will use here is exactly the same as with functions: push the complexity out, create small, composable abstractions.

Instead of having a behemoth of a class, which is capable of doing a hundred different things in several different scenarios, it is much better to have smaller, composable classes.

As a rule of thumb, I look for classes that are longer than 200 lines (this is in Node.js)—if they are, there needs to be a really good reason for it. I have yet to encounter a class of that size that could not be refactored to something more readable. Often, much smaller classes can benefit from being split up, too.

Why is shorter better? Because reading things at one level of abstraction helps build overview. Quickly getting an overview of the code is invaluable. It also adds to that sacred principle of software development, high cohesion—in fact, pushing complexity at any level of abstraction does.

Once again, an example will serve us well:

// bad

var Sender = function(url, apiKey) {
  //...
};

Sender.prototype.send = function(recipients, message) {
  //...
};

Sender.prototype.sendWithRetries = function(recipients, message) {
  //...
  this.send(recipients, message);
  //...
};

// good

var Sender = function(url, apiKey) {
  //...
};

Sender.prototype.send = function(recipients, message) {
  //...
};

var RetryingSender = function(sender) {
  this.sender = sender;
};

RetryingSender.prototype.send = function(recipients, message) {
  //...
  this.sender.send(recipients, message);
  //...
};

Here, we go from one class/file to two classes/files that each have a very specific set of functionality: the Sender sends off requests; the RetryingSender is in charge of the logic of retries, exponential backoff, and what have you, using the Sender internally to do the actual sending.

It is now easier to get an overview of the Sender, and also, actually, of the RetryingSender, because all its functionality is now at a single level of abstraction (one over that of sending requests).

Push complexity out of libraries #

Now that we have made our classes much smaller, we have ended up with a whole bunch of classes. This is complexity, too. Big libraries are bad.

This follows logically and for exactly the same reasons as for classes and functions: it quickly becomes hard to get an overview and understand the full structure of what is being surveyed when there are too many components.

As an example, let’s look at jQuery. jQuery is absolutely huge. It does a whole lot of different things, and not many people quite know (not to mention use) everything it does. Most people use more than one of the features, but few use them all.

A lot of people use jQuery for the selector syntax, instead of just using Sizzle (which is what jQuery uses on the inside).

A lot of people use jQuery for the functional (well, pseudo-functional) helpers, instead of using a dedicated library like underscore.js or Lo-Dash.

Have you ever considered proposing a change to jQuery? I have. So I looked at the project, but it was not easy to get an overview of. It does too many things.

Look at the file tree for just a section of jQuery (i have ignored meta-files and dependencies; ... indicates that there is more, that I haven’t shown):

+ jquery
|
+-+ src
| |
| +-+ ajax
| | +-+ var
| | | +-- location.js
| | | +-- nonce.js
| | | +-- rquery.js
| | +-- jsonp.js
| | +-- load.js
| | +-- parseJSON.js
| | +-- parseXML.js
| | +-- script.js
| | +-- xhr.js
| |
| +-+ attributes
| | +-- attr.js
| | +-- classes.js
| | +-- prop.js
| | +-- support.js
| | +-- val.js
| +-- ...
|
+-+ test
  +-- ...

There are many examples of big libraries likes this. A great example of how to do this better is Ampersand. Situating itself as the anti-Angular framework, Ampersand has split all of its functionality into small, independent modules that are easy to get a hold of. ampersand-model, for example, currently takes up a total of 132 lines of code, and is easy to get an overview of.

To compare with jQuery, here’s the file tree (once again without meta-files):

+ ampersand-model
|
+-+ test
| +-- backbone.js
| +-- index.js
| +-- model.js
|
+-- ampersand-model.js

Yowza, that’s what I’m talking about: beautifully simple tree, easy to get an overview of.

Small, composable packages are made possible by (recursive) dependency managers. With Node.js and npm, it is almost ridiculously easy to compose modules. Because modularization is so easy with npm, it comes naturally to create package with a small, cohesive bunch of classes (or maybe just a single one!).

So how do you do this for your code? Find the small cohesive structures in your code and spin them off as individual modules. Take all your utility classes and spin them off, too. Each of your projects should be written on a single level of abstraction—that makes it easier to get an overview.

Aside #

There are some edge-cases that I have not covered in this blog post, aiming to make it a quick and coherent overview of the general principle.

While the general rule applies to all areas of software development, there are some levels of abstraction where it is harder to push complexity, applications (executable libraries) being one of them—here I am thinking especially of projects with big user interfaces. It is definitely possible, but requires some serious consideration.


If you liked this blog post, I think you should opt to get notified when I write new blog posts.

 
31
Kudos
 
31
Kudos

Now read this

Group Flow in Software Development

In The Clean Coder, Robert C Martin writes about being in the Zone: Let me be clear about this. You will write more code in the Zone. If you are practicing TDD, you will go around the red/green/refactor loop more quickly. And you will... Continue →