Levelling Up as a Developer

Hi there. So, it looks like you're a programmer now?

You've got comfortable with your stack, you know and use most of your language's features, and you're ready to build something cool. There's a slight problem though; when you start building something a little more ambitious, it doesn't turn out quite right. Your code starts to get messy. You fix bugs in one place only to find your changes break an entirely different part your app. You find yourself wanting to change things, but can't because of an invisible web of dependencies and constraints. It's frustrating - programming gives you real power to solve problems, but you seem to hit the same kind of issues again and again.

It's hard to find information about how to improve after this point. There's lots of high-quality, free information out there that teaches you how to code in a particular language. But to be a successful developer you need to know so much more; and most of what you need to know isn't anything about a particular language, but how to put the concepts you've learned together in a meaningful way. You can (and perhaps will) spend a lifetime learning how to best structure your code; learning how to make it clear, easy to extend, concise and readable all at the same time.

Once you get past the actual mechanics of programming, the most important thing to understand is that managing complexity is the most important part of software development. Anything beyond a trivial sized application will start to get complicated quickly. Keeping that complexity manageable stops your program from collapsing under its own weight.

A lot of knowing how to do this comes from working side by side with experienced developers; everything from general principles of design to specific rules of thumb to keep code clean. That takes time, practice and patience. While there is no substitute for getting your hands dirty, this article shows you some shortcuts to being a better developer. The ideas expressed here are to a large extent language independent - although specific phrasing might mention the classes and methods of a statically typed OOP language, the concepts are just as applicable to Javascript objects and functions or Python's dynamic classes and modules.

With any luck, you'll finish reading equipped with a fresh perspective and some mental tools that you didn't have before - and that's always well worth the effort.

Cowardly Disclaimer: The following rules, guidelines and suggestions have worked well for me as a starting point. I make absolutely no claims that the following represents the best approach to development; only that I think it represents a good place to begin. Others will argue otherwise and have different (and even contradictory) advice.

You should listen to them too.

Overall Architecture

Picture the scene - you're at the beginning of a brand new project. It's going to be awesome, it's something you're excited about. Perhaps your software is going to change the whole world, perhaps its going to revolutionize the way New York hot dog stand owners do business. Either is great. Regardless, it is easy to get overexcited and start thinking about your project's architecture as a goal, rather than a means to an end. While you do need to consider the major areas of your system and how they plug into one another, too much detail is obstructive to begin with. Defining specific classes and methods at this point is counter-productive; they're too granular. You might have thousands by the time you're finished - there's no need to single out any for special treatment right at the start.

Instead, think of your project as a series of interacting sub-systems; the only important thing at this stage is how information flows from one sub-system to another. Don't worry about implementation details. Develop your design based on solely on the behaviour you need and the flow of information; what each part needs to know about, and where that data comes from. Once you've worked out those requirements, a lot of the rest will start to fall naturally into place.

I'm talking here about big chunks of high level stuff. Things like "this is the video processing component of our app; it pulls URLs from this message queue and streams out normalized and encoded video" or "this is our data aggregation package; it takes in a stream of raw log data and emits chunks of aggregated stats". Scribbling on a piece of paper or whiteboard and bouncing ideas off people helps. As you come to understand your evolving design and your problem domain better, you'll be able to narrow in on a solution that works well and then you can start to refine your architecture.

The reason you're doing this is that you end up with a map of how your project fits together and a clear place to start asking better questions of your design. Instead of attempting to solve one complex question ("how do I write doomsday-machine control software for my evil genius client?") you naturally end up with a set of more focused, manageable ones ("I know the Underling Control Panel needs to present a UI to the minions and spit out a set of targets for management approval - what's the best way to achieve this?") In this way, you'll better understand the major components of your design and - crucially - the boundaries and interfaces between them.

From there, you can begin to think about implementation specifics; but don't try to design the implementation of your project before you understand the problems involved.

On a Need to Know Basis

As you start to implement the major components of your application, you'll need to make sure that each has a separate set of well defined responsibilities. As a general rule of thumb, each part should work as independently as possible - it should only require the minimum amount of information to function. Sometimes unintentional coupling can prevent this.

Coupling occurs when one component relies on another to function and would break without its dependency being present. Often this is required and and forms an intentional part of your design. Sometimes though it can seem tempting to make an object more aware of its context for convenience's sake alone. This can - and does - cause trouble.

Making Contact

Consider the following as an example. You're building a swanky interface that allows a user to manipulate lists of contacts. The (imaginary) UI framework you're using is pretty standard - there's a base component class that can contain other components, and every widget available inherits from that base class. Each component has a list of its children and knows which component its parent is. You've built yourself a custom component called ContactInfo to display a contact's details; it's looking sharp, and you're pretty pleased with it. It looks like this:

For each contact, a new instance of this component gets instantiated and added as a child of a standard list component that comes with the UI framework; imaginatively enough, this component is called List. After you run and test, it looks like the following:

Pretty neat! The built in list component handles a lot of stuff like scrolling and mobile touch control for you. Now we want to make the red button remove a contact from the list. Assuming we've set up an event handler that gets fired when the button is clicked, we might be tempted to write something in ContactInfo like:

private function onRemoveButtonClick() : Void {
	cast(parent, List).remove(this);
}

This obtains the reference to the parent, casts it to List and calls List's remove() method, passing itself along as the item to be removed. It compiles and runs fine and when the remove button is clicked we get exactly the behaviour we want. So what's the problem?

The issue with this approach is that ContactInfo is now coupled to List. This means that you can't use it anywhere else in your app; if you were to try, it would throw an exception at runtime, because its parent is no longer an instance of List. So that component isn't self-contained any more - its usage is totally dependent on it being embedded inside a List component. It makes your code more difficult to refactor, and more difficult to reason about. Now when you consider the ContactInfo component you'll always have to consider List too, and whether a change in one will break the other.

This is a fairly trivial example, but as soon as it becomes a common pattern in your codebase you're in trouble. Dozens or hundreds of such statements will severely hamper your ability to extend or change the behaviour of your app.

So what should you do in this situation? Firstly, think about the responsibilities of the objects involved. Should ContactInfo really know how to remove itself? Probably not. This is more of a list management function, and there is no reason why ContactInfo needs to be aware that it is currently in a list. Really, it is the job of whatever contains the list to remove items from it. ContactInfo should dispatch an event (or your language's equivalent) notifying anything listening that the user has clicked the remove button. That way the other objects can choose how to react, rather than the delete operation being baked directly into ContactInfo.

Sensible Decoupling Leads to Happiness

This leads naturally onto a more general principle for keeping both yourself and your application sane. If there is just one thing you take away from this article, this should be it: Any given part of your system should only need to know the absolute minimum about its own environment for it to do its own job and no more. Bearing this in mind will help you avoid writing a mess of spaghetti code where changing one feature causes breakages in other places that should have been unrelated.

To understand how decoupled the components of your application are, think about how reusable they are. If you were to remove a component from your current project and put it somewhere else, how much effort would it take to get it working? How much set up and configuration code is required to get that part running as needed? How many references to other objects does it rely on? As a good rule of thumb, the less pain it takes to instantiate and set up an object before you can begin to use it, the better. If an object (or class, or module) is a pain to set up and use on its own, the odds are you might have a design problem you want to take a look at

Again, this is a complexity management thing. If you have a well tested piece of code that is decoupled, you don't need to constantly maintain a mental model of how it works. Until you need to change it, you can happily forget about how its insides function. This is crucial to making a larger codebase scale; no matter how clever you are, and how much you can think about at once, eventually you are going to hit your limit and complexity will overwhelm you. Don't let yourself get anywhere near that point.

Good code is easy to think about. It is easy to think about because you don't need to think about it all at once.

A Model with a View

Somewhat related to the above, this is worth stating explicitly: do not couple your model objects to your views. Model objects (or business objects) are those that encapsulate the logic and underlying behaviour of your application. They operate on a different layer of abstraction to whatever is displayed to the user and as such they do not in any sense need to be aware of your presentation layer. Your model classes are standalone entities that should be able to be used independently of however you're choosing to present your application.

I know the thought process - "but if I just give this one model a reference to my main form, I can call that update() method on the form when the model's data changes - that seems easiest!" Once you've done that, you can no longer instantiate or use that model fully without having an instance of the form too; this is coupling in the worst sense. It starts off so innocently and with the best of intentions, but all too quickly you end up piling on hack after hack to work around your initial mistake.

Instead, have your models dispatch events that your views can react to or implement a listener interface on your view class. Don't give the model a direct reference to your view implementation. That little bit of extra effort will be worth it later.

Model-view coupling. Not even once.

Code Shows Intent

Code is designed to be read by two audiences - compilers and people. Your primary audience is people. Compilers are comparatively easy to satisfy, but a lot of a developer's time will be spent on understanding pre-existing code so it needs to be easy to understand. In practice, this means tidying up after yourself and making sure your code is doing what it says it will do.

You can keep a few things in short-term memory. Sure, for now you can remember that the background2.png image is the current asset and background.png is actually unused. And perhaps you can remember that your AbstractWidget class has actually turned into a concrete implementation, or that your object's delete() method is obsolete because you refactored that bit.

The problem is the software industry is more unpredictable than we'd like, and that implementation that's good enough "for now" ends up becoming semi-permanent. When you come back to it a year later or someone else needs to do some work on it, the stubs of implementation or temporary files you left behind become misleading. It's difficult to work out what's required to make the code run in the way that it claims it will, because all the assumptions the code has led you to believe suddenly turn out to be wrong.

Although it is far from ideal, sometimes the code itself is the only documentation available. Make that documentation as good as you possibly can - don't leave it littered with inaccuracies and diversions. If you have code that requires your own personal short-term memory to understand clearly, clean it up before it gets committed. That way your colleagues (or your future self) will never have to wonder what you were thinking when you wrote it. Tidy up after yourself and you - and the rest of your team - will be happier later.

What's in a Name?

An awful lot, actually. Naming is one of the three hardest things in computer science. Method, object, package and variable names all need to be meaningful entities in their own right that help to communicate what they are and what they're used for.

Don't be afraid of having verbose, descriptive names when required, but don't have overly long variable names for the sake of it. Method, class and variable names should be both as concise as possible and as descriptive as required. When naming classes, don't feel the need to include a taxonomy of your whole inheritance tree; similarly, don't include unnecessary nouns related to parameters you pass to methods. Which of the following takes the least effort to read and understand?

decimal amount = 50.25m;
AccountFinanceModelRecord accountFinanceModelRecord
	= new AccountFinanceModelRecord();
accountFinanceModelRecord.depositAmount(amount);

decimal amount = 50.25m;
Account account = new Account();
account.deposit(amount);

The latter is much easier to understand not only because there is less to read, but because it includes enough to be clear without unnecessary information. The ultimate goal is to maximize the signal to noise ratio in your naming.

Consistency is important too; your language should have strong best practice conventions for naming. Use them - it will make your code accessible to everyone familiar with that platform. Equally, if you establish your own conventions stick to them. Be as strict with yourself as you can be, right down to things like the number of line breaks between methods and coherent verb usage in event handlers. Consistency across a codebase lowers reading friction and lowers barriers to understanding.

Test Driven Development and a Different Perspective

Test Driven Development (TDD) is a method of developing software by creating automated tests before you write any code. These tests define a set of criteria that working code would satisfy; your job is then to write the code to make those tests pass. Repeat iteratively until you have an application.

In practice TDD tends to be a little controversial. Discussions around the suitability of a project for the technique, whether it results in the best design for a given problem and exactly what to test abound. Automated testing in general (unit, functional and acceptance testing) is generally viewed as a good thing to have, although whether you'll end up writing them in practice depends on a number of external factors.

If it's not something you've had exposure to before, grab a testing framework for your favourite language and get stuck in. Testing is important part of the industry as a whole so it's well worth investigating. Whether or not you decide to make it a regular part of your work flow, it is extremely valuable experience. In fact, the benefits of TDD stay with you even if you're not actively using the practice. Why? Because it teaches you to code from a better point of view.

Whenever you start work on a new class or library, its always tempting to think about the internals first; how its going to be structured, what properties it will need, etc. If you do this, the resulting interface to the rest of your code (its methods, events etc.) will be determined by that structure. This is entirely the wrong way around. When your component is used, the most important thing is the interface it presents to the rest of the code. This shouldn't be determined as a side effect of its internal structure. It should be designed from the ground up to be easy to use from calling code.

As such, the best way to start writing something new is to pretend it already exists and write it as you would use it in the ideal case. Perhaps you're writing a library to pull all the images from a web page. Before you do anything else, you might sit down and write:

string url = "http://example.com/page_full_of_pictures/";
ImageScraper scraper = new ImageScraper(url);
scraper.loaded += loadHandler;
scraper.load();

The class ImageScraper doesn't exist yet, but by writing a use case first we've defined our requirements for it in the most succinct way possible. By deciding what information to give it in advance and what events we need available to react to, ImageScraper will have the cleanest possible interface. Your job as a developer is now to go away and make the code you've just written work.

As a bonus, this approach lets you use refactoring tools to their fullest. Most language-aware IDEs will allow you to automatically implement missing methods or create classes if you've referred to them and they do not yet exist. They'll even make intelligent guesses about parameter names and any typing you require, which can speed up development significantly.

Always think from the calling code's perspective, and start writing it that way too. Your implementation should be defined solely by the behaviour you require, rather than letting your implementation dictate the behaviour available.

DRY, YAGNI, KISS and Other Trendy Acronyms

There are plenty of acronyms and pithy phrases floating around for various development practices and rules of thumb. While these are all worth investigating and evaluating on their own merits, there are three that I have found exceedingly useful in practice.

1. Don't Repeat Yourself (DRY)

This is as simple as it is useful. If at any point you see repetition in code you're writing, immediately stop typing and think about ways to refactor that repetition out into a separate function or variable. Repetition like this is usually fairly easy to spot, because the actual shape of the text on screen tends to be similar for repeating sections of code. If you have similar functionality across different classes, pulling that out into a separate utility function is better than leaving it duplicated in two places. Additionally, pulling out repetition in this way makes it much easier to identify and work with higher level abstractions.

Not having chunks of repeated code is a basic tenant of programming anyway, but establishing it as a principle with a funky name perhaps allows you to focus on it a little more. People talk about keeping their code DRY; now you can too. The benefits should be obvious; you shouldn't ever have to make the same change in multiple places.

2. You Ain't Gonna Need It (YAGNI)

YAGNI is a principle to live by. It keeps your code simple, focused on the task at hand and very amenable to refactoring later. The basic idea is that you should never implement something up until the point it becomes absolutely clear that you'll need it. When you do, refactor as required and implement it properly.

Developers often feel a desire to generalise; that is, engineer the solution to a problem so that it solves both the original problem and becomes capable of solving whole set of related problems too. In theory this is great - more power to cope with future change, right? In practice it tends not to work so well. That extra power comes at the cost of increased complexity. Your generalised solution needs configuring to work in each specific case - including the problem you were first addressing. This typically means another layer of abstraction, some extra configuration code and finally a more fiddly implementation of the problem solving code.

If that isn't bad enough, the real cost comes later when the project's requirements change, or you find out that the approach you thought would work to start with needs reworking. Because there are an infinite variety of possible changes, the probability that you will have anticipated any actual requirements changes in advance correctly is extremely low. This is why trying to code around "What if?" questions as you develop is a losing game; the odds are you won't even be asking the right questions in the first place.

The ultimate pathological manifestation of this is the Inner Platform Effect, where a project ends up attempting to solve such general problems it begins to resemble a poorly implemented version of the platform you're building on in the first place.

To avoid this, don't implement abstractions now that you don't need in response to vague concerns about what problems might appear in the future. Don't have abstract classes with only one concrete implementation. Don't have a one-class implementation of an interface because you "might need to change it later". Don't push methods further up the inheritance tree just in case other classes might want to use them sometime in the future. Prove the need to yourself first. If you need to change things later that's fine. Code is malleable, and can be changed and rewritten as you like. This is easier the simpler and more focused the code is. It is far better to extract an interface or superclass at the point you need it and not before. It also takes a lot less effort if your refactoring efforts are not hampered by unnecessary layers of indirection.

3. Keep it Simple, Stupid (KISS)

I don't like wording of this principle. Whatever the original intention of the phrase, it now just comes across as arrogant and alienating (as does any piece of advice that comes bundled with an apparent insult). Despite this the saying itself is widespread and still represents very sound advice. In fact, this entire article is built around the idea of making your code simpler by managing complexity.

Perhaps it should be commonly phrased a different way. A long time ago I read an excellent book written by Charlie Calvert for a now sadly obscured programming language. It was a very well written piece of work, but what struck me most was a George Sand quotation in the preface: "Simplicity is the most difficult thing to secure in this world; it is the last limit of experience and the last effort of genius."

Simplicity in your work is not something only the stupid fail to do; it is something to continually strive for and it is the ultimate expression of craftsmanship and ability.

Always Be Refactoring

As you work your code will start becoming messy. It is a fairly inevitable consequence of progress; approaches that you've tried won't quite work out, you'll perform little hacky experiments here and there to see just what that undocumented framework is doing and you'll want to alter the purpose of objects and functions as you come to know more about them. This is as normal as it is unavoidable.

To clean up, you'll need to refactor. While you can code for a week and then switch to refactoring mode, you'll see dramatically better results if you refactor as frequently as you can. A clean up of your code results in more of its structure becoming apparent. That extra knowledge feeds back and informs subsequent development, so keeping that feedback cycle as short as possible lets you work more efficiently. In the limit of those cycles becoming shorter, you end up refactoring more or less continually as you write. This is not a waste of time; you will gain back the time you invest now later.

Putting off refactoring code is rarely the right decision. It is a form of technical debt, and as such it only becomes a more difficult and complicated task with time. You can put off doing it right for a while, but that debt will always need to be paid back one way or another. Eventually you can reach a point where it becomes economically impossible to refactor a large codebase back to a clean state. While it's always technically possible to do so, the amount of time required can become prohibitive in a commercial setting. From that point onward, future growth and the ability to adapt become severely curtailed. The project is trapped in a descending spiral of complexity and will eventually reach a limit where nothing can be changed without breaking something else.

It's your responsibility to make sure that doesn't happen. Refactor wherever and whenever you can; not only will you become a better developer, it will save your sanity in the long run.

Conclusion

Developing software is hard, and keeping complexity in check as you grow your application can be very difficult indeed. Some of the guidelines in this article probably won't suit the way you work - others might permanently change the way you code. There's always more to learn, so keep an open mind and work constantly at becoming a better developer. Books for your chosen language or framework are often a great shortcut to being able to find best practice code and novel design patterns. Reading lots of code is likewise great: develop a habit of nosing around the source for any open source libraries you use. Following and contributing to discussion with others in the business is a great way to find out more and decide what approaches work for you. Co-workers are great for this, as are online communities such as Proggit and Hacker News.

The software business is full of ideas; combine that with a low barrier to entry and you get one of the most dynamic and fast paced industries in the world. Keep a critical eye out for fads, but don't let this stop you from trying things out; there's always something to investigate, a new framework or paradigm to try. Keep pushing yourself. Every bit of effort you put in will be rewarded with knowledge that makes you a better developer than you were before.

There's a long road and a bright future ahead.

Good luck.