In my previous post I discussed a variety of desirable attributes in an application architecture. I made the point that although you should make some basic concessions to performance (such as avoiding chatty communications) optimising for performance without being able to measure the results is generally sub-optimal. I made the point that in most cases "good enough" performance is generally all that's required, beyond which additional performance gains have no benefit. However, what about cases where this doesn't hold? How do we address problems where we need to be able to add additional performance to an application over time?

What we need to consider here is what is the actual requirement. Generally in line of business applications the requirement is that we be able to support increased usage of our application. This may be because we add additional functionality that increases the load on the system or more likely that we be able to support additional users of our application. For instance if you run a web site that is experiencing increased demand you will in the short term be more likely concerned with ensuring that your system can handle the increased traffic that with additional load due to adding new features (although this may also be a problem). At the very low end you can simply throw hardware at the issue, by increasing the speed of the processor or available memory. However there is a definite limit to the amount of load a single system may handle. To handle loads beyond this we need to look at architecture.

What this leads to is that we want an architecture than demonstrates scalability. Scalability refers to how well an application may take advantage of additional resource in order to perform additional work. At the basic single system end it may refer to the ability of an application to use multiple cores in a system (through being multi-threaded). More often it will refer to the ability for the application to run on multiple systems simultaneously in a coherent fashion. Running on multiple systems allows the overall performance of the application to be increased. With an appropriate architecture it may also improve the robustness of the application by making the application tolerant of the failure of a system on which it runs.

Designing an application to be scalable means addressing a number of challenges. Scalable applications need to consider issues of concurrency to a much greater extent than applications where scalability is not a concern. Scaling an application to multiple systems can affect robustness both positively and negatively depending on how the architecture allows for failure of individual systems.

Providing for scalability represents an overhead both at runtime and in development. Runtime overheads come from the additional complexity and synchronisation required to ensure the application behaves consistently across multiple systems. The compensation for this overhead is that overall performance may be significantly higher as the application can utilise far more than the capacity of a single system. Development overhead is incurred due to the additional effort required to build a system in a scalable fashion.

In building an application to be scalable it is worth considering what level of scalability is warranted. At the far extreme Google have produced a platform using custom infrastructure that scales into the hundreds of thousands of machines. You are highly unlikely to be building an application that warrants this level of scalability. Even if you see your application needing this at some point (and you're so very wrong about that) such an extreme is not a valid target for building an application that doesn't have enormous load already. Designing for such an environment implies a large number of restrictions that will not apply to you. Also you're unlikely to have the budget to produce your own customised version of Linux and management software to run everything on. You should instead consider your current utilisation and realistic projections for the load your application is likely to have to handle. If your application becomes wildly popular this gives you room in which to scale without having to pay the costs of developing an enormously scalable platform. In the unlikely event you become the next big thing you're probably going to have to redevelop your system anyway at which point you can make more informed decisions as to your scalability requirements.

In conclusion, prefer scalability over performance optimisation as this will give you the ability to handle higher overall load. When determining the scalability requirements of your application make your design relevant to the likely load on your application rather than over-engineering to handle load levels implausible for your circumstances.