Category Archives: Opinions

How (and why) to move performance testing earlier in the development cycle [WOPR22]

I recently attended the WOPR22 conference in Malmo. This focussed on discussions around how to move performance testing earlier in the process. This is a big subject and there is clearly no magic bullet solution, but I thought I’d share some of the key takeaways from the discussions.

Performance testing needs to be thought of as more than just load testing

A traditional approach of “finish development, go through a load testing process and approve/reject for go-live” really doesn’t work in a modern development environment. The feedback loop is just too slow.

We have to be providing feedback earlier. Load testing though is not ideal for this, there are a lot of drawbacks – scripts have to be maintained, environments available, datasets managed, load models understood. These are surmountable but quicker feedback can be provided by simpler processes that can be added to a standard CI solution or executed manually but regularly.

Examples are things such as

  • waterfall chart analysis of pages being returned
  • unit test timings
  • parallelisation of unit tests to get some idea of concurrency impacts
  • using perf monitors/APM on CI environments

These will not find true problems that only occur under load but they will give you some early indications that problems may be there.

Performance Testers need tighter integration with the development team

The performance testing team cannot be too distant from the development team – for practical and political reasons. There was a lot of discussion about whether a performance team works best as a distinct entity or as individuals integrated across the teams. There are pros and cons for each argument.

What is clear though is that when issues are identified there must be co-operation between performance testers and developers to share their knowledge to resolve the problem. Performance testers should not be people who just identify problems – they must be people who are part of the team that solves the problem.

Mais Tawfik has the policy of physically pairing the performance tester who has identified the problem with the developer who is working on the fix until a resolution is found.

Performance Testers still need space for analysis

One of the downsides to pushing performance testing earlier was that it often results in an additional demand for testing without provision of appropriate space for analysis. Performance testing is an area where analysis of the data is important – it is not based on black and white results.

It is often overlooked that data is not information. Human intelligence is required to convert data into information. An important role of the performance tester in improving any process is to ensure that there is not an acceptance of data over information because data can be provided more regularly. We must ensure that there is sufficient quality, not just quantity of performance testing during the development process.

Environmental advances can make this process easier

Cloud and other virtualised environments as well as automation tools for creating environments (e.g. Chef, Puppet, CloudFormation) have been game changers for earlier and more regular performance teasing. Environments can be reliably created on demand. To move testing earlier we must take advantage of these technologies.

Use automation to simplify the process

Automate the capture of metrics during the test to speed up the entire process. Using APM tooling helps in this respect. Automating this reduces the overhead associated with the process of running a test and analysing results.

Attendees of WOPR22 in Malmö, Sweden.

Attendees of WOPR22 in Malmö, Sweden.

Based on discussion with all WOPT22 attendees:

Fredrik Fristedt, Andy Hohenner, Paul Holland, Martin Hynie, Emil Johansson, Maria Kedemo, John Meza, Eric Proegler, Bob Sklar, Paul Stapleton, Neil Taitt, and Mais Tawfik Ashkar.

Leave a comment

Filed under Opinions

It’s not just about being the fastest…

First published on Performance Calendar on 21st December 2013

I have been doing a lot of work this year about creating a performance culture within a company. This is an essential step on the route to creating good performance in your products and only when you start treating performance as a first class citizen will you start to get buy in to the time and effort needed to create performant systems, both from the developers and from the business as a whole.

This is a costly process and requires an investment of time and effort from a company to fully implement and the business will expect to see benefits back in return for this investment.

I would like to address two common problems that cause the value of this investment to be undermined.

1) Solving the technical challenge not the business problem

When I have talked to some developers who are getting into performance and trying to get buy in from their business and are struggling I often hear the same complaint – “why don’t they realise that they want as fast a website as possible?”.

To this I always answer – “because they don’t!”. The business in question does not want a fast website. The business wants to make as much money as possible. Only if having a fast website is a vehicle for them doing that do they want a fast website.

The key point I am making here is that it is easy as a techie to get excited by the challenge of setting arbitrary targets and putting time and effort into continually bettering them when more business benefit could be gained from solving other performance problems.

To address this we need to take a step back and address exactly the performance problems that are being seen and how they are impacting the business.

These may be slow page load when not under load. Equally likely will be that they suffer slowdowns under adverse conditions, they suffer intermittent slowdowns under normal load, they use excessive resources on the server necessitating an excessively large platform or many other potential problems.

All of these examples can be boiled down to a direct financial impact on the business.

As an example, one company we worked with determined that their intermittent slowdowns cost them 350 sales on average which would work out to £3.36m per year. This gives you instant business buy in to solve the problem, a direct problem for developers to work on and a trackable KPI to track achievement and know when you are done after which you can move on to the next performance problem.

Another company I worked with had a system that performed perfectly adequately but was very memory hungry. Their business objective was to release some of the memory being used by the servers to be used on alternative projects (i.e. reduce the hosting cost for the application). Again a direct business case, a problem developers can get their teeth into and a trackable KPI.

To sum up – start your performance optimisation with a business impact, put it into financial terms and provide the means to justify the value of the development efforts to the business.

2) Over optimisation results in technical debt

The second issue I would like to address is the idea that we should always build the most ultra performant system.

No – we should always build an APPROPRIATELY performant system.

Over optimising a system can be just as negative as under-optimising. Building a ultra performant, scalable web application takes many factors such as…

Time

Building highly performant systems just takes longer

Complexity

Highly optimised system tend to have a lot more moving parts. Elements such as caching layers, NoSQL databases, sharded databases, cloned data stores, message queues, remote components, multiple languages, technologies and platforms are things that may be introduced to ensure that your system can scale and remain performant. All these things take management, testing, development expertise and hardware.

Expertise

Building ultra performant website is hard, it takes clever people to devise intelligent solutions, often operating at the limits of the technologies being used. These kind of solutions can lead to areas of your system being unmaintainable by the rest of the team. Some of the worst maintenance situations I have seen have been caused by the introduction of some unnecessarily complicated piece of coding designed to solve a potential performance problem that never materialised.

Investment

These system require financial support in terms of hardware, software and development/testing time and effort to build and support.

Compromises

Solving performance issues often is done at the expenses of good practice or functionality elsewhere. This may be as simple as compromising on the timeliness of data by introducing caching but often maybe accepting architectural compromises or even coding good practice compromises to achieve performance.

Summary

The warning I want to give here is to understand your performance landscape, set your KPIs, define your performance non functional requirements, set performance acceptance targets or whatever method you use to determine how your application is expected to perform.

This action is essential to allow developers to be able to make reasonable assessments of the levels of optimisation that are appropriate to perform on the system they are developing.

Leave a comment

Filed under Opinions

Making Slow Websites Faster, Quickly

Intechnica recently hosted an event called Faster Websites – aimed at discussing with retailers some of the means and methods that can be adopted in improve performance of their online presence.

As part of the preparation for this event we evaluated the websites of the potential attendees as well as the top 50 leading retail sites in the UK.

As would be expected there was a wide range of results from the very fast to the quite slow.

I had a look into the performance of some of the slower sites to see if there were any quick wins that I could propose to improve their speed. I did a very limited investigation using WebPageTest under normal traffic conditions (as far as I know) and came up with the following observations.

Most follow general good practice

With very few exceptions most were doing the obvious things (minifying javascript, compressing content, using a CDN etc). This illustrated that there was unlikely to be a simple, config based solution to the slowness.

Slowness was caused by client side, not server side, issues

None of the sites spent more than 0.5 seconds waiting for a server response, indicating that the server is not struggling to return content. This is as would be expected for a site homepage that is not under load.

Very large page weights – especially javascript

A large amount of the slowness was being caused by simple page weight issues. All of these sites were requesting well over 100 elements with some requesting over 200 items.

The largest chunk of this was images, as was to be expected. As these are retail sites, there is an argument to be made that high quality imagery is to be expected and is essential for business. However one site was requesting close to 70 images, taking 3.5mb of data. It would certainly be worth investigating whether these images could be compressed, loaded asynchronously or even just removed.

Of more concern to me across all these slow loading sites was the general size and number of javascript files that were being requested. Sites were requesting over 40 distinct javascript files and file sizes totalling 300kb+ were common, with one site topping 600kb of javascript content. In most cases this javascript had already been minified and compressed. In all these cases the use of javascript should be fully investigated and rationalised.

CSS and even HTML files were similarly large (50kb+) and could equally be rationalised.

Complex DOMs

Most of the slower sites had more complex DOMs, often topping 2,000+ elements. This does not necessarily cause an issue but when combined with complex javascript and content manipulation it can easily lead to slowdown.

In the examples I tested this was illustrated in how long the startup event was taking for some pages. In one example this took over 1.5 seconds. This illustrates a page that is far too complex and needs rationalising.

3rd parties causing slowdowns

There were a couple of sites that were slowed down in their load time by waiting for responses from 3rd parties for content (e.g. from Facebook). In one case this was causing a 12 second delay.

As a site owner you really can’t let your performance be in the hands of 3rd party content. You must aim to make all these calls asynchronous after page load if possible.

Part of any performance assessment should include poor performance of these elements.

Overall the impression that I got was that effort on these systems was still mainly focussed on providing server side performance, and the state of the client side was being generally ignored beyond following standard good practice. A more considered approach could easily (days not months of effort) speed these pages up dramatically.

Leave a comment

Filed under Opinions

Treat Performance as a First Class Citizen

Steve Souders wrote a very interesting blog post recently (http://www.stevesouders.com/blog/2013/08/27/web-performance-for-the-future/) about treating “performance as a discipline”.

The premise of this article was that performance is such a fundamental issue that a separate team should be created to focus purely on performance.

Seeing this view put in writing by one of the leaders in the performance arena was very refreshing to me. At Intechnica we have been pushing this message for a number of years and it is nice to finally see it gaining some traction within the industry. We formed the company to deliver this capability into other companies.

For me the battle that we face day to day is the battle to get people to treat performance as a First Class Citizen within the development industry. There does often still seem to be a sense that good performance is just something that developers should be able to achieve with more time or more kit.

The reality of course is that that is true up to a point. If you are developing an average complexity website with moderate usage and moderate data levels then you should be able to develop something that performs to an acceptable level. As soon as these factors start to ramp up then performance will suffer and will require expertise to solve the problems. This does not reflect on the competency of the developer, it is just a reflection that a specialised skill is required.

The analogy I would make to this would be to look at the security of a website. For a standard brochureware or low usage site then a competent developer should be able to deliver a site with sufficient security in place. However when the site ramps up to a banking site you would no longer expect the developer to implement the security, there would be an expectation that security specialists would be involved and would be looking beyond the code to the system as a whole. This is no negative reflection on the developer, just that the nature of security is so important to the system and so complex that only a specialist can fully understand the solution that is required. This is acceptable because security is regarded as a First Class Citizen in the development world.

Performance issues often require such a breadth of knowledge beyond simply looking at the code (APM tooling, load generation tools, network setup, system interaction, concurrency effects, threading, database optimisation etc.) that specialists are required to be able to solve them.

These are not better than developers, they just have different skills.

At Intechnica we have run projects where we have a performance scrum team. Leaving developers to deliver functionality with usual performance best practice but no specific defined KPIs. Then applying the acceptance KPIs afterwards and passing failures onto the the performance scrum team backlog.

We have also developed projects with performance engineers within each scrum team and applied KPIs as part of the feature we were developing and to the system as a whole.

Both are valid approaches. There are other valid approaches. As long as performance is treated as a First Class Citizen then you will be on the right track to performance success.

Leave a comment

Filed under Opinions

The Joy of Performance Improvements

I think that most developers get in to development for one of two reasons – they like solving problems, or they like building things. For me when I got into it is was the latter. I loved the fact that I would build things that people would then use (admittedly, I have often also built things that people didn’t use).

However as my career has progressed I have realised that by far a more enjoyable and satisfying element of development is working on performance improvements.

Since I moved into management I don’t get to work on development as much as I would like (which would be all the time), but recently I have built a system from scratch to delivery while also working on some performance improvements to an area of a site that had been declared “not fit for purpose”.

Comparing the two experiences, there was just no comparison. I’m talking about the experience of taking a failing system, investigating, manipulating, testing, making amendments, re-testing, assessing the impact of tiny changes, assessing the likely impact of large changes, finding out why elements that perform independently do not perform in combination, re-testing again, making another small improvement, re-testing, re-investigating and repeating it until performance is acceptable – then doing a few more rounds until performance is awesome. This was soooo much more enjoyable and rewarding than just building a system from scratch.

I’d recommend any developers to try to get involved in this side of development. That’s why I set up a performance management and improvement company!

By the way, if anyone is interested, the outcome of the performance improvements was taking a process that was taking 45 minutes to run down to running in 3 minutes. The target was 10 minutes.

1 Comment

Filed under Opinions

Technical Debt Management 101

First of all, make sure you read my previous technical debt management posts (how technical debt can be both good and bad) – this post is designed to address the issues described therein.

What can be done to prevent negative technical debt and ensure you are in control of positive technical debt? How do you prevent it becoming a problem?

The simple answer is that you can’t. All systems will have some planned and unplanned technical debt. Software development is a complex, fast moving business and developers are only human. Decisions and technology choices made in the best of faith will turn out to be wrong.

However there are some methods that can be used to minimize the risk and impact of technical debt. Here are some necessary elements of technical debt management:

  • Good system architecture
  • Abstract systems, so that switching out technologies or areas of the system can be done with minimal impact
  • Robust system management
  • Enough testing (automated or manual) and documentation to enable low risk re-engineering if needed
  • Shared responsibilities so that the system is not too dependent on individuals
  • Technical debt management
  • Awareness of areas of technical debt within the system and of the longer term impact of the compromises or decisions you are making now

Paying off debt

Dr._Strangelove_-_The_War_RoomSo, having followed all the good practices specified and having understood the nature of my technical debt, how should I pay it off when time comes?

Well, there are three ways and I will do my best to stretch the financial metaphor one last time to illustrate them…

Ongoing small payments

Make a commitment that with every feature introduced an area of technical debt will also be addressed. Make technical debt everyone’s ongoing problem.

Lump Sum Payment

Take everyone out of feature production for a set period and resolve the main areas of technical debt. Make technical debt everyone’s problem for a short time.

Dedicated payment plan

Create a dedicated technical debt management team who do nothing but resolve technical debt issues and run this alongside the development program. Create some people who have technical debt management as their only problem.

MVP vs Future Proofing

One of the buzz words at the moment is MVP (Minimum Viable Product) – the concept of only building the bare minimum that is needed to deliver the current requirement – like the XP concept of YAGNI (You Aren’t Gonna Need It).

Back in the mists of time when I first started developing, it was all about future proofing – building the framework into your software to allow for future changes to be introduced in a simpler way.

So which approach is right?

I often come across developers advocating elements of future proofing. The reasons advocating such are:

  • Future proofing builds a system based on solid foundations
  • Cutting costs by not doing this piece of work now will lead to increased costs later
  • The major impact of certain future events can be mitigated with a small amount of effort now

However, for me these advantages are outweighed by the advantages of the MVP approach:

  • The business will see benefit from the work much sooner
  • Future proofing is spending money now in case you need to spend it in future – the vast majority of future proofing efforts end up going unused
  • Future proofing can be made obsolete by non-technical issues
  • Changing business requirements
  • New technologies – how often have we spent days coding round an issue that then was included in the next release of the language?
  • Failure of third parties your system depends on

That said, I would still advocate good development practices as a way of minimizing the impact of change later on. Good practices like separation of concerns, loosely coupled objects, modular design, reasonable levels of abstraction, dependency injection etc. are all the building blocks that will enable you to create a system that is well placed for change.

Reduced functionality is no excuse for bad code.

Leave a comment

Filed under Opinions, Technical Debt

Negative Technical Debt part 2: Over-funding

This is a fairly common sort of technical debt, and again one that can be laid at the door of developers.

In financial terms this can be thought of as borrowing too much money and then paying high interest payments. In technical terms this is seen in the over-engineering of systems to build a level of complexity that is just not needed for the system being built.

This results in longer development time, larger amounts of bugs and much more complex ongoing maintenance. The analogy for this is building a hatchback and putting a Formula 1 engine in it (just in case one day it needs to enter a Grand Prix), and then giving it a lifetime warranty.

It was this sort of technical debt that the agile concept of YAGNI (You Aren’t Gonna Need It) was introduced to address. If it is not something that you need right now, then don’t build it. As we discussed in Positive Technical Debt part 2: Structured Finance, YAGNI can also lead to the development of technical debt, but that technical debt is positive and can be controlled.

This type of technical debt is the most negative form as it is very difficult to control. By its very nature it is something that often only becomes apparent when it is already is existence (if it was spotted in advance, it just wouldn’t be done).

Following agile practices, focused development, regular architectural reviews and experienced developers are all ways of dealing with this problem.

In the final post of this series I’ll talk about how to manage Technical Debt (subscribe). In the meantime, why not take a look at the whole Technical Debt series, where we discuss many forms of positive and negative technical debt?

Leave a comment

Filed under Opinions, Technical Debt