Category Archives: Agile

Clumsy is the new [FR]agile

I recently gave a quick talk at Agile on the Bench in Norwich entitled ’Clumsy is the new [FR]agile’. The following is a rough transcript of that talk:

Hi, I’m Dom Davis, CTO at Rainbird, and lapsed Agile Evangelist.

I’m going to be honest here. When Paul asked me about coming to Agile On The Bench I fully expected to be sitting in the audience watching someone else talking. You see, I’ve rather fallen out with agile. I don’t recognise what people call Agile these days. I couldn’t even tell you what Agile is anymore.

In my last role I was confidently informed that I didn’t do agile because I didn’t do scrum – which was an interesting statement. Agile isn’t scrum. And Scrum spelled with a capital “Waterfall” isn’t agile either. It may sound harsh, but I can guarantee you that’s what they were practicing.

Part of this persons philosophy was that stand-ups were important, but hard to organise every day. So instead of five short daily stand-ups there was one hour long weekly standup. That’s not a stand-up. That’s an uncomfortable team meeting.

And it’s not just that one person. I’d go so far as to say that most people using Scrum aren’t really being Agile. At best they’re being Clumsy. At worst they’re deluding themselves. The same holds for Kanban, Lean and pretty much any flavour of Agile you care to mention.

The problem is that you cannot prescribe a one size fits all approach to software development. People are different. Teams are different. Companies are different. Trying to force people to work in a specific way simply because it has been prescribed as “Agile” is agile with a capital FR.

What works at company X may not work at company Y. And while it may look all fine and dandy at company X I’m willing to bet they also have issues – they’ve just got an evangelist who’s willing to stand up and talk about the good bits, while glossing over the problems.

This carbon copy approach shows a fundamental misunderstanding of what it means to be agile. Scrum, Kanban, Lean; these are all frameworks, philosophies you can weave into the very fabric of your company, not rigid processes to be mandated and enforced. Constraint and agility are diametrically opposed. One cannot allow the other. A rigid process, by definition, cannot be Agile.

And yet the virtues of Agile have been sung from the highest parapets. So much so that we all now know, at a visceral level, that Agile Is The Way. So if it doesn’t work for some reason we’ll just try another flavour of it. We’ll employ certified scrum masters and hope that if we believe hard enough, and follow the plan rigidly enough, it will all be OK.

It won’t, because you’re addressing the wrong problem. This is not the Agile way. It’s the Clumsy way. It’s only when you need to react fast, to be truly Agile, that you find out you’re actually Clumsy. And you find out the hard way, tripping over your processes and getting tied in knots.

Agile is not a process, it’s a state of being. It means you can act with agility. That you can react to the needs of the business, and to the pitfalls of software development in a timely manner. And that’s it. Everything else is process.

Most truly agile teams will review and change their process regularly: keep what is working at this time, shed what is not. What works today may not work in 6 months because the problem space is dynamic, constantly shifting. They understand this and embrace it. There is no perfect solution, just something that works well enough for now.

Strip away the buzzwords and we’re just talking about project management. Except project management is a dirty word. It’s enterprise-y, and we’re all ninja-rockstar-full-stack developers deploying fleets of containerised micro-services to the cloud. We don’t want to be constrained by Process.

But we do need something. It doesn’t need to be Process with a capital P, but simply building a Kanban board, running daily stand-ups, and declaring ourselves to be “agile” isn’t going to work. There are some fundamentals we need to get right. Without those fundamentals you’re setting yourself up to fail.

There are no quick wins. But there is basic starting point to get to the solutions that works for you, in your team, in your company. In the end it all boils down to communication. How do we communicate the requirements from our stakeholders and users to the development team? How do we communicate progress back to the stakeholders and users?

This could be anything from Post-It notes on a whiteboard and informal meetings when required, all the way up to full blown project management systems. Informal doesn’t scale well, and the more formalised the system the less agility it has, so there is a trade off.

But don’t start with the tools, or the process. Start with what you want to communicate. How do new issues enter the pipeline? How do we make sure that what is developed is what is required? How do we feedback progress?

Get that sorted and you’re well on the way to winning, regardless of how it’s done, or what you call it.

If you’re interested I use a process loosely based on Xanpan at Rainbird. I call this process ’Fred’, purely because people keep asking me what I use and I needed to give it a name. Fred, as practiced now, has little resemblance to Fred when I first called it that. It’s not without its issues.

If you want to hear more I’ll be ranting about Agile and Agile Processes in a longer session called ’Agile Smagile’ at NorDevCon.

K.I.S.S.

One of the talks I do explains how ‘simple’ in the K.I.S.S.1 principle is not the same as ‘easy‘. Sometimes the easy option can introduce complexities into the system that become detrimental and actually introduce complexity. The simplest example is testing. Not testing is the easy option; but it introduces uncertainty and complexity into your system that will be difficult to pin down at a later date.

That said easy and simple are not mutually exclusive, which is an important thing to remember, especially when you’re caught in the throws of trying to fix a supposedly simple problem.

I spent far too many hours yesterday trying to write an install script for one of our components. The underlying issue is simple: “As someone installing the component, I want the configuration files to reflect the idiosyncrasies of my setup so that I can compile the code”. Or, if you prefer BDD format: “Given I am installing the component, when I perform the compile step, then the compilation should work”.

It all boils down to a bindings file which specifies the location of a couple of libraries and includes. The defaults specified in that file don’t work on everyones system, although they do work on live2.

My salutation was to write some code that iterated through the files, check if any were missing, and if so, prompt the user to give the correct locations. Great, except the format of bindings.gyp is such that I needed to take a complex structure, flatten it, inject extra details so the user prompts made sense, and then reconstruct the complex structure from the flat one. Not wanting to hard code the format I then disappeared up my own backside with specially crafted config files and mappings from that to the format used by bindings.gyp.

nearly 200 lines of code in, deep in callback hell with grave concerns about whether my script would even pass code review I discovered some pretty nasty bugs that meant that minor configuration changes to a live server would cause an automated deploy to suddenly require user input, which we didn’t want. Adding logic to provide an override to this would make things even more complex and my nice, simple solution was disappearing out of reach.

It was then that it hit me that I was providing the wrong solution. This is something that really needs to be set once per environment and then left. It’s not going to be used a huge amount of time, it doesn’t need to be gold plated. With that in mind I wrote a simple shell script that checked for the presence of an environment variable. If the variable existed, use the bindings file it points to, if it doesn’t, use the default; simples.

All told, with error handling and everything, the script is 21 lines long. Not only that, but it provides a nice way to handle upgrades to the environment in live without having to redeploy.


1 Keep It Simple Stupid – in a nutshell, don’t overcomplicate things.

2 Something we wanted to maintain.

Refactoring as a Displacement Activity

Refactoring is good right? We constantly read about how we should be writing simple and easy code to do the minimum necessary and refactoring it as we go to improve it and expand what it can do. Unfortunately, this can lead to some really bad habits, especially if it’s used as a displacement activity.

What do I mean by displacement activity? Well lets consider a hideous project that is under specced, under resourced and overdue. Our poor old developer is demotivated and drowning under a sea of requirements that they don’t know how they’re going to implement, but what they have done is written the code that handles the command line arguments. This code works to spec, is even tested and could be considered complete. But, it’s taking user input. It’s not as neat as it could be. In fact it’s quite messy.

So what does our developer do? Do they tackle the next in the long list of requirements and try and get closer to completion, or do they refactor the working code to make it nicer? While we all know they should be doing the former, all to often it’s the latter that happens. Our developer goes off and refactors the working code to make it nicer.

Having seen this happen on a number of occasions it got me to thinking what was going on here. Why mess about with the stuff that works when there is a pile of stuff that doesn’t work to be getting on with. The answer is brutally simple, it’s a psychological trick.

In order to feel that they have had a productive day our developer needs to make progress. By tackling the difficult, as yet unwritten features, the developer is taking a risk. They may not be able to complete the task in the time they’ve allotted. They may run into problems. They may get stuck. They may fail. By refactoring the working code, on the other had, they’re tackling a problem that they know they can solve; after all, they’ve solved it already. What they’re doing now is a refinement.

Of course, once the refinement is completed our developer is still left with the mountain of requirements to complete that they didn’t want to tackle, and they now have less time to complete those requirements, but dealing with that problem has been shifted to something we can deal with tomorrow. In the mean time “progress”, for a given definition of progress, has been made because the existing code is now cleaner. A classic displacement activity.

Taken to its extreme this kind of thinking can lead to making nothing absolutely perfectly. You tend towards, but never actually achieve, the perfect solution for your problem. This is compounded by the fact that as soon as you start dealing with the edges of software development (input and output) things start getting messy. Perfection can’t actually be achieved and you’ll end up swapping one compromise for another ad-infinitum. Annoyingly we know all about this pitfall. the KISS principle and MVPs are all about avoiding this type of displacement and producing something that is important: working code. The irony is that this “perfect code” that we’ve wasted time crafting will no doubt get hacked about later as the system grows and the missing requirements are added. One day we may learn.

PDD, The Other SDLC

As I’ve said before, I’m no stranger to public speaking. That said, NorDevCon was my first conference and, by all accounts, I did OK.

My talk, “PDD1, The Other SDLC2” focused on how external stimuli, such as live production errors and impending deadlines, can cause development practices to break down. Ultimately this breakdown boils down to communication, or lack thereof. This talk was spawned from my talk on “Agile In The Real World” last year, and both of these talks focus on the personal experiences I have had when real life gets in the way of theory.

Personally I like this type of talk as it’s something that tends to resonate with the audience. With Impostor Syndrome and fears about “Am I doing it right?” being quite common among developers it can be quite useful to stand up and highlight where it commonly goes wrong so people know they are not alone.

Of course the irony is my own Impostor Syndrome and fears about “Am I doing it right?” have me worry the audience will respond with “No, we don’t relate to that, it’s just you doing it wrong“. Having the original author of some of the work I was building on attending the talk also added to the pressure a tad.

Having proved to myself that not only can I do it, but that I thoroughly enjoy it, I’ll be looking to see if I can do talks at other conferences3. I’m also hoping to line up so local talks too during the year (including my PDD talk if you missed it at NorDevCon). I suspect this year is going to see a lot of me using Keynote 🙂


1 PDD: Panic Driven Development

2 SDLC: Software [or Systems] Development Lifecycle

3 Ones where I don’t personally know the organiser, and run the group backing the conference 🙂

Costing and Commitment

One of the hardest aspects of Scrum seems to be the accurate costing of stories. We all know the theory: you break your work into chunks, cost those chunks, and commit to a certain amount of work each week based on costing and velocity. Pure Scrum™ then states that if you do not complete your committed work then Your Sprint Has Failed. All well and good, apart from one small problem: it’s all bollocks.

I’ve long had an issue with the traditional costing/commitment/sprint cycle insofar as it doesn’t translate from theory into the real world. In her recent talk at NorDev, Liz Keogh pointed out that Scrum practitioners are distancing themselves from the word “commitment” and from Scrum Failures as they are causing inherent problems in the process. Allan Kelly recently blogged about how he considers commitment to be harmful. Clearly I’m not alone in my concerns.

As always, when it comes to things like this, I’ve simply borrowed ideas for other people and customised the process to suit me and my team. If this means I’m doing it “wrong” then I make absolutely no apologies for it, it works for us and it works well.

Costing

Developers suck at costing. I mean really suck. It’s like some mental block and I have yet to find a quick and effective way to clear this. Story points are supposed to make developers better at costing because you’re removing the time element and trying to group issues by complexity. That’s lovely on paper, but I’ve found it just confuses developers – oh, we all get the “a Chihuahua is 1 point so a Labrador must be 5 or 8 points” thing, but that’s dogs, visualising the relative size and complexity of code is not quite as simple.

What’s worse is that story points only relate to time once you have a velocity. You can’t have velocity without points and developers generally find it really hard to guesstimate1 without times. Chicken and egg. In the end I just loosely correlated story points and time, making everyone happy. I’ve also made story points really short because once you go past a few days estimates start becoming pure guesses. What we end up with is:

Points Meaning
0 Quicker to do it than cost it.
0.5 One line change.
1 Easily done in an hour.
2 Will take a couple of hours.
3 A morning/afternoons work.
5 Will take most of the day.
8 It’s a days work.
13 Not confident I can do it in a day.
20 Couple of days work.
40 Going to take most of the week.
100 Going to take a couple of weeks.

Notice this is a very loose correlation to time, and it gets looser the larger the story point count. Given these vagaries I will only allow 40 and 100 point costings to be given to bugs. Stories should be broken up into chunks of two days or less so you’ve got a good understanding of what you’re doing and how long it’s going to take2.

With that in mind 40 points really becomes “this is going to be a bitch to fix” and 100 points is saved for when the entire team looks at you blankly when diagnosing the problem: “Err… let me go look at it for a few days and I’ll get back to you“.

Stopping inflation

Story point inflation is a big problem with scrum. Developers naturally want to buy some contingency time and are tempted to pad estimates. Story point deflation is even worse with developers being hopelessly optimistic and then failing to deliver. Throw in the The Business trying to game the system and it’s quickly become a mess. I combat this in a few ways.

Firstly, points are loosely correlated to time. In ideal conditions a developer wants to be completing about 8 points a day. This is probably less once you take meetings, walkups and other distractions into account. While an 8 point story should be costed such as the developer can complete it in a normal day with distractions accounted for, the same doesn’t hold true for a series of 1 point stories. If they’re all about an hour long and there’s an hours worth of distractions in the day then the developer is only getting 7 points done in that day.

Minor fluctuations in average per developer throughput are fine, but when your velocity starts going further out of whack it’s time to speak to everyone and get them to think about how they’re estimating things.

Secondly, points are loosely correlated to time. A developer can track how long it takes them to complete an issue and if they’re consistently under or over estimating it becomes very apparent as the story points bear no correlation to the actual expended effort. A 5 pointer taking 7 hours isn’t a problem, but any more than that and it probably wanted to be an 8 pointer. Make a note, adjust future estimates accordingly. I encourage all my developers to track how long an issue really takes them and to see how that relates to the initial estimate.

Thirdly, costing is done as a group exercise (we play planning poker) and we work on the premise of an “average developer”. Obviously if we take someone who is unfamiliar with the code it’s going to take them longer. You’ll generally find there’s some outlying estimates with someone very familiar with that part of the code giving low estimates and people unfamiliar with it padding the value. We usually come to a consensus fairly quickly and, if we can’t I just take an average.

I am aware that this goes against what Traditional Scrum™ teaches us, but then I’m not practicing that, I’m practicing some mongrel Scrumban process that I made up as I went along.

Velocity and commitment

I use an the average velocity of the past 7 sprints3 adjusted to take into account holiday when planning a sprint. We then pile a load of work into the sprint based on this figure and get to work. Traditionally we’ve said that we’ve committed to that number of story points and issues but only because that’s the terminology that I learned with Scrum. Like everything else, it’s a guestimate. It’s what we hope to do, a line in the sand. There are no sprint failures. What there is is a discussion of why things were not completed and why actual velocity didn’t match expected velocity. Most of the time the reasons are benign and we can move on. If it turns out there are problems or impediments then we can address them. It’s a public discussion and people don’t get to hide.

Epics and Epic Points

The problem with having story points covering such a small time period is that larger pieces of work start costing huge numbers of points. A month is something like 200 points and a year becomes 2500 points. With only 2000 hours in a year we start getting a big disconnect between points and time which The Business will be all over. They’ll start arguing that if a 1 year project is 2500 points then why can’t we have 2500 1 point issues in that time?

To get round this issue we use epic points which are used to roughly cost epics because they’re broken down into stories and properly costed. While story points are educated guesstimates epic points are real finger in the air jobs. They follow the same sequence as story points, but they go up to 1000 (1, 2, 3, 5, 8, 13, 20, 40, 100, 250, 500, 1000). We provide a handy table that lets the business know that if you have an epic with x many points and you lob y developers at the problem then it will take roughly z days/weeks/months. The figures are deliberately wooly and are used for prioritisation of epics, not delivery dates. We’re also very clear on the fact that if 1 developer can do it in 4 weeks, 2 developers can’t do it in 2. That’s more likely to be 3 weeks.

Epic points are malleable and get revisited regularly during the life of an epic. They can go up, down or remain unchanged based on how the team feel about the work that’s left. It’s only as the project nears completion that the epic points and remaining story points start bearing a relationship to each other. Prior to that epic points allow The Business to know if it’s a long way off, or getting closer.


1 What, you think this is based on some scientific method or something? Lovely idea, we’re making educated guesses.

2 I’ve had developers tell me they can’t cost an issue because they didn’t know what’s involved. If you don’t know what’s involved then how can you work on it? Calling it 20 points and hoping for the best isn’t going to work. Instead you need to create a costing task, and spend some time working out what’s involved. Then, when you understand the issue, you can then cost it properly.

3 A figure based purely on the fact that JIRA shows me the past 7 sprints in the velocity view.

PDD

Development Strategies

The Development Strategy triangle.


Most [all?] discussions on Agile (or lean, or XP, or whatever the strategy de jour is currently) seem to use a sliding scale of “Agileness” with pure a Waterfall process on the left, a pure Agile process on the right. You then place teams somewhere along this axis with very few teams being truly pure Waterfall or pure Agile. I don’t buy this. I think it’s a triangle with Waterfall at one point, Agile at the second, and Panic Driven Development at the third. Teams live somewhere within this triangle.

So what is Panic Driven Development? Panic Driven Development, or PDD is the knee jerk reactions from the business to various external stimuli. There’s no planning process and detailed spec as per Waterfall, there’s no discreet chunks and costing as per Agile, there is just “Do It Now!” because “The sky is falling!“; or “All our competitors are doing it!“; or “It’ll make the company £1,000,0001; or purely “Because I’m the boss and I said so“. Teams high up the PDD axis will often lurch from disaster to disaster never really finishing anything as the Next Big Thing trumps everything else, but even the most Agile team will have some PDD in their lives, it happens every time there is a major production outage.

If you’re familiar with the Cynefin framework you’ll recognise PDD as living firmly in the chaotic space. As such, a certain amount of PDD in any organisation is absolutely fine – you could even argue it’s vital to handle unexpected emergencies – but beyond this PDD is very harmful to productivity, morale and code quality. Over a certain point it doesn’t matter if you’re Agile or Waterfall, the high levels of PDD mean you are probably going to fail.

Sadly, systemic PDD is often something that comes from The Business and it can be hard for the development team to push back and gain some order. If you find yourself in this situation you need to track all the unplanned incoming work and its affect on the work you should be doing and feed this data back to the business. Only when they see the harm that this sort of indecision is causing, and the effect on the bottom line, will they be able to change.


1 I have worked on quite a few “million pound” projects or deals. The common denominator is that all of them failed to produce the promised million, often by many orders of magnitude.


“PDD” originally appeared as part of Agile In The Real World and appears here in a revised and expanded form.


Three Bin Scrum

Allan Kelly blogged recently about using three backlogs with Scrum rather than the more traditional two. Given this is a format we currently use at Virgin Wines he asked if I would do a writeup of how it’s used so he could know more. I’ve already covered our setup in passing, but thought I would write it up in a little more detail and in the context of Allan’s blog.

Our agile adoption has gone from pure PDD, to Scrum-ish, to Kanban, to something vaguely akin to Scrumban taking the bits we liked from Scrum and Kanban. It works for us and our business, although we do regularly tweak it and improve it.

With Kanban we had a “Three-bin System“. The bin on the factory floor was the stuff the team was actively looking at, or about to look at; the bin in the factory store was a WIP limited set of issues to look at in the near future; and the bin at the supplier was everything else.

When we moved to our hybrid system we really didn’t want to replace our three bins, or backlogs with just a sprint backlog and product backlog because the product backlog would just be unworkable (as in 1072 issues sitting in it unworkable!). So we kept our three backlogs.

The Product Backlog

The Product Backlog (What Allan calls the Opportunity backlog, which is a much better name) is a dumping ground. Every minor bug, every business whim, every request is recorded and, unless it meets certain criteria, dumped in the Product Backlog. There’s 925 issues in the product backlog at the moment, a terrifyingly large number of those are bugs!

I can already hear people telling me that those aren’t really bugs or feature requests, how can they be, they’re not prioritised, therefore they’re not important. They’re bugs alright. Mostly to do with our internal call centre application or internal processes where there are workarounds. I would dearly love to get those bugs fixed, but this is the Real World and I have finite resources and a demanding business.

I am open and honest about the Product Backlog. An issue goes in there, it’s not coming out again without a business sponsor to champion it. It’s not on any “long term road map”. It’s buried. I am no longer thinking about it.

Our QA team act as the business sponsor for the bugs. Occasionally they’ll do a sweep and close any that have been fixed by other work, and if they get people complaining about a bug in the Product Backlog they’ll prioritise it.

The Product Backlog is too big to view in its entirety. We use other techniques, such as labels and heat maps to give an overview of whats in this backlog at a glance.

The Sprint backlog

Bad name, I know, but this equates to Allan’s Validated Backlog. This is the list of issues that could conceivably be picked up and put into the next sprint. The WIP limit for this backlog is roughly 4 x velocity which, with our week long sprints, puts it at about a months work deep.

To make it into the Sprint Backlog an issue must be costed, prioritised and have a business sponsor. Being in this backlog doesn’t guarantee that the work will get done, and certainly doesn’t guarantee it’ll get done within a month. It simply means it’s on our radar and has a reasonable chance of being completed. The more active the product sponsor, the higher that chance.

The Current Sprint

With a WIP limit of last weeks velocity, adjusted for things like holidays and the like, this forms the List Of Things We Hope To Do This Week. We don’t have “Sprint Failures“, so if an issue doesn’t make it all the way to the Completed column it simply gets dumped back into the Sprint Backlog at sprint completion. The majority of uncompleted issues will get picked up in the next sprint, but it’s entirely possible for something to make it all the way to the current sprint, not get worked on, then get demoted all the way back to the Product Backlog, possibly never to be heard from again.

Because issues that span sprints get put back in exactly the same place they were when the last sprint ended we end up with something that’s akin to punctuated kanban. It’s not quite the hard stop and reset that pure Scrum advocates, but it’s also not continuous flow.

The current sprint is not set in stone. While I discourage The Business from messing about with it once it’s started (something they’ve taken on board) we are able to react to events. Things can be dropped from the sprint, added to the sprint or re-costed. Developers who run out of work can help their colleagues, or go to the Sprint Backlog to pull some more work into the sprint. Even the sprint end date and the timing of the planning meeting and retrospective is movable if need be.

The Expedited Queue

There is a fourth backlog, the Expedited Queue. This is a pure Kanban setup with WIP limits of 0 on every column and should be empty at all times. QA failures and bugs requiring a patch get put in this queue and should be fixed ASAP by the first available developer. Story points are used to record how much work was spent in the Expedited Queue, but it’s not attributed to the sprints velocity. The logic here is that it’s taking velocity away from the sprint, as this is work caused by items that aren’t quite as “done” as we had hoped.

Continuous Integration

I’m having lunch with Paul Grenyer today to discuss Continuous Integration, or CI. In a nutshell CI is an automated process that performs a build on a regular basis, be that every hour, overnight, or on every commit to a major branch. Ideally your build will also run your unit tests and any other tests or analysis you run meaning that at any given moment in time you can be confident that your build is sound.

CI has been a part of my coding life for so long that I can’t even remember when I was first introduced to it. What I do remember is that, initially at least, it was setup and handled by others. I simply had to check in my code and hope I didn’t get the dreaded “Build Broken” email from Cruise Control.

CI was so ingrained into me that it was a bit of a shock when I moved to a company that didn’t use it. We couldn’t even use the joke phrase “It compiles, lets ship it!” because we didn’t know if it actually did compile from commit to commit. A quick Google, much swearing and half a day later and we had Cruise up and running. Now, not only was there the “Build Broken” email to fear, there was editing the Cruise config file after each release to point to the new release branches. Cruise is (or at least was) not the easiest thing in the world to configure.

My tenure as a Cruise admin was mercifully short lived as I discovered Hudson which is much, much easier to configure. My fun with release branches continued until we moved to Git. By this time Hudson had forked and we had gone the Jenkins
route. Jenkins now runs CI builds, overnight builds, release builds and has been pressed into service as a handy way to kick off a few scripts either periodically or on request.

Our Builds

Much as I’d love to use Maven our legacy code makes that difficult. Instead we have a build project that handles all our builds using a set of quite complex ant scripts. Locally the developers have the option of:

  • clean: Delete all build artefacts. Not sure this is ever used, but it’s there, just in case.
  • compile: For our legacy code this does a local build and puts the build output in the directories required to run everything locally. Thanks to the magic of our system running things locally is different to running it in any other environment. For the newer code base this just compiles the code locally allowing you to run it. Given Eclipse does the same anyway it’s a target that is rarely used in the newer projects.
  • deploy: Perform a full build of the project, including Checkstyle checks, JUnit tests, Cobertura code coverage and packaging the code into it’s final zip, jar, war or ear (depending on the project). If this completes for all projects and dependents you have altered you can be reasonably sure that Jenkins will not fail your build. In the rare case that it does you are exempt from shame and punishment as it’s invariably something you couldn’t have known about.
  • sonar: Perform a deploy, then run Sonar over the code which performs an enhanced set of checks configured in Sonar. Keeping Sonar green keeps me happy, but unlike build failures, chasing a clean Sonar result should not be done at the expense of actually getting work done. Sometimes good enough is fine.
  • verify: The newer code base is split over a number of project. Verify runs deploy for each project checking that you’ve not broken anything in another project that may depend on your code.

Sat on top of this is the set of CI build targets run by Jenkins:

  • ci.build: Run on master and the release branches after each commit (currently Jenkins polls every 60 seconds, I’d like to change this to a commit hook one day), it calls deploy on each project. Unlike verify, which is a single ant build that calls deploy on each project Jenkins runs a new ant build for each project. This has caused issues where verify builds clean and Jenkins fails and vice-versa.

  • push.build: This is a manually run parameterised build that takes the given version number and creates a production release with a unique build number. This calls deploy but overrides a number of parameters so the version details are configured correctly. It also pushes the resultant zip, jar, ear or war in a staging area.

  • promote.build: Another manually run parameterised build that takes the build number generated by push and promotes it to the specified environment (development, one of the QA environments or our pre-production environment). This simply copies the staged files from the previous push, guaranteeing that the same release is tested in each environment.

  • release.build: Identical to promote.build except there is a checkbox that must be ticked agreeing to the warning that this is going to production. The destination becomes the production staging area.

  • overnight.build: Run overnight by Jenkins, this calls sonar and provides a nightly snapshot of the overall quality of our builds.

New projects just need a simple ant file pointing at our build project with a few variables set to gain all of these targets. It’s then just a question of cloning a the Jenkins jobs from another project, making them specific to the new project and you’re away. Maybe not the most elegant of systems, but its reliable and adaptable.

Agile In The Real World

No plan of operations extends with any certainty beyond the first contact with the main hostile force.” – Helmuth Carl Bernard Graf von Moltke

I’ve been doing eXtreme Programming (XP) and Agile in one guise or another since the early 2000’s. During that time I’ve been in big teams, small teams, bureaucratic organisations, lean organisations and chaotic organisations. I have never worked in a top down Agile organisation and probably never will. Also, no two teams I have worked in have done Agile the same way. I suspect this is partly to do with the organisations the teams were part of, and partly to do with the teams themselves. This is not a bad thing.

Agile is a toolkit, not a rigid set of structures. As with all toolkits, some tools fit better for certain circumstances than others. A good team will adopt an Agile process that fits them, fits the business they work in and then adapt that process as and when things change (and they will). If you’re looking for a post about “How to do Agile” then this is the wrong place. I can’t tell you, I don’t know your team, or your organisation. Instead this explains how I’ve implemented Agile for our team and our organisation in order to get the maximum benefit.

PDD

Most (all?) discussions on Agile seem to use a sliding scale of Agileness with pure a Waterfall process on the left, a pure Agile process on the right and then place teams somewhere along this axis with very few teams being truly pure Waterfall or pure Agile. I don’t buy this. I think it’s a triangle with Waterfall at one point, Agile at the second, and Panic Driven Development at the third. Teams live somewhere within this triangle.

So what is Panic Driven Development? Panic Driven Development, or PDD is the knee jerk reactions from the business to various external stimuli. There’s no planning process and detailed spec as per Waterfall, there’s no discreet chunks and costing as per Agile, there is just “Do It Now!” because “The sky is falling!“, or “All our competitors are doing it!“, or “It’ll make the company £1,000,0001, or purely “Because I’m the boss and I said so“. Teams high up the PDD axis will often lurch from disaster to disaster never really finishing anything as the Next Big Thing trumps everything else, but even the most Agile team will have some PDD in their lives, it happens every time there is a major production outage.

When I first joined my current company it was almost pure PDD. Worse still, timescales were being determined by people who didn’t have the first clue about how long things would really take. Projects were late (often by many months) and issue tracking was managed by simply ditching any issues over a certain age. In short it was chaos. Chuck in a legacy codebase with some interesting “patterns”, a whole bunch of anti patterns and a serious amount of WTFs and you have the perfect storm: low output and poor quality.

Working on the edge of chaos

One thing I realised very early on was that I was not going to be able to change The Business. The onslaught of new things based on half formed ideas was never going to change and the rapid changes of direction were part of the companies DNA. Rather than fight this we embraced it, with some caveats.

Things change for us, fast. Ideas get discarded, updated and changed in days and the development team needs to keep up. To achieve this we use Scrum… except where we don’t, and use Kanban instead. Don’t worry though, it’s not that complex. 🙂

Scheduled work is done using Scrum. Sprints are a week long and start on a Wednesday2. Short, rapid sprints mean we can change direction fast without knocking the sprint planning for six. If the business want to change direction they only have to wait a few days. Releases generally (but not always) consist of two sprints of work. A release undergoes 2 weeks of QA after leaving development so will generally be in production 4 weeks after the sprint started. If need be we can do a one sprint release with as little as one week QA and have a change out within 3 weeks of it being requested.

Sat on top of that we have a Kanban queue which should remain empty at all times. It is populated with QA failures and critical issues that are either blocking the release, or require patching. Every column on the Kanban board has a constraint of 0 items. Put something in it and it goes red, making it pretty obvious that someone needs to fix something sharpish.

The sprint planning meeting, retrospective and costing are all handled in the same Wednesday morning meeting which lasts an hour. First up we look at the state of the outgoing sprint. We look at what got added to the sprint after it started, and why; what was removed from the sprint, and why; and what wasn’t completed within the timeframe of the sprint, and why. We run a system whereby it’s OK for things to span sprints. Things overrun, things get stalled, and sometimes it’s simply that you had half an hour left in the sprint, added a new issue to work on, but never had enough time to finish it. Any concerns are raised and handled, then the sprint is closed. The next sprint is then planned using a moving average of velocity as guidance for how much work to add. Any time remaining in the meeting is used costing and curating the backlog. Sadly the business rarely attend these meetings meaning we need to be creative when it comes to business sponsors.

Finding Business Sponsors

Unlike traditional Scrum we have two backlogs. With over a decade of technical debt and more new development than we can possibly hope to achieve we have hundreds of issues. Clearly this is unworkable. The majority of these live in the un-prioritised backlog. We know about them, we’ve documented them, but they’re not getting done, and may not even get costed unless someone champions them and gets them pushed into the scrum backlog. The scrum backlog is the realistic backlog. We aim to keep no more than 4 x average velocity worth of work in this backlog which means at any given time it provides a roadmap for the next month. We also make sure everything in the scrum backlog is properly costed meaning sprint planning is incredibly easy; just put the top 25% of the backlog into the sprint, adjusted for holidays and various other factors.

Using this method you very quickly find sponsors coming out of the woodwork. When work is not done people start asking where it is, you can then explain to them that it’s not been prioritised, or it’s being trumped by other work. If they care about the issue then they need to champion it, become the business sponsor and take responsibility for it. They can argue the case for it being moved up the backlog with the business. If they don’t want to do that then clearly the work is not important, so it goes into the un-prioritised backlog to eventually die through lack of interest. Stuff that is already in the un-prioritised backlog can be fished out when a sponsor is found and costing can start.

Bugs generally follow a slightly different process insofar as they will always have a sponsor, even if it’s the testing team. Bugs are never closed unless they are fixed, or cease to be an issue due to other changes. The QA team will regularly revisit all open bugs and re-prioritise or close them as necessary.

Costing

New features are costed using planning poker and we use very small stories. Valid costings are 1 (1 line change), 2, 3, 5, 8, 13 and 20. Our target velocity is between 8 and 13 points per developer per day. Any slower and we’re being too optimistic with our costing, any faster and we’re being too pessimistic. Bearing that in mind a developer should easily handle two 20 point stories in a single sprint with room to spare. Anything larger than 20 points needs to be carved up into multiple stories, or turned into an Epic. We do this because estimates get rapidly poorer once you go past a couple of days work.

Stories are only costed if the team fully understand the issue. If there are questions the issue is noted and the questions taken to the Business Sponsor. Yes, it would be great if they were in the costing meeting and could answer the questions there and then, but it can be a little like herding cats sometimes. The cost to the business sponsor is that the issue isn’t costed and can’t go into a sprint until it is, and it’s a cost they’re incurring by not attending, not that we’re imposing on them.

Stories that exceed 20 points are either quickly split into a couple of stories, or converted to an epic and a costing task raised. This allows time in a sprint for one or more members of the team to find the full set of requirements from the business sponsor and generate the full set of required stories for later costing.

Scope creep can either be added to a story, or a new story created for the creep. If it’s added to an existing story it’s old costing is discarded, and if that story is in the current sprint it’s ejected from the sprint until it’s been re-costed and space made for it. The costing may happen there and then with the team having a quick huddle, or it may need to wait for the next planning meeting.

It’s not a silver bullet

Nothing is written in stone except the maximum velocity of the team. Sprints can start late, end late or end early. Releases can be held back, or bought forward. Issues can be removed from the sprint and replaced with others. We can react to the business, but it’s not a silver bullet. The more the business change their minds the slower throughput gets due to the inertia of changing direction, however, they are now better informed and can see and measure the effects of this which has resulted in a lot less chopping and changing.

Projects are now being delivered on time, however, the timescales are also now realistic, and easily tracked. Projects are becoming better defined as the true cost of them is realised by those proposing them. The output is similar to what it used to be, but is now more focused. Rather than over promise, under deliver and spend months cleaning up the mess certain projects just aren’t even attempted.

The process is continually evolving. We’ve done pure Scrum and pure Kanban before. The model we use took the most useful aspects of both of these systems. As we try new things we’ll take the best bits and adapt them to suit us. No doubt there are Agile Evangelists out there who will balk at one or more aspects of what we do as being wrong. Maybe they are, all I can say is they work for us and the team is happy with how we work. If they’re not, we change it.


1 I have worked on quite a few “million pound” projects or deals. The common denominator is that all of them failed to produce the promised million, often by many orders of magnitude.

2 Why Wednesday? People are more likely to be in. There isn’t that last minute panic on Friday to get everything finished and the sprint doesn’t start on a day where people are catching up after the weekend.