Intuition VS. Guesstimate – Why are projects more late than not. – ██FR█████ █INTELL███████████

This content originally appeared on DEV Community and was authored by Slobi

I been in IT industry log enough to feel that projects rarely go by schedule.

At first I was a in teams of young people that had little to no experience, so we done our jobs best we could but we missed our deadlines from time to time. We done all the engineering beforehand to give best possible guesstimate with some buffer but again we were wrong almost the same amount as we where just eyeballing it.

Then I moved on to some more advanced projects, organisations and teams, I also gained more experience but still the pattern continued. I was unhappy about it, I thought that I was a terrible engineer, managers were unhappy to from time to time.

My first line of defense was a thought that software base that we build on top was bad, so we were struggling to fix mess that was beneath our app layer and that was the reason we frequently missed our deadlines.

My second phase was that people are not suited to do programming but by some miracle we manage to do it anyways, so I felt pretty good when we make something working.

Years passed I changed teams talked to many people but the pattern emerged everywhere. So I started to suspect that it is not that simple, we can’t just makeup story and call it a day and feel good about our frailer of not meeting the deadlines. I owed it to myself to think of something better, so I did.

By words it look like this:

When doing one task it is pretty probable for it to be done somewhere around standard deviation of estimate and that task goes pretty well, (I view it as probability density function to finish a task in any given moment,I am not mathematician so excuse my crudeness), but when a task goes wrong, when we get out of that nice and safe zone, we have a low chance to finish it at any of those moments. Imagine Gaussian distribution but it has a tail that is very tin.
There is low chance to get into that tail because there is a lot more surface are before that so our minds ignore it, managers do it too.

My defense against that case was go give overestimates, and most people also do it, so our guess is always on right side of median, but regularly we finished our tasks on time meaning early by our guesstimate. Reaction of management was that we were pressured to keep it real. But then from time to time there was a task or maybe a few that made everyone unhappy, we felt hopeless, missed deadlines even four times the estimate, sometimes had to reorganize and redo whole thing again.

Still when we look at one task at the time, it is not all that grim, there is a slight chance that we make a mistake, but with more work it will get fixed.
We are humans and we can do one thing at the time, so lets make a assumption that ideally we finish one thing and we move to another. It turns out that we get into that unpleasant situation more frequently than we intuitively
felt we should get in. It makes us doubt our path in software engineering. I can’t feel bummed forever about such feelings so I started making an excuse,and it is that when we combine multiple Gaussian distributions with that tiny tail it grows. But I did not had time to sit down and do the home work so I made my best predictions, and moved on with my life.

Recently I been in a release situation that is still ongoing, and the feeling got back. So I decided to do the work to get that thing calculated with my limited and rusted math knowledge.

Implementation

Standard software procedure is if you can’t define thing good, you make discrete approximation and grind numbers trough few for loops, until you get, integration, function product, or something you can’t define but seems logical.
In my mind same as simulation

So distribution approximation I came up with is:

function trail(x) {
    return Math.tanh(x*6-4) + Math.tanh(-x*3+4.3) * (1-trailProb) + Math.tanh(-x+4.3) * trailProb
}

It is not pretty but it gives the top figure, the idea of using hyperbolic tangent function came from one AI course for perception where it is used as quick fix for the Sigmund function.

Then I made it discrete it like this:

let oneTask = {}
for(let i = 0; i < 200; i++){
    oneTask[i] =  trail(i/10)
}

I had a relative probability of finishing a task every 10% of a time estimate. It looks like this:

Then I had to make a up way to combine tasks.
Good that it was discreet so I could reason about it and calculate it:

combineTwoTasks(task1, task2) {
    let twoTasks = {}
    for(let i = 0; i < 100; i++){
        for(let j = 0; j < 200; j++){
            let y1 = task1[i]
            let y2 = task2[j]
            let finalx = i+j
            let finaly = y1*y2/DEVIDEER
            twoTasks[finalx] = twoTasks[finalx] ? twoTasks[finalx] + finaly: finaly
        }
    }
    return twoTasks
}

So x position is added time of finish of a first task and added time of finish of second task and probability is multiply of those two probabilities. Basically it boils down to getting sum of 7 when tossing two consecutive dices.
In the end you get all the ways you get to finish in some moment. So two tasks combined gives another probability curve, going once again you get 3 tasks and so on.
So whole code looks like this:

let oneTask = {}
for(let i = 0; i < 200; i++){
    oneTask[i] =  trail(i/10)
}
let twoTasks = this.combineTwoTasks(oneTask, oneTask)
let threeTasks = this.combineTwoTasks(oneTask, twoTasks)
let nTasks = oneTask
for(let i = 0; i < 7; i++){
    nTasks = this.combineTwoTasks(oneTask, nTasks)
}
for(let i = 0; i < 200; i++){
    this.addPoint(i/10, -oneTask[i]||0)
    this.addPoint(i/10, -twoTasks[i]||0, '#00ff00')
    this.addPoint(i/10, -threeTasks[i]||0, '#0000ff')
    this.addPoint(i/10, -nTasks[i]||0, '#00ffff')
}

For fist few tasks the time and probability of finishing looks like this:

And it looks fine, we should be shamed to do all this work and get result that throws a wet towel to our faces and says you are all bad developers. But this is when all the variables are known, when there is no unknown thing that can make that tedious tail. When we rise a chaise to get into that awkward situation just a little bit the results are a bit different:

We see that a small tail of task that is barely perceive can impact us on forth task destructively. Something like barely 1-2% of area under the curve becomes 20%-30%, and that is only for 4 tasks with almost sure outcome. We have almost no chance to finish on time <10% and good chance to overdue by 1.5x of original guesstimate, 2x is not far fetched eater.

I wanted to play a little bit, and see 7 sequential tasks:

The story continues, we should be done by 7 we are probably doing it by 9.5 estimates, and we have same chance to finish after 13 estimates as we have to finish it on time.

Every task is different, there are a lot of tasks that have much bigger tail section due to various reasons. Only the tasks where all the variables are known are in the pure Gaussian distribution form can be estimated and delivered repeatedly and reliably.

I hope this helps fellow developers to not feel bad when things go wrong, it is nature of universe and help organisations and managers to adapt to the truth.

You can check my few hours of codding mess here:
https://github.com/slobacartoonac/pythonium/tree/master/late

This content originally appeared on DEV Community and was authored by Slobi