Stop Estimating, Start Measuring: A Bayesian Approach to Software Deadlines
These articles are AI-generated summaries. Please check the original sources for full details.
¿Cuándo esta terminado?
José Gutiérrez proposes a data-driven system to eliminate the perpetual ‘two more weeks’ estimation cycle in software development. The method relies on 10,000 Monte Carlo simulations derived from real developer throughput to achieve 95% confidence in delivery dates.
Why This Matters
Traditional estimation is an illusion because developers cannot accurately predict time in changing contexts with variable capacity. By shifting from hope-based estimates to Bayesian updates, engineering teams synchronize their models with reality, converting subjective guesses into a predictable distribution of outcomes that exposes technical debt and management friction.
Key Insights
- Task Atomization: Features must be divided into tasks of ≤ 1 ideal day to ensure units of work are comparable and the execution plan remains refined.
- Convergence Metric: Developer throughput distributions typically stabilize after an 8-week observation period, reflecting true capacity including non-ideal work days.
- Bayesian Updating: The system adjusts the ‘prior’ belief of velocity with weekly observations to generate a ‘posterior’ distribution, improving prediction accuracy automatically.
- Monte Carlo Simulation: Running 10,000 scenarios against a remaining backlog allows teams to communicate delivery ranges (e.g., 50th vs 95th percentile) instead of fixed, optimistic dates.
- Refinement Enforcement: The requirement to break features into small tasks forces developers to plan execution paths early, reducing architectural surprises during the cycle.
Working Examples
A Monte Carlo simulation using NumPy to predict delivery dates based on historical developer velocity and task backlog.
import numpy as np
# Datos hist3ricos (semanas 1-8)
dev_a_velocidad = [20, 22, 19, 21, 20, 23, 21, 22] # tareas/semana
dev_b_velocidad = [15, 14, 16, 15, 17, 15, 16, 15]
# Par1metros distribuciones, asumiendo normalidad
velocidad_a = np.mean(dev_a_velocidad) # ~21
std_a = np.std(dev_a_velocidad) # ~1.2
velocidad_b = np.mean(dev_b_velocidad) # ~15.4
std_b = np.std(dev_b_velocidad) # ~0.8
backlog_tareas = 200 # tareas restantes
# Simulaci3n Monte Carlo
simulaciones = 10000
dias_para_terminar = []
for _ in range(simulaciones):
tareas_semana_a = np.random.normal(velocidad_a, std_a)
tareas_semana_b = np.random.normal(velocidad_b, std_b)
tareas_por_semana = tareas_semana_a + tareas_semana_b
semanas_necesarias = backlog_tareas / tareas_por_semana
dias = semanas_necesarias * 7
dias_para_terminar.append(dias)
# Resultado
percentil_95 = np.percentile(dias_para_terminar, 95)
percentil_50 = np.percentile(dias_para_terminar, 50)
print(f"Mediana: {percentil_50:.0f} d1as")
print(f"95% confianza (worst case): {percentil_95:.0f} d1as")
Practical Applications
- Stakeholder Communication: Shift from promising ‘3 weeks’ to providing a confidence range based on 10k simulations. Pitfall: Relying on optimism to avoid political pressure leads to total loss of credibility when dates slip.
- Daily Workflow: Use tasks as binary progress indicators (Done/Not Done) to make blockers immediately visible. Pitfall: Reporting progress as ‘working on the feature’ without a granular plan hides delays until it is too late to react.
- Team Stability: Use sliding windows on historical data to adjust to team changes or scope creeps in real-time. Pitfall: Swapping team members frequently breaks the distribution model, requiring a reset of the Bayesian prior.
References:
Continue reading
Next article
DeepSeek-V4: 1M-Token Contexts via Compressed Sparse Attention and Hybrid Architecture
Related Content
Optimizing Engineering Throughput: Why Speed Does Not Equal Velocity
Software teams often mistake shipping speed for progress, but true velocity requires alignment with business outcomes like a 99% payment success rate.
Beyond Feature Delivery: How Open Source Redefines Software Engineering Mindsets
Open source contributor Tarunya Kesharwani details how GSoC participation and PR reviews shift engineering focus from basic feature completion to long-term maintainability, highlighting that professional software engineering requires balancing immediate functionality with architectural scalability and collaborative code standards across diverse technology stacks.
Solving the DevOps Tool Sprawl: Reclaiming Release Context
Modern DevOps teams face fragmented delivery cycles as specialized tools like Jira, GitHub, and Jenkins create data silos that hinder compliance and release visibility.