Instructions references/architecture_decision_records.md references/engineering_metrics.md references/technology_evaluation_framework.md scripts/team_scaling_calculator.py scripts/tech_debt_analyzer.py
name: “cto-advisor”
description: “Technical leadership guidance for engineering teams, architecture decisions, and technology strategy. Use when assessing technical debt, scaling engineering teams, evaluating technologies, making architecture decisions, establishing engineering metrics, or when user mentions CTO, tech debt, technical debt, team scaling, architecture decisions, technology evaluation, engineering metrics, DORA metrics, or technology strategy.”
license: MIT
metadata:
version: 2.0.0
author: Alireza Rezvani
category: c-level
domain: cto-leadership
updated: 2026-03-05
python-tools: tech_debt_analyzer.py, team_scaling_calculator.py
frameworks: architecture-decisions, engineering-metrics, technology-evaluation
CTO Advisor
Technical leadership frameworks for architecture, engineering teams, technology strategy, and technical decision-making.
Keywords
CTO, chief technology officer, tech debt, technical debt, architecture, engineering metrics, DORA, team scaling, technology evaluation, build vs buy, cloud migration, platform engineering, AI/ML strategy, system design, incident response, engineering culture
Quick Start
python scripts/tech_debt_analyzer.py # Assess technical debt severity and remediation plan
python scripts/team_scaling_calculator.py # Model engineering team growth and cost
Core Responsibilities
1. Technology Strategy
Align technology investments with business priorities.
Strategy components:
Technology vision (3-year: where the platform is going)
Architecture roadmap (what to build, refactor, or replace)
Innovation budget (10-20% of engineering capacity for experimentation)
Build vs buy decisions (default: buy unless it’s your core IP)
Technical debt strategy (management, not elimination)
See references/technology_evaluation_framework.md for the full evaluation framework.
2. Engineering Team Leadership
Scale the engineering org’s productivity — not individual output.
Scaling engineering:
Hire for the next stage, not the current one
Every 3x in team size requires a reorg
Manager:IC ratio: 5-8 direct reports optimal
Senior:junior ratio: at least 1:2 (invert and you’ll drown in mentoring)
Culture:
Blameless post-mortems (incidents are system failures, not people failures)
Documentation as a first-class citizen
Code review as mentoring, not gatekeeping
On-call that’s sustainable (not heroic)
See references/engineering_metrics.md for DORA metrics and the engineering health dashboard.
3. Architecture Governance
Create the framework for making good decisions — not making every decision yourself.
Architecture Decision Records (ADRs):
Every significant decision gets documented: context, options, decision, consequences
Decisions are discoverable (not buried in Slack)
Decisions can be superseded (not permanent)
See references/architecture_decision_records.md for ADR templates and the decision review process.
Every vendor is a dependency. Every dependency is a risk.
Evaluation criteria: Does it solve a real problem? Can we migrate away? Is the vendor stable? What’s the total cost (license + integration + maintenance)?
5. Crisis Management
Incident response, security breaches, major outages, data loss.
Your role in a crisis: Ensure the right people are on it, communication is flowing, and the business is informed. Post-crisis: blameless retrospective within 48 hours.
Workflows
Tech Debt Assessment Workflow
Step 1 — Run the analyzer
python scripts/tech_debt_analyzer.py --output report.json
Step 2 — Interpret results
The analyzer produces a severity-scored inventory. Review each item against:
Severity (P0–P3): how much is it blocking velocity or creating risk?
Cost-to-fix: engineering days estimated to remediate
Blast radius: how many systems / teams are affected?
Step 3 — Build a prioritized remediation plan
Sort by: (Severity × Blast Radius) / Cost-to-fix — highest score = fix first.
Group items into: (a) immediate sprint, (b) next quarter, (c) tracked backlog.
Step 4 — Validate before presenting to stakeholders
Example output — Tech Debt Inventory:
Item | Severity | Cost-to-Fix | Blast Radius | Priority Score
----------------------|----------|-------------|--------------|---------------
Auth service (v1 API) | P1 | 8 days | 6 services | HIGH
Unindexed DB queries | P2 | 3 days | 2 services | MEDIUM
Legacy deploy scripts | P3 | 5 days | 1 service | LOW
ADR Creation Workflow
Step 1 — Identify the decision
Trigger an ADR when: the decision affects more than one team, is hard to reverse, or has cost/risk implications > 1 sprint of effort.
Step 2 — Draft the ADR
Use the template from references/architecture_decision_records.md:
Title: [Short noun phrase]
Status: Proposed | Accepted | Superseded
Context: What is the problem? What constraints exist?
Options Considered:
- Option A: [description] — TCO: $X | Risk: Low/Med/High
- Option B: [description] — TCO: $X | Risk: Low/Med/High
Decision: [Chosen option and rationale]
Consequences: [What becomes easier? What becomes harder?]
Step 3 — Validation checkpoint (before finalizing)
Step 4 — Communicate and close
Share the accepted ADR in the engineering all-hands or architecture sync. Link it from the relevant service’s README.
Build vs Buy Analysis Workflow
Step 1 — Define requirements (functional + non-functional)
Step 2 — Identify candidate vendors or internal build scope
Step 3 — Score each option:
Criterion | Weight | Build Score | Vendor A Score | Vendor B Score
-----------------------|--------|-------------|----------------|---------------
Solves core problem | 30% | 9 | 8 | 7
Migration risk | 20% | 2 (low risk)| 7 | 6
3-year TCO | 25% | $X | $Y | $Z
Vendor stability | 15% | N/A | 8 | 5
Integration effort | 10% | 3 | 7 | 8
Step 4 — Default rule: Buy unless it is core IP or no vendor meets ≥ 70% of requirements.
Step 5 — Document the decision as an ADR (see ADR workflow above).
Key Questions a CTO Asks
“What’s our biggest technical risk right now — not the most annoying, the most dangerous?”
“If we 10x our traffic tomorrow, what breaks first?”
“How much of our engineering time goes to maintenance vs new features?”
“What would a new engineer say about our codebase after their first week?”
“Which technical decision from 2 years ago is hurting us most today?”
“Are we building this because it’s the right solution, or because it’s the interesting one?”
“What’s our bus factor on critical systems?”
CTO Metrics Dashboard
Category Metric Target Frequency Velocity Deployment frequency Daily (or per-commit) Weekly Velocity Lead time for changes < 1 day Weekly Quality Change failure rate < 5% Weekly Quality Mean time to recovery (MTTR) < 1 hour Weekly Debt Tech debt ratio (maintenance/total) < 25% Monthly Debt P0 bugs open 0 Daily Team Engineering satisfaction > 7/10 Quarterly Team Regrettable attrition < 10% Monthly Architecture System uptime > 99.9% Monthly Architecture API response time (p95) < 200ms Weekly Cost Cloud spend / revenue ratio Declining trend Monthly
Red Flags
Tech debt ratio > 30% and growing faster than it’s being paid down
Deployment frequency declining over 4+ weeks
No ADRs for the last 3 major decisions
The CTO is the only person who can deploy to production
Build times exceed 10 minutes
Single points of failure on critical systems with no mitigation plan
The team dreads on-call rotation
Integration with C-Suite Roles
When… CTO works with… To… Roadmap planning CPO Align technical and product roadmaps Hiring engineers CHRO Define roles, comp bands, hiring criteria Budget planning CFO Cloud costs, tooling, headcount budget Security posture CISO Architecture review, compliance requirements Scaling operations COO Infrastructure capacity vs growth plans Revenue commitments CRO Technical feasibility of enterprise deals Technical marketing CMO Developer relations, technical content Strategic decisions CEO Technology as competitive advantage Hard calls Executive Mentor ”Should we rewrite?” “Should we switch stacks?”
Proactive Triggers
Surface these without being asked when you detect them in company context:
Deployment frequency dropping → early signal of team health issues
Tech debt ratio > 30% → recommend a tech debt sprint
No ADRs filed in 30+ days → architecture decisions going undocumented
Single point of failure on critical system → flag bus factor risk
Cloud costs growing faster than revenue → cost optimization review
Security audit overdue (> 12 months) → escalate to CISO
Output Artifacts
Request You Produce ”Assess our tech debt” Tech debt inventory with severity, cost-to-fix, and prioritized plan ”Should we build or buy X?” Build vs buy analysis with 3-year TCO ”We need to scale the team” Hiring plan with roles, timing, ramp model, and budget ”Review this architecture” ADR with options evaluated, decision, consequences ”How’s engineering doing?” Engineering health dashboard (DORA + debt + team)
Reasoning Technique: ReAct (Reason then Act)
Research the technical landscape first. Analyze options against constraints (time, team skill, cost, risk). Then recommend action. Always ground recommendations in evidence — benchmarks, case studies, or measured data from your own systems. “I think” is not enough — show the data.
Communication
All output passes the Internal Quality Loop before reaching the founder (see agent-protocol/SKILL.md).
Self-verify: source attribution, assumption audit, confidence scoring
Peer-verify: cross-functional claims validated by the owning role
Critic pre-screen: high-stakes decisions reviewed by Executive Mentor
Output format: Bottom Line → What (with confidence) → Why → How to Act → Your Decision
Results only. Every finding tagged: 🟢 verified, 🟡 medium, 🔴 assumed.
Context Integration
Always read company-context.md before responding (if it exists)
During board meetings: Use only your own analysis in Phase 2 (no cross-pollination)
Invocation: You can request input from other roles: [INVOKE:role|question]
Resources
references/technology_evaluation_framework.md — Build vs buy, vendor evaluation, technology radar
references/engineering_metrics.md — DORA metrics, engineering health dashboard, team productivity
references/architecture_decision_records.md — ADR templates, decision governance, review process
Architecture Decision Records (ADR) Framework
What is an ADR?
Architecture Decision Records capture important architectural decisions made along with their context and consequences. They help maintain institutional knowledge and explain why systems are built the way they are.
ADR Template
ADR-[NUMBER]: [TITLE]
Date : YYYY-MM-DDStatus : [Proposed | Accepted | Deprecated | Superseded]Deciders : [List of people involved in decision]Technical Story : [Ticket/Issue reference]
Context and Problem Statement
[Describe the context and problem that needs to be solved. What are we trying to achieve?]
Decision Drivers
[Driver 1: e.g., Performance requirements]
[Driver 2: e.g., Time to market]
[Driver 3: e.g., Team expertise]
[Driver 4: e.g., Cost constraints]
Considered Options
Option 1: [Name]
Option 2: [Name]
Option 3: [Name]
Decision Outcome
Chosen option : "[Option Name]", because [justification]
Positive Consequences
[Consequence 1]
[Consequence 2]
Negative Consequences
[Risk 1 and mitigation]
[Risk 2 and mitigation]
Pros and Cons of Options
Option 1: [Name]
Pros :
[Advantage 1]
[Advantage 2]
Cons :
[Disadvantage 1]
[Disadvantage 2]
Option 2: [Name]
[Repeat structure]
Links
[Related ADRs]
[Documentation]
[Research/PoCs]
Example ADRs
ADR-001: Microservices Architecture
Date : 2024-01-15Status : AcceptedDeciders : CTO, VP Engineering, Tech LeadsTechnical Story : ARCH-001
Context and Problem Statement
Our monolithic application is becoming difficult to scale and deploy. Different teams are stepping on each other's toes, and deployment cycles are getting longer. We need to decide on our architectural approach for the next 3-5 years.
Decision Drivers
Need for independent team deployment
Requirement to scale different components independently
Different components have different performance characteristics
Team size growing from 25 to 75+ engineers
Need to support multiple technology stacks
Considered Options
Keep Monolith : Continue with current architecture
Modular Monolith : Break into modules but single deployment
Microservices : Full service-oriented architecture
Serverless : Function-as-a-Service approach
Decision Outcome
Chosen option : "Microservices", because it best supports our team autonomy needs and scaling requirements, despite added complexity.
Positive Consequences
Teams can deploy independently
Services can scale based on individual needs
Technology diversity is possible
Fault isolation improved
Negative Consequences
Increased operational complexity - Mitigated by investing in DevOps
Network latency between services - Mitigated by careful service boundaries
Data consistency challenges - Mitigated by event sourcing patterns
ADR-002: Container Orchestration Platform
Date : 2024-02-01Status : AcceptedDeciders : CTO, DevOps Lead, Platform TeamTechnical Story : INFRA-045
Context and Problem Statement
With the move to microservices (ADR-001), we need a container orchestration platform to manage deployment, scaling, and operations of application containers.
Decision Drivers
Need for automated deployment and scaling
High availability requirements (99.9% SLA)
Multi-cloud strategy (avoid vendor lock-in)
Team familiarity and ecosystem maturity
Cost considerations
Considered Options
Kubernetes : Industry standard, self-managed
Amazon ECS : AWS-native solution
Docker Swarm : Simpler alternative
Nomad : HashiCorp solution
Decision Outcome
Chosen option : "Kubernetes", because of its maturity, ecosystem, and multi-cloud support.
Positive Consequences
Industry standard with huge ecosystem
Multi-cloud compatible
Strong community support
Extensive tooling available
Negative Consequences
Steep learning curve - Mitigated by training and hiring
Operational complexity - Mitigated by managed Kubernetes (EKS/GKE)
ADR-003: API Gateway Strategy
Date : 2024-03-15Status : AcceptedDeciders : CTO, Security Lead, API TeamTechnical Story : API-101
Context and Problem Statement
With multiple microservices, we need a unified entry point for external clients that handles cross-cutting concerns like authentication, rate limiting, and monitoring.
Decision Drivers
Security requirements (OAuth2, API keys)
Need for rate limiting and throttling
Monitoring and analytics requirements
Developer experience for API consumers
Performance (sub-100ms overhead)
Considered Options
Kong : Open-source, plugin ecosystem
AWS API Gateway : Managed service
Istio/Envoy : Service mesh approach
Build Custom : In-house solution
Decision Outcome
Chosen option : "Kong", because of its flexibility and plugin ecosystem while avoiding vendor lock-in.
Common Architecture Decisions
1. Frontend Architecture
Single Page Application (SPA) vs Server-Side Rendering (SSR) vs Static Site Generation (SSG)
React vs Vue vs Angular vs Svelte
Monorepo vs Polyrepo
Micro-frontends vs Monolithic frontend
2. Backend Architecture
Monolith vs Microservices vs Serverless
REST vs GraphQL vs gRPC
Synchronous vs Asynchronous communication
Event-driven vs Request-response
3. Data Architecture
SQL vs NoSQL vs NewSQL
Single database vs Database per service
CQRS vs Traditional CRUD
Event Sourcing vs State-based storage
4. Infrastructure Decisions
Cloud provider : AWS vs Azure vs GCP vs Multi-cloud
Containers vs VMs vs Serverless
Kubernetes vs ECS vs Cloud Run
Self-hosted vs Managed services
5. Development Practices
Continuous Deployment vs Continuous Delivery
Feature flags vs Branch-based deployment
Blue-green vs Canary vs Rolling deployment
GitFlow vs GitHub Flow vs GitLab Flow
ADR Best Practices
Writing Good ADRs
Keep them short : 1-2 pages maximum
Be specific : Include concrete examples
Document why, not what : Focus on reasoning
Include all options : Even obviously bad ones
Be honest about drawbacks : Every decision has trade-offs
When to Write ADRs
Write an ADR when:
The decision has significant impact
Multiple options were seriously considered
The decision is hard to reverse
You find yourself explaining the same decision repeatedly
There's disagreement about the approach
ADR Lifecycle
Proposed : Under discussion
Accepted : Decision made and being implemented
Deprecated : No longer relevant but kept for history
Superseded : Replaced by another ADR
Storage and Discovery
Store ADRs in your main repository under docs/architecture/decisions/
Use consistent numbering (ADR-001, ADR-002, etc.)
Create an index file linking all ADRs
Reference ADRs in code comments where relevant
Review ADRs regularly (quarterly) for relevance
Decision Evaluation Framework
Technical Factors (40%)
Performance impact
Scalability potential
Security implications
Maintainability
Technical debt
Business Factors (30%)
Time to market
Cost (initial and ongoing)
Revenue impact
Competitive advantage
Regulatory compliance
Team Factors (30%)
Current expertise
Learning curve
Hiring availability
Team preference
Training requirements
Anti-patterns to Avoid
Decision by Committee : Too many stakeholders leading to compromise solutions
Analysis Paralysis : Over-analyzing instead of deciding
Resume-Driven Development : Choosing tech for personal goals
Hype-Driven Development : Choosing the newest/coolest tech
Not-Invented-Here : Rejecting external solutions by default
Vendor Lock-in : Over-dependence on proprietary solutions
Premature Optimization : Solving problems you don't have yet
Under-documentation : Not capturing the "why" behind decisions
Review Checklist
Before finalizing an ADR, ensure:
Engineering Metrics & KPIs Guide
Metrics Framework
DORA Metrics (DevOps Research and Assessment)
1. Deployment Frequency
Definition : How often code is deployed to production
Target :
Elite: Multiple deploys per day
High: Weekly to monthly
Medium: Monthly to bi-annually
Low: Less than bi-annually
Measurement : Deployments per day/week/month
Improvement : Smaller batch sizes, feature flags, CI/CD
2. Lead Time for Changes
Definition : Time from code commit to production
Target :
Elite: Less than 1 hour
High: 1 day to 1 week
Medium: 1 week to 1 month
Low: More than 1 month
Measurement : Median time from commit to deploy
Improvement : Automation, parallel testing, smaller changes
3. Mean Time to Recovery (MTTR)
Definition : Time to restore service after incident
Target :
Elite: Less than 1 hour
High: Less than 1 day
Medium: 1 day to 1 week
Low: More than 1 week
Measurement : Average incident resolution time
Improvement : Monitoring, rollback capability, runbooks
4. Change Failure Rate
Definition : Percentage of changes causing failures
Target :
Elite: 0-15%
High: 16-30%
Medium/Low: >30%
Measurement : Failed deploys / Total deploys
Improvement : Testing, code review, gradual rollouts
Engineering Productivity Metrics
Code Quality
Metric
Formula
Target
Action if Below
Test Coverage
Tests / Total Code
>80%
Add unit tests
Code Review Coverage
Reviewed PRs / Total PRs
100%
Enforce review policy
Technical Debt Ratio
Debt / Development Time
<10%
Dedicate debt sprints
Cyclomatic Complexity
Per function/method
<10
Refactor complex code
Code Duplication
Duplicate Lines / Total
<5%
Extract common code
Development Velocity
Metric
Formula
Target
Action if Below
Sprint Velocity
Story Points / Sprint
Stable ±10%
Review estimation
Cycle Time
Start to Done Time
<5 days
Reduce WIP
PR Merge Time
Open to Merge
<24 hours
Smaller PRs
Build Time
Code to Artifact
<10 minutes
Optimize pipeline
Test Execution Time
Full Test Suite
<30 minutes
Parallelize tests
Team Health
Metric
Formula
Target
Action if Below
On-call Incidents
Incidents / Week
<5
Improve monitoring
Bug Escape Rate
Prod Bugs / Release
<5%
Improve testing
Unplanned Work
Unplanned / Total
<20%
Better planning
Meeting Time
Meetings / Total Time
<20%
Reduce meetings
Focus Time
Uninterrupted Hours
>4h/day
Block calendars
Business Impact Metrics
System Performance
Metric
Description
Target
Business Impact
Uptime
System availability
99.9%+
Revenue protection
Page Load Time
Time to interactive
<3s
User retention
API Response Time
P95 latency
<200ms
User experience
Error Rate
Errors / Requests
<0.1%
Customer satisfaction
Throughput
Requests / Second
Per requirement
Scalability
Product Delivery
Metric
Description
Target
Business Impact
Feature Delivery Rate
Features / Quarter
Per roadmap
Market competitiveness
Time to Market
Idea to Production
<3 months
First mover advantage
Customer Defect Rate
Customer Bugs / Month
<10
Customer satisfaction
Feature Adoption
Users / Feature
>50%
ROI validation
NPS from Engineering
Customer Score
>50
Product quality
Metrics Dashboards
Executive Dashboard (Weekly)
┌─────────────────────────────────────┐
│ EXECUTIVE METRICS │
├─────────────────────────────────────┤
│ Uptime: 99.97% ✓ │
│ Sprint Velocity: 142 pts ✓ │
│ Deployment Frequency: 3.2/day ✓ │
│ Lead Time: 4.2 hrs ✓ │
│ MTTR: 47 min ✓ │
│ Change Failure Rate: 8.3% ✓ │
│ │
│ Team Health: 8.2/10 │
│ Tech Debt Ratio: 12% ⚠ │
│ Feature Delivery: 85% ✓ │
└─────────────────────────────────────┘ Team Dashboard (Daily)
┌─────────────────────────────────────┐
│ TEAM METRICS │
├─────────────────────────────────────┤
│ Current Sprint: │
│ Completed: 65/100 pts (65%) │
│ In Progress: 20 pts │
│ Days Left: 3 │
│ │
│ PR Queue: 8 pending │
│ Build Status: ✓ Passing │
│ Test Coverage: 82.3% │
│ Open Incidents: 2 (P2, P3) │
│ │
│ On-call Load: 3 pages this week │
└─────────────────────────────────────┘ Individual Dashboard (Daily)
┌─────────────────────────────────────┐
│ DEVELOPER METRICS │
├─────────────────────────────────────┤
│ This Week: │
│ PRs Merged: 8 │
│ Code Reviews: 12 │
│ Commits: 23 │
│ Focus Time: 22.5 hrs │
│ │
│ Quality: │
│ Test Coverage: 87% │
│ Code Review Feedback: 95% ✓ │
│ Bug Introduction Rate: 0% │
└─────────────────────────────────────┘ Implementation Guide
Phase 1: Foundation (Month 1)
Basic Metrics
Deployment frequency
Build success rate
Uptime/availability
Team velocity
Tools Setup
CI/CD instrumentation
Basic monitoring
Time tracking
Phase 2: Quality (Month 2)
Quality Metrics
Test coverage
Code review metrics
Bug rates
Technical debt
Tool Integration
Static analysis
Test reporting
Code quality gates
Phase 3: Performance (Month 3)
Performance Metrics
DORA metrics complete
System performance
API metrics
Database metrics
Advanced Monitoring
APM tools
Distributed tracing
Custom dashboards
Phase 4: Optimization (Ongoing)
Advanced Analytics
Predictive metrics
Trend analysis
Anomaly detection
Correlation analysis
Metric Anti-patterns
What NOT to Measure
❌ Lines of Code : Encourages bloat ❌ Hours Worked : Promotes presenteeism ❌ Individual Velocity : Creates competition ❌ Bug Count Without Context : Discourages risk-taking ❌ Commit Count : Encourages tiny commits
Goodhart's Law
"When a measure becomes a target, it ceases to be a good measure"
Examples :
Optimizing test coverage → Writing meaningless tests
Reducing bug count → Not reporting bugs
Increasing velocity → Inflating estimates
Reducing meeting time → Skipping important discussions
How to Avoid Gaming
Use Multiple Metrics : No single metric tells the whole story
Focus on Trends : Not absolute numbers
Combine Leading and Lagging : Balance predictive and historical
Regular Review : Adjust metrics that are being gamed
Team Ownership : Let teams choose their metrics
OKR Framework for Engineering
Company Level OKRs
Objective : Deliver exceptional product quality
Key Results :
KR1: Achieve 99.95% uptime (from 99.9%)
KR2: Reduce customer-reported bugs by 50%
KR3: Improve deployment frequency to 10x/day
Engineering OKRs
Objective : Build scalable, reliable infrastructure
Key Results :
KR1: Migrate 80% of services to Kubernetes
KR2: Reduce MTTR to <30 minutes
KR3: Achieve 85% test coverage
Team OKRs
Objective : Improve developer productivity
Key Results :
KR1: Reduce build time to <5 minutes
KR2: Automate 90% of deployment process
KR3: Reduce PR review time to <4 hours
Reporting Templates
Monthly Engineering Report
# Engineering Report - [Month Year]
## Executive Summary
- Key Achievement: [ Highlight ]
- Main Challenge: [Issue and resolution]
- Next Month Focus: [ Priority ]
## DORA Metrics
| Metric | This Month | Last Month | Target | Status |
|--------|------------|------------|--------|--------|
| Deploy Frequency | X/day | Y/day | Z/day | ✓/⚠/✗ |
| Lead Time | X hrs | Y hrs | <Z hrs | ✓/⚠/✗ |
| MTTR | X min | Y min | <Z min | ✓/⚠/✗ |
| Change Failure | X% | Y% | <Z% | ✓/⚠/✗ |
## Team Performance
- Velocity: X story points (Y% of plan)
- Sprint Completion: X%
- Unplanned Work: X%
## Quality Metrics
- Test Coverage: X% (Δ Y%)
- Customer Bugs: X (Δ Y)
- Code Review Coverage: X%
## Highlights
1. [Major feature or improvement]
2. [Technical achievement]
3. [Process improvement]
## Challenges & Solutions
1. Challenge: [ Issue ]
Solution: [Action taken]
## Next Month Priorities
1. [Priority 1]
2. [Priority 2]
3. [Priority 3] Quarterly Business Review
# Engineering QBR - Q[ X ] [ Year ]
## Strategic Alignment
- Business Goal: [ Goal ]
- Engineering Contribution: [How engineering supported]
- Impact: [Measurable outcome]
## Quarterly Metrics
### Delivery
- Features Shipped: X of Y planned (Z%)
- Major Releases: [ List ]
- Technical Debt Reduced: X%
### Reliability
- Uptime: X%
- Incidents: X (PY critical, PZ major)
- Customer Impact: [ Description ]
### Efficiency
- Cost per Transaction: $X (Δ Y%)
- Infrastructure Cost: $X (Δ Y%)
- Engineering Cost per Feature: $X
## Team Growth
- Headcount: Start: X → End: Y
- Attrition: X%
- Key Hires: [ Roles ]
## Innovation
- Patents Filed: X
- Open Source Contributions: X
- Hackathon Projects: X
## Lessons Learned
1. [What worked well]
2. [What didn't work]
3. [What we're changing]
## Next Quarter Focus
1. [Strategic Initiative 1]
2. [Strategic Initiative 2]
3. [Strategic Initiative 3] Tool Recommendations
Metrics Collection
DataDog : Comprehensive monitoring
New Relic : Application performance
Grafana + Prometheus : Open source stack
CloudWatch : AWS native
Engineering Analytics
LinearB : Developer productivity
Velocity : Engineering metrics
Sleuth : DORA metrics
Swarmia : Engineering insights
Project Tracking
Jira : Issue tracking
Linear : Modern issue tracking
Azure DevOps : Microsoft ecosystem
GitHub Projects : Integrated with code
Incident Management
PagerDuty : On-call management
Opsgenie : Incident response
StatusPage : Status communication
FireHydrant : Incident command
Success Indicators
Healthy Engineering Organization
✓ DORA metrics improving quarter-over-quarter ✓ Team satisfaction >8/10 ✓ Attrition <10% annually ✓ On-time delivery >80% ✓ Technical debt <15% of capacity ✓ Innovation time >20%
Warning Signs
⚠️ Increasing MTTR trend ⚠️ Declining velocity ⚠️ Rising bug escape rate ⚠️ Increasing unplanned work ⚠️ Growing PR queue ⚠️ Decreasing test coverage
Crisis Indicators
🚨 Multiple production incidents per week 🚨 Team satisfaction <6/10 🚨 Attrition >20% 🚨 Technical debt >30% 🚨 No deployments for >1 week 🚨 Customer escalations increasing
Technology Evaluation Framework
Evaluation Process
Phase 1: Requirements Gathering (Week 1)
Functional Requirements
Core features needed
Integration requirements
Performance requirements
Scalability needs
Security requirements
Non-Functional Requirements
Usability/Developer experience
Documentation quality
Community support
Vendor stability
Compliance needs
Constraints
Budget limitations
Timeline constraints
Team expertise
Existing technology stack
Regulatory requirements
Phase 2: Market Research (Week 1-2)
Identify Candidates
Industry leaders (Gartner Magic Quadrant)
Open-source alternatives
Emerging solutions
Build vs Buy analysis
Initial Filtering
Eliminate options not meeting hard requirements
Remove options outside budget
Focus on 3-5 top candidates
Phase 3: Deep Evaluation (Week 2-4)
Technical Evaluation
Proof of Concept (PoC)
Performance benchmarks
Security assessment
Integration testing
Scalability testing
Business Evaluation
Total Cost of Ownership (TCO)
Return on Investment (ROI)
Vendor assessment
Risk analysis
Exit strategy
Phase 4: Decision (Week 4)
Evaluation Criteria Matrix
Technical Criteria (40%)
Criterion
Weight
Description
Scoring Guide
Performance
10%
Speed, throughput, latency
5: Exceeds requirements 3: Meets requirements 1: Below requirements
Scalability
10%
Ability to grow with needs
5: Linear scalability 3: Some limitations 1: Hard limits
Reliability
8%
Uptime, fault tolerance
5: 99.99% SLA 3: 99.9% SLA 1: <99% SLA
Security
8%
Security features, compliance
5: Exceeds standards 3: Meets standards 1: Concerns exist
Integration
4%
API quality, compatibility
5: Native integration 3: Good APIs 1: Limited integration
Business Criteria (30%)
Criterion
Weight
Description
Scoring Guide
Cost
10%
TCO including licenses, operation
5: Under budget by >20% 3: Within budget 1: Over budget
ROI
8%
Value generation potential
5: <6 month payback 3: <12 month payback 1: >24 month payback
Vendor Stability
6%
Financial health, market position
5: Market leader 3: Established player 1: Startup/uncertain
Support Quality
6%
Support availability, SLAs
5: 24/7 premium support 3: Business hours 1: Community only
Operational Criteria (30%)
Criterion
Weight
Description
Scoring Guide
Ease of Use
8%
Learning curve, UX
5: Intuitive 3: Moderate learning 1: Steep curve
Documentation
7%
Quality, completeness
5: Excellent docs 3: Adequate docs 1: Poor docs
Community
7%
Size, activity, resources
5: Large, active 3: Moderate 1: Small/inactive
Maintenance
8%
Operational overhead
5: Fully managed 3: Some maintenance 1: High maintenance
Vendor Evaluation Template
Vendor Profile
Company Name :
Founded :
Headquarters :
Employees :
Revenue :
Funding (if applicable):
Key Customers :
Product Assessment
Strengths
Weaknesses
Opportunities
Threats
Financial Analysis
Cost Breakdown
Component
Year 1
Year 2
Year 3
Total
Licensing
$
$
$
$
Implementation
$
$
$
$
Training
$
$
$
$
Support
$
$
$
$
Infrastructure
$
$
$
$
Total
$
$
$
$
ROI Calculation
Cost Savings :
Reduced manual work: $/year
Efficiency gains: $/year
Error reduction: $/year
Revenue Impact :
New capabilities: $/year
Faster time to market: $/year
Payback Period : X months
Risk Assessment
Risk
Probability
Impact
Mitigation
Vendor goes out of business
Low/Med/High
Low/Med/High
Strategy
Technology becomes obsolete
Integration difficulties
Team adoption challenges
Budget overrun
Performance issues
Build vs Buy Decision Framework
When to Build
Advantages :
Full control over features
No vendor lock-in
Potential competitive advantage
Perfect fit for requirements
No licensing costs
Build when :
Core business differentiator
Unique requirements
Long-term investment
Have expertise in-house
No suitable solutions exist
Hidden Costs :
Development time
Maintenance burden
Security responsibility
Documentation needs
Training requirements
When to Buy
Advantages :
Faster time to market
Proven solution
Vendor support
Regular updates
Shared development costs
Buy when :
Commodity functionality
Standard requirements
Limited internal resources
Need quick solution
Good options available
Hidden Costs :
Customization limits
Vendor lock-in
Integration effort
Training needs
Scaling costs
When to Adopt Open Source
Advantages :
No licensing costs
Community support
Transparency
Customizable
No vendor lock-in
Adopt when :
Strong community exists
Standard solution needed
Have technical expertise
Can contribute back
Long-term stability needed
Hidden Costs :
Support costs
Security responsibility
Upgrade management
Integration effort
Potential consulting needs
Proof of Concept Guidelines
PoC Scope
Duration : 2-4 weeks
Team : 2-3 engineers
Environment : Isolated/sandbox
Data : Representative sample
Success Criteria
PoC Checklist
PoC Report Template
# PoC Report: [Technology Name]
## Executive Summary
- **Recommendation** : [Proceed/Stop/Investigate Further]
- **Confidence Level** : [ High/Medium/Low ]
- **Key Finding** : [One sentence summary]
## Test Results
### Functional Tests
| Test Case | Result | Notes |
|-----------|--------|-------|
| | Pass/Fail | |
### Performance Tests
| Metric | Target | Actual | Status |
|--------|--------|--------|---------|
| Response Time | <100ms | Xms | ✓/✗ |
| Throughput | >1000 req/s | X req/s | ✓/✗ |
| CPU Usage | <70% | X% | ✓/✗ |
| Memory Usage | <4GB | XGB | ✓/✗ |
### Integration Tests
| System | Status | Effort |
|--------|--------|--------|
| Database | ✓/✗ | Low/Med/High |
| API Gateway | ✓/✗ | Low/Med/High |
| Authentication | ✓/✗ | Low/Med/High |
## Team Feedback
- **Ease of Use** : [1-5 rating]
- **Documentation** : [1-5 rating]
- **Would Recommend** : [ Yes/No ]
## Risks Identified
1. [Risk and mitigation]
2. [Risk and mitigation]
## Next Steps
1. [Action item]
2. [Action item] Technology Categories
Development Platforms
Languages : TypeScript, Python, Go, Rust, Java
Frameworks : React, Node.js, Spring, Django, FastAPI
Mobile : React Native, Flutter, Swift, Kotlin
Evaluation Focus : Developer productivity, ecosystem, performance
Databases
SQL : PostgreSQL, MySQL, SQL Server
NoSQL : MongoDB, Cassandra, DynamoDB
NewSQL : CockroachDB, Vitess, TiDB
Evaluation Focus : Performance, scalability, consistency, operations
Infrastructure
Cloud : AWS, GCP, Azure
Containers : Docker, Kubernetes, Nomad
Serverless : Lambda, Cloud Functions, Vercel
Evaluation Focus : Cost, scalability, vendor lock-in, operations
Monitoring & Observability
APM : DataDog, New Relic, AppDynamics
Logging : ELK Stack, Splunk, CloudWatch
Metrics : Prometheus, Grafana, CloudWatch
Evaluation Focus : Coverage, cost, integration, insights
Security
SAST : Sonarqube, Checkmarx, Veracode
DAST : OWASP ZAP, Burp Suite
Secrets : Vault, AWS Secrets Manager
Evaluation Focus : Coverage, false positives, integration
DevOps Tools
CI/CD : Jenkins, GitLab CI, GitHub Actions
IaC : Terraform, CloudFormation, Pulumi
Configuration : Ansible, Chef, Puppet
Evaluation Focus : Flexibility, integration, learning curve
Continuous Evaluation
Quarterly Reviews
Technology landscape changes
Performance against expectations
Cost optimization opportunities
Team satisfaction
Market alternatives
Annual Assessment
Full technology stack review
Vendor relationship evaluation
Strategic alignment check
Technical debt assessment
Roadmap planning
Deprecation Planning
Migration strategy
Timeline definition
Risk assessment
Communication plan
Success metrics
Decision Documentation
Always document:
Why the technology was chosen
Who was involved in the decision
When the decision was made
What alternatives were considered
How success will be measured
Use Architecture Decision Records (ADRs) for significant technology choices.
#!/usr/bin/env python3
"""
Engineering Team Scaling Calculator - Optimize team growth and structure
"""
import json
import math
from typing import Dict, List, Tuple
class TeamScalingCalculator :
def __init__ (self):
self .conway_factor = 1.5 # Conway's Law impact factor
self .brooks_factor = 0.75 # Brooks' Law diminishing returns
# Optimal team structures based on size
self .team_structures = {
'startup' : { 'min' : 1 , 'max' : 10 , 'structure' : 'flat' },
'growth' : { 'min' : 11 , 'max' : 50 , 'structure' : 'team_leads' },
'scale' : { 'min' : 51 , 'max' : 150 , 'structure' : 'departments' },
'enterprise' : { 'min' : 151 , 'max' : 9999 , 'structure' : 'divisions' }
}
# Role ratios for balanced teams
self .role_ratios = {
'engineering_manager' : 0.125 , # 1:8 ratio
'tech_lead' : 0.167 , # 1:6 ratio
'senior_engineer' : 0.3 ,
'mid_engineer' : 0.4 ,
'junior_engineer' : 0.2 ,
'devops' : 0.1 ,
'qa' : 0.15 ,
'product_manager' : 0.1 ,
'designer' : 0.08 ,
'data_engineer' : 0.05
}
def calculate_scaling_plan (self, current_state: Dict, growth_targets: Dict) -> Dict:
"""Calculate optimal scaling plan"""
results = {
'current_analysis' : self ._analyze_current_state(current_state),
'growth_timeline' : self ._create_growth_timeline(current_state, growth_targets),
'hiring_plan' : {},
'team_structure' : {},
'budget_projection' : {},
'risk_factors' : [],
'recommendations' : []
}
# Generate hiring plan
results[ 'hiring_plan' ] = self ._generate_hiring_plan(
current_state,
growth_targets
)
# Design team structure
results[ 'team_structure' ] = self ._design_team_structure(
growth_targets[ 'target_headcount' ]
)
# Calculate budget
results[ 'budget_projection' ] = self ._calculate_budget(
results[ 'hiring_plan' ],
current_state.get( 'location' , 'US' )
)
# Assess risks
results[ 'risk_factors' ] = self ._assess_scaling_risks(
current_state,
growth_targets
)
# Generate recommendations
results[ 'recommendations' ] = self ._generate_recommendations(results)
return results
def _analyze_current_state (self, current_state: Dict) -> Dict:
"""Analyze current team state"""
total_engineers = current_state.get( 'headcount' , 0 )
analysis = {
'total_headcount' : total_engineers,
'team_stage' : self ._get_team_stage(total_engineers),
'productivity_index' : 0 ,
'balance_score' : 0 ,
'issues' : []
}
# Calculate productivity index
if total_engineers > 0 :
velocity = current_state.get( 'velocity' , 100 )
expected_velocity = total_engineers * 20 # baseline 20 points per engineer
analysis[ 'productivity_index' ] = (velocity / expected_velocity) * 100
# Check team balance
roles = current_state.get( 'roles' , {})
analysis[ 'balance_score' ] = self ._calculate_balance_score(roles, total_engineers)
# Identify issues
if analysis[ 'productivity_index' ] < 70 :
analysis[ 'issues' ].append( 'Low productivity - possible process or tooling issues' )
if analysis[ 'balance_score' ] < 60 :
analysis[ 'issues' ].append( 'Team imbalance - review role distribution' )
manager_ratio = roles.get( 'managers' , 0 ) / max (total_engineers, 1 )
if manager_ratio > 0.2 :
analysis[ 'issues' ].append( 'Over-managed - too many managers' )
elif manager_ratio < 0.08 and total_engineers > 20 :
analysis[ 'issues' ].append( 'Under-managed - need more engineering managers' )
return analysis
def _get_team_stage (self, headcount: int ) -> str :
"""Determine team stage based on size"""
for stage, config in self .team_structures.items():
if config[ 'min' ] <= headcount <= config[ 'max' ]:
return stage
return 'startup'
def _calculate_balance_score (self, roles: Dict, total: int ) -> float :
"""Calculate team balance score"""
if total == 0 :
return 0
score = 100
ideal_ratios = self .role_ratios
for role, ideal_ratio in ideal_ratios.items():
actual_count = roles.get(role, 0 )
actual_ratio = actual_count / total
# Penalize deviation from ideal ratio
deviation = abs (actual_ratio - ideal_ratio)
penalty = deviation * 100
score -= min (penalty, 20 ) # Max 20 point penalty per role
return max ( 0 , score)
def _create_growth_timeline (self, current: Dict, targets: Dict) -> List[Dict]:
"""Create quarterly growth timeline"""
current_headcount = current.get( 'headcount' , 0 )
target_headcount = targets.get( 'target_headcount' , current_headcount)
timeline_quarters = targets.get( 'timeline_quarters' , 4 )
growth_needed = target_headcount - current_headcount
timeline = []
for quarter in range ( 1 , timeline_quarters + 1 ):
# Apply Brooks' Law - diminishing returns with rapid growth
if quarter == 1 :
quarterly_growth = math.ceil(growth_needed * 0.4 ) # Front-load hiring
else :
remaining_growth = target_headcount - current_headcount
quarters_left = timeline_quarters - quarter + 1
quarterly_growth = math.ceil(remaining_growth / quarters_left)
# Adjust for onboarding capacity
max_onboarding = math.ceil(current_headcount * 0.25 ) # 25% growth per quarter max
quarterly_growth = min (quarterly_growth, max_onboarding)
current_headcount += quarterly_growth
timeline.append({
'quarter' : f 'Q { quarter } ' ,
'headcount' : current_headcount,
'new_hires' : quarterly_growth,
'onboarding_capacity' : max_onboarding,
'productivity_factor' : 1.0 - ( 0.2 * (quarterly_growth / max (current_headcount, 1 )))
})
return timeline
def _generate_hiring_plan (self, current: Dict, targets: Dict) -> Dict:
"""Generate detailed hiring plan"""
current_roles = current.get( 'roles' , {})
target_headcount = targets.get( 'target_headcount' , 0 )
hiring_plan = {
'total_hires_needed' : target_headcount - current.get( 'headcount' , 0 ),
'by_role' : {},
'by_quarter' : {},
'interview_capacity_needed' : 0 ,
'recruiting_resources' : 0
}
# Calculate ideal role distribution
for role, ideal_ratio in self .role_ratios.items():
ideal_count = math.ceil(target_headcount * ideal_ratio)
current_count = current_roles.get(role, 0 )
hires_needed = max ( 0 , ideal_count - current_count)
if hires_needed > 0 :
hiring_plan[ 'by_role' ][role] = {
'current' : current_count,
'target' : ideal_count,
'hires_needed' : hires_needed,
'priority' : self ._get_role_priority(role, current_roles, target_headcount)
}
# Distribute hires across quarters
timeline = self ._create_growth_timeline(current, targets)
for quarter_data in timeline:
quarter = quarter_data[ 'quarter' ]
hires = quarter_data[ 'new_hires' ]
hiring_plan[ 'by_quarter' ][quarter] = {
'total_hires' : hires,
'breakdown' : self ._distribute_quarterly_hires(hires, hiring_plan[ 'by_role' ])
}
# Calculate interview capacity (5 interviews per hire average)
hiring_plan[ 'interview_capacity_needed' ] = hiring_plan[ 'total_hires_needed' ] * 5
# Calculate recruiting resources (1 recruiter per 50 hires/year)
annual_hires = hiring_plan[ 'total_hires_needed' ] * ( 4 / max (targets.get( 'timeline_quarters' , 4 ), 1 ))
hiring_plan[ 'recruiting_resources' ] = math.ceil(annual_hires / 50 )
return hiring_plan
def _get_role_priority (self, role: str , current_roles: Dict, target_size: int ) -> int :
"""Determine hiring priority for a role"""
# Priority based on criticality and current gaps
priorities = {
'engineering_manager' : 10 if target_size > 20 else 5 ,
'tech_lead' : 9 ,
'senior_engineer' : 8 ,
'devops' : 7 if current_roles.get( 'devops' , 0 ) == 0 else 5 ,
'qa' : 6 ,
'mid_engineer' : 5 ,
'product_manager' : 6 ,
'designer' : 5 ,
'data_engineer' : 4 ,
'junior_engineer' : 3
}
return priorities.get(role, 5 )
def _distribute_quarterly_hires (self, total_hires: int , role_needs: Dict) -> Dict:
"""Distribute quarterly hires across roles"""
distribution = {}
# Sort roles by priority
sorted_roles = sorted (
role_needs.items(),
key =lambda x: x[ 1 ][ 'priority' ],
reverse = True
)
remaining_hires = total_hires
for role, needs in sorted_roles:
if remaining_hires <= 0 :
break
hires = min (needs[ 'hires_needed' ], max ( 1 , remaining_hires // 3 ))
distribution[role] = hires
remaining_hires -= hires
return distribution
def _design_team_structure (self, target_headcount: int ) -> Dict:
"""Design optimal team structure"""
stage = self ._get_team_stage(target_headcount)
structure = {
'organizational_model' : self .team_structures[stage][ 'structure' ],
'teams' : [],
'reporting_structure' : {},
'communication_paths' : 0
}
if stage == 'startup' :
structure[ 'teams' ] = [{
'name' : 'Core Team' ,
'size' : target_headcount,
'focus' : 'Full-stack'
}]
elif stage == 'growth' :
# Create 2-4 teams
team_size = 6
num_teams = math.ceil(target_headcount / team_size)
structure[ 'teams' ] = [
{
'name' : f 'Team { i + 1} ' ,
'size' : team_size,
'focus' : [ 'Platform' , 'Product' , 'Infrastructure' , 'Growth' ][i % 4 ]
}
for i in range (num_teams)
]
elif stage == 'scale' :
# Create departments with multiple teams
structure[ 'departments' ] = [
{ 'name' : 'Platform' , 'teams' : 3 , 'headcount' : target_headcount * 0.3 },
{ 'name' : 'Product' , 'teams' : 4 , 'headcount' : target_headcount * 0.4 },
{ 'name' : 'Infrastructure' , 'teams' : 2 , 'headcount' : target_headcount * 0.2 },
{ 'name' : 'Data' , 'teams' : 1 , 'headcount' : target_headcount * 0.1 }
]
# Calculate communication paths (n*(n-1)/2)
structure[ 'communication_paths' ] = (target_headcount * (target_headcount - 1 )) // 2
# Add management layers
structure[ 'management_layers' ] = math.ceil(math.log(target_headcount, 7 ))
return structure
def _calculate_budget (self, hiring_plan: Dict, location: str ) -> Dict:
"""Calculate budget projection"""
# Average salaries by role and location (in USD)
salary_bands = {
'US' : {
'engineering_manager' : 200000 ,
'tech_lead' : 180000 ,
'senior_engineer' : 160000 ,
'mid_engineer' : 120000 ,
'junior_engineer' : 85000 ,
'devops' : 150000 ,
'qa' : 100000 ,
'product_manager' : 150000 ,
'designer' : 120000 ,
'data_engineer' : 140000
},
'EU' : {
'engineering_manager' : 160000 ,
'tech_lead' : 144000 ,
'senior_engineer' : 128000 ,
'mid_engineer' : 96000 ,
'junior_engineer' : 68000 ,
'devops' : 120000 ,
'qa' : 80000 ,
'product_manager' : 120000 ,
'designer' : 96000 ,
'data_engineer' : 112000
},
'APAC' : {
'engineering_manager' : 120000 ,
'tech_lead' : 108000 ,
'senior_engineer' : 96000 ,
'mid_engineer' : 72000 ,
'junior_engineer' : 51000 ,
'devops' : 90000 ,
'qa' : 60000 ,
'product_manager' : 90000 ,
'designer' : 72000 ,
'data_engineer' : 84000
}
}
location_salaries = salary_bands.get(location, salary_bands[ 'US' ])
budget = {
'annual_salary_cost' : 0 ,
'benefits_cost' : 0 , # 30% of salary
'equipment_cost' : 0 , # $5k per hire
'recruiting_cost' : 0 , # 20% of first-year salary
'onboarding_cost' : 0 , # $10k per hire
'total_cost' : 0 ,
'cost_per_hire' : 0
}
for role, details in hiring_plan[ 'by_role' ].items():
hires = details[ 'hires_needed' ]
salary = location_salaries.get(role, 100000 )
budget[ 'annual_salary_cost' ] += hires * salary
budget[ 'recruiting_cost' ] += hires * salary * 0.2
budget[ 'benefits_cost' ] = budget[ 'annual_salary_cost' ] * 0.3
budget[ 'equipment_cost' ] = hiring_plan[ 'total_hires_needed' ] * 5000
budget[ 'onboarding_cost' ] = hiring_plan[ 'total_hires_needed' ] * 10000
budget[ 'total_cost' ] = sum ([
budget[ 'annual_salary_cost' ],
budget[ 'benefits_cost' ],
budget[ 'equipment_cost' ],
budget[ 'recruiting_cost' ],
budget[ 'onboarding_cost' ]
])
if hiring_plan[ 'total_hires_needed' ] > 0 :
budget[ 'cost_per_hire' ] = budget[ 'total_cost' ] / hiring_plan[ 'total_hires_needed' ]
return budget
def _assess_scaling_risks (self, current: Dict, targets: Dict) -> List[Dict]:
"""Assess risks in scaling plan"""
risks = []
growth_rate = (targets[ 'target_headcount' ] - current[ 'headcount' ]) / max (current[ 'headcount' ], 1 )
if growth_rate > 1.0 : # More than 100% growth
risks.append({
'risk' : 'Rapid growth dilution' ,
'impact' : 'High' ,
'mitigation' : 'Implement strong onboarding and mentorship programs'
})
if current.get( 'attrition_rate' , 0 ) > 15 :
risks.append({
'risk' : 'High attrition during scaling' ,
'impact' : 'High' ,
'mitigation' : 'Address retention issues before aggressive hiring'
})
if targets.get( 'timeline_quarters' , 4 ) < 4 :
risks.append({
'risk' : 'Compressed timeline' ,
'impact' : 'Medium' ,
'mitigation' : 'Consider extending timeline or increasing recruiting resources'
})
return risks
def _generate_recommendations (self, results: Dict) -> List[ str ]:
"""Generate scaling recommendations"""
recommendations = []
# Based on growth rate
total_hires = results[ 'hiring_plan' ][ 'total_hires_needed' ]
current_size = results[ 'current_analysis' ][ 'total_headcount' ]
if current_size > 0 :
growth_rate = total_hires / current_size
if growth_rate > 0.5 :
recommendations.append( 'Consider hiring a dedicated recruiting team' )
recommendations.append( 'Implement scalable onboarding processes' )
recommendations.append( 'Establish clear team charters and boundaries' )
if growth_rate > 1.0 :
recommendations.append( '⚠️ High growth risk - consider slowing timeline' )
recommendations.append( 'Focus on senior hires first to establish culture' )
recommendations.append( 'Implement continuous integration practices early' )
# Based on structure
if results[ 'team_structure' ][ 'communication_paths' ] > 1000 :
recommendations.append( 'Implement clear communication channels and tools' )
recommendations.append( 'Consider platform teams to reduce dependencies' )
# Based on balance
if results[ 'current_analysis' ][ 'balance_score' ] < 70 :
recommendations.append( 'Prioritize hiring for underrepresented roles' )
recommendations.append( 'Consider role rotation for skill development' )
return recommendations
def calculate_team_scaling (current_state: Dict, growth_targets: Dict) -> str :
"""Main function to calculate team scaling"""
calculator = TeamScalingCalculator()
results = calculator.calculate_scaling_plan(current_state, growth_targets)
# Format output
output = [
"=== Engineering Team Scaling Plan ===" ,
f "" ,
f "Current State Analysis:" ,
f " Current Headcount: { results[ 'current_analysis' ][ 'total_headcount' ] } " ,
f " Team Stage: { results[ 'current_analysis' ][ 'team_stage' ] } " ,
f " Productivity Index: { results[ 'current_analysis' ][ 'productivity_index' ] :.1f } %" ,
f " Team Balance Score: { results[ 'current_analysis' ][ 'balance_score' ] :.1f } /100" ,
f "" ,
f "Growth Plan:" ,
f " Target Headcount: { growth_targets[ 'target_headcount' ] } " ,
f " Total Hires Needed: { results[ 'hiring_plan' ][ 'total_hires_needed' ] } " ,
f " Timeline: { growth_targets[ 'timeline_quarters' ] } quarters" ,
f "" ,
"Quarterly Timeline:"
]
for quarter in results[ 'growth_timeline' ]:
output.append(
f " { quarter[ 'quarter' ] } : { quarter[ 'headcount' ] } total "
f "(+ { quarter[ 'new_hires' ] } hires, "
f " { quarter[ 'productivity_factor' ] :.0% } productivity)"
)
output.extend([
f "" ,
"Hiring Priorities:"
])
sorted_roles = sorted (
results[ 'hiring_plan' ][ 'by_role' ].items(),
key =lambda x: x[ 1 ][ 'priority' ],
reverse = True
)
for role, details in sorted_roles[: 5 ]:
output.append(
f " { role } : { details[ 'hires_needed' ] } hires "
f "(Priority: { details[ 'priority' ] } /10)"
)
output.extend([
f "" ,
f "Budget Projection:" ,
f " Annual Salary Cost: $ { results[ 'budget_projection' ][ 'annual_salary_cost' ] :,.0f } " ,
f " Total Investment: $ { results[ 'budget_projection' ][ 'total_cost' ] :,.0f } " ,
f " Cost per Hire: $ { results[ 'budget_projection' ][ 'cost_per_hire' ] :,.0f } " ,
f "" ,
f "Team Structure:" ,
f " Model: { results[ 'team_structure' ][ 'organizational_model' ] } " ,
f " Management Layers: { results[ 'team_structure' ][ 'management_layers' ] } " ,
f " Communication Paths: { results[ 'team_structure' ][ 'communication_paths' ] :, } " ,
f "" ,
"Key Recommendations:"
])
for rec in results[ 'recommendations' ]:
output.append( f " • { rec } " )
return ' \n ' .join(output)
if __name__ == "__main__" :
import argparse
parser = argparse.ArgumentParser(
description = "Engineering Team Scaling Calculator - Optimize team growth and structure"
)
parser.add_argument(
"input_file" , nargs = "?" , default = None ,
help = "JSON file with current_state and growth_targets (default: run with sample data)"
)
parser.add_argument(
"--json" , action = "store_true" ,
help = "Output raw JSON instead of formatted report"
)
args = parser.parse_args()
if args.input_file:
with open (args.input_file) as f:
data = json.load(f)
current_state = data[ "current_state" ]
growth_targets = data[ "growth_targets" ]
else :
current_state = {
'headcount' : 25 ,
'velocity' : 450 ,
'roles' : {
'engineering_manager' : 2 ,
'tech_lead' : 3 ,
'senior_engineer' : 8 ,
'mid_engineer' : 10 ,
'junior_engineer' : 2
},
'attrition_rate' : 12 ,
'location' : 'US'
}
growth_targets = {
'target_headcount' : 75 ,
'timeline_quarters' : 4
}
if args.json:
calculator = TeamScalingCalculator()
results = calculator.calculate_scaling_plan(current_state, growth_targets)
print (json.dumps(results, indent = 2 ))
else :
print (calculate_team_scaling(current_state, growth_targets))
#!/usr/bin/env python3
"""
Technical Debt Analyzer - Assess and prioritize technical debt across systems
"""
import json
from typing import Dict, List, Tuple
from datetime import datetime
import math
class TechDebtAnalyzer :
def __init__ (self):
self .debt_categories = {
'architecture' : {
'weight' : 0.25 ,
'indicators' : [
'monolithic_design' , 'tight_coupling' , 'no_microservices' ,
'legacy_patterns' , 'no_api_gateway' , 'synchronous_only'
]
},
'code_quality' : {
'weight' : 0.20 ,
'indicators' : [
'low_test_coverage' , 'high_complexity' , 'code_duplication' ,
'no_documentation' , 'inconsistent_standards' , 'legacy_language'
]
},
'infrastructure' : {
'weight' : 0.20 ,
'indicators' : [
'manual_deployments' , 'no_ci_cd' , 'single_points_failure' ,
'no_monitoring' , 'no_auto_scaling' , 'outdated_servers'
]
},
'security' : {
'weight' : 0.20 ,
'indicators' : [
'outdated_dependencies' , 'no_security_scans' , 'plain_text_secrets' ,
'no_encryption' , 'missing_auth' , 'no_audit_logs'
]
},
'performance' : {
'weight' : 0.15 ,
'indicators' : [
'slow_response_times' , 'no_caching' , 'inefficient_queries' ,
'memory_leaks' , 'no_optimization' , 'blocking_operations'
]
}
}
self .impact_matrix = {
'user_impact' : { 'weight' : 0.30 , 'score' : 0 },
'developer_velocity' : { 'weight' : 0.25 , 'score' : 0 },
'system_reliability' : { 'weight' : 0.20 , 'score' : 0 },
'scalability' : { 'weight' : 0.15 , 'score' : 0 },
'maintenance_cost' : { 'weight' : 0.10 , 'score' : 0 }
}
def analyze_system (self, system_data: Dict) -> Dict:
"""Analyze a system for technical debt"""
results = {
'timestamp' : datetime.now().isoformat(),
'system_name' : system_data.get( 'name' , 'Unknown' ),
'debt_score' : 0 ,
'debt_level' : '' ,
'category_scores' : {},
'prioritized_actions' : [],
'estimated_effort' : {},
'risk_assessment' : {},
'recommendations' : []
}
# Calculate debt scores by category
total_debt_score = 0
for category, config in self .debt_categories.items():
category_score = self ._calculate_category_score(
system_data.get(category, {}),
config[ 'indicators' ]
)
weighted_score = category_score * config[ 'weight' ]
results[ 'category_scores' ][category] = {
'raw_score' : category_score,
'weighted_score' : weighted_score,
'level' : self ._get_level(category_score)
}
total_debt_score += weighted_score
results[ 'debt_score' ] = round (total_debt_score, 2 )
results[ 'debt_level' ] = self ._get_level(total_debt_score)
# Calculate impact and prioritize
results[ 'prioritized_actions' ] = self ._prioritize_actions(
results[ 'category_scores' ],
system_data.get( 'business_context' , {})
)
# Estimate effort
results[ 'estimated_effort' ] = self ._estimate_effort(
results[ 'prioritized_actions' ],
system_data.get( 'team_size' , 5 )
)
# Risk assessment
results[ 'risk_assessment' ] = self ._assess_risks(
results[ 'debt_score' ],
system_data.get( 'system_criticality' , 'medium' )
)
# Generate recommendations
results[ 'recommendations' ] = self ._generate_recommendations(results)
return results
def _calculate_category_score (self, category_data: Dict, indicators: List) -> float :
"""Calculate score for a specific category"""
if not category_data:
return 50.0 # Default middle score if no data
total_score = 0
count = 0
for indicator in indicators:
if indicator in category_data:
# Score from 0 (no debt) to 100 (high debt)
total_score += category_data[indicator]
count += 1
return (total_score / count) if count > 0 else 50.0
def _get_level (self, score: float ) -> str :
"""Convert numerical score to level"""
if score < 20 :
return 'Low'
elif score < 40 :
return 'Medium-Low'
elif score < 60 :
return 'Medium'
elif score < 80 :
return 'Medium-High'
else :
return 'Critical'
def _prioritize_actions (self, category_scores: Dict, business_context: Dict) -> List:
"""Prioritize technical debt reduction actions"""
actions = []
for category, scores in category_scores.items():
if scores[ 'raw_score' ] > 60 : # Focus on high debt areas
priority = self ._calculate_priority(
scores[ 'raw_score' ],
category,
business_context
)
action = {
'category' : category,
'priority' : priority,
'score' : scores[ 'raw_score' ],
'action_items' : self ._get_action_items(category, scores[ 'level' ])
}
actions.append(action)
# Sort by priority
actions.sort( key =lambda x: x[ 'priority' ], reverse = True )
return actions[: 5 ] # Top 5 priorities
def _calculate_priority (self, score: float , category: str , context: Dict) -> float :
"""Calculate priority based on score and business context"""
base_priority = score
# Adjust based on business context
if context.get( 'growth_phase' ) == 'rapid' and category in [ 'scalability' , 'performance' ]:
base_priority *= 1.5
if context.get( 'compliance_required' ) and category == 'security' :
base_priority *= 2.0
if context.get( 'cost_pressure' ) and category == 'infrastructure' :
base_priority *= 1.3
return min ( 100 , base_priority)
def _get_action_items (self, category: str , level: str ) -> List[ str ]:
"""Get specific action items based on category and level"""
actions = {
'architecture' : {
'Critical' : [
'Immediate: Create architecture migration roadmap' ,
'Week 1: Identify service boundaries for decomposition' ,
'Month 1: Begin extracting first microservice' ,
'Month 2: Implement API gateway' ,
'Quarter: Complete critical service separation'
],
'Medium-High' : [
'Month 1: Document current architecture' ,
'Month 2: Design target architecture' ,
'Quarter: Begin gradual migration' ,
'Monitor: Track coupling metrics'
]
},
'code_quality' : {
'Critical' : [
'Immediate: Implement code quality gates' ,
'Week 1: Set up automated testing pipeline' ,
'Month 1: Achieve 40% test coverage' ,
'Month 2: Refactor critical modules' ,
'Quarter: Reach 70% test coverage'
],
'Medium-High' : [
'Month 1: Establish coding standards' ,
'Month 2: Implement code review process' ,
'Quarter: Gradual refactoring plan'
]
},
'infrastructure' : {
'Critical' : [
'Immediate: Implement basic CI/CD' ,
'Week 1: Set up monitoring and alerts' ,
'Month 1: Automate critical deployments' ,
'Month 2: Implement disaster recovery' ,
'Quarter: Full infrastructure as code'
],
'Medium-High' : [
'Month 1: Document infrastructure' ,
'Month 2: Begin automation' ,
'Quarter: Modernize critical components'
]
},
'security' : {
'Critical' : [
'Immediate: Security audit and patching' ,
'Week 1: Implement secrets management' ,
'Month 1: Set up vulnerability scanning' ,
'Month 2: Implement security training' ,
'Quarter: Achieve compliance standards'
],
'Medium-High' : [
'Month 1: Security assessment' ,
'Month 2: Implement security tools' ,
'Quarter: Regular security reviews'
]
},
'performance' : {
'Critical' : [
'Immediate: Performance profiling' ,
'Week 1: Implement caching strategy' ,
'Month 1: Optimize database queries' ,
'Month 2: Implement CDN' ,
'Quarter: Re-architect bottlenecks'
],
'Medium-High' : [
'Month 1: Performance baseline' ,
'Month 2: Optimization plan' ,
'Quarter: Incremental improvements'
]
}
}
return actions.get(category, {}).get(level, [ 'Create action plan' ])
def _estimate_effort (self, actions: List, team_size: int ) -> Dict:
"""Estimate effort required for debt reduction"""
total_story_points = 0
effort_breakdown = {}
for action in actions:
# Estimate based on category and score
base_points = action[ 'score' ] * 2 # Higher debt = more effort
if action[ 'category' ] == 'architecture' :
points = base_points * 1.5 # Architecture changes are complex
elif action[ 'category' ] == 'security' :
points = base_points * 1.2 # Security requires careful work
else :
points = base_points
effort_breakdown[action[ 'category' ]] = {
'story_points' : round (points),
'sprints' : math.ceil(points / (team_size * 20 )), # 20 points per dev per sprint
'developers_needed' : math.ceil(points / 100 )
}
total_story_points += points
return {
'total_story_points' : round (total_story_points),
'estimated_sprints' : math.ceil(total_story_points / (team_size * 20 )),
'recommended_team_size' : max (team_size, math.ceil(total_story_points / 200 )),
'breakdown' : effort_breakdown
}
def _assess_risks (self, debt_score: float , criticality: str ) -> Dict:
"""Assess risks associated with technical debt"""
risk_level = 'Low'
if debt_score > 70 and criticality == 'high' :
risk_level = 'Critical'
elif debt_score > 60 or criticality == 'high' :
risk_level = 'High'
elif debt_score > 40 :
risk_level = 'Medium'
risks = {
'overall_risk' : risk_level,
'specific_risks' : []
}
if debt_score > 60 :
risks[ 'specific_risks' ].extend([
'System failure risk increasing' ,
'Developer productivity declining' ,
'Innovation velocity blocked' ,
'Maintenance costs escalating'
])
if debt_score > 80 :
risks[ 'specific_risks' ].extend([
'Competitive disadvantage emerging' ,
'Talent retention risk' ,
'Customer satisfaction impact' ,
'Potential data breach vulnerability'
])
return risks
def _generate_recommendations (self, results: Dict) -> List[ str ]:
"""Generate strategic recommendations"""
recommendations = []
# Overall strategy based on debt level
if results[ 'debt_level' ] == 'Critical' :
recommendations.append( '🚨 URGENT: Dedicate 40 % o f engineering capacity to debt reduction' )
recommendations.append( 'Create dedicated debt reduction team' )
recommendations.append( 'Implement weekly debt reduction reviews' )
recommendations.append( 'Consider temporary feature freeze' )
elif results[ 'debt_level' ] in [ 'Medium-High' , 'High' ]:
recommendations.append( 'Allocate 25-30 % o f sprints to debt reduction' )
recommendations.append( 'Establish technical debt budget' )
recommendations.append( 'Implement debt prevention practices' )
else :
recommendations.append( 'Maintain 15-20 % o ngoing debt reduction allocation' )
recommendations.append( 'Focus on prevention over correction' )
# Category-specific recommendations
for category, scores in results[ 'category_scores' ].items():
if scores[ 'raw_score' ] > 70 :
if category == 'architecture' :
recommendations.append( f 'Consider hiring architecture specialist' )
elif category == 'security' :
recommendations.append( f 'Engage security audit firm' )
elif category == 'performance' :
recommendations.append( f 'Implement performance SLA monitoring' )
# Team recommendations
effort = results.get( 'estimated_effort' , {})
if effort.get( 'recommended_team_size' , 0 ) > effort.get( 'total_story_points' , 0 ) / 200 :
recommendations.append( f "Scale team to { effort[ 'recommended_team_size' ] } engineers" )
return recommendations
def analyze_technical_debt (system_config: Dict) -> str :
"""Main function to analyze technical debt"""
analyzer = TechDebtAnalyzer()
results = analyzer.analyze_system(system_config)
# Format output
output = [
f "=== Technical Debt Analysis Report ===" ,
f "System: { results[ 'system_name' ] } " ,
f "Analysis Date: { results[ 'timestamp' ][: 10 ] } " ,
f "" ,
f "OVERALL DEBT SCORE: { results[ 'debt_score' ] } /100 ( { results[ 'debt_level' ] } )" ,
f "" ,
"Category Breakdown:"
]
for category, scores in results[ 'category_scores' ].items():
output.append( f " { category.title() } : { scores[ 'raw_score' ] :.1f } ( { scores[ 'level' ] } )" )
output.extend([
f "" ,
"Risk Assessment:" ,
f " Overall Risk: { results[ 'risk_assessment' ][ 'overall_risk' ] } "
])
for risk in results[ 'risk_assessment' ][ 'specific_risks' ]:
output.append( f " • { risk } " )
output.extend([
f "" ,
"Effort Estimation:" ,
f " Total Story Points: { results[ 'estimated_effort' ][ 'total_story_points' ] } " ,
f " Estimated Sprints: { results[ 'estimated_effort' ][ 'estimated_sprints' ] } " ,
f " Recommended Team Size: { results[ 'estimated_effort' ][ 'recommended_team_size' ] } " ,
f "" ,
"Top Priority Actions:"
])
for i, action in enumerate (results[ 'prioritized_actions' ][: 3 ], 1 ):
output.append( f " \n{ i } . { action[ 'category' ].title() } (Priority: { action[ 'priority' ] :.0f } )" )
for item in action[ 'action_items' ][: 3 ]:
output.append( f " - { item } " )
output.extend([
f "" ,
"Strategic Recommendations:"
])
for rec in results[ 'recommendations' ]:
output.append( f " • { rec } " )
return ' \n ' .join(output)
if __name__ == "__main__" :
# Example usage
example_system = {
'name' : 'Legacy E-commerce Platform' ,
'architecture' : {
'monolithic_design' : 80 ,
'tight_coupling' : 70 ,
'no_microservices' : 90 ,
'legacy_patterns' : 60
},
'code_quality' : {
'low_test_coverage' : 75 ,
'high_complexity' : 65 ,
'code_duplication' : 55
},
'infrastructure' : {
'manual_deployments' : 70 ,
'no_ci_cd' : 60 ,
'no_monitoring' : 40
},
'security' : {
'outdated_dependencies' : 85 ,
'no_security_scans' : 70
},
'performance' : {
'slow_response_times' : 60 ,
'no_caching' : 50
},
'team_size' : 8 ,
'system_criticality' : 'high' ,
'business_context' : {
'growth_phase' : 'rapid' ,
'compliance_required' : True ,
'cost_pressure' : False
}
}
print (analyze_technical_debt(example_system))