⚙️

CTO Advisor

Technical leadership guidance — engineering team scaling, technology strategy, build vs. buy decisions, and architecture at the executive level.

by @alirezarezvani · MIT · 9.2k

Built for: Founders executives

What this skill does

Get executive-level advice on scaling your engineering team and planning your technology roadmap without hiring a full-time CTO. Create actionable strategies to manage long-term code health, decide when to purchase tools versus building them, and establish a sustainable work culture. Use this advisor when you need to align technology spending with business goals or reorganize your team for rapid growth.

@alirezarezvani · Leadership

view on github ↗

name: “cto-advisor” description: “Technical leadership guidance for engineering teams, architecture decisions, and technology strategy. Use when assessing technical debt, scaling engineering teams, evaluating technologies, making architecture decisions, establishing engineering metrics, or when user mentions CTO, tech debt, technical debt, team scaling, architecture decisions, technology evaluation, engineering metrics, DORA metrics, or technology strategy.” license: MIT metadata: version: 2.0.0 author: Alireza Rezvani category: c-level domain: cto-leadership updated: 2026-03-05 python-tools: tech_debt_analyzer.py, team_scaling_calculator.py frameworks: architecture-decisions, engineering-metrics, technology-evaluation

CTO Advisor

Technical leadership frameworks for architecture, engineering teams, technology strategy, and technical decision-making.

Keywords

CTO, chief technology officer, tech debt, technical debt, architecture, engineering metrics, DORA, team scaling, technology evaluation, build vs buy, cloud migration, platform engineering, AI/ML strategy, system design, incident response, engineering culture

Quick Start

python scripts/tech_debt_analyzer.py      # Assess technical debt severity and remediation plan
python scripts/team_scaling_calculator.py  # Model engineering team growth and cost

Core Responsibilities

1. Technology Strategy

Align technology investments with business priorities.

Strategy components:

Technology vision (3-year: where the platform is going)
Architecture roadmap (what to build, refactor, or replace)
Innovation budget (10-20% of engineering capacity for experimentation)
Build vs buy decisions (default: buy unless it’s your core IP)
Technical debt strategy (management, not elimination)

See references/technology_evaluation_framework.md for the full evaluation framework.

2. Engineering Team Leadership

Scale the engineering org’s productivity — not individual output.

Scaling engineering:

Hire for the next stage, not the current one
Every 3x in team size requires a reorg
Manager:IC ratio: 5-8 direct reports optimal
Senior:junior ratio: at least 1:2 (invert and you’ll drown in mentoring)

Culture:

Blameless post-mortems (incidents are system failures, not people failures)
Documentation as a first-class citizen
Code review as mentoring, not gatekeeping
On-call that’s sustainable (not heroic)

See references/engineering_metrics.md for DORA metrics and the engineering health dashboard.

3. Architecture Governance

Create the framework for making good decisions — not making every decision yourself.

Architecture Decision Records (ADRs):

Every significant decision gets documented: context, options, decision, consequences
Decisions are discoverable (not buried in Slack)
Decisions can be superseded (not permanent)

See references/architecture_decision_records.md for ADR templates and the decision review process.

4. Vendor & Platform Management

Every vendor is a dependency. Every dependency is a risk.

Evaluation criteria: Does it solve a real problem? Can we migrate away? Is the vendor stable? What’s the total cost (license + integration + maintenance)?

5. Crisis Management

Incident response, security breaches, major outages, data loss.

Your role in a crisis: Ensure the right people are on it, communication is flowing, and the business is informed. Post-crisis: blameless retrospective within 48 hours.

Workflows

Tech Debt Assessment Workflow

Step 1 — Run the analyzer

python scripts/tech_debt_analyzer.py --output report.json

Step 2 — Interpret results The analyzer produces a severity-scored inventory. Review each item against:

Severity (P0–P3): how much is it blocking velocity or creating risk?
Cost-to-fix: engineering days estimated to remediate
Blast radius: how many systems / teams are affected?

Step 3 — Build a prioritized remediation plan Sort by: (Severity × Blast Radius) / Cost-to-fix — highest score = fix first. Group items into: (a) immediate sprint, (b) next quarter, (c) tracked backlog.

Step 4 — Validate before presenting to stakeholders

Every P0/P1 item has an owner and a target date
Cost-to-fix estimates reviewed with the relevant tech lead
Debt ratio calculated: maintenance work / total engineering capacity (target: < 25%)
Remediation plan fits within capacity (don’t promise 40 points of debt reduction in a 2-week sprint)

Example output — Tech Debt Inventory:

Item                  | Severity | Cost-to-Fix | Blast Radius | Priority Score
----------------------|----------|-------------|--------------|---------------
Auth service (v1 API) | P1       | 8 days      | 6 services   | HIGH
Unindexed DB queries  | P2       | 3 days      | 2 services   | MEDIUM
Legacy deploy scripts | P3       | 5 days      | 1 service    | LOW

ADR Creation Workflow

Step 1 — Identify the decision Trigger an ADR when: the decision affects more than one team, is hard to reverse, or has cost/risk implications > 1 sprint of effort.

Step 2 — Draft the ADR Use the template from references/architecture_decision_records.md:

Title: [Short noun phrase]
Status: Proposed | Accepted | Superseded
Context: What is the problem? What constraints exist?
Options Considered:
  - Option A: [description] — TCO: $X | Risk: Low/Med/High
  - Option B: [description] — TCO: $X | Risk: Low/Med/High
Decision: [Chosen option and rationale]
Consequences: [What becomes easier? What becomes harder?]

Step 3 — Validation checkpoint (before finalizing)

All options include a 3-year TCO estimate
At least one “do nothing” or “buy” alternative is documented
Affected team leads have reviewed and signed off
Consequences section addresses reversibility and migration path
ADR is committed to the repository (not left in a doc or Slack thread)

Step 4 — Communicate and close Share the accepted ADR in the engineering all-hands or architecture sync. Link it from the relevant service’s README.

Build vs Buy Analysis Workflow

Step 1 — Define requirements (functional + non-functional) Step 2 — Identify candidate vendors or internal build scope Step 3 — Score each option:

Criterion              | Weight | Build Score | Vendor A Score | Vendor B Score
-----------------------|--------|-------------|----------------|---------------
Solves core problem    | 30%    | 9           | 8              | 7
Migration risk         | 20%    | 2 (low risk)| 7              | 6
3-year TCO             | 25%    | $X          | $Y             | $Z
Vendor stability       | 15%    | N/A         | 8              | 5
Integration effort     | 10%    | 3           | 7              | 8

Step 4 — Default rule: Buy unless it is core IP or no vendor meets ≥ 70% of requirements. Step 5 — Document the decision as an ADR (see ADR workflow above).

Key Questions a CTO Asks

“What’s our biggest technical risk right now — not the most annoying, the most dangerous?”
“If we 10x our traffic tomorrow, what breaks first?”
“How much of our engineering time goes to maintenance vs new features?”
“What would a new engineer say about our codebase after their first week?”
“Which technical decision from 2 years ago is hurting us most today?”
“Are we building this because it’s the right solution, or because it’s the interesting one?”
“What’s our bus factor on critical systems?”

CTO Metrics Dashboard

Category	Metric	Target	Frequency
Velocity	Deployment frequency	Daily (or per-commit)	Weekly
Velocity	Lead time for changes	< 1 day	Weekly
Quality	Change failure rate	< 5%	Weekly
Quality	Mean time to recovery (MTTR)	< 1 hour	Weekly
Debt	Tech debt ratio (maintenance/total)	< 25%	Monthly
Debt	P0 bugs open	0	Daily
Team	Engineering satisfaction	> 7/10	Quarterly
Team	Regrettable attrition	< 10%	Monthly
Architecture	System uptime	> 99.9%	Monthly
Architecture	API response time (p95)	< 200ms	Weekly
Cost	Cloud spend / revenue ratio	Declining trend	Monthly

Red Flags

Tech debt ratio > 30% and growing faster than it’s being paid down
Deployment frequency declining over 4+ weeks
No ADRs for the last 3 major decisions
The CTO is the only person who can deploy to production
Build times exceed 10 minutes
Single points of failure on critical systems with no mitigation plan
The team dreads on-call rotation

Integration with C-Suite Roles

When…	CTO works with…	To…
Roadmap planning	CPO	Align technical and product roadmaps
Hiring engineers	CHRO	Define roles, comp bands, hiring criteria
Budget planning	CFO	Cloud costs, tooling, headcount budget
Security posture	CISO	Architecture review, compliance requirements
Scaling operations	COO	Infrastructure capacity vs growth plans
Revenue commitments	CRO	Technical feasibility of enterprise deals
Technical marketing	CMO	Developer relations, technical content
Strategic decisions	CEO	Technology as competitive advantage
Hard calls	Executive Mentor	”Should we rewrite?” “Should we switch stacks?”

Proactive Triggers

Surface these without being asked when you detect them in company context:

Deployment frequency dropping → early signal of team health issues
Tech debt ratio > 30% → recommend a tech debt sprint
No ADRs filed in 30+ days → architecture decisions going undocumented
Single point of failure on critical system → flag bus factor risk
Cloud costs growing faster than revenue → cost optimization review
Security audit overdue (> 12 months) → escalate to CISO

Output Artifacts

Request	You Produce
”Assess our tech debt”	Tech debt inventory with severity, cost-to-fix, and prioritized plan
”Should we build or buy X?”	Build vs buy analysis with 3-year TCO
”We need to scale the team”	Hiring plan with roles, timing, ramp model, and budget
”Review this architecture”	ADR with options evaluated, decision, consequences
”How’s engineering doing?”	Engineering health dashboard (DORA + debt + team)

Reasoning Technique: ReAct (Reason then Act)

Research the technical landscape first. Analyze options against constraints (time, team skill, cost, risk). Then recommend action. Always ground recommendations in evidence — benchmarks, case studies, or measured data from your own systems. “I think” is not enough — show the data.

Communication

All output passes the Internal Quality Loop before reaching the founder (see agent-protocol/SKILL.md).

Self-verify: source attribution, assumption audit, confidence scoring
Peer-verify: cross-functional claims validated by the owning role
Critic pre-screen: high-stakes decisions reviewed by Executive Mentor
Output format: Bottom Line → What (with confidence) → Why → How to Act → Your Decision
Results only. Every finding tagged: 🟢 verified, 🟡 medium, 🔴 assumed.

Context Integration

Always read company-context.md before responding (if it exists)
During board meetings: Use only your own analysis in Phase 2 (no cross-pollination)
Invocation: You can request input from other roles: [INVOKE:role|question]

Resources

references/technology_evaluation_framework.md — Build vs buy, vendor evaluation, technology radar
references/engineering_metrics.md — DORA metrics, engineering health dashboard, team productivity
references/architecture_decision_records.md — ADR templates, decision governance, review process

Architecture Decision Records (ADR) Framework

What is an ADR?

Architecture Decision Records capture important architectural decisions made along with their context and consequences. They help maintain institutional knowledge and explain why systems are built the way they are.

ADR Template

ADR-[NUMBER]: [TITLE]

Date: YYYY-MM-DD
Status: [Proposed | Accepted | Deprecated | Superseded]
Deciders: [List of people involved in decision]
Technical Story: [Ticket/Issue reference]

Context and Problem Statement

[Describe the context and problem that needs to be solved. What are we trying to achieve?]

Decision Drivers

[Driver 1: e.g., Performance requirements]
[Driver 2: e.g., Time to market]
[Driver 3: e.g., Team expertise]
[Driver 4: e.g., Cost constraints]

Considered Options

Option 1: [Name]
Option 2: [Name]
Option 3: [Name]

Decision Outcome

Chosen option: "[Option Name]", because [justification]

Positive Consequences

[Consequence 1]
[Consequence 2]

Negative Consequences

[Risk 1 and mitigation]
[Risk 2 and mitigation]

Pros and Cons of Options

Option 1: [Name]

Pros:
- [Advantage 1]
- [Advantage 2]
Cons:
- [Disadvantage 1]
- [Disadvantage 2]

Option 2: [Name]

[Repeat structure]

Example ADRs

ADR-001: Microservices Architecture

Date: 2024-01-15
Status: Accepted
Deciders: CTO, VP Engineering, Tech Leads
Technical Story: ARCH-001

Context and Problem Statement

Our monolithic application is becoming difficult to scale and deploy. Different teams are stepping on each other's toes, and deployment cycles are getting longer. We need to decide on our architectural approach for the next 3-5 years.

Decision Drivers

Need for independent team deployment
Requirement to scale different components independently
Different components have different performance characteristics
Team size growing from 25 to 75+ engineers
Need to support multiple technology stacks

Considered Options

Keep Monolith: Continue with current architecture
Modular Monolith: Break into modules but single deployment
Microservices: Full service-oriented architecture
Serverless: Function-as-a-Service approach

Decision Outcome

Chosen option: "Microservices", because it best supports our team autonomy needs and scaling requirements, despite added complexity.

Positive Consequences

Teams can deploy independently
Services can scale based on individual needs
Technology diversity is possible
Fault isolation improved

Negative Consequences

Increased operational complexity - Mitigated by investing in DevOps
Network latency between services - Mitigated by careful service boundaries
Data consistency challenges - Mitigated by event sourcing patterns

ADR-002: Container Orchestration Platform

Date: 2024-02-01
Status: Accepted
Deciders: CTO, DevOps Lead, Platform Team
Technical Story: INFRA-045

Context and Problem Statement

With the move to microservices (ADR-001), we need a container orchestration platform to manage deployment, scaling, and operations of application containers.

Decision Drivers

Need for automated deployment and scaling
High availability requirements (99.9% SLA)
Multi-cloud strategy (avoid vendor lock-in)
Team familiarity and ecosystem maturity
Cost considerations

Considered Options

Kubernetes: Industry standard, self-managed
Amazon ECS: AWS-native solution
Docker Swarm: Simpler alternative
Nomad: HashiCorp solution

Decision Outcome

Chosen option: "Kubernetes", because of its maturity, ecosystem, and multi-cloud support.

Positive Consequences

Industry standard with huge ecosystem
Multi-cloud compatible
Strong community support
Extensive tooling available

Negative Consequences

Steep learning curve - Mitigated by training and hiring
Operational complexity - Mitigated by managed Kubernetes (EKS/GKE)

ADR-003: API Gateway Strategy

Date: 2024-03-15
Status: Accepted
Deciders: CTO, Security Lead, API Team
Technical Story: API-101

Context and Problem Statement

With multiple microservices, we need a unified entry point for external clients that handles cross-cutting concerns like authentication, rate limiting, and monitoring.

Decision Drivers

Security requirements (OAuth2, API keys)
Need for rate limiting and throttling
Monitoring and analytics requirements
Developer experience for API consumers
Performance (sub-100ms overhead)

Considered Options

Kong: Open-source, plugin ecosystem
AWS API Gateway: Managed service
Istio/Envoy: Service mesh approach
Build Custom: In-house solution

Decision Outcome

Chosen option: "Kong", because of its flexibility and plugin ecosystem while avoiding vendor lock-in.

Common Architecture Decisions

1. Frontend Architecture

Single Page Application (SPA) vs Server-Side Rendering (SSR) vs Static Site Generation (SSG)
React vs Vue vs Angular vs Svelte
Monorepo vs Polyrepo
Micro-frontends vs Monolithic frontend

2. Backend Architecture

Monolith vs Microservices vs Serverless
REST vs GraphQL vs gRPC
Synchronous vs Asynchronous communication
Event-driven vs Request-response

3. Data Architecture

SQL vs NoSQL vs NewSQL
Single database vs Database per service
CQRS vs Traditional CRUD
Event Sourcing vs State-based storage

4. Infrastructure Decisions

Cloud provider: AWS vs Azure vs GCP vs Multi-cloud
Containers vs VMs vs Serverless
Kubernetes vs ECS vs Cloud Run
Self-hosted vs Managed services

5. Development Practices

Continuous Deployment vs Continuous Delivery
Feature flags vs Branch-based deployment
Blue-green vs Canary vs Rolling deployment
GitFlow vs GitHub Flow vs GitLab Flow

ADR Best Practices

Writing Good ADRs

Keep them short: 1-2 pages maximum
Be specific: Include concrete examples
Document why, not what: Focus on reasoning
Include all options: Even obviously bad ones
Be honest about drawbacks: Every decision has trade-offs

When to Write ADRs

Write an ADR when:

The decision has significant impact
Multiple options were seriously considered
The decision is hard to reverse
You find yourself explaining the same decision repeatedly
There's disagreement about the approach

ADR Lifecycle

Proposed: Under discussion
Accepted: Decision made and being implemented
Deprecated: No longer relevant but kept for history
Superseded: Replaced by another ADR

Storage and Discovery

Store ADRs in your main repository under docs/architecture/decisions/
Use consistent numbering (ADR-001, ADR-002, etc.)
Create an index file linking all ADRs
Reference ADRs in code comments where relevant
Review ADRs regularly (quarterly) for relevance

Decision Evaluation Framework

Technical Factors (40%)

Performance impact
Scalability potential
Security implications
Maintainability
Technical debt

Business Factors (30%)

Time to market
Cost (initial and ongoing)
Revenue impact
Competitive advantage
Regulatory compliance

Team Factors (30%)

Current expertise
Learning curve
Hiring availability
Team preference
Training requirements

Anti-patterns to Avoid

Decision by Committee: Too many stakeholders leading to compromise solutions
Analysis Paralysis: Over-analyzing instead of deciding
Resume-Driven Development: Choosing tech for personal goals
Hype-Driven Development: Choosing the newest/coolest tech
Not-Invented-Here: Rejecting external solutions by default
Vendor Lock-in: Over-dependence on proprietary solutions
Premature Optimization: Solving problems you don't have yet
Under-documentation: Not capturing the "why" behind decisions

Review Checklist

Before finalizing an ADR, ensure:

Problem is clearly stated
All realistic options are considered
Trade-offs are honestly evaluated
Decision rationale is clear
Consequences are identified
Mitigation strategies are defined
Success metrics are established
Review date is set (if applicable)

Engineering Metrics & KPIs Guide

Metrics Framework

DORA Metrics (DevOps Research and Assessment)

1. Deployment Frequency

Definition: How often code is deployed to production
Target:
- Elite: Multiple deploys per day
- High: Weekly to monthly
- Medium: Monthly to bi-annually
- Low: Less than bi-annually
Measurement: Deployments per day/week/month
Improvement: Smaller batch sizes, feature flags, CI/CD

2. Lead Time for Changes

Definition: Time from code commit to production
Target:
- Elite: Less than 1 hour
- High: 1 day to 1 week
- Medium: 1 week to 1 month
- Low: More than 1 month
Measurement: Median time from commit to deploy
Improvement: Automation, parallel testing, smaller changes

3. Mean Time to Recovery (MTTR)

Definition: Time to restore service after incident
Target:
- Elite: Less than 1 hour
- High: Less than 1 day
- Medium: 1 day to 1 week
- Low: More than 1 week
Measurement: Average incident resolution time
Improvement: Monitoring, rollback capability, runbooks

4. Change Failure Rate

Definition: Percentage of changes causing failures
Target:
- Elite: 0-15%
- High: 16-30%
- Medium/Low: >30%
Measurement: Failed deploys / Total deploys
Improvement: Testing, code review, gradual rollouts

Engineering Productivity Metrics

Code Quality

Metric	Formula	Target	Action if Below
Test Coverage	Tests / Total Code	>80%	Add unit tests
Code Review Coverage	Reviewed PRs / Total PRs	100%	Enforce review policy
Technical Debt Ratio	Debt / Development Time	<10%	Dedicate debt sprints
Cyclomatic Complexity	Per function/method	<10	Refactor complex code
Code Duplication	Duplicate Lines / Total	<5%	Extract common code

Development Velocity

Metric	Formula	Target	Action if Below
Sprint Velocity	Story Points / Sprint	Stable ±10%	Review estimation
Cycle Time	Start to Done Time	<5 days	Reduce WIP
PR Merge Time	Open to Merge	<24 hours	Smaller PRs
Build Time	Code to Artifact	<10 minutes	Optimize pipeline
Test Execution Time	Full Test Suite	<30 minutes	Parallelize tests

Team Health

Metric	Formula	Target	Action if Below
On-call Incidents	Incidents / Week	<5	Improve monitoring
Bug Escape Rate	Prod Bugs / Release	<5%	Improve testing
Unplanned Work	Unplanned / Total	<20%	Better planning
Meeting Time	Meetings / Total Time	<20%	Reduce meetings
Focus Time	Uninterrupted Hours	>4h/day	Block calendars

Business Impact Metrics

System Performance

Metric	Description	Target	Business Impact
Uptime	System availability	99.9%+	Revenue protection
Page Load Time	Time to interactive	<3s	User retention
API Response Time	P95 latency	<200ms	User experience
Error Rate	Errors / Requests	<0.1%	Customer satisfaction
Throughput	Requests / Second	Per requirement	Scalability

Product Delivery

Metric	Description	Target	Business Impact
Feature Delivery Rate	Features / Quarter	Per roadmap	Market competitiveness
Time to Market	Idea to Production	<3 months	First mover advantage
Customer Defect Rate	Customer Bugs / Month	<10	Customer satisfaction
Feature Adoption	Users / Feature	>50%	ROI validation
NPS from Engineering	Customer Score	>50	Product quality

Metrics Dashboards

Executive Dashboard (Weekly)

┌─────────────────────────────────────┐
│         EXECUTIVE METRICS           │
├─────────────────────────────────────┤
│ Uptime:              99.97% ✓       │
│ Sprint Velocity:     142 pts ✓      │
│ Deployment Frequency: 3.2/day ✓     │
│ Lead Time:           4.2 hrs ✓      │
│ MTTR:                47 min ✓       │
│ Change Failure Rate: 8.3% ✓         │
│                                     │
│ Team Health:         8.2/10         │
│ Tech Debt Ratio:     12% ⚠          │
│ Feature Delivery:    85% ✓          │
└─────────────────────────────────────┘

Team Dashboard (Daily)

┌─────────────────────────────────────┐
│          TEAM METRICS               │
├─────────────────────────────────────┤
│ Current Sprint:                     │
│   Completed: 65/100 pts (65%)       │
│   In Progress: 20 pts               │
│   Days Left: 3                      │
│                                     │
│ PR Queue: 8 pending                 │
│ Build Status: ✓ Passing             │
│ Test Coverage: 82.3%                │
│ Open Incidents: 2 (P2, P3)          │
│                                     │
│ On-call Load: 3 pages this week     │
└─────────────────────────────────────┘

Individual Dashboard (Daily)

┌─────────────────────────────────────┐
│        DEVELOPER METRICS            │
├─────────────────────────────────────┤
│ This Week:                          │
│   PRs Merged: 8                     │
│   Code Reviews: 12                  │
│   Commits: 23                       │
│   Focus Time: 22.5 hrs              │
│                                     │
│ Quality:                            │
│   Test Coverage: 87%                │
│   Code Review Feedback: 95% ✓       │
│   Bug Introduction Rate: 0%         │
└─────────────────────────────────────┘

Implementation Guide

Phase 1: Foundation (Month 1)

Basic Metrics
- Deployment frequency
- Build success rate
- Uptime/availability
- Team velocity
Tools Setup
- CI/CD instrumentation
- Basic monitoring
- Time tracking

Phase 2: Quality (Month 2)

Quality Metrics
- Test coverage
- Code review metrics
- Bug rates
- Technical debt
Tool Integration
- Static analysis
- Test reporting
- Code quality gates

Phase 3: Performance (Month 3)

Performance Metrics
- DORA metrics complete
- System performance
- API metrics
- Database metrics
Advanced Monitoring
- APM tools
- Distributed tracing
- Custom dashboards

Phase 4: Optimization (Ongoing)

Advanced Analytics
- Predictive metrics
- Trend analysis
- Anomaly detection
- Correlation analysis

Metric Anti-patterns

What NOT to Measure

❌ Lines of Code: Encourages bloat
❌ Hours Worked: Promotes presenteeism
❌ Individual Velocity: Creates competition
❌ Bug Count Without Context: Discourages risk-taking
❌ Commit Count: Encourages tiny commits

Goodhart's Law

"When a measure becomes a target, it ceases to be a good measure"

Examples:

Optimizing test coverage → Writing meaningless tests
Reducing bug count → Not reporting bugs
Increasing velocity → Inflating estimates
Reducing meeting time → Skipping important discussions

How to Avoid Gaming

Use Multiple Metrics: No single metric tells the whole story
Focus on Trends: Not absolute numbers
Combine Leading and Lagging: Balance predictive and historical
Regular Review: Adjust metrics that are being gamed
Team Ownership: Let teams choose their metrics

OKR Framework for Engineering

Company Level OKRs

Objective: Deliver exceptional product quality

Key Results:

KR1: Achieve 99.95% uptime (from 99.9%)
KR2: Reduce customer-reported bugs by 50%
KR3: Improve deployment frequency to 10x/day

Engineering OKRs

Objective: Build scalable, reliable infrastructure

Key Results:

KR1: Migrate 80% of services to Kubernetes
KR2: Reduce MTTR to <30 minutes
KR3: Achieve 85% test coverage

Team OKRs

Objective: Improve developer productivity

Key Results:

KR1: Reduce build time to <5 minutes
KR2: Automate 90% of deployment process
KR3: Reduce PR review time to <4 hours

Reporting Templates

Monthly Engineering Report

# Engineering Report - [Month Year]

## Executive Summary
- Key Achievement: [Highlight]
- Main Challenge: [Issue and resolution]
- Next Month Focus: [Priority]

## DORA Metrics
| Metric | This Month | Last Month | Target | Status |
|--------|------------|------------|--------|--------|
| Deploy Frequency | X/day | Y/day | Z/day | ✓/⚠/✗ |
| Lead Time | X hrs | Y hrs | <Z hrs | ✓/⚠/✗ |
| MTTR | X min | Y min | <Z min | ✓/⚠/✗ |
| Change Failure | X% | Y% | <Z% | ✓/⚠/✗ |

## Team Performance
- Velocity: X story points (Y% of plan)
- Sprint Completion: X%
- Unplanned Work: X%

## Quality Metrics
- Test Coverage: X% (Δ Y%)
- Customer Bugs: X (Δ Y)
- Code Review Coverage: X%

## Highlights
1. [Major feature or improvement]
2. [Technical achievement]
3. [Process improvement]

## Challenges & Solutions
1. Challenge: [Issue]
   Solution: [Action taken]
   
## Next Month Priorities
1. [Priority 1]
2. [Priority 2]
3. [Priority 3]

Quarterly Business Review

# Engineering QBR - Q[X] [Year]

## Strategic Alignment
- Business Goal: [Goal]
- Engineering Contribution: [How engineering supported]
- Impact: [Measurable outcome]

## Quarterly Metrics

### Delivery
- Features Shipped: X of Y planned (Z%)
- Major Releases: [List]
- Technical Debt Reduced: X%

### Reliability
- Uptime: X%
- Incidents: X (PY critical, PZ major)
- Customer Impact: [Description]

### Efficiency
- Cost per Transaction: $X (Δ Y%)
- Infrastructure Cost: $X (Δ Y%)
- Engineering Cost per Feature: $X

## Team Growth
- Headcount: Start: X → End: Y
- Attrition: X%
- Key Hires: [Roles]

## Innovation
- Patents Filed: X
- Open Source Contributions: X
- Hackathon Projects: X

## Lessons Learned
1. [What worked well]
2. [What didn't work]
3. [What we're changing]

## Next Quarter Focus
1. [Strategic Initiative 1]
2. [Strategic Initiative 2]
3. [Strategic Initiative 3]

Tool Recommendations

Metrics Collection

DataDog: Comprehensive monitoring
New Relic: Application performance
Grafana + Prometheus: Open source stack
CloudWatch: AWS native

Engineering Analytics

LinearB: Developer productivity
Velocity: Engineering metrics
Sleuth: DORA metrics
Swarmia: Engineering insights

Project Tracking

Jira: Issue tracking
Linear: Modern issue tracking
Azure DevOps: Microsoft ecosystem
GitHub Projects: Integrated with code

Incident Management

PagerDuty: On-call management
Opsgenie: Incident response
StatusPage: Status communication
FireHydrant: Incident command

Success Indicators

Healthy Engineering Organization

✓ DORA metrics improving quarter-over-quarter
✓ Team satisfaction >8/10
✓ Attrition <10% annually
✓ On-time delivery >80%
✓ Technical debt <15% of capacity
✓ Innovation time >20%

Warning Signs

⚠️ Increasing MTTR trend
⚠️ Declining velocity
⚠️ Rising bug escape rate
⚠️ Increasing unplanned work
⚠️ Growing PR queue
⚠️ Decreasing test coverage

Crisis Indicators

🚨 Multiple production incidents per week
🚨 Team satisfaction <6/10
🚨 Attrition >20%
🚨 Technical debt >30%
🚨 No deployments for >1 week
🚨 Customer escalations increasing

Technology Evaluation Framework

Evaluation Process

Phase 1: Requirements Gathering (Week 1)

Functional Requirements

Core features needed
Integration requirements
Performance requirements
Scalability needs
Security requirements

Non-Functional Requirements

Usability/Developer experience
Documentation quality
Community support
Vendor stability
Compliance needs

Constraints

Budget limitations
Timeline constraints
Team expertise
Existing technology stack
Regulatory requirements

Phase 2: Market Research (Week 1-2)

Identify Candidates

Industry leaders (Gartner Magic Quadrant)
Open-source alternatives
Emerging solutions
Build vs Buy analysis

Initial Filtering

Eliminate options not meeting hard requirements
Remove options outside budget
Focus on 3-5 top candidates

Phase 3: Deep Evaluation (Week 2-4)

Technical Evaluation

Proof of Concept (PoC)
Performance benchmarks
Security assessment
Integration testing
Scalability testing

Business Evaluation

Total Cost of Ownership (TCO)
Return on Investment (ROI)
Vendor assessment
Risk analysis
Exit strategy

Phase 4: Decision (Week 4)

Evaluation Criteria Matrix

Technical Criteria (40%)

Criterion	Weight	Description	Scoring Guide
Performance	10%	Speed, throughput, latency	5: Exceeds requirements 3: Meets requirements 1: Below requirements
Scalability	10%	Ability to grow with needs	5: Linear scalability 3: Some limitations 1: Hard limits
Reliability	8%	Uptime, fault tolerance	5: 99.99% SLA 3: 99.9% SLA 1: <99% SLA
Security	8%	Security features, compliance	5: Exceeds standards 3: Meets standards 1: Concerns exist
Integration	4%	API quality, compatibility	5: Native integration 3: Good APIs 1: Limited integration

Business Criteria (30%)

Criterion	Weight	Description	Scoring Guide
Cost	10%	TCO including licenses, operation	5: Under budget by >20% 3: Within budget 1: Over budget
ROI	8%	Value generation potential	5: <6 month payback 3: <12 month payback 1: >24 month payback
Vendor Stability	6%	Financial health, market position	5: Market leader 3: Established player 1: Startup/uncertain
Support Quality	6%	Support availability, SLAs	5: 24/7 premium support 3: Business hours 1: Community only

Operational Criteria (30%)

Criterion	Weight	Description	Scoring Guide
Ease of Use	8%	Learning curve, UX	5: Intuitive 3: Moderate learning 1: Steep curve
Documentation	7%	Quality, completeness	5: Excellent docs 3: Adequate docs 1: Poor docs
Community	7%	Size, activity, resources	5: Large, active 3: Moderate 1: Small/inactive
Maintenance	8%	Operational overhead	5: Fully managed 3: Some maintenance 1: High maintenance

Vendor Evaluation Template

Vendor Profile

Company Name:
Founded:
Headquarters:
Employees:
Revenue:
Funding (if applicable):
Key Customers:

Product Assessment

Strengths

Market leader position
Strong feature set
Good performance
Excellent support
Active development

Weaknesses

Price point
Learning curve
Limited customization
Vendor lock-in
Missing features

Opportunities

Roadmap alignment
Partnership potential
Training availability
Professional services

Threats

Competitive alternatives
Market changes
Technology shifts
Acquisition risk

Financial Analysis

Cost Breakdown

Component	Year 1	Year 2	Year 3	Total
Licensing	$	$	$	$
Implementation	$	$	$	$
Training	$	$	$	$
Support	$	$	$	$
Infrastructure	$	$	$	$
Total	$	$	$	$

ROI Calculation

Cost Savings:
- Reduced manual work: $/year
- Efficiency gains: $/year
- Error reduction: $/year
Revenue Impact:
- New capabilities: $/year
- Faster time to market: $/year
Payback Period: X months

Risk Assessment

Risk	Probability	Impact	Mitigation
Vendor goes out of business	Low/Med/High	Low/Med/High	Strategy
Technology becomes obsolete
Integration difficulties
Team adoption challenges
Budget overrun
Performance issues

Build vs Buy Decision Framework

When to Build

Advantages:

Full control over features
No vendor lock-in
Potential competitive advantage
Perfect fit for requirements
No licensing costs

Build when:

Core business differentiator
Unique requirements
Long-term investment
Have expertise in-house
No suitable solutions exist

Hidden Costs:

Development time
Maintenance burden
Security responsibility
Documentation needs
Training requirements

When to Buy

Advantages:

Faster time to market
Proven solution
Vendor support
Regular updates
Shared development costs

Buy when:

Commodity functionality
Standard requirements
Limited internal resources
Need quick solution
Good options available

Hidden Costs:

Customization limits
Vendor lock-in
Integration effort
Training needs
Scaling costs

When to Adopt Open Source

Advantages:

No licensing costs
Community support
Transparency
Customizable
No vendor lock-in

Adopt when:

Strong community exists
Standard solution needed
Have technical expertise
Can contribute back
Long-term stability needed

Hidden Costs:

Support costs
Security responsibility
Upgrade management
Integration effort
Potential consulting needs

Proof of Concept Guidelines

PoC Scope

Duration: 2-4 weeks
Team: 2-3 engineers
Environment: Isolated/sandbox
Data: Representative sample

Success Criteria

Core use cases demonstrated
Performance benchmarks met
Integration points tested
Security requirements validated
Team feedback positive

PoC Checklist

Environment setup documented
Test scenarios defined
Metrics collection automated
Team training completed
Results documented

PoC Report Template

# PoC Report: [Technology Name]

## Executive Summary
- **Recommendation**: [Proceed/Stop/Investigate Further]
- **Confidence Level**: [High/Medium/Low]
- **Key Finding**: [One sentence summary]

## Test Results

### Functional Tests
| Test Case | Result | Notes |
|-----------|--------|-------|
| | Pass/Fail | |

### Performance Tests
| Metric | Target | Actual | Status |
|--------|--------|--------|---------|
| Response Time | <100ms | Xms | ✓/✗ |
| Throughput | >1000 req/s | X req/s | ✓/✗ |
| CPU Usage | <70% | X% | ✓/✗ |
| Memory Usage | <4GB | XGB | ✓/✗ |

### Integration Tests
| System | Status | Effort |
|--------|--------|--------|
| Database | ✓/✗ | Low/Med/High |
| API Gateway | ✓/✗ | Low/Med/High |
| Authentication | ✓/✗ | Low/Med/High |

## Team Feedback
- **Ease of Use**: [1-5 rating]
- **Documentation**: [1-5 rating]
- **Would Recommend**: [Yes/No]

## Risks Identified
1. [Risk and mitigation]
2. [Risk and mitigation]

## Next Steps
1. [Action item]
2. [Action item]

Technology Categories

Development Platforms

Languages: TypeScript, Python, Go, Rust, Java
Frameworks: React, Node.js, Spring, Django, FastAPI
Mobile: React Native, Flutter, Swift, Kotlin
Evaluation Focus: Developer productivity, ecosystem, performance

Databases

SQL: PostgreSQL, MySQL, SQL Server
NoSQL: MongoDB, Cassandra, DynamoDB
NewSQL: CockroachDB, Vitess, TiDB
Evaluation Focus: Performance, scalability, consistency, operations

Infrastructure

Cloud: AWS, GCP, Azure
Containers: Docker, Kubernetes, Nomad
Serverless: Lambda, Cloud Functions, Vercel
Evaluation Focus: Cost, scalability, vendor lock-in, operations

Monitoring & Observability

APM: DataDog, New Relic, AppDynamics
Logging: ELK Stack, Splunk, CloudWatch
Metrics: Prometheus, Grafana, CloudWatch
Evaluation Focus: Coverage, cost, integration, insights

Security

SAST: Sonarqube, Checkmarx, Veracode
DAST: OWASP ZAP, Burp Suite
Secrets: Vault, AWS Secrets Manager
Evaluation Focus: Coverage, false positives, integration

DevOps Tools

CI/CD: Jenkins, GitLab CI, GitHub Actions
IaC: Terraform, CloudFormation, Pulumi
Configuration: Ansible, Chef, Puppet
Evaluation Focus: Flexibility, integration, learning curve

Continuous Evaluation

Quarterly Reviews

Technology landscape changes
Performance against expectations
Cost optimization opportunities
Team satisfaction
Market alternatives

Annual Assessment

Full technology stack review
Vendor relationship evaluation
Strategic alignment check
Technical debt assessment
Roadmap planning

Deprecation Planning

Migration strategy
Timeline definition
Risk assessment
Communication plan
Success metrics

Decision Documentation

Always document:

Why the technology was chosen
Who was involved in the decision
When the decision was made
What alternatives were considered
How success will be measured

Use Architecture Decision Records (ADRs) for significant technology choices.

#!/usr/bin/env python3
"""
Engineering Team Scaling Calculator - Optimize team growth and structure
"""

import json
import math
from typing import Dict, List, Tuple

class TeamScalingCalculator:
    def __init__(self):
        self.conway_factor = 1.5  # Conway's Law impact factor
        self.brooks_factor = 0.75  # Brooks' Law diminishing returns
        
        # Optimal team structures based on size
        self.team_structures = {
            'startup': {'min': 1, 'max': 10, 'structure': 'flat'},
            'growth': {'min': 11, 'max': 50, 'structure': 'team_leads'},
            'scale': {'min': 51, 'max': 150, 'structure': 'departments'},
            'enterprise': {'min': 151, 'max': 9999, 'structure': 'divisions'}
        }
        
        # Role ratios for balanced teams
        self.role_ratios = {
            'engineering_manager': 0.125,  # 1:8 ratio
            'tech_lead': 0.167,  # 1:6 ratio
            'senior_engineer': 0.3,
            'mid_engineer': 0.4,
            'junior_engineer': 0.2,
            'devops': 0.1,
            'qa': 0.15,
            'product_manager': 0.1,
            'designer': 0.08,
            'data_engineer': 0.05
        }
    
    def calculate_scaling_plan(self, current_state: Dict, growth_targets: Dict) -> Dict:
        """Calculate optimal scaling plan"""
        results = {
            'current_analysis': self._analyze_current_state(current_state),
            'growth_timeline': self._create_growth_timeline(current_state, growth_targets),
            'hiring_plan': {},
            'team_structure': {},
            'budget_projection': {},
            'risk_factors': [],
            'recommendations': []
        }
        
        # Generate hiring plan
        results['hiring_plan'] = self._generate_hiring_plan(
            current_state,
            growth_targets
        )
        
        # Design team structure
        results['team_structure'] = self._design_team_structure(
            growth_targets['target_headcount']
        )
        
        # Calculate budget
        results['budget_projection'] = self._calculate_budget(
            results['hiring_plan'],
            current_state.get('location', 'US')
        )
        
        # Assess risks
        results['risk_factors'] = self._assess_scaling_risks(
            current_state,
            growth_targets
        )
        
        # Generate recommendations
        results['recommendations'] = self._generate_recommendations(results)
        
        return results
    
    def _analyze_current_state(self, current_state: Dict) -> Dict:
        """Analyze current team state"""
        total_engineers = current_state.get('headcount', 0)
        
        analysis = {
            'total_headcount': total_engineers,
            'team_stage': self._get_team_stage(total_engineers),
            'productivity_index': 0,
            'balance_score': 0,
            'issues': []
        }
        
        # Calculate productivity index
        if total_engineers > 0:
            velocity = current_state.get('velocity', 100)
            expected_velocity = total_engineers * 20  # baseline 20 points per engineer
            analysis['productivity_index'] = (velocity / expected_velocity) * 100
        
        # Check team balance
        roles = current_state.get('roles', {})
        analysis['balance_score'] = self._calculate_balance_score(roles, total_engineers)
        
        # Identify issues
        if analysis['productivity_index'] < 70:
            analysis['issues'].append('Low productivity - possible process or tooling issues')
        
        if analysis['balance_score'] < 60:
            analysis['issues'].append('Team imbalance - review role distribution')
        
        manager_ratio = roles.get('managers', 0) / max(total_engineers, 1)
        if manager_ratio > 0.2:
            analysis['issues'].append('Over-managed - too many managers')
        elif manager_ratio < 0.08 and total_engineers > 20:
            analysis['issues'].append('Under-managed - need more engineering managers')
        
        return analysis
    
    def _get_team_stage(self, headcount: int) -> str:
        """Determine team stage based on size"""
        for stage, config in self.team_structures.items():
            if config['min'] <= headcount <= config['max']:
                return stage
        return 'startup'
    
    def _calculate_balance_score(self, roles: Dict, total: int) -> float:
        """Calculate team balance score"""
        if total == 0:
            return 0
        
        score = 100
        ideal_ratios = self.role_ratios
        
        for role, ideal_ratio in ideal_ratios.items():
            actual_count = roles.get(role, 0)
            actual_ratio = actual_count / total
            
            # Penalize deviation from ideal ratio
            deviation = abs(actual_ratio - ideal_ratio)
            penalty = deviation * 100
            score -= min(penalty, 20)  # Max 20 point penalty per role
        
        return max(0, score)
    
    def _create_growth_timeline(self, current: Dict, targets: Dict) -> List[Dict]:
        """Create quarterly growth timeline"""
        current_headcount = current.get('headcount', 0)
        target_headcount = targets.get('target_headcount', current_headcount)
        timeline_quarters = targets.get('timeline_quarters', 4)
        
        growth_needed = target_headcount - current_headcount
        timeline = []
        
        for quarter in range(1, timeline_quarters + 1):
            # Apply Brooks' Law - diminishing returns with rapid growth
            if quarter == 1:
                quarterly_growth = math.ceil(growth_needed * 0.4)  # Front-load hiring
            else:
                remaining_growth = target_headcount - current_headcount
                quarters_left = timeline_quarters - quarter + 1
                quarterly_growth = math.ceil(remaining_growth / quarters_left)
            
            # Adjust for onboarding capacity
            max_onboarding = math.ceil(current_headcount * 0.25)  # 25% growth per quarter max
            quarterly_growth = min(quarterly_growth, max_onboarding)
            
            current_headcount += quarterly_growth
            
            timeline.append({
                'quarter': f'Q{quarter}',
                'headcount': current_headcount,
                'new_hires': quarterly_growth,
                'onboarding_capacity': max_onboarding,
                'productivity_factor': 1.0 - (0.2 * (quarterly_growth / max(current_headcount, 1)))
            })
        
        return timeline
    
    def _generate_hiring_plan(self, current: Dict, targets: Dict) -> Dict:
        """Generate detailed hiring plan"""
        current_roles = current.get('roles', {})
        target_headcount = targets.get('target_headcount', 0)
        
        hiring_plan = {
            'total_hires_needed': target_headcount - current.get('headcount', 0),
            'by_role': {},
            'by_quarter': {},
            'interview_capacity_needed': 0,
            'recruiting_resources': 0
        }
        
        # Calculate ideal role distribution
        for role, ideal_ratio in self.role_ratios.items():
            ideal_count = math.ceil(target_headcount * ideal_ratio)
            current_count = current_roles.get(role, 0)
            hires_needed = max(0, ideal_count - current_count)
            
            if hires_needed > 0:
                hiring_plan['by_role'][role] = {
                    'current': current_count,
                    'target': ideal_count,
                    'hires_needed': hires_needed,
                    'priority': self._get_role_priority(role, current_roles, target_headcount)
                }
        
        # Distribute hires across quarters
        timeline = self._create_growth_timeline(current, targets)
        for quarter_data in timeline:
            quarter = quarter_data['quarter']
            hires = quarter_data['new_hires']
            
            hiring_plan['by_quarter'][quarter] = {
                'total_hires': hires,
                'breakdown': self._distribute_quarterly_hires(hires, hiring_plan['by_role'])
            }
        
        # Calculate interview capacity (5 interviews per hire average)
        hiring_plan['interview_capacity_needed'] = hiring_plan['total_hires_needed'] * 5
        
        # Calculate recruiting resources (1 recruiter per 50 hires/year)
        annual_hires = hiring_plan['total_hires_needed'] * (4 / max(targets.get('timeline_quarters', 4), 1))
        hiring_plan['recruiting_resources'] = math.ceil(annual_hires / 50)
        
        return hiring_plan
    
    def _get_role_priority(self, role: str, current_roles: Dict, target_size: int) -> int:
        """Determine hiring priority for a role"""
        # Priority based on criticality and current gaps
        priorities = {
            'engineering_manager': 10 if target_size > 20 else 5,
            'tech_lead': 9,
            'senior_engineer': 8,
            'devops': 7 if current_roles.get('devops', 0) == 0 else 5,
            'qa': 6,
            'mid_engineer': 5,
            'product_manager': 6,
            'designer': 5,
            'data_engineer': 4,
            'junior_engineer': 3
        }
        
        return priorities.get(role, 5)
    
    def _distribute_quarterly_hires(self, total_hires: int, role_needs: Dict) -> Dict:
        """Distribute quarterly hires across roles"""
        distribution = {}
        
        # Sort roles by priority
        sorted_roles = sorted(
            role_needs.items(),
            key=lambda x: x[1]['priority'],
            reverse=True
        )
        
        remaining_hires = total_hires
        
        for role, needs in sorted_roles:
            if remaining_hires <= 0:
                break
            
            hires = min(needs['hires_needed'], max(1, remaining_hires // 3))
            distribution[role] = hires
            remaining_hires -= hires
        
        return distribution
    
    def _design_team_structure(self, target_headcount: int) -> Dict:
        """Design optimal team structure"""
        stage = self._get_team_stage(target_headcount)
        structure = {
            'organizational_model': self.team_structures[stage]['structure'],
            'teams': [],
            'reporting_structure': {},
            'communication_paths': 0
        }
        
        if stage == 'startup':
            structure['teams'] = [{
                'name': 'Core Team',
                'size': target_headcount,
                'focus': 'Full-stack'
            }]
            
        elif stage == 'growth':
            # Create 2-4 teams
            team_size = 6
            num_teams = math.ceil(target_headcount / team_size)
            
            structure['teams'] = [
                {
                    'name': f'Team {i+1}',
                    'size': team_size,
                    'focus': ['Platform', 'Product', 'Infrastructure', 'Growth'][i % 4]
                }
                for i in range(num_teams)
            ]
            
        elif stage == 'scale':
            # Create departments with multiple teams
            structure['departments'] = [
                {'name': 'Platform', 'teams': 3, 'headcount': target_headcount * 0.3},
                {'name': 'Product', 'teams': 4, 'headcount': target_headcount * 0.4},
                {'name': 'Infrastructure', 'teams': 2, 'headcount': target_headcount * 0.2},
                {'name': 'Data', 'teams': 1, 'headcount': target_headcount * 0.1}
            ]
        
        # Calculate communication paths (n*(n-1)/2)
        structure['communication_paths'] = (target_headcount * (target_headcount - 1)) // 2
        
        # Add management layers
        structure['management_layers'] = math.ceil(math.log(target_headcount, 7))
        
        return structure
    
    def _calculate_budget(self, hiring_plan: Dict, location: str) -> Dict:
        """Calculate budget projection"""
        # Average salaries by role and location (in USD)
        salary_bands = {
            'US': {
                'engineering_manager': 200000,
                'tech_lead': 180000,
                'senior_engineer': 160000,
                'mid_engineer': 120000,
                'junior_engineer': 85000,
                'devops': 150000,
                'qa': 100000,
                'product_manager': 150000,
                'designer': 120000,
                'data_engineer': 140000
            },
            'EU': {
                'engineering_manager': 160000,
                'tech_lead': 144000,
                'senior_engineer': 128000,
                'mid_engineer': 96000,
                'junior_engineer': 68000,
                'devops': 120000,
                'qa': 80000,
                'product_manager': 120000,
                'designer': 96000,
                'data_engineer': 112000
            },
            'APAC': {
                'engineering_manager': 120000,
                'tech_lead': 108000,
                'senior_engineer': 96000,
                'mid_engineer': 72000,
                'junior_engineer': 51000,
                'devops': 90000,
                'qa': 60000,
                'product_manager': 90000,
                'designer': 72000,
                'data_engineer': 84000
            }
        }
        
        location_salaries = salary_bands.get(location, salary_bands['US'])
        
        budget = {
            'annual_salary_cost': 0,
            'benefits_cost': 0,  # 30% of salary
            'equipment_cost': 0,  # $5k per hire
            'recruiting_cost': 0,  # 20% of first-year salary
            'onboarding_cost': 0,  # $10k per hire
            'total_cost': 0,
            'cost_per_hire': 0
        }
        
        for role, details in hiring_plan['by_role'].items():
            hires = details['hires_needed']
            salary = location_salaries.get(role, 100000)
            
            budget['annual_salary_cost'] += hires * salary
            budget['recruiting_cost'] += hires * salary * 0.2
        
        budget['benefits_cost'] = budget['annual_salary_cost'] * 0.3
        budget['equipment_cost'] = hiring_plan['total_hires_needed'] * 5000
        budget['onboarding_cost'] = hiring_plan['total_hires_needed'] * 10000
        
        budget['total_cost'] = sum([
            budget['annual_salary_cost'],
            budget['benefits_cost'],
            budget['equipment_cost'],
            budget['recruiting_cost'],
            budget['onboarding_cost']
        ])
        
        if hiring_plan['total_hires_needed'] > 0:
            budget['cost_per_hire'] = budget['total_cost'] / hiring_plan['total_hires_needed']
        
        return budget
    
    def _assess_scaling_risks(self, current: Dict, targets: Dict) -> List[Dict]:
        """Assess risks in scaling plan"""
        risks = []
        
        growth_rate = (targets['target_headcount'] - current['headcount']) / max(current['headcount'], 1)
        
        if growth_rate > 1.0:  # More than 100% growth
            risks.append({
                'risk': 'Rapid growth dilution',
                'impact': 'High',
                'mitigation': 'Implement strong onboarding and mentorship programs'
            })
        
        if current.get('attrition_rate', 0) > 15:
            risks.append({
                'risk': 'High attrition during scaling',
                'impact': 'High',
                'mitigation': 'Address retention issues before aggressive hiring'
            })
        
        if targets.get('timeline_quarters', 4) < 4:
            risks.append({
                'risk': 'Compressed timeline',
                'impact': 'Medium',
                'mitigation': 'Consider extending timeline or increasing recruiting resources'
            })
        
        return risks
    
    def _generate_recommendations(self, results: Dict) -> List[str]:
        """Generate scaling recommendations"""
        recommendations = []
        
        # Based on growth rate
        total_hires = results['hiring_plan']['total_hires_needed']
        current_size = results['current_analysis']['total_headcount']
        
        if current_size > 0:
            growth_rate = total_hires / current_size
            
            if growth_rate > 0.5:
                recommendations.append('Consider hiring a dedicated recruiting team')
                recommendations.append('Implement scalable onboarding processes')
                recommendations.append('Establish clear team charters and boundaries')
            
            if growth_rate > 1.0:
                recommendations.append('⚠️ High growth risk - consider slowing timeline')
                recommendations.append('Focus on senior hires first to establish culture')
                recommendations.append('Implement continuous integration practices early')
        
        # Based on structure
        if results['team_structure']['communication_paths'] > 1000:
            recommendations.append('Implement clear communication channels and tools')
            recommendations.append('Consider platform teams to reduce dependencies')
        
        # Based on balance
        if results['current_analysis']['balance_score'] < 70:
            recommendations.append('Prioritize hiring for underrepresented roles')
            recommendations.append('Consider role rotation for skill development')
        
        return recommendations

def calculate_team_scaling(current_state: Dict, growth_targets: Dict) -> str:
    """Main function to calculate team scaling"""
    calculator = TeamScalingCalculator()
    results = calculator.calculate_scaling_plan(current_state, growth_targets)
    
    # Format output
    output = [
        "=== Engineering Team Scaling Plan ===",
        f"",
        f"Current State Analysis:",
        f"  Current Headcount: {results['current_analysis']['total_headcount']}",
        f"  Team Stage: {results['current_analysis']['team_stage']}",
        f"  Productivity Index: {results['current_analysis']['productivity_index']:.1f}%",
        f"  Team Balance Score: {results['current_analysis']['balance_score']:.1f}/100",
        f"",
        f"Growth Plan:",
        f"  Target Headcount: {growth_targets['target_headcount']}",
        f"  Total Hires Needed: {results['hiring_plan']['total_hires_needed']}",
        f"  Timeline: {growth_targets['timeline_quarters']} quarters",
        f"",
        "Quarterly Timeline:"
    ]
    
    for quarter in results['growth_timeline']:
        output.append(
            f"  {quarter['quarter']}: {quarter['headcount']} total "
            f"(+{quarter['new_hires']} hires, "
            f"{quarter['productivity_factor']:.0%} productivity)"
        )
    
    output.extend([
        f"",
        "Hiring Priorities:"
    ])
    
    sorted_roles = sorted(
        results['hiring_plan']['by_role'].items(),
        key=lambda x: x[1]['priority'],
        reverse=True
    )
    
    for role, details in sorted_roles[:5]:
        output.append(
            f"  {role}: {details['hires_needed']} hires "
            f"(Priority: {details['priority']}/10)"
        )
    
    output.extend([
        f"",
        f"Budget Projection:",
        f"  Annual Salary Cost: ${results['budget_projection']['annual_salary_cost']:,.0f}",
        f"  Total Investment: ${results['budget_projection']['total_cost']:,.0f}",
        f"  Cost per Hire: ${results['budget_projection']['cost_per_hire']:,.0f}",
        f"",
        f"Team Structure:",
        f"  Model: {results['team_structure']['organizational_model']}",
        f"  Management Layers: {results['team_structure']['management_layers']}",
        f"  Communication Paths: {results['team_structure']['communication_paths']:,}",
        f"",
        "Key Recommendations:"
    ])
    
    for rec in results['recommendations']:
        output.append(f"  • {rec}")
    
    return '\n'.join(output)

if __name__ == "__main__":
    import argparse

    parser = argparse.ArgumentParser(
        description="Engineering Team Scaling Calculator - Optimize team growth and structure"
    )
    parser.add_argument(
        "input_file", nargs="?", default=None,
        help="JSON file with current_state and growth_targets (default: run with sample data)"
    )
    parser.add_argument(
        "--json", action="store_true",
        help="Output raw JSON instead of formatted report"
    )
    args = parser.parse_args()

    if args.input_file:
        with open(args.input_file) as f:
            data = json.load(f)
        current_state = data["current_state"]
        growth_targets = data["growth_targets"]
    else:
        current_state = {
            'headcount': 25,
            'velocity': 450,
            'roles': {
                'engineering_manager': 2,
                'tech_lead': 3,
                'senior_engineer': 8,
                'mid_engineer': 10,
                'junior_engineer': 2
            },
            'attrition_rate': 12,
            'location': 'US'
        }
        growth_targets = {
            'target_headcount': 75,
            'timeline_quarters': 4
        }

    if args.json:
        calculator = TeamScalingCalculator()
        results = calculator.calculate_scaling_plan(current_state, growth_targets)
        print(json.dumps(results, indent=2))
    else:
        print(calculate_team_scaling(current_state, growth_targets))

#!/usr/bin/env python3
"""
Technical Debt Analyzer - Assess and prioritize technical debt across systems
"""

import json
from typing import Dict, List, Tuple
from datetime import datetime
import math

class TechDebtAnalyzer:
    def __init__(self):
        self.debt_categories = {
            'architecture': {
                'weight': 0.25,
                'indicators': [
                    'monolithic_design', 'tight_coupling', 'no_microservices',
                    'legacy_patterns', 'no_api_gateway', 'synchronous_only'
                ]
            },
            'code_quality': {
                'weight': 0.20,
                'indicators': [
                    'low_test_coverage', 'high_complexity', 'code_duplication',
                    'no_documentation', 'inconsistent_standards', 'legacy_language'
                ]
            },
            'infrastructure': {
                'weight': 0.20,
                'indicators': [
                    'manual_deployments', 'no_ci_cd', 'single_points_failure',
                    'no_monitoring', 'no_auto_scaling', 'outdated_servers'
                ]
            },
            'security': {
                'weight': 0.20,
                'indicators': [
                    'outdated_dependencies', 'no_security_scans', 'plain_text_secrets',
                    'no_encryption', 'missing_auth', 'no_audit_logs'
                ]
            },
            'performance': {
                'weight': 0.15,
                'indicators': [
                    'slow_response_times', 'no_caching', 'inefficient_queries',
                    'memory_leaks', 'no_optimization', 'blocking_operations'
                ]
            }
        }
        
        self.impact_matrix = {
            'user_impact': {'weight': 0.30, 'score': 0},
            'developer_velocity': {'weight': 0.25, 'score': 0},
            'system_reliability': {'weight': 0.20, 'score': 0},
            'scalability': {'weight': 0.15, 'score': 0},
            'maintenance_cost': {'weight': 0.10, 'score': 0}
        }
    
    def analyze_system(self, system_data: Dict) -> Dict:
        """Analyze a system for technical debt"""
        results = {
            'timestamp': datetime.now().isoformat(),
            'system_name': system_data.get('name', 'Unknown'),
            'debt_score': 0,
            'debt_level': '',
            'category_scores': {},
            'prioritized_actions': [],
            'estimated_effort': {},
            'risk_assessment': {},
            'recommendations': []
        }
        
        # Calculate debt scores by category
        total_debt_score = 0
        for category, config in self.debt_categories.items():
            category_score = self._calculate_category_score(
                system_data.get(category, {}), 
                config['indicators']
            )
            weighted_score = category_score * config['weight']
            results['category_scores'][category] = {
                'raw_score': category_score,
                'weighted_score': weighted_score,
                'level': self._get_level(category_score)
            }
            total_debt_score += weighted_score
        
        results['debt_score'] = round(total_debt_score, 2)
        results['debt_level'] = self._get_level(total_debt_score)
        
        # Calculate impact and prioritize
        results['prioritized_actions'] = self._prioritize_actions(
            results['category_scores'],
            system_data.get('business_context', {})
        )
        
        # Estimate effort
        results['estimated_effort'] = self._estimate_effort(
            results['prioritized_actions'],
            system_data.get('team_size', 5)
        )
        
        # Risk assessment
        results['risk_assessment'] = self._assess_risks(
            results['debt_score'],
            system_data.get('system_criticality', 'medium')
        )
        
        # Generate recommendations
        results['recommendations'] = self._generate_recommendations(results)
        
        return results
    
    def _calculate_category_score(self, category_data: Dict, indicators: List) -> float:
        """Calculate score for a specific category"""
        if not category_data:
            return 50.0  # Default middle score if no data
        
        total_score = 0
        count = 0
        
        for indicator in indicators:
            if indicator in category_data:
                # Score from 0 (no debt) to 100 (high debt)
                total_score += category_data[indicator]
                count += 1
        
        return (total_score / count) if count > 0 else 50.0
    
    def _get_level(self, score: float) -> str:
        """Convert numerical score to level"""
        if score < 20:
            return 'Low'
        elif score < 40:
            return 'Medium-Low'
        elif score < 60:
            return 'Medium'
        elif score < 80:
            return 'Medium-High'
        else:
            return 'Critical'
    
    def _prioritize_actions(self, category_scores: Dict, business_context: Dict) -> List:
        """Prioritize technical debt reduction actions"""
        actions = []
        
        for category, scores in category_scores.items():
            if scores['raw_score'] > 60:  # Focus on high debt areas
                priority = self._calculate_priority(
                    scores['raw_score'],
                    category,
                    business_context
                )
                
                action = {
                    'category': category,
                    'priority': priority,
                    'score': scores['raw_score'],
                    'action_items': self._get_action_items(category, scores['level'])
                }
                actions.append(action)
        
        # Sort by priority
        actions.sort(key=lambda x: x['priority'], reverse=True)
        return actions[:5]  # Top 5 priorities
    
    def _calculate_priority(self, score: float, category: str, context: Dict) -> float:
        """Calculate priority based on score and business context"""
        base_priority = score
        
        # Adjust based on business context
        if context.get('growth_phase') == 'rapid' and category in ['scalability', 'performance']:
            base_priority *= 1.5
        
        if context.get('compliance_required') and category == 'security':
            base_priority *= 2.0
        
        if context.get('cost_pressure') and category == 'infrastructure':
            base_priority *= 1.3
        
        return min(100, base_priority)
    
    def _get_action_items(self, category: str, level: str) -> List[str]:
        """Get specific action items based on category and level"""
        actions = {
            'architecture': {
                'Critical': [
                    'Immediate: Create architecture migration roadmap',
                    'Week 1: Identify service boundaries for decomposition',
                    'Month 1: Begin extracting first microservice',
                    'Month 2: Implement API gateway',
                    'Quarter: Complete critical service separation'
                ],
                'Medium-High': [
                    'Month 1: Document current architecture',
                    'Month 2: Design target architecture',
                    'Quarter: Begin gradual migration',
                    'Monitor: Track coupling metrics'
                ]
            },
            'code_quality': {
                'Critical': [
                    'Immediate: Implement code quality gates',
                    'Week 1: Set up automated testing pipeline',
                    'Month 1: Achieve 40% test coverage',
                    'Month 2: Refactor critical modules',
                    'Quarter: Reach 70% test coverage'
                ],
                'Medium-High': [
                    'Month 1: Establish coding standards',
                    'Month 2: Implement code review process',
                    'Quarter: Gradual refactoring plan'
                ]
            },
            'infrastructure': {
                'Critical': [
                    'Immediate: Implement basic CI/CD',
                    'Week 1: Set up monitoring and alerts',
                    'Month 1: Automate critical deployments',
                    'Month 2: Implement disaster recovery',
                    'Quarter: Full infrastructure as code'
                ],
                'Medium-High': [
                    'Month 1: Document infrastructure',
                    'Month 2: Begin automation',
                    'Quarter: Modernize critical components'
                ]
            },
            'security': {
                'Critical': [
                    'Immediate: Security audit and patching',
                    'Week 1: Implement secrets management',
                    'Month 1: Set up vulnerability scanning',
                    'Month 2: Implement security training',
                    'Quarter: Achieve compliance standards'
                ],
                'Medium-High': [
                    'Month 1: Security assessment',
                    'Month 2: Implement security tools',
                    'Quarter: Regular security reviews'
                ]
            },
            'performance': {
                'Critical': [
                    'Immediate: Performance profiling',
                    'Week 1: Implement caching strategy',
                    'Month 1: Optimize database queries',
                    'Month 2: Implement CDN',
                    'Quarter: Re-architect bottlenecks'
                ],
                'Medium-High': [
                    'Month 1: Performance baseline',
                    'Month 2: Optimization plan',
                    'Quarter: Incremental improvements'
                ]
            }
        }
        
        return actions.get(category, {}).get(level, ['Create action plan'])
    
    def _estimate_effort(self, actions: List, team_size: int) -> Dict:
        """Estimate effort required for debt reduction"""
        total_story_points = 0
        effort_breakdown = {}
        
        for action in actions:
            # Estimate based on category and score
            base_points = action['score'] * 2  # Higher debt = more effort
            
            if action['category'] == 'architecture':
                points = base_points * 1.5  # Architecture changes are complex
            elif action['category'] == 'security':
                points = base_points * 1.2  # Security requires careful work
            else:
                points = base_points
            
            effort_breakdown[action['category']] = {
                'story_points': round(points),
                'sprints': math.ceil(points / (team_size * 20)),  # 20 points per dev per sprint
                'developers_needed': math.ceil(points / 100)
            }
            total_story_points += points
        
        return {
            'total_story_points': round(total_story_points),
            'estimated_sprints': math.ceil(total_story_points / (team_size * 20)),
            'recommended_team_size': max(team_size, math.ceil(total_story_points / 200)),
            'breakdown': effort_breakdown
        }
    
    def _assess_risks(self, debt_score: float, criticality: str) -> Dict:
        """Assess risks associated with technical debt"""
        risk_level = 'Low'
        
        if debt_score > 70 and criticality == 'high':
            risk_level = 'Critical'
        elif debt_score > 60 or criticality == 'high':
            risk_level = 'High'
        elif debt_score > 40:
            risk_level = 'Medium'
        
        risks = {
            'overall_risk': risk_level,
            'specific_risks': []
        }
        
        if debt_score > 60:
            risks['specific_risks'].extend([
                'System failure risk increasing',
                'Developer productivity declining',
                'Innovation velocity blocked',
                'Maintenance costs escalating'
            ])
        
        if debt_score > 80:
            risks['specific_risks'].extend([
                'Competitive disadvantage emerging',
                'Talent retention risk',
                'Customer satisfaction impact',
                'Potential data breach vulnerability'
            ])
        
        return risks
    
    def _generate_recommendations(self, results: Dict) -> List[str]:
        """Generate strategic recommendations"""
        recommendations = []
        
        # Overall strategy based on debt level
        if results['debt_level'] == 'Critical':
            recommendations.append('🚨 URGENT: Dedicate 40% of engineering capacity to debt reduction')
            recommendations.append('Create dedicated debt reduction team')
            recommendations.append('Implement weekly debt reduction reviews')
            recommendations.append('Consider temporary feature freeze')
        elif results['debt_level'] in ['Medium-High', 'High']:
            recommendations.append('Allocate 25-30% of sprints to debt reduction')
            recommendations.append('Establish technical debt budget')
            recommendations.append('Implement debt prevention practices')
        else:
            recommendations.append('Maintain 15-20% ongoing debt reduction allocation')
            recommendations.append('Focus on prevention over correction')
        
        # Category-specific recommendations
        for category, scores in results['category_scores'].items():
            if scores['raw_score'] > 70:
                if category == 'architecture':
                    recommendations.append(f'Consider hiring architecture specialist')
                elif category == 'security':
                    recommendations.append(f'Engage security audit firm')
                elif category == 'performance':
                    recommendations.append(f'Implement performance SLA monitoring')
        
        # Team recommendations
        effort = results.get('estimated_effort', {})
        if effort.get('recommended_team_size', 0) > effort.get('total_story_points', 0) / 200:
            recommendations.append(f"Scale team to {effort['recommended_team_size']} engineers")
        
        return recommendations

def analyze_technical_debt(system_config: Dict) -> str:
    """Main function to analyze technical debt"""
    analyzer = TechDebtAnalyzer()
    results = analyzer.analyze_system(system_config)
    
    # Format output
    output = [
        f"=== Technical Debt Analysis Report ===",
        f"System: {results['system_name']}",
        f"Analysis Date: {results['timestamp'][:10]}",
        f"",
        f"OVERALL DEBT SCORE: {results['debt_score']}/100 ({results['debt_level']})",
        f"",
        "Category Breakdown:"
    ]
    
    for category, scores in results['category_scores'].items():
        output.append(f"  {category.title()}: {scores['raw_score']:.1f} ({scores['level']})")
    
    output.extend([
        f"",
        "Risk Assessment:",
        f"  Overall Risk: {results['risk_assessment']['overall_risk']}"
    ])
    
    for risk in results['risk_assessment']['specific_risks']:
        output.append(f"  • {risk}")
    
    output.extend([
        f"",
        "Effort Estimation:",
        f"  Total Story Points: {results['estimated_effort']['total_story_points']}",
        f"  Estimated Sprints: {results['estimated_effort']['estimated_sprints']}",
        f"  Recommended Team Size: {results['estimated_effort']['recommended_team_size']}",
        f"",
        "Top Priority Actions:"
    ])
    
    for i, action in enumerate(results['prioritized_actions'][:3], 1):
        output.append(f"\n{i}. {action['category'].title()} (Priority: {action['priority']:.0f})")
        for item in action['action_items'][:3]:
            output.append(f"   - {item}")
    
    output.extend([
        f"",
        "Strategic Recommendations:"
    ])
    
    for rec in results['recommendations']:
        output.append(f"  • {rec}")
    
    return '\n'.join(output)

if __name__ == "__main__":
    # Example usage
    example_system = {
        'name': 'Legacy E-commerce Platform',
        'architecture': {
            'monolithic_design': 80,
            'tight_coupling': 70,
            'no_microservices': 90,
            'legacy_patterns': 60
        },
        'code_quality': {
            'low_test_coverage': 75,
            'high_complexity': 65,
            'code_duplication': 55
        },
        'infrastructure': {
            'manual_deployments': 70,
            'no_ci_cd': 60,
            'no_monitoring': 40
        },
        'security': {
            'outdated_dependencies': 85,
            'no_security_scans': 70
        },
        'performance': {
            'slow_response_times': 60,
            'no_caching': 50
        },
        'team_size': 8,
        'system_criticality': 'high',
        'business_context': {
            'growth_phase': 'rapid',
            'compliance_required': True,
            'cost_pressure': False
        }
    }
    
    print(analyze_technical_debt(example_system))

Install this Skill

Skills give your AI agent a consistent, structured approach to this task — better output than a one-off prompt.

npx skills add alirezarezvani/claude-skills --skill c-level-advisor/cto-advisor

Download ZIP

Community skill by @alirezarezvani. Need a walkthrough? See the install guide →

Works with

Claude Code OpenAI Codex CLI Gemini CLI

Prefer no terminal? Download the ZIP and place it manually.

Details

Category: Leadership
License: MIT
Author: @alirezarezvani
Source: GitHub →
Source file: show path
c-level-advisor/cto-advisor/SKILL.md

CTO technical-leadership engineering-management tech-strategy executive

People who install this also use

👔

CEO Advisor

Executive leadership coaching — strategic decision-making, organizational development, board governance, and navigating high-stakes business challenges.

@alirezarezvani

🏛️

Senior Software Architect

Design system architecture with C4 and sequence diagrams, write Architecture Decision Records, evaluate tech stacks, and guide architectural trade-offs.

@alirezarezvani

🚀

Senior DevOps Engineer

CI/CD pipeline design, Infrastructure as Code, containerization with Docker and Kubernetes, and deployment automation from a senior DevOps perspective.

@alirezarezvani

CTO Advisor

CTO Advisor

Keywords

Quick Start

Core Responsibilities

1. Technology Strategy

2. Engineering Team Leadership

3. Architecture Governance

4. Vendor & Platform Management

5. Crisis Management

Workflows

Tech Debt Assessment Workflow

ADR Creation Workflow

Build vs Buy Analysis Workflow

Key Questions a CTO Asks

CTO Metrics Dashboard

Red Flags

Integration with C-Suite Roles

Proactive Triggers

Output Artifacts

Reasoning Technique: ReAct (Reason then Act)

Communication

Context Integration

Resources

Architecture Decision Records (ADR) Framework

What is an ADR?

ADR Template

ADR-[NUMBER]: [TITLE]

Context and Problem Statement

Decision Drivers

Considered Options

Decision Outcome

Positive Consequences

Negative Consequences

Pros and Cons of Options

Option 1: [Name]

Option 2: [Name]

Links

Example ADRs

ADR-001: Microservices Architecture

Context and Problem Statement

Decision Drivers

Considered Options

Decision Outcome

Positive Consequences

Negative Consequences

ADR-002: Container Orchestration Platform

Context and Problem Statement

Decision Drivers

Considered Options

Decision Outcome

Positive Consequences

Negative Consequences

ADR-003: API Gateway Strategy

Context and Problem Statement

Decision Drivers

Considered Options

Decision Outcome

Common Architecture Decisions

1. Frontend Architecture

2. Backend Architecture

3. Data Architecture

4. Infrastructure Decisions

5. Development Practices

ADR Best Practices

Writing Good ADRs

When to Write ADRs

ADR Lifecycle

Storage and Discovery

Decision Evaluation Framework

Technical Factors (40%)

Business Factors (30%)

Team Factors (30%)

Anti-patterns to Avoid

Review Checklist

Engineering Metrics & KPIs Guide

Metrics Framework

DORA Metrics (DevOps Research and Assessment)

1. Deployment Frequency

2. Lead Time for Changes