Senior DevOps Engineer
CI/CD pipeline design, Infrastructure as Code, containerization with Docker and Kubernetes, and deployment automation from a senior DevOps perspective.
What this skill does
Generate secure release workflows and manage cloud infrastructure without manually configuring complex tools. This ensures every update is tested and released safely while reducing setup time. Reach for this when launching a new product or streamlining how updates reach your customers.
name: “senior-devops” description: Comprehensive DevOps skill for CI/CD, infrastructure automation, containerization, and cloud platforms (AWS, GCP, Azure). Includes pipeline setup, infrastructure as code, deployment automation, and monitoring. Use when setting up pipelines, deploying applications, managing infrastructure, implementing monitoring, or optimizing deployment processes.
Senior Devops
Complete toolkit for senior devops with modern tools and best practices.
Quick Start
Main Capabilities
This skill provides three core capabilities through automated scripts:
# Script 1: Pipeline Generator — scaffolds CI/CD pipelines for GitHub Actions or CircleCI
python scripts/pipeline_generator.py ./app --platform=github --stages=build,test,deploy
# Script 2: Terraform Scaffolder — generates and validates IaC modules for AWS/GCP/Azure
python scripts/terraform_scaffolder.py ./infra --provider=aws --module=ecs-service --verbose
# Script 3: Deployment Manager — orchestrates container deployments with rollback support
python scripts/deployment_manager.py deploy --env=production --image=app:1.2.3 --strategy=blue-green
Core Capabilities
1. Pipeline Generator
Scaffolds CI/CD pipeline configurations for GitHub Actions or CircleCI, with stages for build, test, security scan, and deploy.
Example — GitHub Actions workflow:
# .github/workflows/ci.yml
name: CI/CD Pipeline
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
jobs:
build-and-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
- run: npm run lint
- run: npm test -- --coverage
- name: Upload coverage
uses: codecov/codecov-action@v4
build-docker:
needs: build-and-test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build and push image
uses: docker/build-push-action@v5
with:
push: ${{ github.ref == 'refs/heads/main' }}
tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
deploy:
needs: build-docker
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
steps:
- name: Deploy to ECS
run: |
aws ecs update-service \
--cluster production \
--service app-service \
--force-new-deployment
Usage:
python scripts/pipeline_generator.py <project-path> --platform=github|circleci --stages=build,test,deploy
2. Terraform Scaffolder
Generates, validates, and plans Terraform modules. Enforces consistent module structure and runs terraform validate + terraform plan before any apply.
Example — AWS ECS service module:
# modules/ecs-service/main.tf
resource "aws_ecs_task_definition" "app" {
family = var.service_name
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
cpu = var.cpu
memory = var.memory
container_definitions = jsonencode([{
name = var.service_name
image = var.container_image
essential = true
portMappings = [{
containerPort = var.container_port
protocol = "tcp"
}]
environment = [for k, v in var.env_vars : { name = k, value = v }]
logConfiguration = {
logDriver = "awslogs"
options = {
awslogs-group = "/ecs/${var.service_name}"
awslogs-region = var.aws_region
awslogs-stream-prefix = "ecs"
}
}
}])
}
resource "aws_ecs_service" "app" {
name = var.service_name
cluster = var.cluster_id
task_definition = aws_ecs_task_definition.app.arn
desired_count = var.desired_count
launch_type = "FARGATE"
network_configuration {
subnets = var.private_subnet_ids
security_groups = [aws_security_group.app.id]
assign_public_ip = false
}
load_balancer {
target_group_arn = aws_lb_target_group.app.arn
container_name = var.service_name
container_port = var.container_port
}
}
Usage:
python scripts/terraform_scaffolder.py <target-path> --provider=aws|gcp|azure --module=ecs-service|gke-deployment|aks-service [--verbose]
3. Deployment Manager
Orchestrates deployments with blue/green or rolling strategies, health-check gates, and automatic rollback on failure.
Example — Kubernetes blue/green deployment (blue-slot specific elements):
# k8s/deployment-blue.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-blue
labels:
app: myapp
slot: blue # slot label distinguishes blue from green
spec:
replicas: 3
selector:
matchLabels:
app: myapp
slot: blue
template:
metadata:
labels:
app: myapp
slot: blue
spec:
containers:
- name: app
image: ghcr.io/org/app:1.2.3
readinessProbe: # gate: pod must pass before traffic switches
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
resources:
requests:
cpu: "250m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
Usage:
python scripts/deployment_manager.py deploy \
--env=staging|production \
--image=app:1.2.3 \
--strategy=blue-green|rolling \
--health-check-url=https://app.example.com/healthz
python scripts/deployment_manager.py rollback --env=production --to-version=1.2.2
python scripts/deployment_manager.py --analyze --env=production # audit current state
Resources
- Pattern Reference:
references/cicd_pipeline_guide.md— detailed CI/CD patterns, best practices, anti-patterns - Workflow Guide:
references/infrastructure_as_code.md— IaC step-by-step processes, optimization, troubleshooting - Technical Guide:
references/deployment_strategies.md— deployment strategy configs, security considerations, scalability - Tool Scripts:
scripts/directory
Development Workflow
1. Infrastructure Changes (Terraform)
# Scaffold or update module
python scripts/terraform_scaffolder.py ./infra --provider=aws --module=ecs-service --verbose
# Validate and plan — review diff before applying
terraform -chdir=infra init
terraform -chdir=infra validate
terraform -chdir=infra plan -out=tfplan
# Apply only after plan review
terraform -chdir=infra apply tfplan
# Verify resources are healthy
aws ecs describe-services --cluster production --services app-service \
--query 'services[0].{Status:status,Running:runningCount,Desired:desiredCount}'
2. Application Deployment
# Generate or update pipeline config
python scripts/pipeline_generator.py . --platform=github --stages=build,test,security,deploy
# Build and tag image
docker build -t ghcr.io/org/app:$(git rev-parse --short HEAD) .
docker push ghcr.io/org/app:$(git rev-parse --short HEAD)
# Deploy with health-check gate
python scripts/deployment_manager.py deploy \
--env=production \
--image=app:$(git rev-parse --short HEAD) \
--strategy=blue-green \
--health-check-url=https://app.example.com/healthz
# Verify pods are running
kubectl get pods -n production -l app=myapp
kubectl rollout status deployment/app-blue -n production
# Switch traffic after verification
kubectl patch service app-svc -n production \
-p '{"spec":{"selector":{"slot":"blue"}}}'
3. Rollback Procedure
# Immediate rollback via deployment manager
python scripts/deployment_manager.py rollback --env=production --to-version=1.2.2
# Or via kubectl
kubectl rollout undo deployment/app -n production
kubectl rollout status deployment/app -n production
# Verify rollback succeeded
kubectl get pods -n production -l app=myapp
curl -sf https://app.example.com/healthz || echo "ROLLBACK FAILED — escalate"
Troubleshooting
Check the comprehensive troubleshooting section in references/deployment_strategies.md.
Install this Skill
Skills give your AI agent a consistent, structured approach to this task — better output than a one-off prompt.
npx skills add alirezarezvani/claude-skills --skill engineering-team/senior-devops Community skill by @alirezarezvani. Need a walkthrough? See the install guide →
Works with
Prefer no terminal? Download the ZIP and place it manually.
Details
- Category
- Development
- License
- MIT
- Author
- @alirezarezvani
- Source
- GitHub →
- Source file
-
show path
engineering-team/senior-devops/SKILL.md
People who install this also use
Senior Software Architect
Design system architecture with C4 and sequence diagrams, write Architecture Decision Records, evaluate tech stacks, and guide architectural trade-offs.
@alirezarezvani
Senior Backend Engineer
REST and GraphQL API development, database schema optimization, authentication patterns, and backend architecture decisions from a senior engineer.
@alirezarezvani
Senior Security Engineer
Threat modeling, penetration testing guidance, zero-trust architecture design, and security code review from a senior security engineering perspective.
@alirezarezvani