Hailemichael Atrsaw Tibebu

Hailemichael Atrsaw Tibebu

DevOps Engineer | SRE | Platform Engineer

Hi there đź‘‹

I'm a DevOps Engineer at Exponent, where I architect and operate production infrastructure across 6 cloud providers (AWS, Azure, GCP, Digital Ocean, Infomaniak, Exoscale). With 4+ years of production experience, I currently manage 12+ Kubernetes clusters, 30+ microservices, and 12+ production environments, maintaining 99.9% uptime for systems serving 20k+ daily active users.

I specialize in building automation that scales—I've reduced deployment times from 45 to 8 minutes, cut service onboarding time by 80%, and decreased infrastructure operations by 70% through custom tooling. My work includes engineering a Multi-Cloud K8s Provisioner managing 50+ nodes, architecting a Cloud-Native Database Backup Orchestrator protecting 30+ PostgreSQL instances with 99.99% success rate, and creating reusable infrastructure libraries adopted by 10+ engineering teams. I've also optimized cloud spending, achieving 35% cost reduction through automated lifecycle management.

Beyond infrastructure automation, I design enterprise observability stacks (Prometheus, Grafana, Loki, Mimir), implement zero-trust security patterns, lead on-call rotations, and build CI/CD pipelines that enforce security best practices. Previously, I developed high-performance APIs as a Backend Developer, giving me a deep understanding of both infrastructure and application layers. I believe the best DevOps engineers are force multipliers—enabling teams to ship faster, fail safer, and sleep better.

Featured Projects

Multi-Cloud K8s Provisioner

Custom automation framework for provisioning self-hosted RKE2 clusters using custom CRDs and Terraform. Manages 8 production clusters and 50+ nodes with 95% reduction in manual deployment effort.

Python RKE2 Terraform Kubernetes

Cloud-Native Database Backup Orchestrator

K8s-native backup automation tool that auto-discovers and backs up 30+ database instances with 99.99% success rate. Supports multi-cloud retention policies and offsite synchronization.

Python Kubernetes API S3 PostgreSQL

Versatile Helm Chart

Versatile Helm chart library adopted by 60+ microservices, reducing new service onboarding time by 80%. Includes patterns for Ingress Nginx, Traefik, Cert-Manager, and autoscaling.

Helm Kubernetes YAML

Infrastructure Libraries

Versioned library of 20+ reusable IaC modules and CI templates adopted across 10 engineering teams. Standardizes VPC peering, EKS hardening, and multi-stage builds.

Terraform Terragrunt GitLab CI GitHub Actions

Core Expertise

Cloud Platforms

AWS, Azure, GCP, Digital Ocean, Infomaniak, Exoscale, Hetzner

Orchestration

Kubernetes (EKS, AKS, GKE, RKE2), Docker, Helm

Infrastructure as Code

Terraform, Terragrunt, Ansible, Packer

Observability

Prometheus, Grafana, Loki, Mimir, CloudWatch

CI/CD

GitLab CI, GitHub Actions, Azure DevOps

Programming

Python (FastAPI, Django), Bash

Resume

Download CV (PDF)