Location
Paris or remote (from -5h GMT up to +2h GMT to ensure sufficient overlap with the rest of the team).
How we work
We move fast on hard problems in a nascent market with no set playbook : navigating uncertainty is part of the job. You’ll be challenged: anyone can question work and decisions must be justified. We keep a high bar and match it with high support: we help each other unblock and share context openly, with low ego. More about our values: morpho.org/jobs.
Role
Enable Morpho to ship reliable, high‑scale products faster by building and operating a resilient, secure, and cost‑efficient infrastructure platform that empowers engineers and keeps tier‑1 services within agreed SLOs.
Responsibilities
- Architect multichain production infrastructure with Kubernetes and AWS to achieve ≥99.95% uptime while lowering unit costs via autoscaling and right‑sizing.
- Build self‑service platform capabilities with CI/CD and GitOps to cut p50 deploy time from hours to minutes and enable safe, autonomous rollbacks.
- Design and operate high‑throughput data systems with event‑driven pipelines to process billions of on‑chain events with sub‑minute latency.
- Implement end‑to‑end observability with Prometheus, Grafana, and OpenTelemetry to enable proactive alerting and rapid incident triage.
- Establish zero‑trust IAM and secrets management with fine‑grained access controls and automated credential rotation to raise the security baseline.
- Lead post‑mortems and drive RFC‑based architecture decisions to improve reliability, developer experience, and time to recovery across teams.
- Mentor engineers across product and protocol teams to spread best practices in distributed systems, performance, and reliability engineering.
What Success Looks Like
- In your first 30 days
- Understand Morpho’s services, deployment patterns, on‑chain integrations, and incident history. Set clear SLOs and a roadmap for platform improvements.
- Ship quick wins in CI/CD or observability that remove friction for engineers and increase visibility on tier‑1 services.
- By Month 4–6
- Deliver a self‑service deployment pipeline with progressive delivery and automated rollbacks. MTTR trending down. Tier‑1 services fully instrumented with actionable alerts.
- Implement zero‑trust IAM foundations with auditable access and safe secrets handling. Document and socialize runbooks for high‑impact incidents.
- By Month 12
- Demonstrate multi‑region readiness for a tier‑1 surface meeting RTO/RPO targets. Platform unit cost reduced by 20–30% while maintaining ≥99.9% SLO compliance.
- Trusted partner to hiring managers and tech leads, influencing roadmaps and unblocking hard infrastructure and data challenges.