DevOps Engineer
Twenty
Location
Arlington, VA
Employment Type
Full time
Location Type
On-site
Department
Engineering
About the Role
We are seeking a DevOps Engineer / Software Engineer – Infrastructure focused on infrastructure and reliability to help design, build, and operate our application platform. This position is at the crossroads of software engineering, operations, and security. You will own core AWS infrastructure, drive observability and reliability improvements, and collaborate closely with product and backend teams. Common backgrounds for this role include DevOps Engineer, SRE, Platform Engineer, or Backend Engineer (Infrastructure).
About the Company
At Twenty, we're taking on one of the most critical challenges of our time: ensuring democracies prevail in the digital age. We develop revolutionary technologies that operate at the intersection of cyber and electromagnetic domains, where the speed and complexity of operations exceeds human cognition. Our team doesn't just solve problems – we deliver game-changing outcomes that directly improve national security. We're pragmatic optimists who know that while our mission of defending America and its allies is challenging, we can succeed.
What You’ll Do
Design, build, and operate AWS-based infrastructure.
Implement and maintain Infrastructure-as-Code for single-tenant and multi-tenant environments using Terraform.
Build and maintain deployment and environment automation (Ansible or similar).
Own and evolve CI/CD pipelines.
Design, implement, and refine observability: metrics, logs, traces, dashboards, and alerting.
Partner with application teams on architecture decisions, performance tuning, and operational readiness.
Contribute to security and governance: IAM policies, network security, secrets management, and security scanning.
Document systems, patterns, and runbooks so others can operate and extend the platform reliably.
Must Haves
Experience administering infrastructure and operating applications deployed on AWS.
Experience using Terraform to manage single-tenant and multi-tenant systems.
-
Strong instincts and practical experience with:
IP networking (VPCs, routing, subnets, proxies, DNS).
Network security (security groups, NACLs, firewalls).
PKI management (TLS certificates, CAs, mTLS, certificate lifecycle).
Should Have
Experience with Ansible or another deployment automation/configuration management framework.
Experience with GitHub Actions or another CI/CD platform (GitLab CI/CD, CircleCI, etc.).
-
Experience working on reliability projects, such as:
Setting up alert management tools and on-call practices.
Debugging failures in distributed systems.
-
Experience setting up observability and operations programs, including:
Collecting and representing telemetry in Grafana (dashboards, panels, alerts).
Instrumenting applications using OpenTelemetry or other log/metric/trace aggregation frameworks.
Experience managing PostgreSQL databases in production (backups, migrations, performance, monitoring).
Experience managing a pub/sub or queue technology, such as NATS, RabbitMQ, Kafka, AWS SQS, or Google Pub/Sub.
Familiarity with secrets management (AWS SSM/Secrets Manager, Vault, or similar).
Nice to Have
Experience or strong interest in cybersecurity (threat modeling, hardening, secure defaults).
Proficiency in Python or another scripting language for tooling and automation.
Experience operating a security scanning tool, such as Trivy (or similar vulnerability/container scanners).
Experience or interest working with large datasets (performance, storage trade-offs, retention policies).
Experience managing graph databases, such as Neo4j, AWS Neptune, or similar.
Experience designing or contributing to runbooks and internal platform documentation for non-infrastructure teams.