Principal Engineer (Cloud Platforms)
ThoughtSpot
Bengaluru, Karnataka, India
About the Role:
We are looking for a Principal Engineer to join our Cloud Platform team and take ownership of the architecture and evolution of our multi-tenant SaaS platform. You will work at the intersection of backend engineering and cloud infrastructure, designing and building the systems that power our control plane and data plane at scale.
This is a high-impact, high-ownership role. You will be a technical anchor for the team: driving architecture decisions, mentoring engineers, and partnering with product and infrastructure leaders to shape the future of our cloud platform.
What You Will Do:
Architecture & Design
- Own end-to-end architecture for the cloud platform, spanning control plane and data plane
- Design multi-tenant systems with strong isolation, security, and resource governance
- Define platform abstractions that work across hybrid environments: AWS, GCP, Azure, and on-premises
- Drive architectural reviews, RFCs, and ensure decisions are well-reasoned, documented, and scalable
Control Plane
- Architect and build systems for tenant provisioning, lifecycle management, and configuration
- Design cluster orchestration and management systems that operate reliably at scale
- Build APIs and automation that enable self-service for operators and tenants
- Ensure control plane is highly available, auditable, and observable
Data Plane
- Design data path components for high-throughput, low-latency workloads
- Build and enforce isolation boundaries between tenants at the data layer
- Optimize for performance, reliability, and cost efficiency at scale
Engineering Excellence
- Set technical direction and coding standards for the platform team
- Identify and address systemic risks — reliability, scalability, security, and operability
- Partner with SRE and DevOps on observability, incident response, and capacity planning
Mentorship & Leadership
- Mentor and grow senior and mid-level engineers
- Contribute to hiring — define bar, conduct interviews, help build the India platform team
- Collaborate closely with cross geo-based engineering and product teams
Must Haves:
SaaS Platform & Multi-Tenancy
- Hands-on experience building or operating SaaS platforms at mid-to-large scale
- Deep understanding of multi-tenant architecture patterns: silo, pool, and bridge models; noisy neighbor mitigation; tenant-level resource quotas and throttling
- Experience with tenant lifecycle management: onboarding, provisioning, configuration drift detection, and offboarding automation
- Knowledge of data isolation strategies: schema-per-tenant, database-per-tenant, row-level security, and encryption-at-rest per tenant
Control Plane & Data Plane
- Proven experience with control plane / data plane separation: in databases, networking platforms, or cloud-native SaaS systems
- Experience building cluster management and orchestration systems: Kubernetes operators, custom controllers, or equivalent systems managing fleets of workloads
- Hands-on with Kubernetes internals: CRDs, admission webhooks, controllers, RBAC, and namespace isolation
- Understanding of configuration management at scale: GitOps workflows, feature flag systems, and dynamic config propagation across distributed environments
Cloud & Hybrid Infrastructure
- Deep expertise in at least two of AWS, GCP, or Azure: VPC design, IAM, managed services (EKS/GKE/AKS, RDS, S3-equivalent), and cost optimization
- Experience designing systems that span hybrid cloud environments: consistent networking, identity federation, and workload portability across cloud and on-premises
- Strong understanding of infrastructure as code: Terraform, Pulumi, or CDK for managing cloud resources programmatically
- Familiarity with service mesh technologies: Istio, Linkerd, or Envoy for traffic management, mTLS, and observability in microservices environments
Distributed Systems
- Strong understanding of distributed systems fundamentals: consensus protocols, leader election, distributed locking, and failure detection
- Experience designing for high availability and fault tolerance: active-active, active-passive, circuit breakers, bulkheads, and graceful degradation
- Hands-on with observability tooling: structured logging, distributed tracing (OpenTelemetry, Jaeger), and metrics-based alerting (Prometheus, Grafana)
- Experience with chaos engineering or resilience testing practices to validate system behavior under failure
Security & Compliance
- Understanding of zero-trust security principles in multi-tenant SaaS: identity-aware proxies, workload identity, and least-privilege access
- Experience with secrets management: Vault, AWS Secrets Manager, or equivalent: integrated into automated provisioning pipelines
- Familiarity with compliance frameworks relevant to SaaS platforms: SOC 2, ISO 27001, or GDPR data residency requirements at the infrastructure level
Good to Haves
- Exposure to FinOps practices: cost attribution per tenant, showback/chargeback models, and cloud cost anomaly detection
- Familiarity with eBPF or low-level networking for custom observability or performance optimization
- Contributions to open-source infrastructure or platform projects