Enterprise Architect
Nurix
IT
Egypt
Key Responsibilities
- Design and evolve the end-to-end infrastructure supporting ASR/TTS, LLM orchestration, Agentic RAG, and self-learning workflows.
- Architect low-latency pipelines for real-time conversational AI, ensuring sub-second response times across voice and chat.
- Build multi-cloud, distributed systems (AWS, GCP, Azure) with elastic scaling to handle spiky workloads.
- Define and enforce SLAs around latency, uptime, and throughput for AI services.
- Drive observability, monitoring, and resilience strategies to handle failures gracefully.
- Optimize GPU/TPU utilization for cost-effective training and inference.
- Partner with InfoSec to embed security-by-design across all AI/ML workloads.
- Implement controls to protect sensitive enterprise data while meeting global compliance standards (SOC2, ISO 27001, GDPR, DPDP).
- Work closely with the Head of AI to translate cutting-edge research into production-grade platforms.
- Provide technical mentorship to engineering teams, ensuring best practices in distributed systems and infra design.
- Evaluate and adopt emerging technologies (e.g., SSMs, inference optimizers like Triton, Riva, vLLM) to stay ahead of the curve.
Required Qualifications & Skills
- 10 - 15 years of experience in large-scale systems architecture, with at least 5 years in principal/architect-level roles.
- Proven expertise in distributed systems, cloud-native architectures, and real-time pipelines.
- Hands-on experience with containerization, orchestration (Kubernetes), and microservices.
- Strong background in scalable ML infrastructure, including model serving, GPU/accelerator utilization, and CI/CD for ML.
- Demonstrated ability to architect systems with low latency (<300ms), high throughput, and enterprise reliability.
- Experience in conversational AI, speech systems, or real-time inference workloads.
- Deep knowledge of MLOps platforms (Kubeflow, MLflow, VertexAI, SageMaker).
- Familiarity with state-of-the-art inference optimization frameworks (e.g., Triton, Nvidia Riva, vLLM, SGLang).
- Open-source contributions or patents in distributed systems, infra, or ML tooling.