AI Research Scientist/Engineer
SonarSource
Software Engineering, Data Science
Singapore
Position description
What you will do
- Spearhead Research & Innovation: Stay on the cutting edge of ML, Deep Learning, and LLMs, specifically their application to the Software Development Lifecycle (SDLC), and identify novel opportunities to enhance our products.
- Develop Advanced AI Models: Design, prototype, and validate novel ML models that identify and resolve complex bugs, vulnerabilities, and code smells, going beyond the capabilities of traditional static analysis.
- Build LLM-Powered Features: Develop and implement advanced LLM-based solutions, including Retrieval-Augmented Generation (RAG) for contextual code analysis, fine-tuning models on proprietary codebases, and exploring agentic systems for automated code remediation.
- Engineer Data Pipelines: Build and manage robust data pipelines to gather, process, and version massive code-centric datasets required for training and evaluating specialized models at scale.
- Translate Prototypes to Products: Collaborate closely with engineering and product teams to integrate successful ML prototypes into Sonar's cutting-edge products, ensuring they meet the needs of our global user base.
- Communicate and Evangelize: Clearly articulate and document complex technical concepts and research findings to both technical and non-technical stakeholders.
Experience and qualifications
- An advanced academic background (Master’s or PhD) in Computer Science, Machine Learning, or a related quantitative field.
- Strong industry experience in machine learning, with a solid understanding of modern software engineering practices and tools.
- Solid programming skills in Python and hands-on experience with core ML/DL frameworks (e.g., PyTorch, TensorFlow, Hugging Face). Familiarity with Java is a plus.
- Proven experience in applied Machine Learning, with a strong focus on Natural Language Processing (NLP) or, ideally, Programming Language Processing (PLP).
- Hands-on experience with modern LLM architectures and techniques, such as Fine-tuning strategies (e.g., LoRA, QLoRA), advanced prompt engineering, building and optimizing Retrieval-Augmented Generation (RAG) pipelines and working with vector databases and semantic search
- Experience with large-scale data processing frameworks and cloud infrastructure (e.g. AWS).
- Experience of driving research projects from initial ideation to a demonstrable prototype with a high degree of autonomy.
- Excellent communication skills in English and a talent for explaining complex scientific topics clearly and concisely.