Join our companies in their quest to drive powerful, positive, change that endures.

Applied Scientist

Sanas

Sanas

Bengaluru, Karnataka, India · Bengaluru, Karnataka, India
Posted on Tuesday, June 4, 2024
Sanas is revolutionizing the way we communicate with the world’s first real-time algorithm, designed to modulate accents, eliminate background noises, and magnify speech clarity. Pioneered by seasoned startup founders with a proven track record of creating and steering multiple unicorn companies, our groundbreaking GDP-shifting technology sets a gold standard. Our initial deployment is laser-focused on elevating the standards of customer experience centers. Testimonials from our partners reveal staggering double-digit improvements in mission-critical KPIs, coupled with boosts in CSAT and NPS. More than just a tool, our technology champions a bias-free workspace. This not only fosters a positive work environment but has also been instrumental in reducing employee attrition and curbing training expenditures.

Sanas is a 70-strong team, established in 2020. In this short span, we’ve successfully secured over $50 million in funding. Our innovation have been supported by the industry’s leading investors, including Insight Partners, Google Ventures, General Catalyst, Quiet Capital, and other influential investors. Our reputation is further solidified by collaborations with numerous Fortune 100 companies. With Sanas, you’re not just adopting a product; you’re investing in the future of communication.

Sanas is seeking a detail-oriented and self-motivated individual to join ML Analytics Org as an Applied Scientist. As a seasoned Applied Science professional, you will be responsible for doing Proof-of-Concepts (PoCs) and building solutions for ambiguous and hard problems in the space of Speech evaluation. You will work closely with the Linguists, Scientists and Engineers in the team to provide high-quality solutions to enable accurate Speech evaluations at scale.

Key Responsibilities

  • Create working solutions and PoCs for ambiguous and hard in speech evaluation
  • Create solutions to analyze speech data at scale with heuristics and ML based automations
  • Create data visualizations to support PoCs and communicate a narrative
  • Write technical reports with actionable recommendations for improving model evaluation and performance, enabling data driven decision making for leaders
  • Create PoCs based on different speech and linguistic features to automate various effort intensive parts of existing evaluation processes
  • Conduct experiments with data sampling techniques to optimize evaluation test-sets for representativeness and improve evaluation time and cost
  • Create and maintain augmented datasets focused on different speech properties including prosody, quality, legibility etc




Basic Qualifications

  • A master’s degree in Computer Science and Engineering with a good hands-on exposure of NLU/NLP/NLG
  • Native or near-native fluency in English with an exposure to accentual nuances
  • Excellent listening, comprehension, writing and presentation skills
  • Excellent organizational, analytical skills and attention to detail
  • Excellent Python or Java coding and Shell scripting skills
  • 7+ years of track record of R&D, publications in top-tier (preferably Q1) journals and delivering efficient solutions in NLU/NLP/NLG for large scale enterprises
  • 4+ YoE of data processing for different data modalities, visualization, and analysis
  • 4+ YoE of software development experience with different development methodologies including Agile
  • 4+ YoE of version control and iterative software development
  • 4+ YoE of working with diverse, cross-functional, global teams




Preferred Qualifications

  • A PhD in Computer Science or Computational Linguistics
  • Exposure to IPA, SAMPA and other speech annotation frameworks
  • Knowledge of Databases and ETL pipelines
  • Experience in coaching junior team members on functional skills and best practices
  • Passionate about problem solving, taking ownership, and delivering results




ML Analytics Organization at SANAS AI

SANAS AI's ML Analytics Organization is a dynamic hub of innovation, dedicated to reshaping analytics in speech technology. Comprising diverse professionals including seasoned Applied Scientists, Speech Scientists, Computational Linguists, Data Linguists, and Software Developers, our team pioneers advanced evaluation metrics and frameworks for speech models. We empower large-scale speech evaluations with a self-serve platform. Additionally, we manage a robust data annotation platform catering to diverse data modalities internally. At the forefront of science initiatives, we drive accuracy, automation, and efficiency, ensuring objective and exhaustive evaluations of speech quality. Our commitment to advancing the field positions us as leaders in revolutionizing the landscape of speech technology.

A day in the life of Surjeet, an Applied Scientist at ML Analytics Org

As an Applied Scientist in ML Analytics team, Surjeet uses their in-depth experience of Signal Processing and Machine Learning expertise to solve hard research problems. Their work often delves in the problem space with no prior academic and industrial research and the challenge is to outperform the existing SOTA bars.

Currently, Surjeet is working on accent identification and profiling Mother Tongue Influence (MTI) in Indian English speakers to improve the quality of speech synthesis. Surjeet starts their day by doing exploratory analysis for the Speech feature selection relevant for accent and MTI identification and runs some modelling experiments with the selected features. Later on Surjeet reviews results of a proof-of-concept (PoC) with Surjeet, a Speech Engineer in the ML Analytics Org. Surjeet shares the feasibility and trade-offs of implementing the proposed PoC in production pipeline with the head of the Org and post-a-go decision partners with a Data Engineer to do that. Towards the end of day Surjeet does a yearly goal review with stakeholders from Science and ML-Ops team.