Job Details
Job Information
Other Information
Job Description
Role Number: 200628858-1242
Summary
The Speech Team within the Siri organization drives major speech recognition, synthesis and speech to speech model changes for various features deeply embedded throughout Apple’s ecosystem. Our mission is to build cutting-edge infrastructure, datasets, and models that empower Siri conversational AI, Dictation and various speech enabled Apple Intelligence features with powerful capabilities across natural language understanding, dialog generation, speech recognition, and multi-modal interaction. We apply these technologies to create engaging, intelligent, and personalized conversational experiences for millions of Apple users.
We believe that the most impactful breakthroughs in deep learning emerge when we address real-world problems at scale. We develop speech to speech experiences and the underlying multimodal foundation model technology for current and future speech-enabled features across Apple’s software, hardware, and services ecosystem. This allows for cutting edge applied research anchored in Apple specific production needs, while improving speech interaction experiences for Apple’s customers around the world.
Description
You will work alongside a fast-growing team of world-class engineers and scientists to tackle core problems in dialog systems and foundation models— ranging from natural language understanding and multi-turn context tracking, to the integration of speech, text, and other modalities. You will develop and deploy novel deep learning technologies that make Siri more intelligent, natural, and useful. You’ll help us advance the state of the art in natural language processing, speech and audio modeling, and multi-modal learning, with a strong focus on bringing your innovations into production. Your ideas will directly impact the daily lives of billions of users through Siri.
Minimum Qualifications
Demonstrated expertise in deep learning with publication record in relevant conferences (e.g., NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, KDD, ACL, ICASSP, InterSpeech) or a track record in applying deep learning techniques to products
Proficient programming skills in Python and one of the deep learning toolkits such as PyTorch, JAX, or Tensorflow
Masters Degree or PhD in Computer Science, or other technical field, or equivalent industry experience
Preferred Qualifications
Experience with conversational AI or multimodal LLM
Experience with large scale machine learning training/evaluation
Data-centric vision about foundation model
Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant (https://www.eeoc.gov/sites/default/files/2023-06/22-088_EEOC_KnowYourRights6.12ScreenRdr.pdf) .
Other Details

