Job Details

Job Information

Applied Research Engineer — Multimodal AI
AWM-1525-Applied Research Engineer — Multimodal AI
3/13/2026
3/18/2026
Negotiable
Permanent

Other Information

www.apple.com
Sunnyvale, CA, 94086, USA
Sunnyvale
California
United States
94086

Job Description

No Video Available
 

Role Number: 200649931-3956

Summary

The Video Computer Vision (VCV) organization is a centralized applied research and engineering team developing real-time, on-device Computer Vision and Machine Perception technologies across Apple products. Within VCV, the Multimodal Intelligence team builds next-generation multimodal AI systems that combine large language models, multimodal LLMs, and foundation models to create intelligent systems capable of understanding, reasoning, and acting across language, vision, audio, and tools. We develop multimodal agentic systems deeply integrated into the Apple ecosystem, partnering across hardware, software, and ML teams to deliver advanced AI in real-time, scalable, and privacy-preserving experiences reaching millions of users.

Description

We are seeking an Applied Research Engineer with demonstrated experience in applied machine learning and generative AI. In this role, you'll build multimodal agent systems, translating research ideas into production-ready solutions for Apple products. Your work will encompass building and shipping agentic systems alongside developing evaluation tools and frameworks—understanding limitations, systematically analyzing failure modes, and creating methodologies to improve robustness, safety, and real-world generalization.
This role offers the unique opportunity to impact millions of users while working at the intersection of research and production. You'll collaborate with world-class researchers and engineers, access cutting-edge hardware platforms, and solve novel problems in on-device AI that few organizations can address. Your work will directly shape how people interact with multimodal AI in Apple products.

This role spans multiple dimensions of agent development — including context optimization, multi-turn orchestration, tool use, planning, evaluation, robustness analysis, and system optimization. Depending on experience and strengths, candidates may focus on different aspects of the agent lifecycle while contributing to the shared mission of building next-generation multimodal agents

Minimum Qualifications

  • MS in Computer Science, Machine Learning, AI, Robotics, or a related field (or equivalent practical experience)

  • Strong foundation in machine learning and deep learning, including Agentic AI, reasoning, and large-scale models

  • Demonstrated experience building production systems with LLMs and/or multimodal LLMs

  • Proficiency in Python and in modern deep learning frameworks (PyTorch preferred)

Preferred Qualifications

  • PhD with research experience in agentic systems, reasoning, planning, or reinforcement learning

  • Hands-on experience building agents, including planning, task decomposition, tool use, and multi-step reasoning

  • Experience with language or multimodal foundation models (training, fine-tuning, context optimization)

  • Expertise in designing evaluation methodologies for agentic systems: scenario generation, failure analysis, robustness testing, and safety validation

  • Publication record in top tier venues is a plus (NeurIPS, ICML, ICLR, CVPR, etc)

Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant (https://www.eeoc.gov/sites/default/files/2023-06/22-088_EEOC_KnowYourRights6.12ScreenRdr.pdf) .

Other Details

No Video Available
--

About Organization

 
About Organization