Job Details

Job Information

Machine Learning Inference Performance Engineer
AWM-5975-Machine Learning Inference Performance Engineer
3/21/2026
3/26/2026
Negotiable
Permanent

Other Information

www.apple.com
Sunnyvale, CA, 94086, USA
Sunnyvale
California
United States
94086

Job Description

No Video Available
 

Role Number: 200652437-3956

Summary

At Apple, we're on the cutting edge of delivering transformative experiences through Artificial Intelligence. If you're passionate about pushing the boundaries of AI and hardware optimization, we want you to join our team. As a Senior Machine Learning Performance Engineer, you will help push the boundaries of on-device generative AI performance and efficiency, designing and implementing novel techniques to optimize large-scale machine learning workloads on the ANE(Apple Neural Engine). You will work at the intersection of machine learning, systems, and hardware architecture, shaping how next-generation AI models run across millions of Apple devices.

This is a unique opportunity to contribute to technologies that directly impact the daily experience of Apple customers worldwide.

Description

In this role, you will play a critical part in enabling state-of-the-art machine learning workloads on Apple silicon. You will collaborate closely with model developers, machine learning researchers, compiler engineers, and hardware architects to deliver highly optimized inference performance.

In this role, you will:

  • Develop novel ML inference optimization strategies to improve performance and power efficiency on the ANE.
  • Analyze and identify performance bottlenecks across the full stack, including model architecture, compiler, runtime, and hardware.
  • Partner with Apple AI/ML, software, and silicon teams to co-design next-generation ML models and inference techniques optimized for ANE.
  • Build and maintain performance profiling tools, benchmarking frameworks, and analysis infrastructure
  • Drive performance characterization, modeling, and optimization for large-scale ML workloads such as LLMs, diffusion models, and computer vision models

Minimum Qualifications

  • BS in Computer Science, Computer Engineering, or a related field

  • Minimum 3 years of experience in system performance analysis, machine learning systems, or hardware/software optimization

  • Strong debugging, performance analysis, and problem-solving skills

Preferred Qualifications

  • MS or PhD in Computer Science, Machine Learning, Computer Architecture, or a related field

  • Experience with ML system optimization, performance modeling, or architecture evaluation

  • Experience developing profiling tools, benchmarking frameworks, or performance analysis tools

  • Familiarity with hardware accelerators or ML compiler stacks

  • Experience with large-scale ML workloads such as transformers, diffusion models, or mixture-of-experts architectures

  • Strong analytical skills with the ability to analyze large datasets and communicate insights clearly

  • Excellent written and verbal communication skills

Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant (https://www.eeoc.gov/sites/default/files/2023-06/22-088_EEOC_KnowYourRights6.12ScreenRdr.pdf) .

Other Details

No Video Available
--

About Organization

 
About Organization