Software Engineer - Inferencing


  • Design, implement and deliver service infrastructure to support service expansion in regions and clouds; strategize and codify capacity management to meet customer demand.
  • Deliver world-class monitoring systems and telemetry pipelines to enhance service and job observability for both end-users and operators.
  • Design and implement release and deployment infrastructure to scale service deployments to thousands of clusters while continue to increase our release cadence and agility.
  • Design and build change management systems that orchestrate and automatically ensure the safety and correctness of any change made to the production system.
  • Codify security and compliance requirements by building and strengthening system defenses against malicious attacks and exploits.
  • Use data-driven and machine learning approaches to build quality and operational insights; leverage insights to drive quality and operational excellence across pre and post production pipelines.
  • Design and implement performance and scalability infrastructure that focuses on methodically calibrating data at scale to ensure meaningful characterizations and comparisons.
  • Leverage performance and profiling tools to identify hot spots and bottlenecks across hardware and software boundaries: from CPU, GPU, microcode, OS, networking to product code and drive end-to-end job performance.


Required Qualifications:

  • 3+ years of experience with coding in one of C, C++ and C#, Java.
  • Experience with improving service operations or engineering fundamentals.
  • Excellent collaboration skills.
  • A Master’s degree (or Bachelor’s degree with 3+ years of work experience equivalent) in computer science or a related field.
  • At least 2 years of experience building and shipping production software or services.


Preferred Qualifications:

  • Proven ability to create componentized and well-architected software
  • Prior experience in building large scale cloud services, distributed systems, or operating systems
  • Understanding of TensorFlow and PyTorch runtimes - a plus
  • Experience programming GPUs (graphics processing units), CUDA/cuDNN/NCCL - a plus
  • Experience programming FPGAs (field-programmable gate arrays) – a plus


Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check. This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.



Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances.  We also consider qualified applicants regardless of criminal histories, consistent with legal requirements.