Software Engineer - Singularity, Distributed Scheduler

Who We Are

We are the engineers on Singularity team. We believe that building a planet-scale AI supercomputer from the ground-up which addresses the fundamental pain-points of data scientists and AI practitioners and takes AI to the unprecedented scale is an opportunity of a lifetime. If you share the same dream as us, come join us! 

What is Singularity?

Ultimately, democratization of AI is all about enabling data scientists to productively build, scale, experiment, and iterate their models on top of a robust, performant, scalable and cost-effective distributed infrastructure built for AI.

In Singularity, we are constantly seeing to apply the best ideas from AI, ML, distributed systems, database, machine learning, information retrieval, networking, and security.

Who You are

As a software engineer you will shape the future of AI compute technology for deep learning training and inferencing. In this role you will have the chance to work the distributed scheduler is responsible for scheduling compute resources required for training/inferencing of a given model in a topology-aware manner making sure that all the resources (e.g., accelerator devices) are available at the same time and within the close proximity (honoring the locality constraint) to bring distributed deep learning training and inferencing to life. You will lead development from the front and establish architecture, coding guidelines and quality bar.

You should have:

  • Experience in engineering leadership and scalable/distributed systems
  • Experience with building schedulers is a plus
  • Experience with programming in C++ and Python.
  • Strong machine learning fundamentals and AI algorithms
  • Great analytical skills and learning agility
  • Rigor to drive the change and pursue results
  • Ability to navigate ambiguity and deliver results in dynamic environment
  • Capacity to drill deep through software layers

Requirements:

  • BS or higher in Computer Science or related discipline (or equivalent experience)
  • 5+ years of industry experience designing, developing and shipping high quality scalable software and services
  • Strong design, implementation and testing skills
  • Managed and native code development experience
  • Experience in using / extending PyTorch/TensorFlow is a plus
  • Experience in programming hardware accelerators such as GPUs is a plus
  • Experience with parallel programming (pthreads, MPI, OpenMP, etc) is a plus
  • Experience with diagnosis and debugging systems performance issues, using appropriate tools and techniques 

Great if you have any of the following under your belt:

  • Large scale stateful and stateless services
  • Native Windows or Linux development experience is a plus
  • Performance profiling
  • Strong written and oral communication skills

We are committed to an inclusive and diverse culture.

What You'll Do  

  • Build a new platform service ground-up from scratch that will become a major  driver for cutting edge AI
  • Grow into senior technical or organization leader
  • Be part of Azure platform at the time of growth as we surpass competition


Join our mission and help us shape the future of planet-scale AI and solve the pain-points of data scientists developing bleeding edge AI!


How to Apply

Send your resume to JoinSingularity@microsoft.com with pointers to the code you are most proud of.

***

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.