HPC Engineer, Machine Learning Infrastructure - US Remote
Hugging Face
N/A
Here at Hugging Face, we’re on a journey to advance good Machine Learning and make it more accessible. Along the way, we contribute to the development of technology for the better.
We have built the fastest-growing, open-source, library of pre-trained models in the world. With more than 1 Million+ models and 320K+ stars on GitHub, over 15.000 companies are using HF technology in production, including leading AI organizations such as Google, Elastic, Salesforce, Grammarly and NASA.
About the role:
We are looking for a HPC Engineer responsible for developing and scaling our distributed large cluster. The ideal candidate will have experience provisioning large compute clusters for AI workflows and strong experience supporting teams to create best practices for reliability and scalability.
Responsibilities
- Design, develop, deploy, and maintain reliable and scalable infrastructure that enables efficient...