Senior Data Research Engineer - AI for Science
Microsoft
Microsoft Research AI for Science is seeking a talented research engineer to join our mission of accelerating scientific discovery through AI. In the materials team, we are building next generation foundational AI capabilities to accelerate the design of novel materials with industrial impact. You can learn more about our AI emulator MatterSim and generator MatterGen in our blog.
This role is an exceptional opportunity to lead our ambitious data generation efforts. You will design, build, and maintain scalable infrastructure to support large-scale materials data generation workflows. You will work with a highly collaborative, interdisciplinary, and diverse team of researchers, engineers and scientists to define and create the next frontier datasets for materials science.
Microsoft’s mission is to empower every person and every organization on the planet to achieve more, and we’re dedicated to this mission across every aspect of our company. Our culture is centered on embracing a growth mindset and encouraging teams and leaders to bring their best each day. Join us and help shape the future of materials design.
This post will be open until the position is filled.
Responsibilities
- Design, build, and maintain scalable infrastructure to support large-scale materials data generation workflows.
- Improve and automate data validation, monitoring and reporting.
- Integrate data from various data sources and modalities.
- Work with domain experts to define new agentic materials science workflows.
- Generate datasets for training deep learning models for materials design.
Qualifications
Required qualifications:
- PhD in computer science, machine learning, computational materials science, computational chemistry, condensed matter physics, or related area, or comparable industry experience.
- Practical experience working with databases (preferably CosmosDB or MongoDB) and developing high-throughput pipelines for data generation or processing.
- Practical experience with using or managing high-performance compute (e.g., SLURM, cloud-based clusters)
- Proficiency in collaborative code development in Python on shared codebases, writing performant code and testing.
- Ability to work in an interdisciplinary collaborative environment, through effective communication of technical concepts to non-experts from different technical backgrounds.
Preferred qualifications:
- Experience in designing and producing scientific datasets and collaborating with domain experts.
- Experience with automation (e.g., via Github Actions or similar tools), security best practices (e.g., image hardening, secure system design), monitoring (e.g., Grafana, Log Analytics)
- Experience in ML Ops, ensuring reproducibility, scalability and integration of machine learning models.
- Experience with building agentic workflows.
- Understanding of density functional theory and its application in simulating solid-state materials.
#Research #AI for Science
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.