Deep Learning Engineer
Palo Alto, Paris, Abu Dhabi Engineering / Full Time / On-site
Key Responsibilities:
+ Design, develop, optimize, and maintain software systems for the entire foundation model development and deployment lifecycle (i.e., data pipeline, pre-training, fine-tuning, serving).
+ Build and maintain scalable, efficient, and reusable codebases for large-scale foundation model training, adaptation, evaluation, and inference.
+ Collaborate closely with data engineers and research scientists to integrate models into production environments.
+ Implement and ensure best practices in software engineering, including code quality, testing, and documentation.
+ Build and optimize robust back-end systems, APIs, and databases to support complex workflows.
+ Ensure code quality, scalability, and performance through rigorous testing and code reviews.
Qualifications:
+ Bachelor’s, Master’s degree in Computer Science, Engineering, or related field. Experience in life sciences or healthcare is a plus.
+ Strong programming skills in JavaScript, Python, and modern web development frameworks, and familiarity with GPU-accelerated tools (e.g., CUDA, cuDNN, Triton).
+ Proficiency with major deep learning frameworks such as PyTorch, HuggingFace Transformers & Accelerate, or Megatron-LM/DeepSpeed.
+ Familiarity with resource management and scheduling systems (e.g., SLURM, Kubernetes).
+ Proficiency in back-end frameworks like Django, Flask, or Node. js, and database technologies (e.g., PostgreSQL, MongoDB).
+ Expertise in distributed systems, cloud computing (AWS, GCP), and containerization tools (Docker, Kubernetes).
Preferred Qualifications:
+ Ph.D. degree in Computer Science, Engineering, or related field. Experience in life sciences or healthcare is a plus.
+ Prior experience pre-training or serving large language models or large-scale foundation models.
+ Experience with deep learning workflows.
+ Knowledge of biological data types and challenges and experience with bioinformatics tools.
+ Familiarity with version control systems like Git and CI/CD pipelines.
+ Strong understanding of RESTful APIs, authentication, and deployment pipelines
+ Familiarity with machine learning workflows and biological datasets.