Altera logo

Senior MLOps & AI Infrastructure Engineer

Altera
July 02, 2026
Full-time
On-site
San Jose, California, United States
$149,100 - $215,925 USD yearly
Other Semiconductor Jobs, Level - Senior

Job Title

Senior MLOps & AI Infrastructure Engineer

Role Summary

Architect, build, and operate production-grade ML infrastructure and pipelines to support large-scale model training, evaluation, and deployment across cloud and on-prem HPC environments. Partner with software, data, and research teams to productionize models for EDA, HPC, and cloud use cases.

Experience Level

Senior β€” requires extensive industry experience (10+ years overall; specific years noted in Requirements).

Responsibilities

Deliver end-to-end MLOps solutions, optimize model lifecycle, and maintain robust infrastructure for scalable ML workloads.

  • Design, build, and maintain scalable training/evaluation/deployment pipelines across cloud and on-prem HPC.
  • Implement and operate experiment tracking, model registry, feature stores, and automated retraining workflows.
  • Develop CI/CD/CT pipelines for models (e.g., Kubeflow, MLflow, Airflow) and containerized deployments on Kubernetes with GPU node pools.
  • Fine-tune and deploy large models (LLMs, GNNs, RL agents) and apply efficiency techniques (quantization, pruning, distillation, RLHF).
  • Build data pipelines, feature engineering systems, and data versioning/lineage for terabyte-scale datasets.
  • Manage cloud ML resources (AWS SageMaker, Azure ML, GCP Vertex AI) and optimize cost/performance.
  • Automate infrastructure provisioning (Terraform/CloudFormation) and integrate with HPC schedulers (Slurm, LSF) for distributed training.
  • Implement monitoring, alerting, and observability for model performance, data quality, and system health.
  • Mentor engineers, collaborate with research scientists, and drive adoption of ML engineering best practices.

Requirements

Must-have technical skills and hands-on experience.

  • 10+ years of experience in ML engineering, data science, and MLOps; production deployment of models at scale.
  • Proven expertise with ML frameworks: PyTorch, TensorFlow, JAX, Hugging Face, scikit-learn, XGBoost.
  • Experience with parallelism strategies and large-model training (FSDP, DeepSpeed, data/model parallelism).
  • Strong Python proficiency (10+ years) and experience with Bash, SQL; Go is a plus.
  • 8+ years working with cloud ML platforms, Docker, Kubernetes, and CI/CD pipelines.
  • 5+ years using experiment tracking and reproducibility tools (MLflow, Weights & Biases, Neptune) and data versioning tools (DVC, Delta Lake).
  • Experience optimizing inference on GPU/TPU clusters and benchmarking model performance.
  • Familiarity with monitoring/observability stacks (Prometheus, Grafana, ELK, Evidently, Arize) and security/DevSecOps practices for ML systems.

Education Requirements

Bachelor's or Master's degree in Computer Science, Machine Learning, Statistics, or a related technical field is stated as required; a PhD in a related field is listed as preferred. (The posting pairs degree requirements with senior-level experience requirements.)


About the Company

Company: Altera

Headquarters: Bengaluru, Karnataka, India

Altera provides leadership programmable solutions for applications ranging from cloud to edge, unveiling limitless AI possibilities. Their extensive product portfolio includes FPGAs, CPLDs, Intellectual Property, development tools, and System on Modules aimed at accelerating innovation in various fields.

Altera logo

Date Posted: 2026-07-02