Virtual Tech Gurus
Description
Job Summary
We are seeking an AI Optimization Engineer to support enterprise AI/ML initiatives with a focus on performance optimization, scalable infrastructure, and production deployment. The ideal candidate will have hands-on experience with large language models (LLMs), GPU-accelerated environments, and high-performance computing (HPC) platforms.
Key Responsibilities
- Design, optimize, and deploy machine learning and deep learning models into production
- Build and manage scalable infrastructure for AI/ML workloads, including LLMs
- Configure and maintain GPU-accelerated clusters for large-scale processing
- Implement model optimization techniques (pruning, quantization, distillation)
- Deploy models using containerized and microservices-based architectures
- Develop secure REST APIs using Flask for inference and orchestration
- Configure and optimize Triton Inference Server for model serving
- Manage job scheduling and automation using SLURM
- Monitor system and model performance using Prometheus and Grafana
- Perform exploratory data analysis (EDA) and visualization
- Collaborate with cross-functional AI teams (NLP, Computer Vision, GenAI)
Required Skills & Qualifications
- Strong Python programming (NumPy, scikit-learn)
- Experience with ML/DL frameworks: TensorFlow, PyTorch, or Keras
- Hands-on experience with HPC and GPU environments
- Strong knowledge of ML algorithms (supervised & unsupervised learning)
- Experience deploying ML models into production
- Understanding of neural networks, transformers, and ensemble methods
- Experience with hyperparameter tuning and transfer learning
- Linux administration (RHEL/CentOS)
- API development experience (Flask, REST)
- Strong troubleshooting and performance optimization skills
Tools & Technologies
- Containers & Orchestration: Docker, Kubernetes, Podman, Enroot, Pyxis
- ML & AI Tools: MLflow, Jupyter, Hugging Face
- Inference & Optimization: Triton Inference Server, TRT-LLM
- Monitoring: Prometheus, Grafana
- CI/CD & Infra: GitHub, Jenkins, Terraform
- Databases: Oracle, MS SQL, MySQL, MongoDB, Redis
- Visualization: Matplotlib, Seaborn, Plotly
- Scheduling: SLURM
Preferred Qualifications
- Experience with AWS (SageMaker, EC2, Lambda)
- Knowledge of vector embeddings and generative AI
- Experience with data preprocessing (cleaning, scaling, normalization)
- Frontend exposure (Angular, HTML, CSS, JavaScript)
- SQL / PL-SQL scripting
JOBID: 12335
