Benefits & Working Mode
Two financial bonuses: 13th-month salary + luxury ESOP.
Premium health insurance plans for ALL family members with the best package in market.
02 additional days off each quarter connected to a weekend.
Up to 20 days of annual leave per year, which can be accumulated for up to 3 years at your discretion.
Premium annual medical check-ups.
Access to top-tier AI learning programs and training academies in partnership with leading global universities.
Budget for overseas team-building events and company gathering & many other top global benefits.
Onsite at HCMC. (Can be hybrid on your request, not strictly)
Key Notes
You will be among the founding members of the first and largest R&D Center in the SEA region, playing a key role in building, optimizing, and leading engineering practices and projects. Your work will create impact not only in Vietnam but across the broader Southeast Asia region.
Company Description
The world leader in GPU Computing. We are passionate about markets include gaming, automotive, professional vision, HPC, data centres and networking in addition to our traditional OEM business.
It is also well positioned as the ‘AI Computing Company’, and Its GPUs are the brains powering modern Deep Learning software frameworks, accelerated analytics, modern data centres, and driving autonomous vehicles. They have some of the most experienced and dedicated people in the world working for us. If you are dedicated, forward-thinking, and if working with hardworking technical people across countries sounds exciting, this job is for you.
Domain - AI Giant Tech Company
Job Description
We are now looking for a Senior/Mid/Junior DL Algorithms Engineer! We’re looking for the engineers who are mindful of performance analysis and optimization to help us squeeze every last clock cycle out of Deep Learning training, one of today's most essential workloads in the world.
If you are unafraid to work across all layers of the hardware/software stack, from GPU architecture to Deep Learning Framework, to achieve peak performance, we want to hear from you! This role offers an opportunity to directly impact the hardware and software roadmap in a fast-growing technology company that leads the AI revolution while helping deep learning users around the globe enjoy ever-higher training speeds.
What you’ll be doing:
Understand, analyze, profile, and optimize deep learning training and inference workloads on state-of-the-art hardware and software platforms.
Collaborate with researchers and engineers across our company, providing guidance on improving the performance of workloads.
Implement production-quality software across our company's deep learning platform stack.
Build tools to automate workload analysis, workload optimization, and other critical workflows.
Application Process
Send your CV & short discussion with Eric - 037 385 0367 (Zalo/Whatsapp)
CV Screening & 3+ Rounds - Deep Technical Interviews
Requirements
Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, Computer Engineering, or related field (or equivalent experience).
2–4 years of experience, or strong academic/project experience, in deep learning, performance engineering, systems, or high-performance computing.
Good understanding of deep learning fundamentals and modern AI model architectures, especially transformers.
Exposure to profiling and performance analysis tools.
Familiarity with GPU architecture/ parallel computing concepts such as CUDA, kernels, memory hierarchy, and streams.
Programming skills in Python.
Experience with at least one major ML framework such as PyTorch, TensorFlow, or JAX.
Strong fundamentals in algorithms.
Experience with production deployment of Deep Learning models.
Nice-to-have
Internship, research, or project experience optimizing AI/ML workloads on GPUs.
Hands-on experience with TensorRT, TensorRT-LLM, vLLM, SGLang, or similar inference/runtime frameworks.
Familiarity with quantization, sparsity, or mixed-precision techniques.
Experience with distributed training or inference concepts. Contributions to open-source ML systems, performance tools, or infrastructure projects.
Proficiency in C++, strong debugging skills and interest in low-level performance optimization.


