[취업] [AI 반도체 유니콘 기업 - 리벨리온] 신입 엔지니어 모집
Framework Software Engineer
Responsibilities and Opportunities
- Design, develop, and optimize high-performance inference frameworks for large-scale distributed serving, including LLM workloads, using vLLM, SGLang, llm-d and PyTorch to support a wide range of serving patterns
- Improve end-to-end serving performance (TTFT, ITL, throughput, tail latency) through developing techniques such as continuous batching, KV-cache management, prefix caching, speculative decoding, and pipeline/tensor parallelism
- Design scalable multi-node serving architectures including prefill/decode disaggregation, distributed KV-cache, cache eviction strategies, and scheduler design for fairness and efficiency
- Analyze and optimize memory usage, device utilization, communication overhead (PCIe, RDMA), and runtime behavior in real-world serving environments
- Build and maintain benchmarks and simulators for AI serving workloads, and drive data-informed architectural decisions
- Work closely with infrastructure, compiler, and hardware teams to co-design end-to-end AI serving systems
- Actively contribute to and collaborate with open-source communities (e.g., vLLM, PyTorch, Triton, SGLang), including upstream contributions, bug fixes, and design discussions
Key Qualifications
- Master's or higher degree (or equivalent experience) in Computer Science, Electrical Engineering, or a related field
- Strong experience with Python, C++ and PyTorch, including model execution and runtime internals
- Hands-on experience with inference serving or high-performance ML systems
- Familiarity with Linux systems, profiling tools, and debugging performance bottlenecks
- Strong problem-solving skills and the ability to reason about system-level trade-offs
- Clear communication skills and ability to collaborate in a fast-paced engineering environment
Ideal Qualifications
- Experience with vLLM, SGLang, TensorRT-LLM, or similar LLM serving frameworks
- Deep understanding of KV-cache management, attention mechanisms, and memory-efficient inference
- Experience with multi-node inference, including tensor/pipeline parallelism
- Experience supporting GPU/NPU-based inference platforms
- Proven track record of open-source contributions to ML, systems, or infrastructure projects
- Background in building or operating Agentic AI services
NPU Compiler Engineer
Responsibilities and Opportunities
- Designing Rebellions' proprietary compiler stack to accelerate deep learning models on RBLN NPU products
- Architecting and developing production-quality frontend/backend compilers with a focus on generalization, encompassing a wide range of functional coverage and optimization
- Evaluating and verifying core-level and system-level functional features, working closely with hardware and system software architects/engineers
Key Qualifications
- Master's or higher degree in Computer Science, Electrical Engineering, or a related field
- Knowledge of compiler architecture, including various transformation passes, high/mid/low-level optimization/scheduling techniques, scratchpad/buffer memory allocation, backend code generation, etc
- Excellent troubleshooting, problem-solving, and debugging skills
- Proficiency in programming languages: Python, C++
Ideal Qualifications
- Proven track record of building high-quality, production-level software
- Experience in deep learning inference on ASIC, including NPUs, GPUs, mobile APs, etc
전형절차
- 서류전형 > On-line 인터뷰 > On-site 인터뷰 > Culture-fit 인터뷰 > 처우 협의 > 최종 합격
- 전형절차는 직무별로 다르게 운영될 수 있으며, 일정 및 상황에 따라 변동될 수 있습니다.
- 전형 일정 및 결과는 지원 시 작성하신 이메일로 개별 안내드립니다.
참고사항
- 본 공고는 모집 완료 시 조기 마감될 수 있습니다.
- 지원서 내용 중 허위사실이 있는 경우에는 합격이 취소될 수 있습니다.
- 채용 및 업무 수행과 관련하여 요구되는 법령 상 자격이 갖추어지지 않은 경우 채용이 제한될 수 있습니다.
- 보훈 대상자 및 장애인 여부는 채용 과정에서 어떠한 불이익도 미치지 않습니다.
- 담당 업무 범위는 후보자의 전반적인 경력과 경험 등 제반사정을 고려하여 변경될 수 있습니다. 이러한 변경이 필요할 경우, 최종 합격 통지 전 적절한 시기에 후보자와 커뮤니케이션 될 예정입니다.
- 채용 관련 문의사항은 메일 주소로 문의바랍니다. recruit@rebellions.ai
