microTECH Global Ltd
- Location:
- (5656) Netherlands
- Salary:
- market rate
- Type:
- Contract
- Start Date:
- asap
- Contract Period:
- 6 months
- Main Industry:
- Search Engineering Jobs
- Advertiser:
- microTECH Global Ltd
- Job ID:
- 132696092
- Posted On:
- 12 February 2026
Your mission will be to translate cutting-edge research into production-ready solutions, focusing on model compression, system optimizations, and agentic capabilities such as function calling and tool orchestration. Experience with designing secure and reliable agentic workflows, including guardrails and safe tool invocation, is considered a strong plus.
What You’ll Do
Optimize LLMs and multimodal models for on-device deployment
Investigate, develop and apply advanced quantization (8-bit, 4-bit, mixed precision), pruning, and distillation techniques for deriving optimized models for NXP NPU targets.
Accelerate inference performance
Investigate, develop and implement system optimizations such as speculative decoding and other efficient decoding algorithms tailored for edge environments.
Engineer agentic AI capabilities towards tiny agents
Investigate methodologies for enhancing the performance of small language models towards enabling tiny agents at the edge, while ensuring these follow safety principles.
Work with inference engines and deployment frameworks
Deploy optimized models using Ollama, llama.cpp, ONNX Runtime, and TFLite for efficient NPU inference.
Benchmark LLMs and agentic systems
Design benchmarking pipelines for assessing the performance of Generative and Agentic AI systems on-device.
Develop demonstrators and proof-of-concepts
Move key technologies from research into product solutions
Your Profile
• MSc, PhD or EngD in a technical specialism, like Computer Science or equally relevant.
• 5+ years of experience in software/AI engineering with deep exposure to LLMs, VLMs, and systems performance.
• Experience with LLM quantization techniques (e.g., SmoothQuant, SpinQuant, QuaRoT), pruning (Wanda, SparseGPT, etc.) and other system optimizations like speculative decoding.
• Track-record experience in working with AI frameworks (PyTorch, TensorFlow, etc.), required.
• Experience with Agentic AI technologies and familiarity with existing frameworks (e.g., LangChain, Google ADK, SmolAgents, etc.)
• Understanding of safety and security considerations for agentic systems (e.g., guardrails, policy enforcement, secure function calling) is a plus.
• Understanding of AI toolchains, deployment, portability and inference engines (CUDA, TensorRT, TFLite, ONNX, Ollama, etc.) preferred.
• Affinity and experience with embedded systems, and NPU accelerators required.
• Experience with embedded software architecture, build systems, version control systems required.
• Broad experience with Operating systems GNU/Linux, embedded systems, development boards, and processors, and SW competencies required.
• Familiarity with setting up and maintaining related ML-Ops development environments (MLFlow, ClearML, etc.) required.
• Knowledge of build systems (YOCTO, OpenEmbedded, etc.) beneficial, working with cross-compilation toolchain
To help us track our recruitment effort, please indicate in your email/cover letter where (vacanciesineu.com) you saw this job posting.
