Neural Network Performance Engineer – London

Humanoid is the first AI and robotics company in the UK, creating the world’s most advanced, reliable, commercially scalable, and safe humanoid robots. Our first humanoid robot HMND 01 is a next-gen labour automation unit, providing highly efficient services across various use cases, starting with industrial applications. 

Our Mission

At Humanoid we strive to create the world’s leading, commercially scalable, safe, and advanced humanoid robots that seamlessly integrate into daily life and amplify human capacity.

Vision

In a world where artificial intelligence opens up new horizons, our faith in its potential unveils a new outlook where, together, humans and machines build a new future filled with knowledge, inspiration, and incredible discoveries. The development of a functional humanoid robot underpins an era of abundance and well-being where poverty will disappear, and people will be able to choose what they want to do. We believe that providing a universal basic income will eventually be a true evolution of our civilization.

Solution

As the demands on our built environment rise, labour shortages loom. With the world’s workforce increasingly moving away from undesirable tasks, the manufacturing, construction, and logistics industries critical to our daily lives are left exposed. By deploying our general-purpose humanoid robots in environments deemed hazardous or monotonous, we envision a future where human well-being is safeguarded while closing the gaps in critical global labour needs.

About the Role

In this role, you will work on all aspects of running capable neural-network based control policies at a high rate with minimal latency, both on cloud hardware and onboard. Your work will be critical to delivering smooth robot motions while reacting to environment changes as quickly as possible.

What You’ll Do

Analyze performance bottlenecks of a particular model architecture and come up with potential improvements.
Make the model run on a new hardware (e.g. NVIDIA Thor) efficiently.
Implement custom kernels to reduce memory throughput requirements where it matters.
Quantize a model with minimal loss of quality.
Suggest and implement changes of model architecture that will enable better performance characteristics without sacrificing model capabilities.

We’re Looking For

3+ years building deep‑learning systems (industry or research) with shipped models or published artifacts to show for it.
1+ years experience working on performance of neural network inference (analyzing bottlenecks, writing custom kernels, quantizing models, fighting deep learning compilers).
Excellent understanding of GPU architecture and why some models run faster than others.
Strong Python + PyTorch/JAX; you can profile, debug numerics, and write maintainable research code.
You document experiments clearly and communicate trade‑offs crisply.

Nice to have

Robotics or autonomous driving experience.
Open source code showcasing your ability to improve inference performance.
Publications at ICLR/ICML/NeurIPS or equivalent open‑source contributions.
Familiarity with vision-language (VLM) or vision-language-action (VLA) models.

What We Offer

Competitive salary plus participation in our Stock Option Plan
Paid vacation with adjustments based on your location to comply with local labor laws
Travel opportunities to our Vancouver and Boston offices
Office perks: free breakfasts, lunches, snacks, and regular team events
Freedom to influence the product and own key initiatives
Collaboration with top‑tier engineers, researchers, and product experts in AI and robotics
Startup culture prioritising speed, transparency, and minimal bureaucracy

How to Apply

Does this role sound like the perfect fit for you?
Fill in the form and include links or files that showcase the best of what you’ve built and achieved.

Apply now

*indicates a required field

Thanks for the request! we have already received your details and will contact you soon

London, UK