Alan Zhu
Lingyue Zhu
Product Leader where AI software meets hardware.

I am a product leader who spans from hardware silicon to shipped software product. I co-founded and was Chief Product Officer at Nexa AI (acquired by Qualcomm in 2026), where I built a Gen AI inference developer tool (from 0 to 8,000+ GitHub stars), shipped a viral local AI agent application, drove the ecosystem partnerships (e.g. Microsoft, NVIDIA, IBM, AMD, Qualcomm), and launched small language models (top 3 on Hugging Face). I now lead a generative AI developer platform at Qualcomm AI Hub.
I am interested in the product and human side of small language models, inference, AI hardware, robotics, and physical AI.
Selected work
NexaSDK
RuntimeThe world's first GenAI inference runtime that runs any model on NPU — 8,000+ GitHub stars and #1 on GitHub Trending.
Hyperlink
ApplicationThe world's first local AI agent that lives inside your computer — it searches across all your files, fully private. 2.1M views and 30K users in two months, featured by NVIDIA, Microsoft, Qualcomm, and AMD.
Octopus
ModelThe world's first AI agent model built to run directly on hardware — function calling across PCs, phones, wearables, automotive, and robots — part of a model line that trended top-3 on Hugging Face, featured by Google I/O and the Google DeepMind blog.
Recognition
Silicon
- Qualcomm
Shipped our work across Snapdragon for Android, the Hexagon NPU, Granite 4.0 at the edge, and IoT & robotics — and featured us seven times in “This Week in AI.”
- NVIDIA
Co-launched our local AI agent on RTX, featured us at their CES 2026 booth as a highlighted startup, and published a 1.3M-view feature on YouTube.
- AMD
Made us an official ISV partner and CES 2026 partner, and published our work on image generation on its NPU and DeepSeek R1 speedups.
- Intel
Featured us on its developer channel.
Models
- Google DeepMind
Featured OmniAudio in the Gemmaverse and our work at Google I/O 2024.
- IBM
Named NexaML a supported framework for Granite 4.0, alongside vLLM, llama.cpp, and MLX.
- Alibaba / Qwen
Highlighted our day-0 Qwen3-VL support and Qwen3 on Qualcomm NPU.
- Liquid AI
Partnered with us on the LFM2.5 launch and LFM2.5 Thinking.
- Hugging Face
Thomas Wolf ranked our models among the most-downloaded, and CEO Clément Delangue put out a public collaboration call.
- OpenAI
Featured our work via Developer Relations.
Platforms
- Microsoft
Put us on stage at the Ignite 2025 keynote as an official partner.
Publications
- AutoNeural: Co-Designing Vision-Language Models for NPU Inference
The world's first software–hardware co-designed model for NPU.
Co-author · arXiv:2512.02924
- Innovation Proficiency and Barriers to Its Development by Product Managers and Their Teams
Co-author · Businesses (MDPI), 2026
Talks & appearances
- Microsoft Ignite 2025
- CES 2026 (NVIDIA & AMD booths)
- Qualcomm Snapdragon Summit
- PyTorch Conference
- Web Summit
Show 2 more
- IBM TechXchange
- Open Data Science Conference
Experience
- Senior Product Manager, AI Hub · Qualcomm2026 — Present
- Co-founder & Chief Product Officer · Nexa AI2024 — 2026
Acquired by Qualcomm in 2026.
- Product Manager, Enterprise Software · Tesla2022 — 2023
- B.A., Computer Science & Data Science · UC Berkeley2022
Beyond work
Born in Xi'an, China, with high school in Texas. UC Berkeley grad, where he competed on the university's official esports team. Off the clock: tennis, watching sports, and a steady rotation of jazzy, housey music.
Connect
Reach me for product collaborations, speaking, or to say hello:
alanzhuly@gmail.com