The MediaPipe LLM Inference API lets you run LLMs on Android and iOS

The new API allows you to use Gemma, Falcon, Phi 2, and Stable LM

2 min. read

Published on March 12, 2024

published on March 12, 2024

Readers help support Windows Report. We may get a commission if you buy through our links.

Experimental MediaPipe LLM Inference API featured next to the logos of Google, Android, and iOS

Google’s Experimental MediaPipe LLM Inference API allows you to bring large language models to your Android and iOS devices. Furthermore, the experimental API can also run LLMs on web platforms. In addition, the API provides initial support for Gemma, Falcon, Phi 2, and Stable LM.

However, it is still under active development. On top of that, on Android devices, production applications with LLMs can use Android AICore to access Gemini API or Gemini Nano.

How do I run LLMs on Android/iOS?

To run LLMs on Android and iOS, you can use the MediaPipe LLM Inference API. However, there are other third-party applications, such as MLC LLM. Yet, the Android AICore facilitates the use of Gemini-powered solutions. Moreover, the AICore can use hardware-specific neural accelerators. You can also use a sample of the MediaPipe LLM Inference. In addition, if you have access to Partner Dash, you could try the Web Demo.

The MediaPipe LLM Inference API allows large language models to run entirely on-device across platforms. It is also easy to use in just a few steps, so you can use LLMs on devices with slightly lower specs. However, you shouldn’t expect them to work at maximum capacity unless your device is high-end. Hopefully, soon, they will add better optimizations and allow lower-spec devices, like phones, to run the LLMs smoother through the API.

The MediaPipe LLM Inference API allows you to run large language models on various devices. In addition, it is easy to use, and some Redditors consider it a great opportunity. Furthermore, the API is experimental and will receive more updates and features in the future. However, for now, you can use it for text-to-text generation. Also, it allows you to pick from multiple models to meet your specific needs.

By the way, if you encounter any compatibility issues, check out the LLM Conversion guide.

What do you think? Are you going to use the MediaPipe LLM Inference API? Let us know in the comments.