I tried Google’s secret open source app and glimpsed the power of offline AI


google ai edge gallery app 3

Andy Walker / Android Authority

Google has pumped out so many AI products in recent years that I’d need my fingers, toes, and the digits of several other people to keep count. Its current public-facing headliner is Gemini, which also doubles as its virtual assistant on its myriad products. But, if you’re willing to lift its development rock to peek at the creepy crawlies beneath, you’ll find AI Edge Gallery.

Hidden away on GitHub — where few Google-made products have ever resided — AI Edge Gallery gives early adopters a taste of fully downloaded AI models that can be run entirely on the phone. To uncover why Google banished this app beyond the Play Store and what it can actually do, I took up my Pixel and downloaded it. Here’s what I discovered.

What is Google AI Edge Gallery?

google ai edge gallery app 4

Andy Walker / Android Authority

First, let me provide details about what AI Edge Gallery is. The app allows users to download and run large language models (LLMs) on Android phones for offline use. Once downloaded, the LLMs don’t require an internet connection to crunch queries, which makes AI Edge Gallery, in theory, very handy in isolated situations. At present, the app offers four LLMs ranging in size and skill. It also splits these up into three suggestions for using them: Ask Image, Prompt Lab, and AI Chat.

AI Edge Gallery allows users to download LLMs that can be used for offline prompt processing right on your phone.

These categories are largely self-descriptive, but they do explain what to expect from AI Edge Gallery. You can use these models to ask questions about images, engage in simple chats as you would with Gemini or ChatGPT, and use prompts for “single-turn use cases.”

Installation and setup are a pain, but the app is slick and smooth

There’s a good reason why AI Edge Gallery isn’t on the Play Store. The setup is an absolute pain, even if the app is buzzy and feels like a Google-made product.

Once you grab the app off GitHub and install it, you’ll need to install the individual models you wish to try. However, before this, you’ll need to create a Hugging Face account — the site that hosts the models — and acknowledge several user agreements. This includes one on the AI Edge Gallery app itself, another on Hugging Face, and finally, Google’s own Gemma Access Request form.

Finally, after all of this, you’ll need to tap back several times to head back into the AI Edge Gallery app, where the model download will begin.

There were several times I issued a loud sigh during this process, and I wouldn’t blame you if you’d rather clean all your shoes instead. Nevertheless, I persisted.

The setup process from downloading the app to using the model of your choice is padded by several user acknowledgements.

To whet my palate, I leapt onto the Gemma-3n-4EB-it-int4 train (I’ll refer to it simply as “Gemma” as we advance). At 4.4GB, it’s the largest model available on the gallery and is available across all three categories. In theory, the largest model should offer all I need to accomplish any offline chatbot goal I could have. For the most part, its offline capabilities were impressive.

An offline travel planner, science teacher, and sous chef

google ai edge gallery app 1

Andy Walker / Android Authority

To test this model’s capabilities, and therefore, the usefulness of AI Edge Gallery, I wanted to use several prompts that I’d normally run by ChatGPT and Gemini — products that have access to the internet.

For my first trick, I asked Gemma about a theoretical trip to Spain. I used the prompt: “I’m traveling to Spain in a few weeks. What are some items I should consider packing, and which sights should I see?” I wanted to test its capabilities as an offline travel companion. After several seconds of pondering an answer, Gemma leapt into action and completed the answer three minutes later. That’s pretty tardy, but considering it ran entirely offline and rendered my Pixel 8 pretty warm, I was impressed.

Processing times are long, but considering the LLM is running entirely offline on my Pixel 8, it’s admirable.

I was even more impressed when scrolling through the answer. Considering that I didn’t specify how long I’d be spending in Spain, where I’d be heading, or when I’d be leaving, Gemma offered plenty of sights to see, exact quantities of garments I should pack, and additional travel tips.

To test if it can connect to the internet if required, I asked it, “What are the biggest news stories of the day?” It gave me an answer from October 26, 2023, presumably the limit of its global knowledge. This isn’t a problem, but remember that this model is better suited to timeless queries.

OK, back to general questions. I wanted to see how proficient the model is at explaining established theories. I asked it to “Explain the theory of relativity and provide an ELI5 example.” Again, it took a day and an age, but eventually, it produced a deep review of Einstein’s theory.

Don’t expect the models to replace services like Perplexity that can readily access information on the internet.

It also offered a detailed explainer about the source of rattles coming from a car’s engine bay, recipes for making vanilla ice cream, facts about the tallest mountains in the world, and an explanation of soccer’s offside rule. All answers were accurate.

How good is the app at creating things?

Within the Prompt Lab section, you can use a model to “rewrite tone, summarize text, and code snippets.” The latter use case is pretty cool! For a complete coding noob, I asked Gemma to “Create code that responds with ‘hello’ when I input ‘Good day.’” It promptly offered a line of JavaScript that did just that. There are seven languages to pick from, too. Notably, the response includes integrating the code into various scenarios, like a website, making it an excellent educational or verification tool.

The app also allows the summary of text blocks, and it’s not too shabby at that, either. I crammed the introduction of Wikipedia’s Theory of Relativity article into the prompt box, and Gemma confidently broke the content down into five bullet points. The response was swift enough that I’d consider using AI Edge Gallery to break down longer PDFs and studies rather than ChatGPT, especially on documents I don’t want to share. There are various answer options, including bullet points, briefer paragraphs, and more.

What about tone rewriting? I’m unsure when I’d use this feature in my life. I’d rather opt for chat apps and Gmail’s built-in tone tweaker. Nevertheless, I gave Gemma the same snippet used above, selecting the Enthusiastic tone option. You can see the results in the screenshots above.

google ai edge gallery app 2

Andy Walker / Android Authority

It’s important to remember that the model you use will dictate AI Edge Gallery’s answers, capabilities, and processing speed. The app offers plenty of flexibility in this regard. You can download all four and use them interchangeably, or you can use the largest model (as I have) and call it a day. You can even snag the smallest model and enjoy quicker operation, albeit more limited smarts. The choice is yours.

Identifying tomatoes but misplacing monuments

What about image queries? The app makes it super easy to select an image from my albums or capture a new photo, and ask a question about it.

For my test, I picked a shot of some tomatoes we grew over the spring. I asked Gemma, “How do I grow these?” Impressively, the model accurately identified them as grape tomatoes, offered a complete breakdown of their preferred habitat and conditions, details on how to start them from seed, including specifics like thinning and soil mix, and suggestions for planting outdoors. This response took over four minutes, but it was a brilliant, detailed answer!

I queried its knowledge of local landmarks to see how it handled more nuanced images. I picked an image of Franschhoek’s NG Kerk, the oldest church in one of the prettiest towns in South Africa. I didn’t expect it to know it, and, well, it didn’t. It answered with: “This is St. Mary’s Church in Stellenbosch.” It picked a nearby town, but that’s a red cross. Perhaps it would know the more distinct Huguenot Monument in Franschhoek? Nope. That’s in Rome, the model decided.

Clearly, Gemma struggles with recognizing buildings but has little issue with tomatoes. It seems you’ll get mixed success here based on the prevalence and familiarity of objects within an image. This still makes it pretty useful in some cases. I’ll have to test this a little more in a future feature.

I’ve activated your flashlight (just kidding!)

google ai edge gallery app 5

Andy Walker / Android Authority

Finally, I want to discuss where the models of AI Edge Gallery and an actual virtual assistant like Gemini differ. The latter has near complete control of my Pixel 8 and lets me play specific playlists on Spotify, open YouTube channels, search the internet, or trigger my flashlight with a simple prompt. However, this isn’t possible with AI Edge Gallery.

Although asking Gemma to “Switch on my flashlight” is recognized and accepted as a prompt, and the model gleefully replies “Okay! I’ve activated your flashlight,” it adds that it cannot actually do this because it’s a “text-based AI.” It understands what I want accomplished, but its net doesn’t reach that far.

AI Edge Gallery cannot replace Gemini, at least not as a virtual assistant.

To be fair, I didn’t expect this app to have that level of control over my device, but I had to test this regardless. If you were hoping to replace Assistant or Gemini with an offline product like AI Edge Gallery, you’ll be sorely disappointed. It’s also worth noting that AI Edge Gallery and its models cannot generate images from prompts or address queries about files other than images. Hopefully, these features will come to the app’s future iterations.

There’s a reason Gemini is Google’s consumer-facing AI product

So, is AI Edge Gallery worth a try? Without a doubt, yes. As someone who loves the idea of fully offline LLMs that only connect to the internet when available or required, the models here genuinely excite me, and that app makes it possible to test them without too much trouble. I’m sure that query crunching would be far quicker and more efficient on a faster smartphone, too. I feel my Pixel 8 was the bottleneck here.

The app itself looks great and functions adequately for the most part, but it still requires some polish here and there. Leave it open in the background, and you’ll regularly get non-responding boxes popping up and multiple crashes when it returns to focus. It also has several annoying UX issues. Swiping across the screen left or right will clear your last prompt, and you’ll have to start all over again. It’s remarkably easy to do this by accident.

AI Edge Gallery makes private offline processing possible, but there’s a reason it’s not on the Play Store.

Nevertheless, I’m still left impressed by the app’s image identification smarts. As someone who regularly uses Circle to Search to identify plants, animals, and landmarks, AI Edge Gallery could be handy if I’m stuck in the wilderness without a connection and an unidentified bird. You may not consider an offline AI tool necessary, but processing data on your phone does have privacy and security benefits.

If you have a flagship Android phone, I’d recommend picking up AI Edge Gallery, perhaps not as a replacement for Gemini, but as a glimpse into the distant future where much of Gemini’s smarts could be available locally.



Source link

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *