Gemini API File Search is now multimodal

Haven't touched gemini api since they did not support having a $ limit per api key. Is it possible now?

This might be great and all but I am still miffed at how simple search on AI Studio is. You can only search the titles of your conversations and nothing inside them. On top of that they messed with the scrolling so Ctrl+F doesn't work reliably.

It’s a striking irony that the world's leader in search is receiving so much heat for poor search functionality and UX within its own flagship AI products

Good to have a choice between clouds and local use.

How much would you pay to have this yours forever, running locally, GDPR and HIPaa compliant, without the headache of privacy or subscriptions.

That´s what we offer with HugstonOne and we did it before Google. Multimodal, Lighting fast RAG, terabytes not kilobytes only :)

All you need is a 32gb ram laptop and HugstonOne, not a rocket science.

Haven't touched gemini api since they did not support having a $ limit per api key. Is it possible now?

It's incredible how far behind Gemini has gotten, both the product and the model. Even the ChatGPT plugin for Google Sheets blows away the native Gemini integration.

Everyone thought Google was pulling ahead with Gemini 3. For a minute there they had the best language model, image model, AND video model in the world. But it's like they decided to pull over for a nap while OpenAI and Anthropic flew by.

I've come across a few weird search issues like this with Google lately. Entire company built on the best search engine ever created; can't do search properly in their apps.

The search in Gemini app in the browser is so embarrassingly bad that I get an impression that nobody of importance in Google must be using it otherwise they would have fixed long ago.

Yeah, it’s surprising, Claude Desktop has had project files since decades which are chunked/indexed and automatically injected into your context based on the topic.

You’d think this would be fairly obvious for Google to do, but it’s probably an organizational problem rather than a technical one.

I am more miffed that you cannot delete conversations.

Too bad they can't just easily vibe code new features.

Good to have a choice between clouds and local use.

How much would you pay to have this yours forever, running locally, GDPR and HIPaa compliant, without the headache of privacy or subscriptions.

That´s what we offer with HugstonOne and we did it before Google. Multimodal, Lighting fast RAG, terabytes not kilobytes only :)

All you need is a 32gb ram laptop and HugstonOne, not a rocket science.

It’s a striking irony that the world's leader in search is receiving so much heat for poor search functionality and UX within its own flagship AI products

One of Googles core problems is internal silos of talent. The search team has likely never interacted with the Gemini app team or perhaps even the Gemini app.

For all intents and purposes Google Gemini is a totally separate company from Google search.

I've come across a few weird search issues like this with Google lately. Entire company built on the best search engine ever created; can't do search properly in their apps.

Yeah, it’s surprising, Claude Desktop has had project files since decades which are chunked/indexed and automatically injected into your context based on the topic.

You’d think this would be fairly obvious for Google to do, but it’s probably an organizational problem rather than a technical one.

The search in Gemini app in the browser is so embarrassingly bad that I get an impression that nobody of importance in Google must be using it otherwise they would have fixed long ago.

It's incredible how far behind Gemini has gotten, both the product and the model. Even the ChatGPT plugin for Google Sheets blows away the native Gemini integration.

I have the opposite experience where Gemini (even the flash models) has the only useful model for my reverse engineering related use case. My hunch is Google utilizes its free access to entire Google search indices to train itself from niche non-English speaking community websites, much frequently and in a "relevant" manner, which in the end gives these models the most up to date info for this particular kind of work. Every other model is just either 10 years outdated with their answers or simply hallucinates like waaaay crazy.

3.1-pro is still very capable, and API is at competitive price vs e.g. Anthropic, they just can't seem to figure out RLHF and harness. It needs a lot of guiding, it tends to be lazy and poorly sticking to instructions by default.

It just feels like many google products really, they are capable of really amazing things, it's just that nobody there seem to care. I would guess they are likely optimizing more for internal use than their vast userbase.

I just cancelled my Gemini subscription yesterday. I have a big private fork of OpenCode, and I did it the wrong way to start with, so I couldn't pull from upstream.

So I put together a plan for refactoring it, step by step, with tests, etc. After literally 8 solid days of fighting with Gemini 3 Pro, I still couldn't pull it off.

I gave GPT 5.5 a chance with the same prompt, plans, and repo. I'm not sure how long it took, but when I checked in on it a few hours later it was done. All tests passed, everything exactly how I'd asked, and better (it made some improvements).

I never felt Gemini was ever better than the OpenAI or Anthropic. I think it’s more on par with open source models than the top 2

Too bad they can't just easily vibe code new features.

Yeah, what happened to no more SWE

One of Googles core problems is internal silos of talent. The search team has likely never interacted with the Gemini app team or perhaps even the Gemini app.

For all intents and purposes Google Gemini is a totally separate company from Google search.

I just cancelled my Gemini subscription yesterday. I have a big private fork of OpenCode, and I did it the wrong way to start with, so I couldn't pull from upstream.

So I put together a plan for refactoring it, step by step, with tests, etc. After literally 8 solid days of fighting with Gemini 3 Pro, I still couldn't pull it off.

They optimize for making their SRE's lives easier, over quantizing models regardless of how negative an effect that has on the user.

I never felt Gemini was ever better than the OpenAI or Anthropic. I think it’s more on par with open source models than the top 2

Yeah, what happened to no more SWE

I am more miffed that you cannot delete conversations.

They optimize for making their SRE's lives easier, over quantizing models regardless of how negative an effect that has on the user.

Your browser does not support the audio element.

Listen to article

This content is generated by Google AI. Generative AI is experimental

[[duration]] minutes

Today, we are expanding the Gemini API’s File Search tool. You can now build retrieval-augmented generation (RAG) systems with multimodal data and custom metadata. We’re also introducing page citations to improve grounding and transparency.

Whether you are prototyping a weekend project or scaling a production application for thousands of users, your RAG systems can now natively process and better organize your text and visual data.

Give your apps a photographic memory

File Search now processes images and text together. Powered by the Gemini Embedding 2 model, the tool understands native image data, providing your agents contextual awareness.

Think of a creative agency trying to dig up a specific visual asset. Instead of relying on keywords or filenames, your app can search an entire archive for an image matching a specific emotional tone or visual style described in a natural language brief.

See how developers are already using it:

“At Code Fundi, we provide the context layer for autonomous engineering. We solve the ‘Context Window Bottleneck’ by distilling massive, noisy repositories into logic-dense, LLM-ready markdown. Using the gemini-embedding-2 model to index a massive public pool of architectural diagrams, ERDs, and sequence diagrams from top open-source projects, we provide agents with a

Filter the noise with custom metadata

Dumping files into a database is easy. Finding the right one at scale is the real challenge. Custom metadata allows you to attach key-value labels to your unstructured data — things like department: Legal or status: Final.

By applying metadata filters at query time, your application can scope requests to the data slice required. This significantly reduces noise from irrelevant documents, increasing both the speed and accuracy of your RAG workflows.

Show your work with page citations

When your application pulls an answer from a massive PDF, users need to verify exactly where that answer came from.

File Search now ties the model’s response directly to the original source. It captures the page number for every piece of indexed information. This level of granularity allows you to point users directly to the right spot, which helps build trust and makes your tool immediately useful for rigorous fact-checking.

Get started with File Search

We want to make it as easy as possible to store and retrieve the data that makes your ideas work. The File Search tool handles the heavy infrastructure so you can focus on building the product.

Uploading files and searching across them is simple:

Explore more code snippets in our developer guide and Gemini API documentation to learn how to build with File Search.

Hacker Times