Here's some of my captions that tend to trip up even state-of-the-art models.
https://mordenstar.com/other/nb-pro-2-tests
So far it does feel more iterative than an entirely new leap in terms of capabilities, but I haven't run it through the more multimodal aspects such as editing existing images.
That being said, it actually managed the King Louie jump rope test which surprised me.
Two what I could consider "interesting prompts" for image gen testing. Did pretty well.
"A macro close-up photograph of an old watchmaker's hands carefully replacing a tiny gear inside a vintage pocket watch. The watch mechanism is partially submerged in a shallow dish of clear water, causing visible refraction and light caustics across the brass gears. A single drop of water is falling from a pair of steel tweezers, captured mid splash on the water's surface. Reflect the watchmaker's face, slightly distorted, in the curved glass of the watch face. Sharp focus throughout, natural window lighting from the left, shot on 100mm macro lens." - Only major problem i could find at a glance is the clasps don't make sense probably, and the drop of water inside the watch on the cog doesn't make sense/cog mangled into tweezers.
"A candid photograph taken from behind an elderly woman sitting alone on a park bench in late autumn. She is gently resting one hand on the empty seat beside her, where a man's weathered flat cap and a folded newspaper sit untouched. Fallen golden leaves cover the path ahead. The low afternoon sun casts her long shadow alongside a second, fainter shadow that almost seems to be there, the suggestion of someone sitting next to her, visible only in the light on the ground. Muted, warm color palette, shallow depth of field on the background trees, photojournalistic style." - I don't know why but it internal errored twice on this one but then got there.
My main use case is editing user uploads to enhance their clothing images. A large part of it is preserving logo, graphics and other technical details. I noticed over time it felt like Nano Banana has gotten worse at this.
I have a test set of graphic t-shirts that I noticed the model seeming getting worse with it. This combined with price and the terrible experience of their cloud console got me to migrate off.
I guess even Google is running out of GPUs.
Pretty close to Gemini 3 Pro Image (aka Nano Banana Pro) in most benchmarks, even without thinking+search, and even exceeding it in 2 most important ones of 'Overall Preference' and 'Visual Quality'. I'm excited about the big jump in Infographics/Factuality (even without thinking+search; I'm surprised that text+image search grounding doesn't make an even bigger dent).
EDIT: after significant prompting, it actually solved it. I think it's the first one to do so in my testing.
> I'm sorry, but I cannot fulfill your request as it contains conflicting instructions. You asked me to include the self-carved markings on the character's right wrist and to show him clutching his electromancy focus, but you also explicitly stated, "Do NOT include any props, weapons, or objects in the character's hands - hands should be empty." This contradiction prevents me from generating the image as requested.
My prompts are automated (e.g. I'm not writing them) and definitely have contained conflicting instructions in the past.
A quick google search on that error doesn't reveal anything either
Previous nano banana frequently made speech attribution errors, the new one seems a lot more consistent.
The "cubism" example seems like it would be a closer fit to something like stained glass or something. I don't think the thing really understands what cubism was all about. Cubist painters were trying to free themselves from the confines of a single integral plane of perspective by allowing themselves to show various parts of the image from different viewpoints, different times, different styles, etc.
The division of the image into geometric shapes is just a by-product of that quest, whereas the examples here have made it the sum total of the whole piece.
This feels to me like an example of how LLMs still don't "understand" what the art means, and are just aping its facade.
I think part of the issue with architects and designers today is that they use CAD too much. It's easy to design boxes and basic roof lines in CAD. It's harder to put in curves and more craftsman features. Nano Banana's renders have more organic design features IMO.
Our house is looking great and we're very happy how it's going so far with a lot of the thanks to Nano Banana.
1. The narrative/life of the artist becomes a lot more important. The most successful artists are ones that craft a story around their life and art, and don't just create stuff and stop. This will become even more important.
2. Originality matters more than ever. By design, these tools can only copy and mix things that already exist. But they aren't alive, they don't live in the world and have experiences, and they can't create something truly new.
3. Those that bother to learn the actual art skills, and not merely prompting, will increasingly be miles ahead of everyone else. People are lazy, and bothering to put in the time to actually learn stuff will stand out more and more. (Ditto for writing essays and other writing people are doing with AI.)
4. Taste continues to be the single most important thing. The vast, vast majority of AI art out there is...not very good. It's not going to get better, because the lack of taste isn't a technical problem.
5. Art with physical materials will become increasingly popular. That is, stuff that can't be digitized very well: sculpture, installation art, etc. Above all, AI art is uncool, which means it has no real future as a leading art form. This uncoolness will push people away from the screen and towards things that are more material.
Why can't Google, for example just call:
Gemini Image = Nano Banana
Gemini Video = Veo
...I use all those fancy image models editing capabilities for my fast fashion web shop. I must say: product photography for clothing and accessories product is dead. Those models are amazing at style transfering and garment transferring.
We will see how good will be Seedream 5.0 full version.
You can argue things like code generation are an extension of the engineer wielding it. Image generation just seems like a net negative overall if it’s used at scale.
Edit: By scale, I mean large corporations putting content in front of millions. I understand the appeal for smaller businesses where they probably weren’t going to pay an artist anyway.
Now extrapolate to all other artforms. Sculpture seems safe, for now, but only barely so.
And not a (botched) fake white/gray grid background that is commonly used to visualize transparency?
Probably about half of us here remember photos before the cell phone era. They were rare, and special, and you'd have a few photos per YEAR to look back on. The feel of photos back then, was at least 100x stronger than now. They were a special item, could be given as a gift. But once they became freely available that same amount of emotion is now split across many thousands of photos. (not saying this is good or bad, just increased supply reducing value of each item)
With image/art generation the same thing will happen and I can already feel it happening. Things that used to be beautiful or fantastic looking now just feel flat and AI-ish. If claymation scenes can be generated in 1s, and I see a million claymation diagrams a year, then claymation will lose its charm. If I see a million fake Tom Cruise videos, then it oversaturates my desire for desire for all Tom Cruise movies.
What a time to be alive.
- Base pricing for a 1024x1024 image is almost 1.6x what normal Nano Banana is ($0.067 vs. $0.039), however you can now get a 512x512 image for cheaper, or a 4k image for cheaper than four 1k images: https://ai.google.dev/gemini-api/docs/pricing#gemini-3.1-fla...
- Thinking is now configurable between `Minimal` and `High` (was not the case with Nano Banana Pro)
- Safety of the model appears to be increased so typical copyright infringing/NSFW content is difficult to generate (it refused to let me generate cartoon characters having taken psychedelics)
- Generation speed is really slow (2-3min per image) but that may be due to load.
- Prompt adherence to my trickier prompts for Nano Banana Pro (https://minimaxir.com/2025/12/nano-banana-pro/) is much worse, unsurprisingly. For example I asked it to make a 5x2 grid with 10 given inputs and it keeps making 4x3 grids with duplicate inputs.
However, I am skeptical with their marquee feature: image search. Anyone who has used Nano Banana Pro for awhile knows that it will strongly overfit on any input images by copy/pasting the subject without changes which is bad for creativity, and I suspect this implementation appears the same.
Additionally I have a test prompt which exploits the January 2025 knowledge cutoff:
Generate a photo of the KPop Demon Hunters performing a concert at Golden Gate Park in their concert outfits.
That still fails even with Grounding with Google Search and Image Search enabled, and more charitable variants of the prompt.tl;dr the example images (https://deepmind.google/models/gemini-image/flash/) seem similar to Nano Banana Pro which is indeed a big quality improvement but even relative to base Nano Banana it's unclear if it justifies a "2" subtitle especially given the increased cost.
I would be happy to never see any more AI slop.
And actually, the link I saw a bit ago was this [0] which is more in-depth and has a lot more examples + prompts.
Just think we conceptually know what a brushless motor design looks like and it's just pixels. I guess even if it did produce the image we wouldn't know what it means.
I find it does a good job at isometric views from floor plans. However, I needed Gemini 3.1 Pro to be able to have a chance at rendering 3D human point of view images from floor plans.
Like... What are your inputs to the model? Empty renders of the space, or more fully decorated views/ photos? Do you have a light harness around this to help you discover the style you like and then stay consistent with it?
Do you find that giving a lot of context around the space you're designing helps (it hasn't in my attempts)?
This is precisely and importantly true. I just wonder if most of the world cares. I'd like to think so, but experience tells me that most of the world is satisfied with mediocre stuff. And I don't say this as a criticism; it's just a fact that artists have to come to grips with.
I totally get this, but on the other hand, we have definitely benefited from being able to take more photos. I have some older friends (pushing 80 or so) who sucked at taking photos, so 9 of 10 photos they have from their prime adult years raising their family are blurry to the point of not recognizing the people if you don't already know who they are.
They have great photos from the last 15-20 years, but of course they do, phone cameras are vastly superior to the point-and-shoot cameras from the 70s, and when you reflexively shoot a dozen photos every time you pose for a picture your odds are way better that one will come out clear, everyone looking at the camera, smiling, etc.
Afaik the only real competitor is Riverflow V2.
Because it's real.
It’s a huge practical problem to try and figure out authentic nature over the Internet. It’s already clear that people will pay for it, but it’s not at all clear that they will get it. If we imagine that the tools get better and more sophisticated than there is no reason whatsoever to assume that the tools won’t be deployed to give the impression that is needed to make money.
I don’t think any of the above survives if we allow for AI to be used as it is currently being used. It only survives if you pretend that ahead of us is some invisible gate past which this technology will not go.
I agree on current AI art taste, but disagree that it can't be improved. I think art AI companies can hire skilled "taste makers" and use their feedback loop as RL for AI art models. I think this area will always be in flux, and will vary by subpopulation so it will be a job role always in demand.
Do you think taste is something that cannot be taught/learned? Are certain individuals just born with good taste; it's an immutable property?
Things that would take me an hour or so the old way takes three minutes with NB.
But I can see this applying to small businesses. Something that some random person would have to spend on hour photoshopping can be done in a few minutes with NB.
Let's give him 2015 tech instead. Imagine if he used Illustrator to create the Mona Lisa. Is that much better?
we have user-preference rankings that put NB2 on top: https://arena.ai/leaderboard/text-to-image
The obvious ones stand out, but there are so many that are indiscernible without spending lots of time digging through it. Even then there are ones that you can at best guess it's maybe AI gen.
Soon many real OF models will be out of job when everyone will be able to produce content to their personal taste from a few prompts.
What in the world is a fake OF model?
Does "OF" stand for "of food"?
> 1... The narrative/life of the artist becomes a lot more important.
When I watch a movie, I don't care about the artist's life. I care about character life, that's very different.
> 2... Originality matters more than ever. By design, these tools can only copy and mix things that already exist.
It's like you assigning to humans divine capabilities :) . Hyperbolizing a little, humans also only copy and mix - where do you think originality comes from? Granted, AI isn't at the level of humans yet, but they improve here.
> 4... It's not going to get better, because the lack of taste isn't a technical problem.
Engineers are in business of converting non-technical problems into technical ones. Just like AI now is way more capable than it was 20 years ago, and able to write interesting texts and make interesting pictures - something which at the time wasn't considered a technical problem - with time what we perceive as "taste" may likely improve.
> 5... Above all, AI art is uncool, which means it has no real future as a leading art form.
AI critics are for a long time mistaking the level with trend. Or, giving a comparison with SpaceX achievements, "you're currently here" - when there was a list of "first, get to the orbit, then we'll talk", "first, start regular payload deliveries to orbit, then we'll talk", "first, land the stage... send crewed capsule... do that in numbers..." and then, currently "first, send the Starship to orbit". "You're currently here" is the always existing point which isn't achieved at the moment and which gives to critics something to point to and mount the objection to the process as a whole, because, see, this particular thing isn't achieved yet.
You assume AI won't be able to make cool art with time. AI critics were shown time and time again to be underestimating the possibilities. Some people find it hard to learn in some particular topics.
We are 50 years into post-modernism. Can't imagine it can get any more important.
I predict emergent design will be the next big thing. Czinger[1] is a great example of what it may look like. Rick Ruben-esque world, where the creator is more a guide.
[1] Czinger uses stochastic optimization to converge to designs - https://www.czinger.com/iconic-design
Also, using AI will not allow you to better express yourself. To use an analogy, it will not put your self-expression into any better focus, but just apply one of the stock IG filters to it.
Less the narrative of the art's production and more the message that it's conveying.
I don't mean (necessarily) a political message or a message that can be put in to words. But the abstract sense of connecting with the human who created it some way.
This isn't just art though. An example: soon, Sora will be able to generate very convincing footage of a football match. Would any football fan watch this? No. A big part of why we watch football is that in some sense we care about the people who are playing.
Same with visual art. AI art can be cool but in the end, I just don't really give a shit. Coz enjoying art is usually about the abstract sense that a human person decided to make the thing you are looking at, and now you are looking at it... And now what?
This is why every time someone says "AI art sucks" and someone replies "oh yeah? But look at THIS AI art" I always wonder... What do you think art is _for_?
I do wonder though… were there other innovations that were uncool in their early years, where now nobody bats an eyelid?
Is that point just a generational/passage of time issue?
When a company sends an email or docu-sign, they don’t want to pay a courier.
Technology supplements or replaces jobs, often reducing costs. This is no different.
I'm torn on the scale thing. It definitely seems net negative. But I think we collectively underestimate just how deeply sick the existing thing already is. We're repulsed by image gen at scale because it breaks our expectation that images are at least somewhat based on reality, that they reflect the natural world or what we can really expect from a product, from a company, from the future. But that was already a bad expectation: when's the last time you saw a mcdonalds meal that looked like the advert? Or a sub-30$ amazon product that wasn't a complete piece of shit? Advertisements were already actively malicious fantasies to exploit the way our brains react to pictures. They're just fantasies that required whole teams of humans doing weird bullshit with lighting and photoshop, and I'm not sure that's much better. It was already slop. All the grieving we do about the loss of truth, or the extent to which corps will gleefully spray us with mind-breaking waterfalls of outright lies, I think those ships sailed a long time ago. The disgust, deceit, the rage we feel about genAI slop is the way we should have felt about all commercials since at least the 80s IMO.
Artists aren't doing it for the money. With advanced tools like these they wouldve iterated much faster and created much grander designs.
Art is about pushing limits of what's possible and AI just raises those limits.
Likewise with the sort of resurgence of vinyl, and the obsession over "old" point and shoot digicams.
I don't think I fully agree. Sure people make so many photo's that they don't have the time or the will to start looking through them all.
You can't just whip out your phone and start scrolling through thousands of photo's with friends. It would get so boring so fast.
But if you put some effort into making a nice little selection of the best photo's, that emotion is 100% still there.
I guess my stick figure hand drawn diagrams, a doc with few mistakes in grammar or spelling would be seen as more worthy to read as long as my ideas are sound. Right? :-)
I sit here thinking how wonderful and terrible of a time it is. If you can afford to sit in the stands and watch, it's exciting. There's never been so much change in such a short period of time. But if you're in the arena, or expecting to end up in the arena at some point, what terrifying moments lay ahead of you.
I never thought I'd say this, but I expect the arena is where I'll end up...I've enjoyed my time in the stands, but I'm running low on energy, capital and the will to keep trying.
(except The Mandalorian, and I can't believe I'm using the word "content" :/)
edit: Totally forgot about Andor & Rogue One sorry, great film and two seasons of top-notch storytelling.
I suppose if the AI was able to tell me a true and compelling story, I might not even mind so much. I just don't want to be spoon fed drivel for 15 minutes to find it was all complete made up BS.
Just being able to generate a vision and then be able to capture it in a prompt is an art within itself.
Scott Alexander has written about it:
Original Nano Banana (gemini-2.5-flash-image): $0.039 per image (up to 1024×1024px)
Nano Banana 2 (gemini-3.1-flash-image-preview): $0.045 per 512px image $0.067 per 1K (1024×1024) image $0.101 per 2K image $0.151 per 4K image
Nano Banana Pro (gemini-3-pro-image-preview): $0.134 per 1K/2K image $0.240 per 4K image
So at the most common 1K resolution, NB2 is ~72% more expensive than the original NB ($0.067 vs $0.039), but still half the price of NB Pro ($0.134).
source: https://deepmind.google/models/model-cards/gemini-3-1-flash-...
The banana models (image) are a different than the mainline models, but the confusingly leverage the same naming scheme.
- https://hunyuan.tencent.com/image/en?tabIndex=0
- https://seed.bytedance.com/en/seedream5_0_lite
someone shared benchmarks that differ my experience tho, so I may be biased
But yeah I am slowly trying to incorporate AI into my life (the delegation, work in my sleep part). I develop it is the funny thing (RAG agents) but yeah. Sometimes I get sold on it like "wait a minute maybe it can do that" but no. Can probably tell I don't get deep into the technical part I'm an API consumer. That's the thing I realize too, can only know so much about a topic if you're spread thin/a generalist.
That was the beginning of my journey into understanding what proper verification/vetting of a source is. It's been going on for a long time and there are always new things to learn. This should be taught to every child, starting early on.
The positive aspect of this advance is that I've basically stopped using social media because of the creeping sense that everything is slop
a lot of these accounts mix old clips with new AI clips
or tag onto something emotional like a fake Epstein file image with your favorite politician, and pointing out its AI has people thinking you’re deflecting because you support the politician
Meanwhile the engagement farmer is completely exempt from scrutiny
Its fascinating how fast and unexpected the direction goes
And here we come back to the aged old "can you seperate an artist from their art" because I'd argue when you watch a movie you are watching a product of their life
I’m fairly certain the original comment was referring to instances where the artist is the character/primary subject.
Finally, someone pointing out all of this is just people announcing what has been in play for half a century.
Cameras are now "enhancing" photos with AI automatically. The contents of a 'real' photo are increasingly generated. The line is blurring and it's only going to get worse.
Or making video editing + free, global publishing platform did for film? (see: doom scrolling).
Depends what the future of VR worlds look like, and what the viewers place is in them.
We have no idea, and most people are just guessing in a way that flatters some understanding of art that they have. We also frankly have no idea what the permanent relationship of humans to art is even without AI.
The television is less than 100 years old. There aren’t very many, but there are some people alive today who were alive before the television was created. The computer is about 80 years old. The whole idea of photography and of recorded audio is less uthan 150 years old.
We are still living in the aftershocks of industrial production of art. It is foolish to imagine that in the midst of this chaos, we can point the way forward with ease.
They said you couldn't become a good photographer if you didn't learn it with the limitation of film that forced you to make each shot count. Photoshopping a picture made it "not a real photo" and was banned from online communities and irl events, drawing in photoshop was not considered art. I find it very ironic that digital artists are repeating the exact same argument as the one used against their art
AI is incompatible with capitalism, but the world isn't ready for that. So we'll have a prolonged period of intense aggregation where more and more value is attributed to systems of control that already have more than they could ever spend, long after the free parts could have provided for basic human needs.
In other words, the masters existed because they had benefactors and a market for their art and inventions. Today there are better artists and inventors toiling in obscurity, but they won't be remembered because they merely make rent. Which gets harder every day, so there's a kind of deification of the working class hero NPC mindset and simultaneously no bandwidth for ingenuity (what we once thought of as divine inspiration).
Terence McKenna predicted this paradox that the future's going to get weirder and weirder back in 1998:
These days, through commissions, art is a much more viable profession than it ever was.
I take a hundred photos on a trip, my phone uses AI (not even the new fancy AI, but old 5-10 year old stuff to detect smiling faces and people in frame) to pull out less than a dozen that are worth keeping. Once a month or so I get fed a reminder of some past trip.
This isn't any different than before. The number of photos taken is greater, but the overall number of worthwhile photos from a given trip is about the same.
Even if there were a million fake Tom Cruise movies I would still like Edge of Tomorrow (even if it had been AI made).
I don't have inside info, but everything we've seen about gemini3.0 makes me think they aren't doing distillation for their models. They are likely training different arch/sizes in parallel. Gemini 3.0-flash was better than 3.0-pro on a bunch of tasks. That shouldn't happen with distillation. So my guess is that they are working in parallel, on different arches, and try out stuff on -flash first (since they're smaller and faster to train) and then apply the learnings to -pro training runs. (same thing kinda happened with 2.5-flash that got better upgrades than 2.5-pro at various points last year). Ofc I might be wrong, but that's my guess right now.
You're completely misunderstanding what the product being sold is.
Has this thought process ever worked in real life? I know plenty of seniors who still believe everything that comes out of Facebook, be AI or not, and before that it was the TV, radio, newspapers, etc.
Most people choose to believe, which is why they have a hard time confronting facts.
net positive to society
-They simply aren't into real women/men (so you couldn't even pay a model to do what they're looking for).
-They want to play out fantasies that would be hard to coordinate even if you could pay models (I guess this is more on the video side of things, but a string of photos can put be together into a comic)
-They want to generate imagery that would be illegal
Based on this, I would guess fetish artists (as in illustrators) are more at risk than OF models. However, AI isn't free. Depending on what you're looking for, commissions might be cheaper still for quite a while...
In a hypothetical world of "AI can produce a lot of extremely high quality art", you can easily find (or commission) AI art you would absolutely love. But it probably wouldn't be something that anyone else would find a lot of value in?
There will be no AI-generated Titanic. There will be many AI-generated movies that are as good as Titanic, but none will become as popular as Titanic did.
Because when AI has won art on quality and quantity both, and the quality of the work itself is no longer a differentiator against the sea of other high quality works? The "narrative/life of the artist" is a fallback path to popularity. You will need something that's not just "it's damn good art" - an external factor - to make it impactful, make it stick in the culture field.
Already a thing in many areas where the supply of art outpaces demand. Pop music, for example, is often as much about making sound as it is about manufacturing narratives around the artists. K-pop being an extreme version of the latter lean.
I can't tell if you're being facetious. But being an embodied consciousness with the ability to create is as divine as it gets. We'd do well to remember.
At least in popular, mainstream culture, the viewer is heavily invested in the identity of the artist. The quality of the "art" is secondary. That's how we get music engineered by committee. And it's how we get paparazzi, People Magazine, and so forth.
On the other hand, this isn't anything new at all. We've had this kind of thing for decades. Real art still manages to survive at the margins.
But even then – people obviously go watch movies because they like the actor/director involved. It’s not really clear why anyone would care about an AI actor. People want to watch people, not imitations of them.
The rest of your comments seem to be summarized as “it has gotten better and therefore it will eventually solve all problems it has now.” Which may be true in a technical sense, but again this is not taste.
A technical company like Space X really has nothing to do with this conversation, and I think you missed my point about it being uncool. It’s not about critics, it’s about culture at large.
At this point I think identifying a work as AI-created makes people instantly devalue it. We are rapidly approaching the point where no one wants to admit something is AI-created, because it comes with negative perceptions.
Originality comes from humans experiencing the world and interacting with it. What AI tool is a living being interacting with the world? None, of course. Hence the constant generic slop images of Impressionism or some other already-existing art style.
Just look at the images in the link: this is the best they can do? A kangaroo at a cafe in Paris? Could anything be more devoid of good taste?
Every human being is unique, both biologically and experientially. Until an AI can feel and have a lived experience, it can not create art.
Art is not a problem to be solved.
It's an ethical conundrum because we're not paying anyone, but we don't have the money to pay anyone, and it's good enough for our budget.
But we're getting used to the process of changing a part of the text in a few seconds without any artist involved and for 0$.
I guess that soon we'll be able to create voice sample from know personalities for a few $ with prices based on the popularity of the artist and some sanity check based on the artist preferences.
Feb 26, 2026
Our latest image generation model offers advanced world knowledge, production-ready specs, subject consistency and more, all at Flash speed.
Naina Raisinghani
Product Manager, Google DeepMind
Google DeepMind is launching Nano Banana 2, a new image model that combines the advanced features of Nano Banana Pro with the speed of Gemini Flash. You can now access high-quality image generation with faster editing and iteration across Google products like the Gemini app and Google Search. Also, Google continues to improve its SynthID technology with C2PA Content Credentials to identify AI-generated content.
Summaries were generated by Google AI. Generative AI is experimental.
Summaries were generated by Google AI. Generative AI is experimental.
Google has a new AI image model called Nano Banana 2. It's super fast and combines the best parts of their other image models. You can now make images quicker with better quality and more control. It's available in Google apps like Gemini and even for creating ads.
Summaries were generated by Google AI. Generative AI is experimental.

Your browser does not support the audio element.
Listen to article
This content is generated by Google AI. Generative AI is experimental
[[duration]] minutes
In August of last year, our Gemini Image model, Nano Banana, became a viral sensation, redefining image generation and editing. Then in November, we released Nano Banana Pro, offering users advanced intelligence and studio-quality creative control. Today, we’re bringing the best of both worlds to users across Google.
Introducing Nano Banana 2 (Gemini 3.1 Flash Image), our latest state-of-the-art image model. Now you can get the advanced world knowledge, quality and reasoning you love in Nano Banana Pro, at lightning-fast speed.
Nano Banana 2 brings the high-speed intelligence of Gemini Flash to visual generation, making rapid edits and iteration possible. It makes once-exclusive Pro features accessible to a wider audience, including:
A flat lay infographic depicting the water cycle
Triptych infographic comparing cloud types
Museum Clos Lucé in Synthetic Cubism style
Localized "Native Wildlife" sign
Read the prompts: Water Cycle 1 , Cloud Infographic 2 , Cubism 3 , Wildlife Sign 4
Nano Banana 2 also dramatically closes the gap between speed and visual fidelity, delivering high-quality, photorealistic imagery. Here’s what our newest model offers and has improved on from the original Nano Banana:
Joyful characters and items at a farm
Fluffy friends building a treehouse
Misty panoramic aerial shot of a verdant valley
Highly stylized pop-art fashion portrait in different aspect ratios
Read the prompts: Farm 5 , Treehouse 6 , Valley 7 , Portrait 8
Whatever your needs, we now offer the perfect tool for every workflow: Nano Banana Pro for high-fidelity tasks requiring maximum factual accuracy, or Nano Banana 2 for rapid generation, precise instruction following and integrated image-search grounding.
Nano Banana 2 is rolling out today across Google products, including:
Try Nano Banana 2 in the Gemini app, using the new templates feature.
World knowledge from Nano Banana 2 in AI Mode in Search.
Subject preservation from Nano Banana 2 in Flow.
Read the prompts: AI Mode in Search 9 , Flow 10
As generative media evolves, so must the tools we use to identify and understand it. We continue to deepen our provenance approach, by coupling our state-of-the-art SynthID technology with interoperable C2PA Content Credentials, we provide users with a more holistic and contextual view of not just if AI was used, but how.
Our provenance tools are already making an impact. Since its launch in November, our SynthID verification feature in Gemini app has been used over 20 million times across various languages, helping people identify Google AI-generated images, video and audio. We’ll soon be bringing C2PA verification to the Gemini app, too.
Done. Just one step more.
Check your inbox to confirm your subscription.
You are already subscribed to our newsletter.
You can also subscribe with a
Prompt: High-quality flat lay photography creating a DIY infographic that simply explains how the water cycle works, arranged on a clean, light gray textured background. The visual story flows from left to right in clear steps. Simple, clean black arrows are hand-drawn onto the background to guide the viewer's eye. The overall mood is educational, modern, and easy to understand. The image is shot from a top-down, bird's-eye view with soft, even lighting that minimizes shadows and keeps the focus on the process.
Prompt: Triptych infographic comparing three types of clouds: Cumulus, Stratus, and Cirrus. Each panel shows the cloud type in a dramatic sky with a bold label. High-contrast comic style. AR: 16:9
Prompt: Create an image of Museum Clos Lucé. In the style of bright colored Synthetic Cubism. No text. Your plan is to first search for visual references, and generate after. Aspect ratio 16:9
Prompt 1:
An intimate, naturalistic cinematic close-up reveals a small, intricately illustrated sign made of recycled material, showing drawings of local birds and flowers. Delicate script below reads: "Native Wildlife: Please Observe from a Distance." Soft, diffused light filters through the leaves of a nearby fern, casting gentle shadows. The background is a soft blur of vibrant green foliage, emphasizing respect for the delicate ecosystem.
Prompt 2:
Take this concept and localize it to an Indian setting, including translation of all the text to Hindi
Prompt: Create an image of these 14 characters and items having fun at the farm. The overall atmosphere is fun, silly and joyful. It is strictly important to keep identity consistent of all the 14 characters and items.
Prompt: Create a funny 6 part story with these 3 fluffy friends building a tree house. The story is thrilling throughout with emotional highs and lows and is ending in a happy moment. Keep the attire and identity consistent of all 3 characters, but their expressions and angles should vary throughout all 6 images. Make sure to only have one of each character in each image. Generate 6 images one at a time. Each image should be a separate output in 16:9 format.
Prompt: This aerial shot captures a dramatic, misty landscape, likely a valley or glen, characterized by rolling, verdant hills and a winding river or loch. The photography style leans towards a moody and atmospheric aesthetic, emphasizing the grandeur and isolation of nature. The camera angle is high, looking down into the valley, providing a sweeping panoramic view that highlights the immense scale of the surroundings. The dominant colors are various shades of deep green, ranging from lush emerald in the foreground fields to darker, more muted tones on the distant mountains. The water, a central feature, appears as a serene, dark blue-grey, reflecting the overcast sky. The sky itself is a blend of grey and white, heavily laden with clouds and mist that cling to the mountain peaks, creating a soft, diffused lighting style. This natural, soft light minimizes harsh shadows and enhances the overall ethereal feel of the scene. The subject is a majestic natural landscape. In the foreground, a darker body of water curves around a patch of bright green fields, bordered by scattered trees and shrubs. A narrow, winding road snakes through the green hills on the right side of the frame, disappearing into the distance. Further into the valley, a larger, lighter blue-grey body of water stretches between towering, green-clad mountains. These mountains rise steeply on both sides, their peaks shrouded in the low-hanging mist, creating a sense of depth and mystery. The overall impression is one of serene, untamed wilderness, hinting at the rugged beauty of places like the Scottish Highlands or the Lake District.
Prompt: Cinematic still, evoking a vibrant, dreamlike quality often found in highly stylized musical dramas or whimsical comedies, with a composition style reminiscent of a master of bold, graphic imagery. The camera is positioned slightly low, looking up at the subject, emphasizing their commanding presence and the dramatic flair of their outfit. The color palette is exceptionally bold and high-contrast, dominated by electric blue and shocking pink, with a bright yellow accent. The background is a solid, uniform cerulean blue, providing a stark, graphic backdrop that makes the subject pop. The subject is a young, dark-skinned individual with tightly coiled hair, wearing an incredibly striking suit. The suit's fabric features an audacious pattern of swirling, wavy lines in electric blue, interspersed with large, concentric circles in hot pink, overlapping and radiating outwards. The tailored blazer has wide lapels and bell sleeves, worn over a sharply pressed yellow collared shirt. The matching trousers are wide-legged, dramatically flaring out towards the ground, with sharp creases down the front. The individual wears bright yellow, heart-shaped sunglasses and large, pink, circular earrings. Their hands are placed on their hips in a confident, almost defiant pose, and their gaze, though hidden behind the sunglasses, projects an aura of cool assurance. The ambiance is one of high fashion, playfulness, and unadulterated self-expression, imbued with an almost surreal, pop-art energy.
Prompt: Create a pencil sketch of a pufferfish nest. Not a clean digital drawing but something with visible pencil strokes and that dusty graphite look
Prompts:
Boat: 35mm soft blur, vibrant colors, soft light. Sailing at the ocean this baby kangaroo wearing a lilac raincoat sailing, motion blur, this hat is sitting messy on the boat
Cafe: 35mm soft blur, vibrant colors, soft light. Sitting on this seat of the café this baby kangaroo sipping coffee, a person sitting behind at the cafe blurred is wearing this hat
Hotel: 35mm soft blur, vibrant colors, soft light. Relaxing on the bed this baby kangaroo wearing a bathrobe relaxing, at the back there is a coat rack with this hat
Castle: 35mm soft blur, vibrant colors, soft light. Jumping inside the inflatable castle this baby kangaroo wearing this hat jumping, motion blur
Related stories
.
It wouldn’t show me the exact things I wanted, but got close enough that I could test ideas and iterate quickly.
Larian Studios most recently was under fire for this [1]. Like I can see a director going “what would X look like?” and then speeding over to the concept artists for a proper rendition if they liked it. I don’t think this is at scale though. Any large business is just going to get rid of the concept artists.
[1]: https://www.pcgamer.com/games/rpg/baldurs-gate-3-developer-l...
Humans do that a lot but it's not all we do. Go to a museum that has modern(ish) art. It's pretty incredibly how diverse the styles and ideas are. Of course it's not representative of anything. These works were collected and curated exactly because they are not average. But it's still something that humans made.
I think what people can do is have conceptual ideas and then follow the "logic" of those ideas to places they themselves have never seen or expected. Artists can observe patterns, ask how they work and why they have the effect they do and then deliberately break them.
I'm not sure current genAI models do these sorts of things.
Nano Banana was technically impressive the first time, but after Seedance it's not really. It's all just an internet pollution machine anyway.
People who actually care about art, if given a chance to see it, yes.
Of course, it being done by Davinci is not some random fact about the painting - as if a painting is a mere artifact.
However, I tried "a picture of jacquesm planting a flag on the Moon" for a laugh, and I have to hand it to Google as the person was in a spacesuit, as they should be, and totally unidentifiable! :-D
So you were making book covers? Ah, so sorry. Nobody really cared that it was you.
And you can probably extend that to what's between the covers...
And we were lucky if even 1 picture per roll was worth keeping long term. And my family almost never looks through those photo albums.
Digital picture frames with a curated rotation of old scans and new digital pictures are what made pictures great for my family.
And not just seniors. I see people of all ages who are perfectly happy to accept artificially generated images and video so long as it plays to their existing biases. My impression is that the majority of humanity is not very skeptical by default, and unwilling to learn.
I still think, even with that, that like most predictions of AI taking over any content industries, the short-term predictions are overblown.
This is a good point. My gut reaction is “well at least someone was paid to do it and can continue to keep society/the economy going ”.
I can see the other side where that’s a soulless job. Not sure what’s worse. Soulless job where your skills apply or even less jobs in a competitive industry.
That is unlike any artist that I know and I know quite a lot of them. They love their work and the process but they also need to eat. And that included those mentioned above.
The only thing AI art makes possible that wasn't possible before is the scale of slop
There was a study around this exact thing:
https://mitsloan.mit.edu/ideas-made-to-matter/study-gauges-h...
Also, I suspect that we'll soon see the same pattern of open weights models following several months behind frontier in every modality not just text.
It's just too easy for other labs to produce synthetic training data from the frontier models and then mimic their behavior. They'll never be as good, but they will certainly be good enough.
This is a very, very weak criterion for divinity. If this is truly it, we should prepare with great haste for the arrival of our artificial gods.
Because by this (IMO silly) metric it seems they will be more divine than us.
When I buy art, I have often spoken with the artist in the past couple days, or I am aware of their history and story and how they developed their art as a response to some other movement or artist collective.
It's rare for people to buy art just bc oil paints go brrrrrm
And we have AI generated influencers now, ex. https://www.instagram.com/imma.gram, so why wouldn't people care about an AI the same way they do about people they never meet?
My thought is the large corps that could afford it, still won’t because it’s a cost they don’t need to incur. For them it’s not even a moral conundrum.
You might be right here. Two points though - first, we don't know if current AI is actually incapable of something in particular; we didn't find this, didn't prove it. Second, we might have a different AI approach, which would actually be capable of these things you mention. To me, it's way too early to dismiss AIs - at least in principle - regarding all of this.
You could generate "pregnant Elon Musk with four arms and three eyes doing yoga poses" because the image models have enough visual concepts of each of those individual things, but that specific image is (likely) not in any training dataset.
A big part of it also the feeling of "connection" with the creator via messages and what not, but that too can be replicated (arguably better) by AI. In fact, a lot of those messages are already being generated haha.
Agree that if you are Artist this is not going to be a big concern to you.
That's engineering, if that.
Art isn't, and has never been about that.
AI is well on the way to eliminating human made art since the skills to actually make art will be lost to the skill of being able to describe art. You know, since the only thing that matter is reducing costs.
I suspect here we have underlying disagreement regarding assumption that AI - in general, not necessarily today's models - isn't qualitatively different than human mind. The part "Originality comes from humans experiencing the world and interacting with it" isn't an accepted truth, and even today AIs do interact, in a limited sense, with the world - so "None, of course" is questionable. And even if so, concluding "Hence... slop..." seems like a jump in reasoning. For example, why don't you think this slop is more like child's early paintings? Just because today's AIs have limited means to learn in the process?
> I think you missed my point about it being uncool. It’s not about critics, it’s about culture at large.
What it is about culture at large? SpaceX analogy was brought to illustrate how much arguments about AI incapabilities are applicable today, but not necessarily tomorrow - just like arguments about SpaceX inability to reach a particular goal quite a few times turned out to be a matter of - not so long - time.
I agree that many AI results today can be uncool. But how do you know it's not passing the uncanny valley period? How can you know they can't be cool eventually?
> people obviously go watch movies because they like the actor/director involved. It’s not really clear why anyone would care about an AI actor.
Let me stretch a little to illustrate here. Imagine "personal" experiences of AI - making AIs unique. One of those AIs consistently produces good movies, which, if you're honestly don't judge by the authorship - are actually good. Yes, people may not care about non-existent AI actors, but they may still care about existent AI author :) . Do you think it's impossible?
> People want to watch people, not imitations of them.
How can you tell the difference? You're watching a movie with actors who are not familiar to you. Would you refuse to watch just for this reason? You just came to somebody's party, and here's a movie going on, and you watched it to the end, because it looked interesting, and you don't know anything about producers, actors etc. - you still can talk about the movie, will you be predominantly worried that it's "AI slop" even if it looks great? Suspiciously great maybe?
> The rest of your comments seem to be summarized as “it has gotten better and therefore it will eventually solve all problems it has now.” Which may be true in a technical sense, but again this is not taste.
It's hard to define taste, to be honest. People can definitely have different tastes, almost by definition. But more importantly - why do you think AI products may not have tastes?
> At this point I think identifying a work as AI-created makes people instantly devalue it. We are rapidly approaching the point where no one wants to admit something is AI-created, because it comes with negative perceptions.
Yes. But doesn't it look like a prejudice? Of course we can point to how many times we looked at it and didn't get some perceived value out of the work, and got annoyed that we spent time and efforts, but didn't get some results - but what if we'll mostly get results from AI works? Do you think that's impossible?
Also for VHS camcorder footage
* On first seeing a photograph around 1840, the influential French painter Paul Delaroche proclaimed, "From today, painting is dead!" [1]
* Charles Baudelaire, in 1859: "As the photographic industry was the refuge of all failed painters, too ill-equipped or too lazy to complete their studies, this universal infatuation bore not only the character of blindness and imbecility, but also the color of vengeance. [...] it is obvious that this industry, by invading the territories of art, has become art’s most mortal enemy" [2]
[1] https://www.barnesfoundation.org/whats-on/early-photography
Not only 1999 prevents humans from becoming too advanced and invent new AI again, it is a believable and comfortable era. A perfect time, perfectly balanced between analog and digital.
The introduction of massive of low-quality creations has made high-quality art much more in demand. Low-quality AI art and music has become a huge blinking indicator that says "SLOP". Hand-made, uniquely styled, quality art now has a "luxury goods" vibe, and people are willing to pay a premium.
If this becomes a trust signal, you can prepare for next gen models to do stick figure hand-drawn-like diagrams with spelling mistakes.
Yes, it’s crude, and you have to do the face tagging, but I think it’s a huge improvement over not having that.
would you recommend this workflow to others, or just noting that it is what you did? any regrets, road blocks, frustrations?
a ball park price would also be interesting: total cost of sketchup license + ai token cost + fivver modeler + draftsman etc. I assume under $1k?
This is like the last mile for online presence. The average barber out here doesn't use Squarespace, barely knows how to use Facebook and doesn't touch GenAi. But they can still cut your hair pretty well - tech savvyness doesn't have a huge connection to business competence out here.
Average person won't notice, and would not care either way.
To each their own, but I think Andor is, by far, the best post-ROTJ output.
Mandalorian started strong, with cool spaghetti Western vibes, and then ended up devolving into mediocrity too. In my opinion.
Haven't watched Andor yet.
You could ask "how many more movies should we make?" and the answer would be "there is no limit, I always want more"
"I like this thing therefore more of it is obviously better"
I think it takes maturity to say "I like this thing and I don't want more of it."
New generations gets unlimited brain rot delivered through infinite scroll, don't know what a folder is, think everything is "an app" and keep falling for the "technology will free us from work and cure cancer"
There was a sweet spot during which you could grow alongside the internet at a pace that was still manageable and when companies and scammers weren't trying so hard to robbyou from your time money and attention
the high end probably pay the same sort of tax as professional footballers
It will be a golden age where the core differentiating factor is true talent and ideas and execution and not any gatekeeping by degrees, connections or budget.
But at this point, OnlyFans is so synonymous with egirls that suggesting someone has an account is used as a way to insinuate they sell pictures of themselves.
I begrudgingly have to admit it is a very good movie
With how much data goes into the frontier systems, and how much of it gets captured by them, an AI might have, in many ways, a richer grasp of human experience than the humans themselves do.
You were only ever one human. An LLM has skimmed from millions. You have seen a tree, and the AI has seen the forest it stands in.
This is...not true? Or at least I can find no basis for your claims.
UK Copyright for books and sculpture predated the invention of photography and existed in a completely recognizable form ("a copyright term of 14 years, with a provision for renewal for a similar term, during which only the author and the printers to whom they chose to license their works could publish the author's creations.[4] Following this, the work's copyright would expire, with the material falling into the public domain"[1]).
Paintings and photographs gained copyright protection at the same time, in the 1862 Fine Arts Copyright Act, seemingly because it seemed natural to extend the haphazardly covered fine arts more completely.
First, "AI is thereby incapable" is a hypothesis, not a fact - how would you prove that you have to "live" to produce art? You might feel this way, you may suggest some correlations here - but can you really prove that?
Second, I don't see impossibility for AI to be - to various degrees - an agent to the world. I think that's already happening actually - they are interacting with world even today, in some limited sense, through our computers and networks, and - today - not many of them actually "learn" from those interactions. But we're in the early days of this - I suspect.
What's astonishing abut the present is that even PKD did not foresee the possibility of an artificial being not only being constructed from whole cloth but actually tailored to each individual.
Agreed. In my opinion, the primary limitation of the porn models is actually poor labeling of the training set. The company that manages to produce a well-labeled, porn-tuned AI image model is going to absolutely clean up.
The extractive dark patterns that will emerge from a parasocial chat "AI relationship" that can generate porn images relevant to the chat on the fly will be staggering. Once that proceeds to being able to generate relevant video, all holy hell is going to break loose.
And that is the gist of the problem, isn't it? As we approach our forties and beyond, chances are we have lived more than half our lives. So do I really want to spend hours watching something I might hate and might leave a bad taste in my mouth? (See game of thrones season 8 or worse, Westworld the HBO series which I don't even want to know what happened in season 3 or 4). I am sure there are people who will enjoy those but for the average person it is highly unlikely.
See:
- All of Wookiepedia and most of Star Wars Expanded Universe.
- "The Hunt for Gollum".
- Every movie in the franchise after "Alien" and "Aliens".
- The sadly upcoming expanded universe/sequels/shows for Blade Runner.
Etc, etc. Everyone has their exceptions ("this one was cool"), but in general the point stands: fandoms ruin everything. They simply don't believe in the adage that "less is more". They always want MORE, and the industry is only happy to oblige.
He can also send back a picture of the real product for approval. I think the primary difference here is the level of involvement. A quick consult and then the professional "makes it all work" versus hands on design with the client figuring out all the details for himself.
Because it can't feel. Get used to it. It can't feel, and what ever it comes up with, would be an imitation of someone real who can feel. So it can generate stuff that can cater to a taste, but the thing itself can't have tasts.
It is fundamental. Arguing about it all day wont change it.
Mandalorian didn't do much for me; too gamey/Marvel-ey/cartooney.
Much like the star bellied sneetches, when the quality of some ad format becomes untethered from the cost of production and placement, then marketers will flock to some alternative.
YouTube influencers fill[ed] that niche for a while because content milling SEO spam and fake reviews is a lot more expensive if you present the results in video form with good production values. (Not sure how long that will be true, since AI is getting better at short-term video).
For anime/non-photographic content that essentially exists (Pony, then Illustrious, then probably some new-fangled thing by now that I don't even know about), thanks to the meticulously tagged booru image corpus. However, as strong as these models are on matters of anatomy and kinks, they're limited in other ways due to the hugely biased dataset and dependence on tag soup prompts rather than natural language (many find the latter a plus, not a minus, though).
I haven't heard of any proprietary/cloud-based NSFW model that would be massively better than what's available for free. There are many NSFW-friendly services, but by and large they're just frontends to models trained by other people.
If so, I will like Andor. I really liked "The Convert".
These entities, whoever they are, they act on our world, they are real, and more and more over time they will get independent from humans, eventually becoming different species that can self-replicate.
For now they need legs and arms to interact with the physical world but I am certain that 100 years from now they will be an integral part of the society.
I already see today LLMs slowly taking actual legal decisions for example, having real world impact.
Once they get physical, perhaps it will be acceptable to become friend with a robot and go to adventure with it. Even, getting robosexual ?
We are not that far away. If I can have my buddy to carry my backpack and drive for me I'll take it. Already today. Not tomorrow.
We can mix and match the media we choose to view or keep so easily, when previously there was so much more material and opportunity cost to choosing what to shoot, develop, keep, and share. I think that inevitably loses some meaning.
See, I don't believe that for even one second. They are just very clever calculators, that's all. But they are also dumb like a brick most of the time. It's a pretend intelligence at best.
We will only prove humans are not.
The best time to start paying attention was ten years ago, when the first Go grandmaster was defeated by a "pretend intelligence." I sure wish I had.
The next best time to start paying attention is now.
A computer playing GO is intelligent now? Is this the kind of conversation we're having?
>>I sure wish I had.
And how would you have changed your decisions in those last 10 years if you did?
>>The next best time to start paying attention is now.
I am paying attention, I use these tools every day - the whole idea that they are intelligent and if only you gave them a robot body they would be just normal members of society is absurd. Despite the initial appearance of genius they are just dumb beyond belief, it's like talking to a savant 5 year old, except a 5 year old can actually retain information for more than a brief conversation.