The star of the show here is https://platform.worldlabs.ai/ (author works there, I don't) which is really good. There's also Meshy.ai (which this repo doesn't seem to use?) for non-scene stuff that's right up there in quality. There's texturing, auto-rigging, etc.
The latest VLLM models have true pixel image grounding which means you can totally ask your AI about pixel coordinates of things, so you get 3d perception for edits and anything else you need.
I'm actually surprised I don't see this stuff being used more; I think it's because most pipelines are hard-baked with assumption that your 3D assets are files you get from an artist, not something you can imagine up in minutes in a script. The technology is moving faster than the industry can keep up with.
There's very little incentive to publicly admit you're using this tech. In fact there are a lot of reasons not to.
May I ask if Claude is the only option to use the tool?
Sol Roth
I remember like seventeen years years ago, Microsoft had "PhotoSynth", which would make 3D environments based on a bunch of images, and seventeen-year-old-tombert thought it was one of the most amazing things to ever be done on a computer.
Doing this with just one image makes this at least an order of magnitude cooler. I will be playing with this over the weekend.
I'm at a crossroad , do I opt for 3d mesh isometrics with more hardware requirements for mobile phones or stick to isometric sprite which nobody seems to be generating via AI reliably (happy to be corrected here if anybody does find a way)
Example: https://uthana.com/app/preview/cXi2eAP19XwQ/mH7opbcqZE4P
https://github.com/Microsoft/TRELLIS
I've been trying to use this to generate 3d character models from images. I am enjoying 3d printing these models to mess with my kids.
Not much of what I've found runs on local models but I'm always on the lookout. Meshy.ai (mentioned here) offers really nice generation but the cost adds up quickly.
I used to spend all day on Bryce3D creating 3d landscapes, leaving computer on fall night to render like 10 seconds of video of a flyover sunset
bit of a rant here but we are definitely speedrunning 3d and its just going to get wilder once we get glass free bounded AR...projecting 3d video streams and objects in front of our phones (this one I know Samsung is already working on) and rooms
Just find an artist or learn to draw
Tencent's Hunyuan3D (https://github.com/Tencent-Hunyuan) is a single/multi view photogrammetry replacement, which image-blaster is based on.
Facebook Research has extended SAM to 3D (https://github.com/facebookresearch/sam-3d-objects), separating as 'Objects' and 'Body'.
The workflows to make meshes watertight for 3D printing are all pretty effective.
Out of 30 objects I tried, only 4 had relative success. And even then, the topo ain't great.
My pixel6 has a photo sphere mode on the camera which is the same thing
If you aren't ready to rig and adjust model poses in a 3D tool, you might be better off generating each movable model part as a separate mesh and just arranging them in space before doing the above.
I ended up thinking it might be easier to generate rigged models, animate them, and capture from an iso perspective, then do some kind of pixel art style transfer on the masked sprite sheet. Eventually I realized my kid didn't really care too much about the visuals so I didn't get too far with it.
Also, try meshy and look at how many polygons or triangles you get from any of the model objects. Hundreds of thousands, when you retopologize still goes to the high tens of thousands.
It's not too bad. It's fantastic for blocking out scenes for upscaling and conversion to video. Use it with 3D and a GPT Image 2 or Nano Banana Pro pass.
Eg.
https://getartcraft.com/media/m_xxtxfsdq3m5fwg5b8hjvkwxp4qyb...
https://getartcraft.com/media/m_gb4qexzqz1grskhn28f235fdtxfk...
https://getartcraft.com/media/m_p6wh750n0gc1s4a0ts88kp5pvnvn...
(All of ArtCraft is on github: https://github.com/storytold/artcraft )
> And even then, the topo ain't great.
This is true. The topo sucks.
Still, people are making delightful games like this (which was 100% vibe coded) :
From what I can tell, it takes an image and first segments it into objects versus environment then sends the environment to Marble 1.1 to generate a Gaussian splat,sends all the isolated individual objects to Hunyuan to generate GLB model files.
But the esper interface is all voice activated, and doesn't talk back - which I think is very prescient, and more likely the way things will go. I'd much rather voice assistants just did the thing that I want them to do rather than talk back to me
Haven't used it professionally mainly because the titles I've worked on lately aren't realistic so you can't really procure the materials to scan.
image-blasterCreates 3D environments, SFX, and meshes from a single image using Claude skills, World Labs, and FAL.
Can take you from an image to a fully meshed 3D environment in < 5 minutes, great for jumpstarting 3D work. Go full blast.
git clone https://github.com/neilsonnn/image-blastercd image-blasterclaude (install with curl -fsSL https://claude.ai/install.sh | bash)input/ directory and ask Claude to blast it and confirm each step with me.By default image-blaster will use your input image to create:
.glb, .obj) of all dynamic objects.spz) of the static environment,.mp3)You can embed image-blaster under the assets of any game engine, DCC software, or web app.
IMAGE-BLASTER uses a few generation models:
marble-1.1 - World Labs Marble model creates the explorable environment.nano-banana - default image edit preference for source cleanup, clean plates, and object reference images.gpt-image-2 - alternate image edit provider when the edit skill is asked to prefer it.hunyuan-3d - Hunyuan 3D model creates 3D object models through FAL.elevenlabs-sfx - ElevenLabs sound effects model creates ambient and object-specific sounds.3D model creation supports these Hunyuan parameters:
--face-count <40000-1500000>: target face count. IMAGE-BLASTER defaults to 50000; Hunyuan's API default is 500000.--enable-pbr true|false: enable PBR material generation. Defaults to true.--generate-type Normal|LowPoly|Geometry: Normal creates a textured model, LowPoly applies polygon reduction, and Geometry creates a white geometry-only model. Defaults to Normal.--polygon-type triangle|quadrilateral: polygon type for LowPoly. Defaults to triangle.IMAGE-BLAST it.IMAGE-BLAST it.IMAGE-BLAST it.IMAGE-BLAST it.IMAGE-BLAST it./app from the .claudeignore file to give Claude the ability to change the React viewer.Ever since then, I have viewed scenes such as the "lingerie store scene" from Enemy of the State [2] with a little bit less eye rolling...
I haven't found it though. Only some "Kiri Engine" which requires phone.
Also epic makes an app called Realityscan
The term is "photogrammetry" which might help in your searches
It's always weird to see her in stuff.