I feel like i don't know how to emotionally react to the AI part of this story. To begin with, it is fundamentally cool we have technology like that. At the same time it felt bittersweet, like an artisan being put out of business by the factory. The first part of the story felt like much of the love was in constructing everything by hand, it seems almost sad to lose that. There is also an element of dystopia in how the AI was able to cross reference everything, bank statements, ticketmaster recipts, shazam, etc. It is kind of unsettling the power of it all.
Not sure where i'm going with this comment. Its a super cool project, thanks for sharing.
Obviously not everyone has same needs or wants to retain stories and memories but lack of social structures and solutions seems like weird mishap.
When he died there was no way of transcribing them automatically (there still isn't really). The boxes stood in my mothers already cramped attic for 13 years, then she got cancer, and she felt a need to finish up things, so she got a scanner and started just scanning.
When my mother died she had scanned about a thousand pages, not transcribed, not anything.
The text in the diaries were fun at times, sometimes depressing, seeing how little he cared about my mother and his family was crushing.
My brother wanted to continue the scanning but I told him that I wanted to throw the diaries away. He kept half a year of writing around his birth (there's at least a sentence) and my uncle did the same, then we just watched it all burn (not literally, we threw it away at the recycling centre).
Not everything needs to be preserved. I'm happy some parts is preserved. I'm happy that those diaries are ash.
Though from the title I didn't expect family history, I thought it was going to be more of a project like this: https://shii.bibanon.org/shii.org/knows/Everything_Shii_Know...
Secondly, the home page seems like I am reading a family history page more than talking about the software. It is confusing to me.
Thanks for sharing.
I would probably have ended well before "I exported my Google Maps location history, Uber trips, bank transactions, and Shazam history."
Aside: I've started seeing lot of AI projects in this category say some variation of:
> it runs on your machine, your data stays with you, and any model can read it
I don’t think people fully appreciate the tension in those claims, especially when the model most area reaching for is Claude or GPT or Gemini. I think these things need more precise language about where data actually goes and what tradeoffs users are implicitly accepting.
Nice work!
Unlike some of the comments herein, I find this as a perfect use of technology in service of users. (Yes, with some limits). I liken this to Maggie Appleton's Home-cooked Software model [1], wherein barefoot developers use technology (AI-driven or not) for writing apps for their own purposes, nominally for a user base of 1 (or very few), with possibilities of expanding to a few dozen.
In that vein, I'm a barefoot developer, and much of the software I have written in the past few months (with help from Claude, ChatGPT) is very much for that tiny user base of a few dozen (=mostly me, if I'm honest). And that is perfectly fine by moi.
I wrote a utility to organize roughly 100K+ photographs (and videos) neatly into dates/location, both for backup, as well as to maintain the memories in an organized fashion. Asked Claude to lookup location by EXIF; haven't yet asked it to "guess the location by photo" when no GPS info existed in the EXIFs. But I think I might do that.
(no, I haven't asked Claude to go thru my Uber trips or bank statements! I draw a line there!)
That is why the OP's personal wiki made me so excited - because the whole output resonates with me.
Like a few commenters mentioned their journaling experiences. I've started doing that with some of our trips (mostly post pandemic), both to remember our experiences better, and to come back to them as needed. The simple act of writing down places visited, experiences had (mostly hikes, mountains climbed, meals consumed in distant foreign places, weird/quirky experiences) causes them to be fresh in one's memories.
Thanks, this was a great project, and a great reminder as well.
I had started something similar with my mom over Christmas in '24. About half way through the collection she asked to stop. We would do the rest on her next visit.
Well. It never came to that as she passed away completely unexpected in March last year.
I’ll never get the chance to record the other stories. The stories from the second half of the photo collection.
I cheer for projects like this.
The bank transaction + location cross referencing to figure out which restaurants you went to is pretty cool. Would be great if this could pull in social media exports too. Point it at your X, IG, FB archives, let it draft pages/content from that.
Any plan for a timeline view? Wiki format works well for depth but sometimes you just want to scroll through a year.
The family has a TON of videos and photos, but no resource to guide us through what is what.
Right now, my wife and I are sticking to annual photo albums. They're already fun to flip through and we're not even that old yet.
Throughout the year we keep writing in it, things we learnt, discords we had and how we resolved them, recipes I experimented with and we loved, random thoughts; basically anything and everything. And that little diary becomes an embodiment of that year.
I would also like to point out the manual labor and writing into it and not using an obsidian++-AI-auto-categorizer-3000 is simply because it feels like it's worth something, it's a nice little routine we have at the start of every year, and it's really fun reading these from 2-3 years back. Also the kids will have some really interesting reading a few years down the line.
I imagine a future where this becomes a family tradition that transcends time, knowledge from different generations, living different lives all nicely recorded in these codices. Something about this whole thing feels really beautiful to me.
We have started asking old family members to send us whatsapp audios with tales and things they remember from long-passed away family members; and what was life like in the 1930-40-50s. I want to start organizing all the info and data we have, my father has built a couple family trees, but this wiki format is indeed very promising. I'll keep an eye on this and see if we can use it.
The MediaWiki server died and I had backups, but... literally no one in the family would've tried to resurrect it.
They knew I'd worked on genealogy for a while but I don't think anyone would've thought to rebuild a linux box covered in dust and somehow find an old MediaWiki install on it.
I should've made simple markdown files with images in an image directory and printed out copies. That's a legacy. A consolidated, easy to drag from grandpa's house and throw on a shelf and flip through, even in 2097.
I wouldn't give a LLM run by a US corporation access to my private photographs.
Ideally square books that can go on a coffee table. At least when I am dead there will be some part of my existence in physical form, unlike all the digital things we spend decades creating.
I might put a SD card taped in the front of each one with a video too, so someone can watch it in the future.
As a separate aside, I also found old Canon photo printers (Selphy models) on ebay for about £5! Some need the little white gear inside glueing back on (there's a video on YouTube about it), and they DO NOT work with Windows anymore, but gutenprint supports them fine on Linux, so I have been printing photos (postcard size) at home. The colour isn't going to win awards and the saturation needs boosting slightly in the printer options compared to default, but it's a wonderful way to finally get some photos from trips on the walls.
Is anyone else feeling uncomfortable with that? It is a great project and I don't want to bash it with general concerns, but sharing all my financial and location details with any service seems like opening the floodgates to my house.
My concern is not even strictly related to AI, but about sharing all my most private data with any service. There is always a significant chance all of it is leaked sooner or later.
I do like the idea of building up this history of people, and maybe when my parents pass I'll make theirs public and so on. Great work, dude! I love it.
The product naming is becoming harassment. When it's in the title, at least we know. When it's in the intro, we know what we are getting into.
What really pinch is that this project could have easily been done with some scripting, open sourced, and anyone could do it at zero cost, with total privacy.
> The model traced the arc of our friendships through the messages, pulled out the life episodes we had talked each other through, and wove them into multiple pages that read like it was written by someone who knew us both. When I shared the pages with my friends, they wanted to read every single one.
This is a stunning violation of the privacy of your friends.
If someone uploaded every single private conversation I had had with them to Anthropic, they would no longer be my friend.
I started running an private MediaWiki instance during the pandemic as I wanted something with a nice editing experience rather than editing markdown documents. I almost went with a self-hosted Confluence instance :P
Mediawiki is very very nice and it has a lot of cool features i've been loving over the years.
One of the things i like the most is the ability to embed a PDF document so that it's both downloadable and browsable from the wiki page itself (it embeds the browser pdf engine).
This means that i can, say, have a page for my microwave oven and have its user manual easily available.
Lately I've been thinking how to connect that with some LLM, most likely there's a chance to do some interesting things :)
Sure, the wiki is private. However, in the process your data is being uploaded straight to an AI company. Of course local LLMs exist but that’s seemingly not supported here and I think the statement on privacy could be clearer.
And I'm sorry your mom experienced that weight towards the end of her life. That sounds like a significant thing to grapple with, especially considering some of the not so pleasant content mentioned.
I'll look into this more: Most appreciated, thank you.
Each year I have the wife take curated photos from our shared accounts with an overview of the event photographed.
This is then bound into a 1/2 inch book with 50 pages. We now have a dozen years of annualized memories that we can pass around with physical access.
She has done this for others with great success. The personal touches make it well worth what she charges.
I was thinking the other day I need to go back to a physical recipe book too. I don't cook that many different things that I need to reference it for, but there was a charm in my old one of remembering the best recipes were the ones covered in spilled ingredients and filled with marginalia.
It is stalker-ish to write up biographies like this about your relatives. It's one thing to write up the weddings and upbeat things like this, but not all families lives are just sunshine and rainbows.
How about that relative of the family who spent time in prison? Grandpa in war? Many old people don't naturally talk about some parts of their lives either because they suffered some injustice like (what as an Eastern European I can think of) their properties taken away by Nazis and Soviets, or they did something they aren't proud of. Are you going to oral history interview/interrogate them to fill in all the gaps? Do you tell them you're going to upload all they say to some servers where who knows who will have access to it?
There are also longlasting family feuds between sides of families, like how one son was tricked out of the inheritance maybe wrongly, maybe he was an ass to his parents. People holding grudges and explaining their life failure and derailment by wrongly or rightly blaming others.
Maybe your aunt is presenting a story that doesn't quite add up when you triangulate it from all OSINT and private sources. Maybe your cousin isn't the daughter of who you think she is. Is it your business?
Even if no such big thing factor in, a biography of a person will be very subjective. You can narrate the same life in many ways so they appear more or less successful or an asshole.
Its fine to keep these things as oral history and memory that fades.
I don't really care about what the regular people who were my great grandparents and their cousins did. Maybe if I could read all the drama, I'd end up hating a bunch of relatives. These things have a natural life cycle of forgetting. That's fine.
Again, it's all well if you live in a family where everyone is nice and everyone was successful and helpful. Otherwise it's a can of worms. Nerds can be a bit blind to this as they just want to play with the toys and treat it like some logic puzzle.
Last year, I visited my grandmother's house for the first time after the pandemic and came across a cupboard full of loose old photos. I counted 1,351 of them spanning all the way from my grandparents in their early 20s, my mom as a baby, to me in middle school, just around the time when we got our first smartphone and all photos since then were backed up online.


Everything was all over the place so I spent some time going through them individually and organizing them into groups. Some of the initial groups were based on the physical attributes of the photograph like similar aspect ratios or film stock. For example, there was a group of black/white 32mm square pictures that were taken around the time when my grandfather was in his mid 20s.
As I got done with grouping all of them, I was able to see flashes of stories in my head, but they were ephemeral and fragile. For instance, there was a group of photos that looked like it was taken during my grandparents' wedding but I didn't know the chronological order they were taken because EXIF metadata didn't exist around that time.


So I sat down with my grandmother and asked her to reorder the photos and tell me everything she could remember about her wedding. Her face lit up as she narrated the backstory behind the occasion, going from photo to photo, resurfacing details that had been dormant for decades. I wrote everything down, recorded the names of people in some of the photos, some of whom I recognized as younger versions of my uncles and aunts.
After the "interview", I had multiple pages of notes connecting the photos to events that happened 50 years ago. Since the account was historical, as an inside joke I wanted to see if I could clean it up and present it as a page on Wikipedia so I could print it and give it to her. So I cloned MediaWiki, spun up a local instance, and began my editorial work. I used the 2011 Royal Wedding as reference and drafted a page starting with the classic infobox and the lead paragraph.


I split up the rest of the content into sections and filled them with everything I could verify like dates, names, places, who sat where. I scanned all the photos and spent some time figuring out what to place where. For every photo placement, there was a follow up to include a descriptive caption too.
Whenever I mentioned a person, I linked them to an empty stub page. After I found out I could also link to the real Wikipedia, I was able to link things to real pages that provided wider context to things like venues, rituals, and the political climate around that time, like for instance a legal amendment that was relevant to the wedding ceremony.




In two evenings, I was able to document a full backstory for the photos into a neat article. These two evenings also made me realize just how powerful encyclopedia software is to record and preserve media and knowledge that would've otherwise been lost over time.
This was so much fun that I spent the following months writing pages to account for all the photos that needed to be stitched together.
I got help from r/genealogy about how to approach recording oral history and I was given resources to better conduct interviews, shoutout to u/stemmatis! I would get on calls with my grandmother and people in the family, ask them a couple of questions, and then write. It was also around this time that I began using audio transcription and language models to make the editorial process easier.
Over time, I managed to write a lot of pages connecting people to different life events. The encyclopedia format made it easy to connect dots I would have never found on my own, like discovering that one of the singers at my grandparents' wedding was the same nurse who helped deliver me.


After finding all the stories behind the physical photos, I started to work on digital photos and videos that I had stored on Google Photos. The wonderful thing about digital photos is that they come with EXIF metadata that can reveal extra information like date, time, and sometimes geographical coordinates.
This time, without any interviews, I wanted to see if I could use a language model to create a page based on just browsing through the photos. As my first experiment, I created a folder with 625 photos of a family trip to Coorg back in 2012.


I pointed Claude Code at the directory and asked it to draft a wiki page by browsing through the images. I hinted at using ImageMagick to create contact sheets so it would help with browsing through multiple photos at once.
After a few minutes and a couple of tokens later, it had created a compelling draft with a detailed account of everything we did during the trip by time of day. The model had no location data to work with, just timestamps and visual content, but it was able to identify the places from the photos alone, including ones that I had forgotten by now. It picked up details on the modes of transportation we used to get between places just from what it could see.


After I had clarified who some of the people in the pictures were, it went on to identify them automatically in the captions. Now that I had a detailed outline ready, the page still only had content based on the available data, so to fill in the gaps I shared a list of anecdotes from my point of view and the model inserted them into places where the narrative called for them.
The Coorg trip only had photos to work with. My trip to Mexico City in 2022 had a lot more. I had taken 291 photos and 343 videos with an iPhone 12 Pro that included geographical coordinates as part of the EXIF metadata.
On top of that, I exported my location timeline from Google Maps, my Uber trips, my bank transactions, and Shazam history. I would ask Claude Code to start with the photos and then gradually give it access to the different data exports.


Here are some of the things it did across multiple runs:
The MediaWiki architecture worked well with the edits, since for every new data source it would make amendments like a real Wikipedia contributor would. I leaned heavily on features that already existed. Talk pages to clarify gaps and consolidate research notes, categories to group pages by theme, revision history to track how a page evolved as new data came in. I didn't have to build any of this, it was all just there.


What started as me helping the model fill in gaps from my memory gradually inverted. The model was now surfacing things I had completely forgotten, cross-referencing details across data sources in ways I never would have done manually.
So I started pointing Claude Code at other data exports. My Facebook, Instagram, and WhatsApp archives held around 100k messages and a couple thousand voice notes exchanged with close friends over a decade.
The model traced the arc of our friendships through the messages, pulled out the life episodes we had talked each other through, and wove them into multiple pages that read like it was written by someone who knew us both. When I shared the pages with my friends, they wanted to read every single one.


This is when I realized I was no longer working on a family history project. What I had been building, page by page, was a personal encyclopedia. A structured, browsable, interconnected account of my life compiled from the data I already had lying around.
I've been working on this as whoami.wiki. It uses MediaWiki as its foundation, which turns out to be a great fit because language models already understand Wikipedia conventions deeply from their training data. You bring your data exports, and agents draft the pages for you to review.
A page about your grandmother's wedding works the same way as a page about a royal wedding. A page about your best friend works the same way as a page about a public figure.
Oh and it's genuinely fun! Putting together the encyclopedia felt like the early days of Facebook timeline, browsing through finished pages, following links between people and events, and stumbling on a detail I forgot.
But more than the technology, it's the stories that stayed with me. Writing about my grandmother's life surfaced things I'd never known, her years as a single mother, the decisions she had to make, the resilience it took. She was a stronger woman than I ever realized. Going through my friendships, I found moments of endearment that I had nearly forgotten, the days friends went the extra mile to be good to me. Seeing those moments laid out on a page made me pick up the phone and call a few of them. The encyclopedia didn't just organize my data, it made me pay closer attention to the people in my life.
Today I'm releasing whoami.wiki as an open source project. The encyclopedia is yours, it runs on your machine, your data stays with you, and any model can read it. The project is early and I'm still figuring a lot of it out, but if this sounds interesting, you can get started here and tell me what you think!
Thanks to Vishnu Dut, Sarah Cheon, Andy Law, Vishhvak Srinivasan, and Raghav Rmadya, who read early drafts and gave great suggestions while I tinkered around.
I’ve also done some light-fast testing. Laser prints (both B&W and color) survive a long while in direct harsh sunlight left in the window of my Utah home. All types of pen I tried were faded within a couple years but Pencil survives.
A 360 degree stapler is a fantastic tool for quickly binding them.
I do something similar but with email and more pro-active [1]. I have created my son an email address when he was born and I'm sending him things from our lives and ask family members to to the same. Just to write them about themselves and send photos of their current homes and gardens and partners.
I imagining him looking through his email when he's 18 and reading personalize messages sent by family members who might no longer be with us then.
[1] https://blog.haschek.at/2024/leaving-a-digital-legacy.html
As an adversarial/worst-case model, it can be useful to think of every service as potentially storing forever all the data that you ever give it access to. As a practical matter, services have terms of service that they follow. If your Claude Code terms say that your data will not be used for training, you can be reasonably confident that they will not be, and storing the raw inputs forever (as suggested by “significant chance all of it is leaked sooner or later”) would be even more unlikely. (For example, Google has entire teams dedicated to compliance with users' “wipeout” settings. You can take a look at https://myactivity.google.com and https://myadcenter.google.com to see some of what Google knows and thinks about you, and if you've chosen "Auto-Delete after 3 months" or whatever, you can be very sure it will be gone after that time. Every single team that stores user data is required to comply with this.)
I do think the services make it harder than it should be, to find out what the terms are — for a given usage of their services whether and for how long the details will be stored by them. Just saying that you can find this out and generally rely on it at least at the time (at a reasonable threat model, e.g. not treating the service as a malicious adversary having a giant law-breaking conspiracy that has never been exposed).
Thanks for the resource!
What I don't want to do is give it to services with an agenda to abuse the data, particularly those profiling individuals for profit. Frankly, I'd trust a Chinese service more than I would an Adtech based one, but that's still not much.
https://i.kym-cdn.com/photos/images/original/001/259/257/342...
From my perspective, the American President has threatened to annex my country, American businesses have repeatedly violated my trust, spyed on me and leaked my data, and American big tech is meddling in my country's politics. No other country has demonstrated such an ability and willingness to collect information about me and use it against me.
It would in fact make fore a better result, a family wiki with content that's AI generated may overall look accurate, but the sloppy parts ridiculous. If I find an archive I would rather assemble the information myself.
https://wiki.roshangeorge.dev/w/Blog/2025-02-07/A_Confused_K...
It would cost around $3000 to have the diaries scanned today, this number was way higher a few years ago (which my mother didn't have). I know Americans have a lot more storage space in their houses and use storage facilities for a lot of things but there has to be a cutoff point. I have about 6m3 of (already filled) extra storage space in central Stockholm and wouldn't want that much more. Throwing shit away is a part of life.
As I store everything in a local Vikunja instance for notes and WIP, here's the list of links I assembled relative to this (hopefully useful; it includes calendar templates so that I can make them for my mother-in-law):
https://github.com/berteh/ScribusGenerator
https://wiki.scribus.net/canvas/Useful_Free_Resources
https://www.opendesktop.org/p/1106678
https://www.opendesktop.org/browse?cat=196&page=1&ord=latest
https://www.pling.com/s/Artwork/browse?cat=196&ord=latest
https://wiki.scribus.net/canvas/CalendarWizard
https://github.com/RaffertyR/Year-Calendar-Script-for-Scribu...
https://wiki.scribus.net/canvas/Category:Scripts
https://wiki.scribus.net/canvas/Making_a_photobook_from_a_di...
https://wiki.rjcalow.co.uk/photography/make/designaphotobook...
https://github.com/PPSchL/scribus-photobook-scripts
https://github.com/RaffertyR/PhotoBookTools-for-Scribus
https://forums.scribus.net/index.php?topic=4081.0
https://wiki.scribus.net/canvas/Automatic_import_of_images_f...
https://wiki.scribus.net/canvas/Photo_Albums
https://github.com/hawbox/scribus-book-templates
https://forums.scribus.net/index.php?topic=3735.0
http://johnosterhout.com/basic-book-template-for-scribus/
When you find a print shop, they'll talk about margins and bleeds, so it might be worth finding a print shop first to know what bleed zones you want on the pages and whether they expect left page first, or right page first.
Once you know that, you can set up Scribus appropriately.
My partner's ancestors came from Sitges (in fact, one of them was the mayor of the town), back in 1820s or 1830s - to Argentina, and from there to Montevideo, Uruguay. Among the various marriages in the generations, there's a Scottish clan, and English ancestry intermixed with Spaniards. She can trace her roots back to some of the founding members and prominent political families there.
The last time we were in Scotland, we found the clan she's from - but couldn't ascertain the ship they took to Argentina :-/ That's left as an exercise for some future trips.
Given the US' NSA's long-standing violation of human rights at massive scale, and the proclivity of American society to be reasonable about kidnapping people, deemed unsavory, off the streets by jackboot thugs - and the fact that China builds roads, hospitals, ports, and communities around the world in nations considered 'inferior' by America's military junta/oligarch ruling class, while America bombs them into oblivion - I'm fine with the idea of eschewing American AI.
Its kind of necessary, I think, to resist this at the moment - at scale too, I might add.
If Americans want to fix this they still can - time is running out, however.
Dodgy companies are a bigger danger to my (and your) privacy and wellbeing than the American NSA will ever be.