Crucially, the purpose of Blacksky is to provide a service for the (US) black community which has its own moderation decisions while being substantially interoperable.
(Remember, the reasons people use one social network rather than another are almost always social first and technical second, where the social functions are enabled or hindered by the technology)
Despite its faults ActivityPub is superior.
What are the differences between Bluesky and Blacksky? What does it mean to provide different services for the black community?
What moderation decisions were made regarding this "Link" user that were suspect, using the post author's word?
Their support also drops certain alias emails, so there's no way to appeal.
A "decentralized" network that locks you out for using basic email privacy tooling, with zero recourse, is centralized where it counts.
The Blacksky team has a much broader vision for which they decided ATProto was the right architecture.
There's a lot more to read about them up on their site: https://blackskyweb.xyz/
Some commentary ( https://bsky.app/profile/mackuba.eu/post/3m2jtzlznu22o ).
IIRC it came during a week of discourse & tensions around bluesky moderation concerning some controversial writers ( https://bsky.app/profile/jay.bsky.team/post/3m25esnq4t22y )
I share this just to be helpful a tiny bit but theres likely a lot of context missing and different perspectives on this.
Atproto is terrible at decentralization however, because of the model where data is stored decentrally, but accessed centrally, in big servers that need to be aware of all the data. In the ActivityPub model there isn't such a thing as "all the data" - you see what you see.
You can create an account with any email address you want if you host your own PDS or you can find another PDS that someone else hosts that is willing to register you an account.
One participant's new account spam protection has nothing to do with the network at large being centralised
I am proud of my data, in its many media-type/lexicon feeds (browse my PDS directly at https://pdsls.dev/at://did:plc:zjbq26wybii5ojoypkso2mso), and I want people to know. I put my data on my PDS because it's good data and I believe that people contributing their "data" (sharing the world as they see it) is a democratic / open society virtue that makes the world better.
When you can meaningfully index all the data on a rpi4, I think that's awesome. That makes a lot of people scared or mad, it evokes many of the things people don't like. But it also stems from only ever having seen or known that situation when the entire stack is under corporate control and when it's a mega-corp harvesting the data. From being in captivity. Not when it's one dude bad-example.com running a link indexer for the entire site on an rpi4, and you can too. We don't know what's that like: it's never been possible. https://constellation.microcosm.blue/
There are legit reasons to have Fear Uncertainty and Doubt about atproto, and you don't have to have fun online. You are free to fuck off to less connected less online spaces if that's your bag. But I grew up wanting to be online and i still want to be online, and no service has ever actually done that before, not like this. This is dozens of times better than the next best thing as a distributed connected online system that I can be online with and that gives me the most freedom to build and use interesting neat new mini apps and tools, to be online with.
To say that like it's a bad thing, is, to me, a joke. I acknowledge your values differ, and respect your decision, but it seems so weird to not want to have fun being online, to get better at it, to make more nodes on the noospheric graph, and to made more edges between them. That still feels like the right choice for me, and it's never been tried socially, and I think it has potential to let humanity keep improving in radical ways. In contrast, renouncing the connected feels like a bad dumb move. But enjoy!! GL;HF.
This is Blacksky's fork of the AT Protocol reference implementation by Bluesky Social PBC. It powers the AppView at api.blacksky.community.
We're publishing this for transparency and so other communities can benefit from the work. This repository is not accepting contributions, issues, or PRs. If you want the canonical atproto implementation, use bluesky-social/atproto.
All changes are in packages/bsky (appview logic), services/bsky (runtime config), and one custom migration. Everything else is upstream.
The upstream dataplane includes a TypeScript firehose consumer (subscription.ts) that indexes events directly. We replaced it with rsky-wintermute, a Rust indexer, for several reasons:
The dataplane and appview from this repo still run as-is. They read from the PostgreSQL database that wintermute writes to. We just don't start the built-in firehose subscription.
These are broadly useful to anyone self-hosting an AppView at scale.
LATERAL JOIN query optimization (packages/bsky/src/data-plane/server/routes/feeds.ts)
getTimeline and getListFeed rewritten with PostgreSQL LATERAL JOINs to force per-user index usage instead of full table scans. Major improvement for users following thousands of accounts.Redis caching layer (packages/bsky/src/data-plane/server/cache/)
Timestamp objects lose their .toDate() method after JSON round-tripping through Redis, causing incomplete profile hydration on cache hits. We currently run with Redis caching disabled. The fix is to serialize timestamps as ISO strings on cache write and reconstruct on read.Notification preferences server-side enforcement (packages/bsky/src/api/app/bsky/notification/listNotifications.ts)
reasons, the server applies the user's saved notification preferences. Without this, preferences are only enforced client-side and have no effect.Auth verifier stale signing key fix (packages/bsky/src/auth-verifier.ts)
forceRefresh), bypasses the dataplane's in-memory identity cache and resolves the DID document directly from PLC directory. Fixes authentication failures after account migration where the signing key rotates but the cache holds the old key.JSON sanitization (packages/bsky/src/data-plane/server/routes/records.ts)
\u0000) and control characters from stored records before JSON parsing. These are valid per RFC 8259 but rejected by Node.js JSON.parse(), causing silent rowToRecord parse failures in the dataplane that surface as missing posts.Infrastructure for private community posts that live on the AppView rather than individual PDSes. Specific to how Blacksky works, but could serve as a reference for other communities.
community.blacksky.feed.* with endpoints for submit, get, delete, timeline, and thread viewscommunity_post table (migration: 20260202T120000000Z-add-community-post.ts)getPostThreadV2 for mixed standard/community post threadsBLACKSKY_MEMBERSHIP_DB_URL)Bluesky Relay (bsky.network)
|
v
rsky-wintermute -----> PostgreSQL 17 <----- Palomar
(Rust indexer) | (Go search)
- firehose consumer | |
- backfiller | v
- label indexer | OpenSearch
- direct indexer |
v
bsky-dataplane (gRPC :2585) <--- Redis (optional)
|
v
bsky-appview (HTTP :2584)
|
v
Reverse proxy (Caddy/nginx)
| Component | Source | Purpose |
|---|---|---|
| rsky-wintermute | blacksky-algorithms/rsky | Rust firehose indexer: consumes events, backfills repos, indexes records into PostgreSQL |
| rsky-relay | blacksky-algorithms/rsky | AT Protocol relay for receiving moderation labels from labeler services |
| rsky-video | blacksky-algorithms/rsky | Video upload service: transcodes via Bunny Stream CDN, uploads blob refs to user PDSes |
| bsky-dataplane | This repo (services/bsky) |
gRPC data layer over PostgreSQL |
| bsky-appview | This repo (services/bsky) |
HTTP API server for app.bsky.* XRPC endpoints |
| Palomar | blacksky-algorithms/indigo | Full-text search: indexes profiles and posts into OpenSearch with follower count boosting |
| palomar-sync | blacksky-algorithms/rsky | Syncs follower counts and PageRank scores from PostgreSQL to OpenSearch |
Wintermute is a monolithic Rust service with four parallel processing paths:
bsky.network firehose via WebSocket, writes events to Fjall (embedded key-value store) queuesON CONFLICT for idempotencyAdditional CLI tools included in the rsky repo:
queue_backfill -- queue DIDs for backfill from CSV, PDS discovery, or direct DID listsdirect_index -- fetch and index specific repos bypassing queues (useful for fixing individual accounts)label_sync -- replay label streams from cursor 0 to catch up on missed negationsplc_import -- bulk import handle/DID mappings from PLC directorypalomar-sync -- sync follower counts and PageRank to OpenSearchVideo upload service for users whose PDS doesn't support Bluesky's video.bsky.app. Uses its own DID (did:web:video.blacksky.community) to authenticate to user PDSes via service auth JWTs. Flow:
Moderation labels come from labeler services (e.g., Bluesky's Ozone) via WebSocket subscription. Wintermute's ingester processes labels in a dedicated label_live queue (low volume, separate from the main firehose). The label_sync tool can replay a labeler's full stream to catch up on missed negations (label removals) without reinserting labels.
bsky schemaThe bsky schema is created by the dataplane's migrations. On first run, the dataplane will apply all migrations automatically. The only Blacksky-specific migration is 20260202T120000000Z-add-community-post.ts (community posts table). If you don't need community posts, you can remove it.
rsky-wintermute writes to this same schema. All its INSERT statements use ON CONFLICT so it's safe to run wintermute and the dataplane migrations in any order.
pnpm install
pnpm build
node services/bsky/dataplane.js
| Variable | Required | Description |
|---|---|---|
DB_PRIMARY_URL |
Yes | PostgreSQL connection string with ?options=-csearch_path%3Dbsky |
DB_REPLICA_URL |
No | Read replica connection string |
BSKY_DATAPLANE_PORT |
No | gRPC port (default 2585) |
BSKY_REDIS_HOST |
No | Redis host:port for caching (currently recommended to leave disabled) |
BLACKSKY_MEMBERSHIP_DB_URL |
No | Separate DB for community membership (Blacksky-specific) |
node services/bsky/api.js
| Variable | Required | Description |
|---|---|---|
BSKY_APPVIEW_PORT |
No | HTTP port (default 2584) |
BSKY_DATAPLANE_URLS |
Yes | Comma-separated dataplane gRPC URLs |
BSKY_DID |
Yes | The AppView's DID (e.g. did:web:api.example.com) |
BSKY_MOD_SERVICE_DID |
Yes | Ozone moderation service DID |
BSKY_ADMIN_PASSWORDS |
Yes | Comma-separated admin passwords for basic auth |
A full-network backfill (all ~42M users, ~18.5B records) takes weeks even with wintermute's parallel processing. Expect:
During backfill, the AppView is functional but will show incomplete data for users that haven't been backfilled yet. Live events are indexed immediately regardless of backfill progress.
These are issues we encountered bootstrapping a full-network AppView. If you're doing the same, you'll likely hit some of these:
COPY text format JSON corruption: PostgreSQL's COPY text protocol treats backslash as an escape character. If your bulk loader doesn't escape backslashes in JSON strings, \" becomes " and you get silently corrupted records. The record.json column is type text (not jsonb), so PostgreSQL won't catch this. We found ~66,000 corrupted records and had to repair them by re-fetching from the public API.
Null bytes in JSON: Some AT Protocol records contain \u0000 (null byte), which is valid JSON per RFC 8259 but rejected by Node.js JSON.parse(). The dataplane silently returns null for these records. Strip null bytes before writing to the database.
Timestamp format sensitivity: The dataplane expects timestamps with millisecond precision and Z suffix (2026-01-12T19:45:23.307Z). Nanosecond precision or timezone offset format (+00:00) causes subtle sorting and comparison issues.
Notification table bloat: Without a unique constraint on (did, recordUri, reason), the notification table grows unbounded with duplicates. Ours reached 1.3 billion rows (663 GB) before we caught it. Adding ON CONFLICT DO NOTHING to INSERTs only helps if the unique index exists first, and creating the index requires deduplication of the existing data.
Post embed tables: The post_embed_image and post_embed_video tables aren't populated by default if your indexer doesn't handle them. Without these, the media filter on getAuthorFeed returns nothing. These need to be backfilled separately.
Label negation ordering: Label negation (removal) events reference the original label by source, URI, and value. If negations arrive before the original label (common during backfill), they're silently dropped. The label_sync tool replays the full stream to catch these.
Fjall queue poisoning: The Fjall embedded database (used for wintermute's queues) can enter a "poisoned" state after crashes, blocking all queue operations. The fix is to delete the queue database directory and restart -- wintermute will catch up from the relay's cursor (relays keep ~72 hours of history).
TLS provider initialization: Rust's rustls requires explicitly installing a crypto provider before any TLS connection. Without rustls::crypto::aws_lc_rs::default_provider().install_default() at startup, the first WebSocket connection to the firehose panics.
Signing key rotation after account migration: When users migrate between PDSes, their signing key changes. The dataplane caches identity data with a staleTTL of 1 hour. During that window, JWT verification fails for migrated users. The fix is to bypass the cache on verification retry and resolve directly from PLC directory.
Based on running a full-network AppView (all ~42M users, ~18.5B records).
| Resource | Minimum | Recommended |
|---|---|---|
| CPU | 16 cores | 48+ cores |
| RAM | 64 GB | 256 GB |
| Storage | 10 TB NVMe | 28+ TB NVMe (RAID) |
| PostgreSQL | Dedicated, same machine or low-latency | Same machine recommended |
| Network | Sustained 100 Mbps | 1 Gbps+ |
Storage breakdown (approximate, full network):
| Table group | Size |
|---|---|
| Posts + records | ~3.5 TB |
| Likes | ~2 TB |
| Follows | ~500 GB |
| Notifications | ~600 GB |
| Indexes | ~4 TB |
| OpenSearch (Palomar) | ~500 GB |
For a smaller community running a partial AppView (indexing only community members), requirements scale roughly linearly with indexed accounts.
git remote add upstream https://github.com/bluesky-social/atproto.git
git fetch upstream
git merge upstream/main
Conflicts will typically be in packages/bsky/src/data-plane/server/routes/ and packages/bsky/src/api/. Resolve by keeping our additions alongside upstream changes.
Same as upstream: dual-licensed under MIT and Apache 2.0. See LICENSE-MIT.txt and LICENSE-APACHE.txt.
In practice, the vast majority of handles (98.9% as of 2024) are under bsky.social [1]. Yes, alternative PDS providers exist, but if the default onboarding funnels everyone into one provider, and the average user doesn't even know what a PDS is, then decentralization is an implementation detail, not a user-facing reality.
I think they might've been referring to the fact that ATProto requires the existence of big, central relay/BGS servers, which are forced to index all the data of everyone on the network for the whole "social" aspect to work well.
That requirement makes hosting a complete, independent ATProto stack much more expensive and resource intensive than hosting an ActivityPub server, thus making ATProto harder to like, actually decentralize. (Correct me if I'm wrong, but I think currently the only independent, full-network relay is the corporate Bluesky one?)
To conflate having doubts about ATProto's design with "not liking fun" feels silly to me; it's a much less battle-tested design, doubts are warranted.
Fascinating
edit: before people take it the wrong way, I mean it's fascinating in that I've never seen these type of moderation policies before. I've seen plenty of communities about cultures (i.e. Ukrainian discord servers), but not around Race and not exclusionary to outsiders. I'm not making a moral judgement here.
Your opinion sounds too strongly held to be defended this half-heartedly.
There's no other network where anything like this is remotely even possible today, much less at such tiny costs! And it turns out it's actually computationally not hard to do so much stuff!
I ran into this really great thread from Henry Farrell on people who overly polarized themselves, who shut down being willing to hear anything else. https://bsky.app/profile/himself.bsky.social/post/3mgagtkjg7...
Imo a good time to remember the old chant: "the only war that matters is the war against imagination; all other wars are subsumed by this war." A lot of totally closed people around HN parts. Atproto is one of the new favs for the shallow hate-brigade (alongside systemd, Linux audio, k8s).
So it's multiple components really.
- The blacksky feed which is a curated feed and community built around the US black community on atproto.
- The blacksky client, appview, moderation team, and relay which provide the necessary infrastructure for blacksky to operate independent of the rest of the ecosystem if they need to and for them to tailor their experience to their community.
- The blacksky PDS which serves as a source of truth for data storage and auth for blacksky users that choose to use it.
And of course non-black users can use all of this infrastructure but if you aren't black and you want to host your account on a blacksky PDS you have to pay a small subscription/donation.
It's all for their community but they are more than willing to let other people use their infra as long as those people pay their fair share.
Maybe it's good for end users but that doesn't mean much if it can fall into the same enshittification trap as the others. Also, centralised moderation. I don't want to be dependent on an American company's moderation rules.
There was an article here recently about someone who really tried setting up their own including a did:web and they ran into many problems. https://notes.nora.codes/atproto-again/
There's still many little centralised ties in bluesky and I doubt they'll ever relinquish control completely.
Personally I like nostr a lot more, it seems to be more censorship-resistant and really decentralised.
No need to call it a "joke." Both solutions can co-exist on vastness of the Internet.
Side note: it hasn't been called BGS for a very long time. Nowadays they are just called relays and since sync 1.1 the cost for running a relay decreased by multiple orders of magnitude.
Neither Bluesky, nor other ATProto services require any of this.
RedDwarf is an example of an ATProto client that connects only with your PDS, uses Microcosm (https://www.microcosm.blue/ )'s aggregation index (they calculate aggregate counts for a variety of things with a couple of raspberry pis run off of a home fiber connection): https://reddwarf.app/
No big iron anywhere in the picture.
You give up the ability to search the network, but okay, that's already a feature that all Mastodon users sacrifice as well.
There are lots and lots of full network relays. A couple that run a full network relay on a < $5/mo VPS . See my other comments in this thread, https://news.ycombinator.com/item?id=47302514
If all you want is like and replies (which is the biggest huge part of the app view responsibility), Constellation has done that, is open source, and can do a full network link indexing on a raspberry pi. And which runs public endpoints you can just use, that many apps rely on.
Atproto unlike Mastodon also faces these challenges that, if you tried to do them on Mastodon, would get you screamed at and banned from instances. Fediverse broadly doesn't want you to use "their" data. You can't write search tools. As a result, Mastodon doesn't need anywhere near the complexity, because these unofficial but vociferously enforced terms-of-service disallow thorough interconnection to begin with. Makes it technically much simpler to pull off! No one is allowed to get a full fire hose. No broad app views are allowed. Dunno if this is still true, but for the longest time you wouldn't get likes or comments from someone unless they were already followed by someone on your fedi, and it's a direct result of this deliberate explicit lack of interconnection.
Those constraints greatly reduce the difficulty of scaling Mastodon: technical scaling is not a problem if you socially don't allow scale. So distributed you can't even read it.
Doubt is allowed. But from my view, Mastodon world deserves the doubt. It has stood still, barely budged. I'd love to be wrong. But it seems dominated by the one software that is Mastodon, and it doesn't seem to have a universe of interesting connected neat social softwares sprouting up left and right. The software centralization is near to total, the protocol centralization even worse. As a result, there is little distinguishing interesting novel Mastodon technology happening. The one API is deeply rooted. ActivityPub is trying to find some way to get started breaking this mono-culture & enable innovation but that's just started.
Meanwhile a casual glance at Atproto shows hundreds of amazing apps and systems and clients springing up, from amazing empassioned developers. 2025 saw a massive amount of technical decentralziation. Npmx and Eurosky setting up sizable idependent public/public-ish PDS instances, and has shown people indeed moving their core identity off BlueSky servers at some kind of scale. I forgive you for not knowing how far things have come, and I forget myself how incredibly quickly it's come together. It's still 99.99% Bluesky concentrated hosting, but it is distributed, has independent services for the full network, there is credible exit (1000 moved to npmx hosting in the past ~3 weeks), and to me the most important thing: there is technical diversity & independent exploration. There are so many devs building amazing things. That feels so so so absent on Mastodon.
That's the checkbox at the end of the Fermi great filter of interest for online social systems for me: can we permissionlessly build interesting social systems & experiences with these social protocols, or will each system have to look like a cookie cutter copy of a single instance?
Don't need extreme measures to keep bad actors out if you're able and willing to throw out anyone who obviously doesn't intend on playing nice.
But AFAIK the way blacksky operates is that they assume good faith when new users join. If it becomes obvious that you are not black then you will likely get reported or directly hit by moderation action and they will ask you to verify your identity at some level.
I think it's something along the lines of "send a photograph that would be non-trivial to fake". Not necessarily forcing you to dox yourself but requiring that you provide some level of evidence that's visibly resistant to AI/tampering. Now I have no idea the extent to which they do this to be entirely honest but I do know they don't mess around with people doing "digital blackface".
I'm not sure how well that moderation approach will scale at large but given they are a community that has carved out their own niche and not a corp just blindly driving to scale, I doubt they'll see the strain that the greater bluesky and atproto have experienced with moderation struggles at scale. And given all decisions around policy and moderation rules are decided by the Blacksky People's Assembly, as the community evolves participants can participate in governance and help craft the process if they are dissatisfied.
Like if you aren't being a niche internet celebrity and aren't trying to play main character on the internet it's unlikely you'd get caught unless you were particularly stupid but that's also kinda part of the point. It's a community and people in that community know each other both online and IRL. It'd be pretty hard to be involved in the community without leaving behind an evidence trail of you blatantly lying about who you are.