AT Protocol

The AT Protocol is designed to facilitate the creation of federated identities, so that users can retain, manage, and customize one online identity across multiple platforms and services. Independent hosts and other network participants can access and serve any user content within the network by fetching content formatted as predefined data schemas from federated network-wide data streams. AT Protocol clients and services interoperate through an HTTP API called XRPC (Cross-organizational Remote Procedure Calls) that uses JSON for sending and receiving associated data. Additionally, all data within the protocol that must be authenticated, referenced, or stored is encoded in CBOR. User identity The AT Protocol utilizes a dual identifier system: a mutable handle, in the form of a domain name, and an immutable decentralized identifier (DID). A handle serves as a verifiable user identifier. Verification is by either of two equivalent methods proving control of the domain name: Either a DNS query of a resource record with the same name as the handle, or a request for a text file from a Web service with the same name. DIDs resolve to DID documents, which contain references to key user metadata, such as the user's handle, public keys, and data repository. While any DID method could, in theory, be used by the protocol if its components provide support for the method, in practice only two methods are supported ('blessed') by the protocol's reference implementations: did:plc and did:web. The validity of these identifiers can be verified by a registry which hosts the DID's associated document and a file that is hosted at a well-known location on the connected domain name, respectively. The usage of these DID methods has been criticized for potentially compromising the decentralization of the protocol. The did:plc method in particular has been noted as being a single point of failure within the protocol, as its current implementation and general usage relies on a single registry hosted by Bluesky Social, with no system to independently verify the document's current state. The company has pledged to transfer the directory to an independent organization which will be incorporated as a Swiss association, as well as suggesting that the architecture of the PLC method could be moved away from its current model as a centralized single-writer registry. Services can assign handles to new users upon signup using subdomains (e.g. @username.bsky.social). Alternatively, users can set a custom domain or subdomain as their handle (e.g. @username.com or @username.wikipedia.org) by adding a TXT record to the domain's records or by responding to HTTP requests to a specific well-known URI, associating the domain or subdomain with the user's DID. The protocol's dual identifier system provides both user-friendly identifiers for use in end-user services and consistent cryptographic identities within the protocol, while also providing a robust TCP/IP-based account verification mechanism at the protocol level. User data repositories User data within the protocol is stored in dedicated data repositories, or "repos". Each user is associated with a single repository, over which they have exclusive management rights. Repositories contain mutable collections of user records, which log actions such as posts, likes, follows, and blocks. Records are persistent and can only be added or removed at the explicit request of the user. Each record within a repository's collection is assigned a unique record key, which is used by network agents to reference records within a user's repository. The current implementation of record keys is the timestamp identifier (TID), derived from the record's creation time. Repositories store collections in a Merkle search tree, which sorts records chronologically based on their TID. Media files, along with their metadata, size, and media type, are stored separately from repositories as blobs, a type of unstructured binary data, in the user's host server. This allows network agents to access and process arbitrary media files regardless of their original schema or upload context. Currently all data in repositories is public, but there are plans to add private data to the protocol. Personal Data Servers Personal Data Servers (PDSes) host user repositories and their associated media. They also serve as the network access point for users, facilitating repository updates, backups, data queries, and user requests. Platform clients access the protocol on the user's behalf by querying their PDS, which, in turn, fetches the requested data from other services within the network. This design differs from ActivityPub, where protocol interactions and services are handled by monolithic host servers. Since network events are resolved through the protocol's network-wide indexing infrastructure, the availability of any single PDS is, by design, potentially inconsequential to the user experience. The AT Protocol prioritizes data portability, enabling users to back up and migrate repositories and associated media without data loss, even in the event of an adversarial PDS. The design of PDSes within the protocol results in low computational requirements for operation, allowing individuals or groups to run their own PDSes without the need for significant computational resources. Relays and the firehose Relays are a key component of the protocol's indexing infrastructure, serving as the core indexers within the network. The firehose is available to all network agents, and can be consumed by any service within the network. Relays have been criticized as being the most centralized component in the protocol's design, given their near-indispensable role in the network and a lack of clear incentives for running a relay. AppViews AppViews, analogous to current-day social networking services, are end-user platforms and services within the protocol that consume, process, and deliver data from the relay to user clients in response to queries from users' PDSes. They utilize network-wide information from the firehose, such as posts, likes, follows, and replies, to create customized user experiences within their clients. Despite these differences, all AppViews operate from the same data sourced from the firehose. This architecture reduces the computational load and storage requirements of AppViews, and prevents user lock-in by enabling users to easily switch between AppViews while retaining their posts, follows, likes, etc. The largest AppView on the protocol is currently Bluesky, although other AppViews, such as Blacksky (a project supporting Black social media users), Frontpage (a Hacker News-style social news website) and Smoke Signal (an RSVP management service) are also available within the protocol. Lexicons All records and XRPC calls within the AT Protocol follow a specific global schema language called a lexicon to support different service and platform modalities. AppViews within the protocol have the flexibility to define their own unique lexicons, or utilize existing ones. This approach allows AppViews to create custom lexicons that are tailored to their specific use case while maintaining compatibility with the broader network. As an example, records displayed in an AppView focused on microblogging would likely use a different lexicon than one focused on video-sharing, as their content types require different sets of attributes. However, AppViews can also choose to serve content using lexicons defined by other AppViews, even if the content was originally posted elsewhere in the network. For example, a new microblogging AppView could choose to serve previously posted content using the lexicon defined by an established competitor, enabling them to provide novel features and services while maintaining compatibility with existing content. This schema design is intended to eliminate user lock-in and foster user-centric innovation by forcing AppViews to differentiate themselves through unique user experiences and additional functionality, rather than relying on exclusive access to content. Lexicons are referenced within records using Namespaced Identifiers (NSIDs), which consist of a domain authority in reverse domain-name order, followed by an arbitrary name segment. For example, com.example.foo is a valid NSID, where com.example is the domain authority, and foo is the name segment. The most popular record lexicon in the protocol, app.bsky, defines Bluesky's microblogging schema. Labelers Labelers produce judgements about user-generated content, such as identifying spam or inappropriate material. These labels can be applied to various aspects of the network, including posts, images, or accounts. The output of labelers is consumed by AppViews and PDSes, which can then provide various strategies to users for handling labeled content, such as hiding, labeling, or blurring. Bluesky Social has open-sourced its internal labeler moderation service "Ozone", allowing users to create custom moderation services for the network. Bluesky feed generators Feed generators process Bluesky posts within the firehose for inclusion in custom Bluesky feeds. After a PDS query, they return a list of post IDs to the user's AppView, which can then be used to create curated feeds. While Bluesky does not provide any built-in tools for generating and hosting custom feeds, services such as Graze provide tools for users to create and host feeds using block coding, as well as to monetize them with injected advertisements. == History ==