Run large language models across a network of independent nodes. No single node sees your full input. Cryptographic protocols protect your prompts. On-chain staking keeps nodes honest.
Three pillars that make decentralized inference private, secure, and fair.
Your prompt is secret-shared between two independent MPC nodes at the embedding layer. No single node ever sees the raw text you typed. Inference happens on encrypted shares.
Transformer models are split into shards served by independent operators. Each node holds only a slice of the model — no single entity controls the full pipeline.
Nodes stake tokens on-chain via an ERC-20 escrow contract. Cheating nodes are detected by verifiers and slashed automatically. Honest work is rewarded proportionally through a P2Pool-style share chain.
From prompt to response in four steps, across a privacy-preserving pipeline.
The client encrypts the request using onion routing layers — one layer per node in the circuit.
Shard 0 runs as an MPC pair — two nodes that perform the embedding and first transformer layers on secret-shared inputs. Neither node can reconstruct the original tokens.
Activations flow through the remaining shards (each run by an independent operator). Each shard processes its transformer layers and forwards the hidden states to the next node.
The last shard samples tokens from the LM head and streams them back through the circuit. Compute shares are recorded on the share chain for settlement and payment.
Like Monero mining pools, UNFED AI registries compete in a free market.
Any operator can deploy a registry, set economic rules (pricing, staking minimums, slash fractions), and publish model manifests. Each cluster is an independent business.
Nodes join whichever cluster offers the best deal. Clients discover clusters via seed lists and peer-exchange gossip, then pick based on price, latency, and model availability.