The Holy Grail of Crypto AI: Frontier Exploration of Decentralized Training

2025/06/11 16:19

Author: JacobZhao Source: mirror, zhaotaobo.eth

In the entire value chain of AI, model training is the link with the largest resource consumption and the highest technical threshold, which directly determines the upper limit of the model's capabilities and the actual application effect. Compared with the lightweight call in the inference stage, the training process requires continuous large-scale computing power investment, complex data processing processes and high-intensity optimization algorithm support, which is the real "heavy industry" for building AI systems. From the perspective of architectural paradigm, training methods can be divided into four categories: centralized training, distributed training, federated learning, and decentralized training, which is the focus of this article.

Centralized training is the most common traditional method, in which a single organization completes the entire training process in a local high-performance cluster. All components, from hardware (such as NVIDIA GPU), underlying software (CUDA, cuDNN), cluster scheduling system (such as Kubernetes), to training framework (such as PyTorch based on NCCL backend), are coordinated and operated by a unified control system. This deeply collaborative architecture optimizes the efficiency of memory sharing, gradient synchronization, and fault tolerance mechanisms, and is very suitable for the training of large-scale models such as GPT and Gemini. It has the advantages of high efficiency and controllable resources, but at the same time there are problems such as data monopoly, resource barriers, energy consumption, and single-point risks.
Distributed Training is the mainstream method for large-scale model training at present. Its core is to disassemble the model training task and distribute it to multiple machines for collaborative execution to break through the bottleneck of single-machine computing and storage. Although it has the "distributed" feature physically, the whole is still controlled by a centralized organization for scheduling and synchronization. It often runs in a high-speed LAN environment. Through the NVLink high-speed interconnect bus technology, the master node coordinates all subtasks. The mainstream methods include:

Data Parallel: Each node trains different data parameters to share, and the model weights need to be matched
Model Parallel: Deploy different parts of the model on different nodes to achieve strong scalability;
Pipeline Parallel: Serial execution in stages to improve throughput;
Tensor Parallel: Fine-grained segmentation of matrix calculations to improve parallel granularity.

Distributed training is a combination of "centralized control + distributed execution", which is similar to the same boss remotely directing multiple "office" employees to collaborate to complete tasks. Currently, almost all mainstream large models (GPT-4, Gemini, LLaMA, etc.) are trained in this way.

Decentralized Training represents a future path that is more open and censorship-resistant. Its core feature is that multiple untrusted nodes (which may be home computers, cloud GPUs, or edge devices) collaborate to complete training tasks without a central coordinator, usually through protocol-driven task distribution and collaboration, and with the help of cryptographic incentive mechanisms to ensure the honesty of contributions. The main challenges faced by this model include:

Device heterogeneity and difficult segmentation: Heterogeneous devices are difficult to coordinate and task segmentation efficiency is low;
Communication efficiency bottleneck: Network communication is unstable and gradient synchronization bottleneck is obvious;
Lack of trusted execution: The lack of a trusted execution environment makes it difficult to verify whether the nodes are actually involved in the calculation;
Lack of unified coordination: There is no central scheduler, and the task distribution and exception rollback mechanisms are complex.

Decentralized training can be understood as: a group of volunteers around the world, each contributing computing power to collaboratively train models, but "truly feasible large-scale decentralized training" is still a systematic engineering challenge, involving multiple levels such as system architecture, communication protocols, cryptographic security, economic mechanisms, model verification, etc., but whether "collaboration is effective + incentives are honest + results are correct" is still in the early prototype exploration stage.

Federated Learning As a transitional form between distribution and decentralization, it emphasizes local data retention and centralized aggregation of model parameters, and is suitable for scenarios that focus on privacy compliance (such as medical and financial). Federated learning has the engineering structure and local coordination capabilities of distributed training, and at the same time has the data dispersion advantages of decentralized training, but it still relies on trusted coordinators and does not have the characteristics of complete openness and anti-censorship. It can be seen as a "controlled decentralization" solution in the privacy compliance scenario. It is relatively mild in terms of training tasks, trust structure and communication mechanism, and is more suitable as a transitional deployment architecture in the industry.

AI training paradigm panoramic comparison table (technical architecture × trust incentive × application characteristics)

Boundaries, opportunities and realistic paths of decentralized training

From the perspective of training paradigm, decentralized training is not applicable to all task types. In some scenarios, due to the complex structure of the task, extremely high resource requirements or high difficulty in collaboration, it is not naturally suitable to be completed efficiently between heterogeneous and trustless nodes. For example, large model training often relies on high video memory, low latency and high-speed bandwidth, which is difficult to effectively split and synchronize in an open network; tasks with strong data privacy and sovereignty restrictions (such as medical, financial, and confidential data) are restricted by legal compliance and ethical constraints and cannot be openly shared; and tasks that lack a collaborative incentive basis (such as corporate closed-source models or internal prototype training) lack external participation motivation. These boundaries together constitute the current practical limitations of decentralized training.

But this does not mean that decentralized training is a false proposition. In fact, decentralized training shows clear application prospects in lightweight, easy-to-parallel, and incentivized task types. Including but not limited to: LoRA fine-tuning, behavioral alignment post-training tasks (such as RLHF, DPO), data crowdsourcing training and annotation tasks, resource-controllable small basic model training, and collaborative training scenarios involving edge devices. These tasks generally have the characteristics of high parallelism, low coupling, and tolerance of heterogeneous computing power, and are very suitable for collaborative training through P2P networks, Swarm protocols, distributed optimizers, etc.

Overview of decentralized training task suitability

Analysis of classic decentralized training projects

Currently, in the frontier fields of decentralized training and federated learning, representative blockchain projects mainly include Prime Intellect, Pluralis.ai, Gensyn, Nous Research and Flock.io. From the perspective of technological innovation and engineering difficulty, Prime Intellect, Nous Research and Pluralis.ai have proposed many original explorations in system architecture and algorithm design, representing the frontier direction of current theoretical research; while the implementation paths of Gensyn and Flock.io are relatively clear, and initial engineering progress can be seen. This article will analyze the core technologies and engineering architectures behind these five projects in turn, and further explore their differences and complementary relationships in the decentralized AI training system.

Prime Intellect: Pioneer of Reinforcement Learning Collaborative Network with Verifiable Training Trajectory

Prime Intellect is committed to building a trustless AI training network that allows anyone to participate in training and receive credible rewards for their computing contributions. Prime Intellect hopes to build a decentralized AI training system with verifiability, openness, and a complete incentive mechanism through the three modules of PRIME-RL + TOPLOC + SHARDCAST.

1. Prime Intellect protocol stack structure and key module value

2. Detailed explanation of Prime Intellect training key mechanisms

PRIME-RL: Decoupled asynchronous reinforcement learning task architecture

PRIME-RL is Prime Intellect A task modeling and execution framework customized for decentralized training scenarios, designed for heterogeneous networks and asynchronous participation. It uses reinforcement learning as the priority adaptation object, structurally decouples the training, reasoning and weight upload processes, so that each training node can complete the task cycle independently locally, and collaborate with the verification and aggregation mechanism through standardized interfaces. Compared with the traditional supervised learning process, PRIME-RL is more suitable for elastic training in an environment without central scheduling, which not only reduces the complexity of the system, but also lays the foundation for supporting multi-task parallelism and policy evolution.

TOPLOC: Lightweight Training Behavior Verification Mechanism

TOPLOC (Trusted Observation & Policy-Locality Check) is the core mechanism of training verifiability proposed by Prime Intellect, which is used to determine whether a node has actually completed effective policy learning based on observed data. Unlike heavy solutions such as ZKML, TOPLOC does not rely on full model recalculation, but completes lightweight structure verification by analyzing the local consistency trajectory between "observation sequence ↔ policy update". It is the first time that it transforms the behavioral trajectory of the training process into a verifiable object. It is a key innovation to achieve trustless training reward distribution and provides a feasible path for building an auditable and incentivized decentralized collaborative training network.

SHARDCAST: Asynchronous Weight Aggregation and Propagation Protocol

SHARDCAST is a weight propagation and aggregation protocol designed by Prime Intellect, optimized for asynchronous, bandwidth-constrained, and real-world network environments with variable node states. It combines the gossip propagation mechanism with the local synchronization strategy to allow multiple nodes to continuously submit partial updates in an asynchronous state, achieving progressive convergence and multi-version evolution of weights. Compared with centralized or synchronous AllReduce methods, SHARDCAST significantly improves the scalability and fault tolerance of decentralized training, and is the core foundation for building stable weight consensus and continuous training iterations.

OpenDiLoCo: Sparse Asynchronous Communication Framework

OpenDiLoCo is a communication optimization framework independently implemented and open sourced by the Prime Intellect team based on the DiLoCo concept proposed by DeepMind. It is designed for challenges such as bandwidth constraints, device heterogeneity, and node instability that are common in decentralized training. Its architecture is based on data parallelism. By building sparse topological structures such as Ring, Expander, and Small-World, it avoids the high communication overhead of global synchronization, and only relies on local neighbor nodes to complete model collaborative training. Combined with asynchronous updates and breakpoint tolerance mechanisms, OpenDiLoCo enables consumer-grade GPUs and edge devices to stably participate in training tasks, significantly improving the participation of global collaborative training, and is one of the key communication infrastructures for building decentralized training networks.

PCCL: Collaborative Communication Library

PCCL (Prime Collective Communication Library) is a lightweight communication library tailored by Prime Intellect for decentralized AI training environments, designed to solve the adaptation bottleneck of traditional communication libraries (such as NCCL, Gloo) in heterogeneous devices and low-bandwidth networks. PCCL supports sparse topology, gradient compression, low-precision synchronization and breakpoint recovery, and can run on consumer-grade GPUs and unstable nodes. It is the underlying component that supports the asynchronous communication capabilities of the OpenDiLoCo protocol. It significantly improves the bandwidth tolerance and device compatibility of the training network, and opens up the "last mile" communication foundation for building a truly open, trustless collaborative training network.

3. Prime Intellect Incentive Network and Role Division

Prime Intellect has built a permissionless, verifiable, and economically incentivized training network that enables anyone to participate in tasks and receive rewards based on real contributions. The operation of the protocol is based on three core roles:

Task initiator: defines the training environment, initial model, reward function and verification criteria
Training node: performs local training, submits weight updates and observes trajectories
Verification node: uses the TOPLOC mechanism to verify the authenticity of the training behavior, and participates in reward calculation and strategy aggregation

The core process of the protocol includes task publishing, node training, trajectory verification, weight aggregation (SHARDCAST) and reward issuance, forming an incentive closed loop around "real training behavior".

IV. INTELLECT-2: Release of the first verifiable decentralized training model

Prime Intellect released INTELLECT-2 in May 2025, the world's first large reinforcement learning model trained by asynchronous, trustless decentralized nodes, with a parameter scale of 32B. The INTELLECT-2 model was trained by 100+ GPU heterogeneous nodes across three continents, using a fully asynchronous architecture and a training time of over 400 hours, demonstrating the feasibility and stability of asynchronous collaborative networks. This model is not only a breakthrough in performance, but also the first systematic implementation of the "training is consensus" paradigm proposed by Prime Intellect. INTELLECT-2 integrates core protocol modules such as PRIME-RL (asynchronous training structure), TOPLOC (training behavior verification) and SHARDCAST (asynchronous weight aggregation), marking the first time that a decentralized training network has achieved openness, verification and economic incentive closed loop in the training process.

In terms of performance, INTELLECT-2 is based on QwQ-32B training and has done special RL training in code and mathematics, which is at the forefront of current open source RL fine-tuning models. Although it has not yet surpassed closed-source models such as GPT-4 or Gemini, its real significance lies in: it is the world's first decentralized model experiment with a complete training process that is reproducible, verifiable and auditable. Prime Intellect not only open-sourced the model, but more importantly, the training process itself - the training data, strategy update trajectory, verification process and aggregation logic are all transparent and traceable, building a decentralized training network prototype that everyone can participate in, trustworthy collaboration, and share benefits.

V. Team and Financing Background

Prime Intellect completed a $15 million seed round of financing in February 2025, led by Founders Fund, with participation from many industry leaders including Menlo Ventures, Andrej Karpathy, Clem Delangue, Dylan Patel, Balaji Srinivasan, Emad Mostaque, Sandeep Nailwal, etc. Prior to this, the project completed a $5.5 million early round of financing in April 2024, led by CoinFund and Distributed Global, with participation from Compound VC, Collab + Currency, Protocol Labs, etc. To date, Prime Intellect has raised more than $20 million in cumulative financing.

The co-founders of Prime Intellect are Vincent Weisser and Johannes Hagemann. The team members have backgrounds in AI and Web3. The core members come from Meta AI, Google Research, OpenAI, Flashbots, Stability AI and the Ethereum Foundation. They have deep capabilities in system architecture design and distributed engineering implementation. They are one of the very few executive teams that have successfully completed real decentralized large-scale model training.

Pluralis: Paradigm Explorer of Asynchronous Model Parallel and Structural Compression Co-training

Pluralis is a Web3 AI project focused on "trusted collaborative training networks". Its core goal is to promote a decentralized, open-participation, and long-term incentive model training paradigm. Different from the current mainstream centralized or closed training paths, Pluralis proposed a new concept called Protocol Learning: "protocol-based" model training process, and build an open training system with an intrinsic incentive closed loop through a verifiable collaboration mechanism and model ownership mapping.

1. Core Concept: Protocol Learning

The Protocol Learning proposed by Pluralis contains three key pillars:

Unmaterializable ModelsThe model is distributed among multiple nodes in fragments, and no single node can restore the complete weights and remain closed source. This design makes the model a natural "in-protocol asset", which can realize access credential control, leakage protection and income attribution binding.
Model-parallel Training over Internet Through the asynchronous Pipeline model parallel mechanism (SWARM architecture), different nodes only hold part of the weight and complete training or reasoning through low-bandwidth network collaboration.
Partial Ownership for Incentives** All participating nodes obtain partial ownership of the model according to their training contribution, thereby enjoying future revenue sharing and protocol governance rights.

II. Technical architecture of Pluralis protocol stack

III. Detailed explanation of key technical mechanisms

Unmaterializable Models

It was first systematically proposed in "A Third Path: Protocol Learning" that model weights are distributed in the form of fragments to ensure that "model assets" can only be stored in Swarm The network runs in the network, ensuring that its access and income are controlled by the protocol. This mechanism is the prerequisite for realizing a sustainable incentive structure for decentralized training.

Asynchronous Model-Parallel Training

In "SWARM Parallel with Asynchronous Updates", Pluralis built an asynchronous model parallel architecture based on Pipeline and demonstrated it on LLaMA-3 for the first time. The core innovation is the introduction of the Nesterov Accelerated Gradient (NAG) mechanism, which effectively corrects the gradient drift and convergence instability problems during the asynchronous update process, making training between heterogeneous devices practical in a low-bandwidth environment.

Column-Space Sparsification

In Beyond Top-K, it is proposed to replace the traditional Top-K with a structure-aware column space compression method to avoid destroying the semantic path. This mechanism takes into account both model accuracy and communication efficiency. It has been measured that more than 90% of communication data can be compressed in an asynchronous model parallel environment, which is a key breakthrough in achieving structure-aware efficient communication.

Fourth, technical positioning and path selection

Pluralis clearly takes "asynchronous model parallelism" as its core direction, emphasizing that it has the following advantages over data parallelism:

Supports low-bandwidth networks and non-consistent nodes;
Adapts to device heterogeneity and allows consumer-grade GPUs to participate;
Naturally possesses elastic scheduling capabilities and supports frequent online/offline nodes;
With structure compression + asynchronous update + weight non-extractability as the three major breakthrough points.

Currently, based on the six technical blog documents published on the official website, the logical structure is integrated into the following three main lines:

Philosophy and Vision: "A Third Path: Protocol Learning" "Why Decentralized Training Matters"
Technical Mechanism Details: "SWARM Parallel" "Beyond Top-K" "Asynchronous Updates"
Institutional Innovation Exploration: "Unmaterializable Models" "Partial Ownership Protocols"

At present, Pluralis has not yet launched products, test networks or open source codes. The reason is that the technical path it has chosen is extremely challenging: it is necessary to solve system-level problems such as the underlying system architecture, communication protocols, and non-exportable weights before it is possible to encapsulate product services upward.

In a new paper published by Pluralis Research in June 2025, its decentralized training framework was expanded from model pre-training to model fine-tuning, supporting asynchronous updates, sparse communication and partial weight aggregation. Compared with the previous design that focused on theory and pre-training, this work pays more attention to the feasibility of implementation, marking its further maturity in the full-cycle training architecture.

V. Team and Financing Background

Pluralis completed a $7.6 million seed round of financing in 2025, led by Union Square Ventures (USV) and CoinFund. Founder Alexander Long has a PhD in machine learning and a background in both mathematics and systems research. The core members are all machine learning researchers with doctoral backgrounds. It is a typical technology-driven project with high-density papers and technical blogs as the main release path. Currently, there is no BD/Growth team and it focuses on overcoming the infrastructure difficulties of low-bandwidth asynchronous model parallelism. Gensyn: Decentralized training protocol layer driven by verifiable execution Gensyn is a Web3 AI project focusing on "trusted execution of deep learning training tasks". The core is not to reconstruct the model architecture or training paradigm, but to build a verifiable distributed training execution network with the full process of "task distribution + training execution + result verification + fair incentives". Through the architectural design of off-chain training + on-chain verification, Gensyn has established an efficient, open and incentivized global training market, making "training is mining" a reality.

1. Project positioning: execution protocol layer of training tasks

Gensyn is not about “how to train”, but the infrastructure of “who trains, how to verify, and how to share profits”. Its essence is a verifiable computing protocol for training tasks, which mainly solves the following problems:

Who will perform the training tasks (computing power distribution and dynamic matching)
How to verify the execution results (no need to recalculate, only verify the disputed operators)
How to distribute the training income (Stake, Slashing and multi-role game mechanism)

2. Overview of the technical architecture

III. Module Detailed Explanation

RL Swarm: Collaborative Reinforcement Learning Training System

Gensyn's first RL Swarm is a decentralized multi-model collaborative optimization system for the post-training stage, with the following core features:

Distributed Reasoning and Learning Process:

Answering phase: each node outputs the answer independently;
Critique phase: nodes comment on each other's output and select the best answer and logic;
Resolving phase: predict the preferences of most nodes and modify their own answers accordingly to achieve local weight updates.

RL Swarm proposed by Gensyn is a decentralized multi-model collaborative optimization system. Each node runs an independent model and performs local training without gradient synchronization. It naturally adapts to heterogeneous computing power and unstable network environment, and supports elastic node access and exit. This mechanism draws on the ideas of RLHF and multi-agent game, but is closer to the dynamic evolution logic of collaborative reasoning network. Nodes are rewarded according to the degree of consistency with the group consensus results, thereby driving the continuous optimization and convergent learning of reasoning capabilities. RL Swarm significantly improves the robustness and generalization ability of the model in open networks, and has been deployed as a core execution module in Gensyn's Testnet Phase 0 based on Ethereum Rollup.

Verde + Proof-of-Learning: Trusted Verification Mechanism

Gensyn's Verde module combines three mechanisms:

Proof-of-Learning: Determine whether the training actually occurred based on the gradient trajectory and training metadata;
Graph-Based Pinpoint: Locate the divergent nodes in the training calculation graph, and only need to recalculate the specific operations;
Refereed Delegation: Adopts an arbitration verification mechanism, in which the verifier and challenger raise disputes and verify locally, greatly reducing the verification cost.

Compared to ZKP or full recalculation verification schemes, the Verde scheme achieves a better balance between verifiability and efficiency.

SkipPipe: Communication fault-tolerant optimization mechanism

SkipPipe is designed to solve the communication bottleneck problem in the "low bandwidth + node disconnection" scenario. Its core capabilities include:

Skip Ratio: skip restricted nodes to avoid training blockage;
Dynamic scheduling algorithm: generate the optimal execution path in real time;
Fault-tolerant execution: even if 50% of the nodes fail, the inference accuracy only drops by about 7%.

Supports training throughput improvement of up to 55%, and realizes key capabilities such as "early-exit reasoning", "seamless reordering", and "reasoning completion".

HDEE: Cross-domain heterogeneous expert clusters

The HDEE (Heterogeneous Domain-Expert Ensembles) module is committed to optimizing the following scenarios:

Multi-domain, multi-modal, and multi-task training;
Uneven distribution of various types of training data and large differences in difficulty;
Task allocation and scheduling issues in an environment with heterogeneous device computing capabilities and inconsistent communication bandwidth.

Its core features:

MHe-IHo: Assign models of different sizes to tasks of different difficulty levels (heterogeneous models, consistent training step size);
MHo-IHe: Uniform task difficulty, but asynchronous adjustment of training step size;
Support heterogeneous expert models + pluggable training strategies to improve adaptability and fault tolerance;
Emphasis on "parallel collaboration + extremely low communication + dynamic expert allocation", suitable for complex task ecology in reality.

Multi-role game mechanism: trust and incentives in parallel

The Gensyn network introduces four types of participants:

Submitter: publish training tasks, set structure and budget;
Solver: execute training tasks and submit results;
Verifier: verify training behavior to ensure its compliance and effectiveness;
Whistleblower: challenge the verifier to obtain arbitration rewards or bear penalties.

This mechanism is inspired by the Truebit economic game design. By forcibly inserting errors + random arbitration, it encourages participants to collaborate honestly and ensures the reliable operation of the network.

Fourth, test network and roadmap planning

Five, team and financing background

Gensyn was co-founded by Ben Fielding and Harry Grieve and is headquartered in London, UK. In May 2023, Gensyn announced the completion of a $43 million Series A round led by a16z crypto, with other investors including CoinFund, Canonical, Ethereal Ventures, Factor, and Eden Block. The team background combines distributed systems and machine learning engineering experience, and has long been committed to building a verifiable, trustless, large-scale AI training execution network.

Nous Research: Cognitive evolutionary training system driven by the concept of subjective AI

Nous Research is one of the few decentralized training teams that has both philosophical height and engineering realization. Its core vision stems from the concept of "Desideratic AI": AI is regarded as an intelligent subject with subjectivity and evolutionary ability, rather than a simple controllable tool. The uniqueness of Nous Research lies in that it does not optimize AI training as an "efficiency problem", but regards it as the formation process of a "cognitive subject". Driven by this vision, Nous focuses on building an open training network that is collaboratively trained by heterogeneous nodes, does not require central scheduling, and is censorship-resistant, and is systematically implemented through a full-stack tool chain.

1. Concept support: Redefine the "purpose" of training

Nous did not invest too much in incentive design or protocol economics, but tried to change the philosophical premise of training itself:

Oppose "alignmentism": Disagree with the "training-style training" that takes human control as the only goal, and advocate that training should encourage models to form an independent cognitive style;
Emphasis on model subjectivity: It is believed that the basic model should retain uncertainty, diversity and hallucination generation ability (hallucination as virtue);
Model training is cognitive formation: the model is not "optimizing task completion", but an individual participating in the cognitive evolution process.

Although this training view is "romantic", it reflects the core logic of Nous' design of training infrastructure: how to allow heterogeneous models to evolve in an open network rather than being uniformly disciplined.

2. Training Core: Psyche Network and DisTrO Optimizer

Nous' most critical contribution to decentralized training is the construction of the Psyche network and the underlying communication optimizer DisTrO (Distributed Training Over-the-Internet), which together constitute the execution center of the training task: DisTrO + Psyche network has many core capabilities, including communication compression (using DCT + 1-bit sign encoding to greatly reduce bandwidth requirements), node adaptability (supporting heterogeneous GPUs, disconnection reconnection and autonomous exit), asynchronous fault tolerance (continuous training without synchronization, with high fault tolerance), and decentralized scheduling mechanism (no central coordinator is required, consensus and task distribution are achieved based on blockchain). This architecture provides a realistic and feasible technical foundation for a low-cost, highly flexible, and verifiable open training network.

This architectural design emphasizes practical feasibility: it does not rely on central servers, is adaptable to global volunteer nodes, and has on-chain traceability of training results. 3. Reasoning and agent system composed of Hermes / Forge / TEE_HEE In addition to building a decentralized training infrastructure, Nous Research has also conducted a number of exploratory system experiments around the concept of "AI subjectivity":

Hermes open source model series: Hermes 1 to 3 are representative open source large models launched by Nous, based on LLaMA 3.1 training, covering three parameter scales of 8B, 70B and 405B. This series aims to embody the "de-instruction, retain diversity" training concept advocated by Nous, and shows stronger expressiveness and generalization capabilities in long context retention, role-playing, multi-round dialogue, etc.
Forge Reasoning API: Multi-modal reasoning system Forge is a reasoning framework developed by Nous, combining three complementary mechanisms to achieve more flexible and creative reasoning capabilities: MCTS (Monte Carlo Tree Search): Strategy search for complex tasks; CoC (Chain of Code): Introducing a combined path of code chain and logical reasoning; MoA (Mixture of Agents): Allowing multiple models to negotiate and improve the breadth and diversity of output. The system emphasizes "non-deterministic reasoning" and combinatorial generation paths, which is a powerful response to the traditional instruction alignment paradigm.
TEE_HEE: AI autonomous agent experiment: TEE_HEE is Nous's cutting-edge exploration in the direction of autonomous agents, aiming to verify whether AI can run independently in a trusted execution environment (TEE) and have a unique digital identity. The agent has its own Twitter and Ethereum accounts, and all control permissions are managed by a remotely verifiable enclave, and developers cannot interfere with its behavior. The goal of the experiment is to build an AI subject with "immutability" and "independent behavioral intentions", taking an important step towards building an autonomous intelligent body.
AI behavior simulator platform: Nous has also developed multiple simulators including WorldSim, Doomscroll, Gods & S8n, etc., to study the behavioral evolution and value formation mechanism of AI in a multi-role social environment. Although not directly involved in the training process, these experiments lay the semantic foundation for cognitive behavioral modeling of long-term autonomous AI. 4. Team and Financing Overview Nous Research was founded in 2023 and was co-founded by Jeffrey Quesnelle (CEO), Karan Malhotra, Teknium, Shivani Mitra and others. The team focuses on both philosophy-driven and system engineering, and has diverse backgrounds in machine learning, system security, and decentralized networks. In 2024, it received $5.2 million in seed round financing. In April 2025, it completed a $50 million Series A financing led by Paradigm, with a valuation of $1 billion, becoming one of the Web3 AI unicorns.

Flock: Blockchain-enhanced federated learning networkFlock.io is a blockchain-based federated learning platform that aims to decentralize data, computing, and models for AI training. FLock prefers the integrated framework of "federated learning + blockchain reward layer", which is essentially an on-chain evolution of the traditional FL architecture rather than a systematic exploration of building a new training protocol. Compared with decentralized training projects such as Gensyn, Prime Intellect, Nous Research, and Pluralis, Flock focuses on privacy protection and usability improvement rather than theoretical breakthroughs in communication, verification, or training methods. Its real comparison targets are federated learning systems such as Flower, FedML, and OpenFL. 1. The core mechanism of Flock.io: Federated learning architecture: emphasizing data sovereignty and privacy protection Flock is based on the classic Federated Learning (FL) paradigm, allowing multiple data owners to collaboratively train a unified model without sharing the original data, focusing on solving data sovereignty, security, and trust issues. The core process includes: Local training: Each participant (Proposer) trains the model on the local device without uploading the original data; On-chain aggregation: After training is completed, the local weight update is submitted and aggregated into a global model by the on-chain Miner; Committee evaluation: VRF randomly elects Voter nodes to use independent test sets to evaluate the effect of the aggregated model and score it; Incentives and penalties: Rewards or confiscations of collateral are executed according to the score results to achieve anti-malice and dynamic trust maintenance.

Blockchain integration: Realize trustless system coordination Flock puts all the core links of the training process (task allocation, model submission, evaluation and scoring, incentive execution) on the chain to achieve system transparency, verifiability and anti-censorship. The main mechanisms include: VRF random election mechanism: improve the fairness and anti-manipulation ability of Proposer and Voter rotation; Equity mortgage mechanism (PoS): constrain node behavior through token mortgage and penalty to improve system robustness; On-chain incentive automatic execution: through smart contracts, reward distribution and slashing penalties bound to task completion and evaluation results are realized to build a collaborative network without trusting intermediaries.

zkFL: Privacy protection innovation of zero-knowledge aggregation mechanism: Flock introduces zkFL zero-knowledge aggregation mechanism, which enables Proposer to submit locally updated zero-knowledge proofs, and Voter can verify its correctness without accessing the original gradient, which improves the credibility of the training process while ensuring privacy, and represents an important innovation in the integration of privacy protection and verifiability in federated learning.

2. Flock's core product components AI Arena: It is a decentralized training platform of Flock.io. Users can participate in model tasks through train.flock.io, serve as trainers, validators or delegators, and receive rewards by submitting models, evaluating performance or delegating tokens. Currently, tasks are officially released and will be gradually opened to community co-creation in the future. FL Alliance: It is a Flock federated learning client that supports participants to further fine-tune the model using private data. Through VRF election, staking and slashing mechanisms, the honesty and collaborative efficiency of the training process are guaranteed, which is the key link between community initial training and real deployment. AI Marketplace: It is a model co-creation and deployment platform where users can propose models, contribute data, and call model services. It supports database access and RAG enhanced reasoning, and promotes the implementation and circulation of AI models in various practical scenarios.

III. Team and Financing Overview Flock.io was founded by Sun Jiahao and has issued the platform token FLOCK. The project has raised a total of US$11 million, with investors including DCG, Lightspeed Faction, Tagus Capital, Animoca Brands, Fenbushi, OKX Ventures, etc. In March 2024, Flock completed a US$6 million seed round of financing to launch the test network and federated learning client; in December of the same year, it added US$3 million in financing and received funding from the Ethereum Foundation to focus on research on blockchain-driven AI incentive mechanisms. At present, the platform has created 6,428 models, connected to 176 training nodes, 236 verification nodes, and 1,178 delegators.

Compared with decentralized training projects, federated learning-based systems such as Flock have more advantages in training efficiency, scalability, and privacy protection, especially for collaborative training of small and medium-sized models. The solution is pragmatic and easy to implement, and is more inclined to feasibility optimization at the engineering level; while projects such as Gensyn and Pluralis pursue deeper theoretical breakthroughs in training methods and communication mechanisms. The system challenges are greater, but they are also closer to the true "trustless and decentralized" training paradigm exploration.

EXO: Decentralized training attempt for edge computing EXO is a very representative AI project in the current edge computing scenario, dedicated to realizing lightweight AI training, reasoning, and Agent applications on home-level consumer devices. Its decentralized training path emphasizes "low communication overhead + local autonomous execution", and adopts the DiLoCo asynchronous delayed synchronization algorithm and the SPARTA sparse parameter exchange mechanism to significantly reduce the bandwidth requirements for multi-device collaborative training. At the system level, EXO did not build an on-chain network or introduce an economic incentive mechanism, but launched the single-machine multi-process simulation framework EXO Gym, which supports researchers to easily conduct rapid verification and experiments of distributed training methods in a local environment. 1. Overview of core mechanisms DiLoCo asynchronous training: Node synchronization is performed every H steps to adapt to unstable networks; SPARTA sparse synchronization: Only a very small number of parameters (such as 0.1%) are exchanged in each step to maintain model relevance and reduce bandwidth requirements; Asynchronous combination optimization: The two can be used in combination to achieve a better compromise between communication and performance. evML verification mechanism exploration: Edge-Verified Machine Learning (evML) proposes the use of TEE/Secure Context for low-cost computing verification, and realizes the trusted participation of edge devices without staking through remote verification + spot check mechanism, which is an engineering compromise between economic security and privacy protection. 2. Tools and Scenario Applications EXO Gym: can simulate multi-node training environments on a single device, and supports communication strategy experiments for models such as NanoGPT, CNN, Diffusion, etc. EXO Desktop App: desktop AI tools for individual users, supporting privacy-friendly personalized features such as local large model operation, iPhone mirror control, and private context integration (such as SMS, calendar, video recording). EXO Gym is more like an exploration-oriented decentralized training experiment project, which mainly integrates existing communication compression technologies (such as DiLoCo and SPARTA) to achieve lightweight training paths. Compared with projects such as Gensyn, Nous, and Pluralis, EXO has not yet entered the core stages of on-chain collaboration, verifiable incentive mechanisms, or real distributed network deployment.

The front-chain engine of decentralized training: a panoramic study of model pre-training

Faced with the core challenges of decentralized training, such as device heterogeneity, communication bottlenecks, coordination difficulties, and lack of trusted execution, Gensyn, Prime Intellect, Pluralis, and Nous Research have proposed differentiated system architecture paths. From the perspectives of training methods and communication mechanisms, these four projects have demonstrated their unique technical focus and engineering implementation logic.

In terms of training method optimization, the four have explored key dimensions such as collaborative strategies, update mechanisms, and asynchronous control, covering different stages from pre-training to post-training.

Prime Intellect's PRIME-RL is an asynchronous scheduling structure for the pre-training stage. Through the strategy of "local training + periodic synchronization", it realizes an efficient and verifiable training scheduling mechanism in a heterogeneous environment. This method has strong versatility and flexibility. The theoretical innovation is high, and a clear paradigm is proposed in the training control structure; the engineering implementation difficulty is medium to high, and there are high requirements for the underlying communication and control modules.
The DeMo optimizer launched by Nous Research focuses on the training stability problem in an asynchronous low-bandwidth environment, and realizes a high-fault-tolerant gradient update process under heterogeneous GPU conditions. It is one of the few solutions that have completed the unification of theory and engineering on the "asynchronous communication compression closed loop". The theoretical innovation is very high, especially in the compression and scheduling collaborative path; the engineering implementation difficulty is also very high, especially relying on the coordination accuracy of asynchronous parallelism.
Pluralis's SWARM + NAG is one of the most systematic and groundbreaking designs in the current asynchronous training path. It is based on the asynchronous model parallel framework, introduces Column-space sparse communication and NAG momentum correction, and constructs a large model training solution that can converge stably under low bandwidth conditions. It has a very high degree of theoretical innovation and is a structural pioneer of asynchronous collaborative training; the engineering difficulty is also very high, requiring deep integration of multi-level synchronization and model segmentation.
Gensyn's RL Swarm mainly serves the post-training stage, focusing on policy fine-tuning and agent collaborative learning. Its training process follows the three-step process of "generation-evaluation-voting", which is particularly suitable for the dynamic adjustment of complex behaviors in multi-agent systems. The theoretical innovation is medium-high, mainly reflected in the agent collaboration logic; the engineering implementation difficulty is moderate, and the main challenge lies in system scheduling and behavior convergence control.

At the communication mechanism optimization level, these four projects also have their own targeted layouts, and generally focus on systematic solutions to bandwidth bottlenecks, node heterogeneity and scheduling stability problems.

Prime Intellect's PCCL is a low-level communication library used to replace the traditional NCCL, aiming to provide a more robust collective communication foundation for the upper-level training protocol. The theoretical innovation is medium-high, with certain breakthroughs in fault-tolerant communication algorithms; the engineering difficulty is medium, and it has strong module adaptability.
Nous Research's DisTrO is the core communication module of DeMo, emphasizing the minimum communication overhead under low bandwidth while ensuring the continuity of the training closed loop. The theoretical innovation is high, and it has universal design value in the scheduling coordination structure; the engineering difficulty is high, and it has high requirements for compression accuracy and training synchronization.
Pluralis' communication mechanism is deeply embedded in the SWARM architecture, significantly reducing the communication load in asynchronous training of large models, maintaining efficient throughput while ensuring convergence. It has high theoretical innovation and sets a paradigm for asynchronous model communication design; the engineering difficulty is extremely high, relying on distributed model orchestration and structural sparsity control.
Gensyn's SkipPipe is a fault-tolerant scheduling component for RL Swarm. This solution has low deployment cost and is mainly used to enhance training stability at the engineering landing layer. The theoretical innovation is average, and it is more of an engineering implementation of known mechanisms; the engineering difficulty is relatively low, but it is highly practical in actual deployment.

In addition, we can measure the value of decentralized training projects from two more macro categories: blockchain collaboration layer and AI training layer:

Blockchain collaboration layer: Emphasis on protocol credibility and incentive collaboration logic

Verifiability: Whether the training process is verifiable and whether game or encryption mechanism is introduced to establish trust;
Incentive mechanism: Whether a task-driven Token reward/role mechanism is designed;
Openness and entry threshold: Whether the node is easy to access, whether it is centralized or permission-controlled.

AI training system level: highlight engineering capabilities and performance accessibility

Scheduling and fault-tolerance mechanism: whether fault-tolerant, asynchronous, dynamic, and distributed scheduling;
Training method optimization: whether the model training algorithm or structure is optimized;
Communication path optimization: whether gradients are compressed/sparse communication is used to adapt to low bandwidth.

The following table systematically evaluates the technical depth, engineering maturity, and theoretical innovation of Gensyn, Prime Intellect, Pluralis, and Nous Research on the decentralized training path based on the above indicator system.

The back-chain ecology of decentralized training: model fine-tuning based on LoRA

In the complete value chain of decentralized training, projects such as Prime Intellect, Pluralis.ai, Gensyn and Nous Research mainly focus on front-end infrastructure construction such as model pre-training, communication mechanism and collaborative optimization. However, another type of project focuses on post-training fine-tuning and inference delivery, and does not directly participate in systematic training processes such as pre-training, parameter synchronization, or communication optimization. Representative projects include Bagel, Pond, and RPS Labs, all of which are based on the LoRA fine-tuning method, forming a key "post-chain" link in the decentralized training ecosystem.

LoRA + DPO: A realistic path for Web3 fine-tuning deployment

LoRA (Low-Rank Adaptation) is an efficient parameter fine-tuning method. Its core idea is to insert low-rank matrices into pre-trained large models to learn new tasks while freezing the original model parameters. This strategy significantly reduces training costs and resource consumption, improves fine-tuning speed and deployment flexibility, and is particularly suitable for Web3 scenarios characterized by modularity and combined calls.

Traditional large language models such as LLaMA and GPT-3 often have billions or even hundreds of billions of parameters, and direct fine-tuning is expensive. LoRA achieves efficient adaptation of large models by only training a small number of inserted parameter matrices, becoming one of the most practical mainstream methods at present.

**Direct Preference Optimization (DPO)** is a language model post-training method that has emerged in recent years. It is often used in conjunction with the LoRA fine-tuning mechanism for the model behavior alignment stage. Compared with the traditional RLHF (Reinforcement Learning from Human Feedback) method, DPO achieves preference learning by directly optimizing paired samples, eliminating the complex reward modeling and reinforcement learning process. It has a simpler structure and more stable convergence, which is especially suitable for fine-tuning tasks in lightweight and resource-constrained environments. Due to its high efficiency and ease of use, DPO is gradually becoming the preferred solution for many decentralized AI projects in the model alignment stage.

Reinforcement Learning (RL): The Future Evolution of Post-training Fine-tuning

From a long-term perspective, more and more projects regard reinforcement learning (RL) as a core path with more adaptability and evolutionary potential in decentralized training. Compared with supervised learning or parameter fine-tuning mechanisms that rely on static data, RL emphasizes continuous optimization of strategies in a dynamic environment, which naturally fits the asynchronous, heterogeneous and incentive-driven collaboration pattern in the Web3 network. Through continuous interaction with the environment, RL can achieve a highly personalized and continuous incremental learning process, providing an evolvable "behavioral intelligence" infrastructure for the construction of agent networks, on-chain task markets and smart economies.

This paradigm is not only highly consistent with the spirit of decentralization in concept, but also has significant system advantages. However, limited by the high engineering threshold and complex scheduling mechanism, RL still faces great challenges in its implementation at the current stage, and it is difficult to promote it widely in the short term.

It is worth noting that Prime Intellect's PRIME-RL and Gensyn's RL Swarm are driving the evolution of RL from a post-training fine-tuning mechanism to a pre-training main structure, attempting to build a collaborative training system centered on RL and without trust coordination.

Bagel (zkLoRA): A trusted verification layer for LoRA fine-tuning

Bagel is based on the LoRA fine-tuning mechanism and introduces zero-knowledge proof (ZK) technology to solve the credibility and privacy protection problems in the process of "on-chain model fine-tuning". zkLoRA does not participate in the actual training calculation, but provides a lightweight and verifiable mechanism that allows external users to confirm that a fine-tuned model is indeed derived from a specified base model and LoRA parameters without accessing the original data or weights.

Unlike Gensyn's Verde or Prime Intellect's TOPLOC, which focus on dynamic verification of "whether the behavior actually occurred" during the training process, Bagel focuses more on static verification of "whether the fine-tuning results are credible". The biggest advantage of zkLoRA is its low verification resource consumption and strong privacy protection, but its application scope is usually limited to fine-tuning tasks with small parameter changes.

Pond: Fine-tuning and agent evolution platform in GNN scenarios

Pond is the only decentralized training project in the industry that focuses on graph neural network (GNN) fine-tuning, serving structured data applications such as knowledge graphs, social networks, and transaction graphs. It provides a lightweight and controllable training and reasoning platform for personalized tasks by supporting users to upload graph structure data and participate in model training feedback.

Pond also uses efficient fine-tuning mechanisms such as LoRA. Its core goal is to realize a modular and deployable intelligent agent system on the GNN architecture, opening up a new exploration path of "small model fine-tuning + multi-agent collaboration" in a decentralized context.

RPS Labs: AI-driven liquidity engine for DeFi

RPS Labs is a decentralized training project based on the Transformer architecture, dedicated to using fine-tuned AI models for DeFi liquidity management, mainly deployed in the Solana ecosystem. Its flagship product UltraLiquid is an active market-making engine that uses fine-tuned models to dynamically adjust liquidity parameters, reduce slippage, increase depth, and optimize token issuance and trading experience.

In addition, RPS also launched the UltraLP tool to support liquidity providers to optimize their fund allocation strategies on DEX in real time, thereby improving capital efficiency and reducing the risk of impermanent loss, reflecting the practical value of AI fine-tuning in financial scenarios.

From the front-chain engine to the back-chain ecology: the way forward for decentralized training

In the complete ecological map of decentralized training, the whole can be divided into two categories: the front-chain engine corresponds to the model pre-training stage, and the back-chain ecology corresponds to the model fine-tuning deployment stage, forming a complete closed loop from infrastructure to application landing.

The front-chain engine focuses on the construction of the underlying protocol for model pre-training, represented by projects such as Prime Intellect, Nous Research, Pluralis.ai, and Gensyn. They are committed to creating a system architecture with asynchronous updates, sparse communication and training verifiability, achieving efficient and reliable distributed training capabilities in a trustless network environment, and forming the technical foundation of decentralized training.

At the same time, Flock, as a representative of the middle layer, through the federated learning path, integrates model aggregation, on-chain verification and multi-party incentive mechanisms, and establishes a feasible and collaborative bridge between training and deployment, providing a practical paradigm for multi-node collaborative learning.

The post-chain ecology focuses on model fine-tuning and application layer deployment. Projects such as Pond, Bagel and RPS Labs revolve around the LoRA fine-tuning method: Bagel provides an on-chain trusted verification mechanism, Pond focuses on the evolution of small models of graph neural networks, and RPS applies the fine-tuning model to smart market making in DeFi scenarios. Through components such as reasoning API and Agent SDK, they provide developers and end users with low-threshold, composable model calls and personalized customization solutions, and are an important entry point for the implementation of decentralized AI.

We believe that decentralized training is not only a natural extension of the blockchain spirit in the AI era, but also the prototype of the infrastructure of a global collaborative intelligent productivity system. In the future, when we look back on this challenging journey, we will still encourage each other with that original intention: decentralization is not just a means, it is value itself.

Gain a broader understanding of the crypto industry through informative reports, and engage in in-depth discussions with other like-minded authors and readers. You are welcome to join us in our growing Coinlive community:https://t.me/CoinliveSG

Add Comment

LoginLeave your comments

0 Comments

Earliest

Load more comments

Live Updates

8 hours ago
Google Search Results for 'Aave' Contain Phishing Threats
Bullish
Bearish
8 hours ago
Abstract Chain has regained control of the official X account
Bullish
Bearish
9 hours ago
Mango Network opens airdrop query, accounting for 5% of the total airdrop
Bullish
Bearish
9 hours ago
Arizona Senate Passes Bitcoin Reserve Bill for Seized Crypto, Heads to House for Debate
Bullish
Bearish
9 hours ago
David Beckham-Backed Healthcare Org, Prenetics, Bets $20M on Bitcoin
Bullish
Bearish
9 hours ago
Deribit: Bitcoin options will have the largest quarterly delivery next Friday, with the biggest pain point at $100,000
Bullish
Bearish
9 hours ago
CZ: AI deep fakes have threatened the security of video verification
Bullish
Bearish
9 hours ago
XRP Price Action Tightens — Breakout Looms to The Upside
Bullish
Bearish
9 hours ago
Trump-Connected Company Pulls Back From Family Crypto Business
Bullish
Bearish
11 hours ago
Treasury Daily Secret Code 20 June 2025: Boost Your Earning
Bullish
Bearish

The Holy Grail of Crypto AI: Frontier Exploration of Decentralized Training

AI training paradigm panoramic comparison table (technical architecture × trust incentive × application characteristics)

Boundaries, opportunities and realistic paths of decentralized training

Analysis of classic decentralized training projects

Prime Intellect: Pioneer of Reinforcement Learning Collaborative Network with Verifiable Training Trajectory

1. Prime Intellect protocol stack structure and key module value

2. Detailed explanation of Prime Intellect training key mechanisms

3. Prime Intellect Incentive Network and Role Division

IV. INTELLECT-2: Release of the first verifiable decentralized training model

V. Team and Financing Background

Pluralis: Paradigm Explorer of Asynchronous Model Parallel and Structural Compression Co-training

1. Core Concept: Protocol Learning

II. Technical architecture of Pluralis protocol stack

III. Detailed explanation of key technical mechanisms

Fourth, technical positioning and path selection

V. Team and Financing Background

1. Project positioning: execution protocol layer of training tasks

2. Overview of the technical architecture

III. Module Detailed Explanation

Fourth, test network and roadmap planning

Five, team and financing background

Nous Research: Cognitive evolutionary training system driven by the concept of subjective AI

1. Concept support: Redefine the "purpose" of training

2. Training Core: Psyche Network and DisTrO Optimizer

The front-chain engine of decentralized training: a panoramic study of model pre-training

The back-chain ecology of decentralized training: model fine-tuning based on LoRA

LoRA + DPO: A realistic path for Web3 fine-tuning deployment

Bagel (zkLoRA): A trusted verification layer for LoRA fine-tuning

Pond: Fine-tuning and agent evolution platform in GNN scenarios

RPS Labs: AI-driven liquidity engine for DeFi

From the front-chain engine to the back-chain ecology: the way forward for decentralized training

Live Updates

Trending News

BlackRock's Strategic Shakeup Amid ETF Anticipation

Ethereum's Liquid Staking Surge: A New Era of Crypto Investment

All the stolen cryptocurrency has flowed into centralized exchanges

CoinsPaid Faces Second Major Security Breach in Six Months

Kraken Sees Over $1 Billion Bitcoin Exodus Amid SEC ETF Anticipation

New Currency Frontier: BRICS Nations Challenge Dollar Supremacy

Injective's INJ Set for Bullish Momentum with Upcoming 100% Token Unlock: Analysis

Gemini Plans to Shift European HQ Out of Ireland Due to Crypto Regulations

Ripple's Chief Technology Officer Stirs Speculation with Enigmatic Tweet

Nigeria Embraces cNGN Stablecoin Amidst Digital Currency Evolution