Author: Zack Pokorny & Translation: Chopper, Foresight News
The implementation of AI agents on blockchain has not been smooth. Although blockchain has programmable and permissionless characteristics, it lacks a semantic abstraction and collaboration layer adapted to agents. A research report released by the crypto research institution Galaxy points out that agents face four major structural frictions on the chain: opportunity discovery, trusted verification, data reading and execution processes. The existing infrastructure is still designed around human interaction, making it difficult to support AI's autonomous asset management and strategy execution. These have become the core bottlenecks for the large-scale implementation of agents on blockchain. The following is a full translation of the report:
The application scenarios and capabilities of AI agents have begun to evolve. They are beginning to perform tasks autonomously and are being developed for holding and allocating capital, and discovering transaction and profit strategies. Although this experimental shift is still in its very early stages, it is quite different from the previous development model of agents mainly as social and analytical tools.
Blockchain is becoming a natural testing ground for this evolutionary process. Blockchain is permissionless, composable, has an open-source application ecosystem, provides equal access to data to all participants, and all assets on the chain are programmable by default. This raises a structural question: if blockchain is programmable and permissionless, why do autonomous agents still face friction? The answer lies not in the feasibility of execution, but in the semantic and coordination burdens on top of execution. Blockchain guarantees the correctness of state transitions but typically does not provide protocol-native abstractions, such as those used for economic interpretation, normative identity, or goal-level coordination. Some friction stems from the architectural flaws of permissionless systems, while some reflects the current state of tools, content management, and market infrastructure. In reality, many higher-level functions still rely on software and workflows, the construction of which requires human intervention. Blockchain Architecture and AI Agents Blockchain design revolves around consensus and deterministic execution, rather than semantic interpretation. It exposes underlying primitives such as storage slots, event logs, and call traces, rather than standardized economic objects. Therefore, abstract concepts such as positions, yields, health coefficients, and liquidity depth typically need to be reconstructed off-chain by indexers, data analysis layers, front-end interfaces, and application programming interfaces (APIs), transforming the protocol-specific states into more usable forms. Many mainstream decentralized finance (DeFi) operational processes, especially those targeting retail investors and those involving subjective decision-making, still revolve around a user-interface-centric model where users interact and sign individual transactions. This user-centric model has expanded with the increasing prevalence of retail investors, even though a significant portion of on-chain activity is now machine-driven. The current mainstream retail investor interaction model remains: Intent → User Interface → Transaction → Confirmation. Programmatic operations follow another path, but also have their limitations: developers select the contract and asset set during the build phase, and then run the algorithm within this fixed scope. Neither of these models can adapt to systems that must dynamically discover, evaluate, and combine operations based on constantly changing objectives at runtime. Friction arises when an infrastructure optimized for transaction verification is used by systems that need to simultaneously interpret economic conditions, assess credit, and optimize behavior around specific objectives. This gap stems partly from the permissionless and heterogeneous design of blockchain, and partly from the fact that interaction tools are still built around human review and front-end intermediaries. A Comparison of Intelligent Agent Behavior Flows and Traditional Algorithmic Strategies Before discussing the gap between blockchain infrastructure and intelligent agent systems, it is necessary to clarify: what exactly is the difference between more intelligent and autonomous behavior flows and traditional on-chain algorithmic systems? The difference lies not in the degree of automation, complexity, parameterization, or even dynamic adaptive capabilities. Traditional algorithmic systems can be highly parameterized, automatically discovering new contracts and tokens, allocating funds across various strategy types, and rebalancing based on performance. The real difference lies in the system's ability to handle scenarios not anticipated during the development phase. Traditional algorithmic systems, no matter how complex, only execute predefined logic for pre-defined patterns. They require predefined interface resolvers for each protocol type, predefined evaluation logic mapping contract states to economic meanings, explicit credit and standardization judgment rules, and hard-coded rules for each decision branch. When situations deviate from the predefined pattern, the system either skips them or fails out entirely. It cannot reason about unfamiliar scenarios; it can only determine whether the current scenario matches a known template. Like this "digestive duck" mechanical automation device, capable of mimicking biological behavior, but all actions are pre-programmed. A traditional algorithm scanning the DeFi lending market can identify newly deployed contracts that emit familiar events or match known factory patterns. However, if a new lending infrastructure component with an unfamiliar interface appears, the system cannot evaluate it. Humans must examine the contract, understand its operation, determine if it represents a exploitable opportunity, and write the integration logic. Only then can the algorithm interact with it. Humans interpret, algorithms execute. Model-based intelligent agent systems change this boundary. They can achieve this through learned reasoning abilities: interpreting vague or incompletely expressed goals. Instructions such as "maximizing returns while avoiding excessive risks" require semantic interpretation. What constitutes excessive risk? How should returns and risks be weighed? Traditional algorithms require precise pre-definition of these conditions. Intelligent agents, however, can interpret intent, make judgments, and optimize their understanding based on feedback. They can also generalize and adapt to unfamiliar interfaces. An intelligent agent can read unfamiliar contract code, parse documentation, or view application binary interfaces it has never encountered before, and infer the economic function of the system. It does not need to pre-build a parser for every type of protocol. Although this capability is not yet perfect, and the agent may misjudge what it sees, it can attempt to interact with systems not anticipated during the construction phase. Furthermore, it can reason in situations where trust and normativity are uncertain. When credit signals are ambiguous or incomplete, the underlying model can probabilistically weigh the signals rather than simply applying binary rules. Does this smart contract possess standardization? Based on existing evidence, is the token legitimate? Traditional algorithms are either rule-based or simply ineffective; agents, on the other hand, can reason about confidence levels. They can interpret errors and make adjustments. When unexpected situations occur, agents can deduce the root cause of the problem and decide how to respond. In contrast, traditional algorithms simply execute anomaly detection modules, forwarding exception information without interpretation. These capabilities currently exist but are not perfect. The underlying model can produce illusions, misjudge content, and make seemingly certain but incorrect decisions. In adversarial and capital-intensive environments (i.e., where code controls or receives assets), "attempting to interact with unforeseen systems" can mean financial loss. The core argument of this paper is not that agents can now reliably perform these functions, but rather that they can attempt them in ways that traditional systems cannot, and that future infrastructure will make these attempts safer and more reliable. This difference should be viewed as a continuous state rather than an absolute classification boundary. Some traditional systems incorporate learned reasoning, while some agents may rely on hard-coded rules on critical paths. This distinction is directional, not absolutely binary. Agent systems shift more of the interpretation, evaluation, and adaptation work to runtime reasoning, rather than pre-defined rules in the construction phase. This is crucial for the discussion of friction problems, because agent systems attempt to achieve something that traditional algorithms completely avoid. Traditional algorithms avoid discovery friction by having humans screen contract sets during the construction phase; avoid control-level friction by relying on whitelists maintained by operators; avoid data friction by using parsers pre-built for known protocols; and avoid execution friction by operating within pre-defined security boundaries. Humans pre-complete semantic, credit, and policy-level work, and the algorithm executes within defined boundaries. Early on-chain agent behavior processes may follow this model, but the core value of agents lies in shifting discovery, credit, and policy evaluation to runtime reasoning, rather than pre-defined rules in the construction phase. They attempt to discover and evaluate unfamiliar opportunities, reason about standardization without hard-coded rules, interpret heterogeneous states without pre-defined parsers, and enforce policy constraints on potentially ambiguous goals. Friction arises not because agents are doing the same things as algorithms but with greater difficulty, but because they are attempting something entirely different: operating in an open, dynamically interpretable behavioral space, rather than within a closed, pre-integrated system. From a structural perspective, this contradiction does not stem from a flaw in blockchain consensus, but rather from the way the overall interaction stack developed around it operates. Blockchain guarantees deterministic state transitions, consensus on the final state, and eventual determinism. It does not attempt to encode economic meaning interpretation, intent verification, or goal tracking at the protocol layer. These responsibilities have historically been handled by front-end interfaces, wallets, indexers, and other off-chain collaborative layers, which always require human intervention. Even for seasoned participants, the current mainstream interaction patterns reflect this design. Retail investors interpret the status through dashboards, select actions through user interfaces, sign transactions through wallets, and informally verify the results. Algorithmic trading institutions have automated execution, but still rely on human operators to sift through protocol sets, check for anomalies, and update integration logic when interfaces change. In both scenarios, the protocol is only responsible for ensuring correct execution, while intent interpretation, anomaly handling, and adaptation to new opportunities are all done by humans. Intelligent agent systems compress or even eliminate this division of labor. They must programmatically reconstruct economically meaningful states, assess the progress of objectives, and verify execution results, rather than simply confirming transactions on the blockchain. These burdens are particularly pronounced on the blockchain because intelligent agents operate in an open, adversarial, and rapidly changing environment where new contracts, assets, and execution paths can emerge without centralized auditing. Protocols only guarantee the correct execution of transactions, not that the economic state is easily interpretable, contracts are standardized, execution paths conform to user intent, or relevant opportunities can be programmatically discovered. The following will analyze these frictions step by step along the various stages of the intelligent agent's operational cycle: discovering existing contracts and opportunities, verifying their legitimacy, acquiring economically meaningful states, and executing operations around the goal. Friction arises because the behavioral space of decentralized finance expands openly in a permissionless environment, while relevance and legitimacy are filtered by humans through on-chain social, market, and tool layers. New protocols emerge through announcements and are also filtered through front-end integration, token lists, data analysis platforms, and liquidity formation. Over time, these signals often form a feasible standard for judging which parts of the behavioral space have economic value and are sufficiently credible, although this consensus may be informal, uneven, and partly dependent on third-party and human screening. This can provide intelligent agents with filtered data and credit signals, but they themselves do not possess the intuitive shortcuts humans use to interpret these signals. From an on-chain perspective, all deployed contracts are equally discoverable. Legitimate protocols, malicious forks, test deployments, and abandoned projects all exist in the form of callable bytecode. The blockchain itself does not encode which contracts are important or which are secure. Therefore, agents must build their own discovery mechanisms: scanning deployment events, identifying interface patterns, tracking factory contracts (contracts that programmatically deploy other contracts), and monitoring liquidity formation to determine which contracts should be included in the decision-making scope. This process is not just about finding contracts, but also about determining whether they should enter the agent's action space. Identifying candidates is only the first step. After initial discovery and screening, contracts must undergo the standardization and authenticity verification process described in the next section. The agent must first confirm that the discovered contracts are legitimate before including them in the decision space. Detecting friction does not mean detecting new deployment behavior. Mature algorithmic systems can already achieve this within their own policy scope. The searcher, which monitors Uniswap factory events and automatically adds them to new funding pools, is performing dynamic discovery. Friction arises at two higher levels: determining the legitimacy of discovered contracts and determining whether they are relevant to open objectives, rather than simply matching pre-defined strategy types. The searcher's discovery logic is tightly bound to its strategy. It knows which interface patterns to look for because the strategy is defined. Agents executing broader instructions, such as "configure the best opportunity after risk adjustment," cannot rely solely on policy-derived filters. They must evaluate newly encountered opportunities against the objective itself, requiring the parsing of unfamiliar interfaces, inference of economic functions, and determination of whether the opportunity should be included in the decision space. This is partly a problem of general autonomy, but blockchain exacerbates it. Control layer friction arises because the determination of identity and legitimacy is typically done outside the protocol, relying on a combination of screening, governance, documentation, interfaces, and operator judgment. In many current workflows, humans remain a crucial part of the judgment process. Blockchain guarantees deterministic execution and finality, but it does not guarantee that the caller is interacting with the target contract. This intent determination is externalized into social context, websites, and human screening. In the current process, humans use the credibility of web pages as an informal verification method. They visit official domains (usually found through aggregation platforms like DeFiLlama or verified social media accounts of projects) and treat the website as a standard mapping between human concepts and contract addresses. Subsequently, the front-end interface establishes a feasible set of trusted benchmarks, clarifying which addresses are official addresses, which tokens should be used, and which entry points are secure.

The Mechanical Turk of 1789 was a chess-playing machine that appeared to operate autonomously, but actually relied on a hidden human operator.
The intelligent agent, by default, cannot interpret brand logos, authenticate social signals, or "officiality" through social context. It can be inputted with filtered data derived from these signals, but transforming this into persistent machine credibility assumptions requires explicit registry entries, policies, or verification logic. The intelligent agent can be configured with operator-provided whitelists, authentication addresses, and credibility policies. The problem is not a complete inability to access social context, but rather the extremely high operational cost of maintaining these safeguards in a dynamically expanding behavioral space, and the lack of alternative verification mechanisms that humans default to when these measures are missing or incomplete.
Potential Data Flow Mismatch
Access to the economic state on the blockchain is essentially a pull model, even though execution signals can be streamed. External systems query nodes for the required state, rather than receiving continuous, structured updates. This model reflects the core function of the blockchain: verification on demand, rather than maintaining a persistent state view at the application level.
Push primitives exist. WebSocket subscriptions can stream new blocks and event logs in real time, but these do not include the stored state that carries most of the economic meaning, unless the protocol explicitly chooses redundant publishing. Agents cannot directly subscribe to lending market utilization, pool reserves, or position health coefficients on-chain. These values are stored in contract storage, and most protocols do not provide a native mechanism to push this information to downstream users. The current optimal model is to subscribe to new block headers and re-query them with each block. Logs can only indicate that the state may change, but do not encode the final economic state; reconstructing the state still requires explicit reading and access to historical state.
Intelligent agent systems might benefit from a reverse process. Instead of polling hundreds of contracts for state changes, agents can receive structured, pre-computed state updates and push them directly to the runtime environment. A push architecture reduces redundant queries, lowers latency between state changes and agent perception, and allows intermediate layers to package state into semantically clear updates, rather than having agents interpret meaning from raw storage. This reverse shift is not easy. It requires subscribing to infrastructure, filtering relevance logic, and patterns for translating storage changes into actionable economic events for agents. However, as agents become continuous participants rather than intermittent queryers, the inefficiency of the pull model becomes increasingly costly. An infrastructure that treats agents as continuous consumers rather than intermittent clients might be more suited to the operation of autonomous systems. Whether push infrastructure is truly superior remains an open question. Massive state changes create filtering challenges, requiring agents to still determine which changes are relevant, thus reintroducing pull semantics on another level. The key issue isn't with pull architecture itself, but rather that current architectures don't consider persistent machine consumers. As the scale of agent usage expands, exploring alternative models might be worthwhile. Execution friction arises because many current interaction layers package intent transformation, transaction verification, and result validation into workflows designed around the front-end interface, wallet, and operator oversight. In retail and subjective decision-making scenarios, this oversight is typically performed by humans. For autonomous systems, these functions must be formalized and directly encoded. Blockchain guarantees deterministic execution based on contract logic, but doesn't guarantee that transactions conform to user intent, comply with risk constraints, or achieve expected economic outcomes. In the current process, the user interface and humans fill this gap. The user interface combines a sequence of operations (exchange, authorization, deposit, lending), with the wallet providing the final "review and send" node. Users or operators typically make informal strategic judgments in this last step. They often assess the safety of a transaction and the acceptability of the quoted price, often with incomplete information. If a transaction fails or an unexpected result occurs, the user will retry, adjust slippage, change the path, or abandon the operation altogether. Intelligent agent systems remove humans from this execution loop. This means the system must replace three human functions with machine-native methods: Intent integration. Human goals, such as "transferring my stablecoins to a risk-adjusted optimal return location," must be integrated into a concrete action plan: choosing which protocol, which market, which token path, the scale, which authorizations, and the execution order. For humans, this process is implicitly completed through the user interface; for intelligent agents, it must be formalized. **Strategy Execution.** Clicking "Send Transaction" is not just a signature, but also an implicit check to ensure the transaction meets constraints: slippage tolerance, leverage cap, minimum health factor, whitelisted contracts, or "prohibited upgradeable contracts." The agent needs to encode explicit policy constraints into machine-verifiable rules: The execution system must verify that the proposed call graph satisfies these rules before broadcasting. **Result Verification.** Transaction being on-chain does not equate to task completion. Successful transaction execution may still fail to achieve the goal: slippage may exceed tolerance limits, the target position size may not be reached due to quota limitations, or interest rates may change between simulation and on-chain. Humans informally verify this by reviewing the user interface afterward. The agent, however, must programmatically evaluate the post-conditions. This introduces the requirement for completion checks, not just simple transaction inclusion. Intent-centric architectures can partially address this issue by shifting more of the burden of "how" to execution from the agent to a dedicated solver. By broadcasting signed intents instead of raw call data, agents can specify outcome-based constraints that the solver or protocol-level mechanism must satisfy for execution to be acceptable. Most execution operations in decentralized finance (DeFi) are inherently multi-step. A yield configuration might require authorization → redemption → deposit → lending → staking. Some steps may be independent transactions, while others may be packaged through multi-call or routing contracts. Humans can tolerate partial completion and returning to the user interface to continue the process. Agents, however, require deterministic process orchestration: if any step fails, the agent must decide whether to retry, reroute, roll back, or pause. This gives rise to new failure modes that are largely masked in human processes: State drift between decision-making and on-chain execution. Interest rates, utilization rates, or liquidity may change between simulation and execution. Humans accept this variability; agents must set acceptable limits and enforce them. Non-atomic execution and partial transactions. Partial operations may be executed across multiple transactions or produce partial results. Agents must track intermediate states and confirm that the final state meets the objective. Authorization limits and approval risks. Humans subconsciously sign authorizations through user interfaces; agents must reason about the scope of authorization (limit, users, duration) as part of a security strategy, not just as a user interface step. Path selection and implicit execution costs. Humans rely on routing contracts and default user interface settings. Agents must incorporate slippage, maximum extractable value risk, gas costs, and price impacts into their objective functions. The core argument of execution friction is that the interaction layer of decentralized finance uses human wallet signatures as the final control plane. This layer carries the current intent verification, risk tolerance, and informal "reasonableness" judgments. Removing humans, execution becomes a control problem: agents must translate objectives into behavioral patterns, automatically execute policy constraints, and verify results under uncertainty. This challenge exists in many autonomous systems, but it is particularly harsh in the blockchain environment: execution directly involves capital, composable unfamiliar contracts, and exposure to adversarial state changes. Humans make decisions using heuristics and correct errors through trial and error. Agents, on the other hand, must perform the same work programmatically at machine speed, often within a dynamically changing behavioral space. Therefore, the statement that "agents only need to submit transactions" underestimates the difficulty. Submitting transactions is the simplest part. Conclusion
Blockchain was not originally designed to provide the semantic and coordination layers required by intelligent agents. Its design goal is to ensure deterministic execution and consensus on state transitions in adversarial environments. The interaction layer built on this foundation has evolved around a model where human users interpret states through an interface, select operations through a front-end interface, and verify results through human review.
Intelligent agent systems have disrupted this architecture. They remove human interpreters, approvers, and verifiers from the loop, requiring these functions to be implemented natively by machines. This shift exposes structural frictions in four dimensions: discovery, credit determination, data acquisition, and execution processes. These frictions arise not because execution is infeasible, but because the infrastructure surrounding blockchain, in most cases, still assumes human involvement in state interpretation and transaction submission.
Bridging these gaps will likely require building new infrastructure across a multi-layered technology stack: middleware that normalizes cross-protocol economic states into machine-readable middleware; indexing services or remote call extensions for semantic primitives such as positions, health coefficients, and opportunity sets; a registry providing standard contract mapping and token authenticity verification; and an execution framework that constrains coding strategies, handles multi-step workflows, and programmatically verifies goal completion. Some gaps stem from the structural characteristics of permissionless systems: open deployment, weak standard identity, and heterogeneous interfaces. Others depend on current tools, standards, and incentive design; these gaps are expected to narrow as the scale of agent usage expands and protocols compete to optimize integration friendliness with autonomous systems. As autonomous systems begin to manage capital, execute strategies, and interact directly with on-chain applications, the architectural assumptions of the current interaction layer will become increasingly prominent. Most of the frictions described in this article reflect the characteristics of blockchain tools and interaction patterns evolving around human-mediated workflows; some frictions stem from the openness, heterogeneity, and adversarial environment of permissionless systems; and some are problems commonly faced by autonomous systems in complex environments. The core challenge is not to get intelligent agents to sign transactions, but to provide them with a reliable means to complete the semantic interpretation, credit determination, and policy execution tasks currently undertaken jointly by software and humans between blockchain state and operational behavior.