Scaling The Interchain – A Deep Dive Into IBC Relayer Operations

3. November 2023

CryptoCrew was part of the recent Celestia launch, behind the scenes we worked together with various core teams on the IBC integration of the new and highly anticipated cosmos chain. 

What is an “IBC Relayer”?

The Inter-Blockchain Communication (IBC) Protocol is the Cosmos standard for blockchain interoperability, allowing blockchains to exchange data and value without centralized oversight or permissions. IBC Relayers facilitate message transfers between these chains by handling packet transmissions and proof verification.

A modern Relayer operation consists of several crucial components: an IBC Relayer Software, which composes IBC messages and posts transactions to both interacting chains, RPC Endpoints for every chain the Relayer intends to connect, and accounts on each chain to sign the transactions. All these components need to be monitored, balanced, and automated to maintain a reliable service.

Software

Our production automation at CryptoCrew is built around Hermes, a comprehensive and well maintained IBC Relayer Software by Informal Systems, written in Rust. Hermes is capable of handling most of the different packet types. It is performant, very reliable, and offers extensive configuration options, enabling fine-grained control.

Endpoints

Performant and reliable RPC endpoints are another critical piece of a dependable Relayer operation. A healthy and synced RPC node is necessary for interacting with each blockchain. Professional operators often maintain private nodes to ensure service availability and to be able to react to possible issues with the node.

We manage private RPC infrastructure across more than 50 mainnets, deployed in high-availability environments. By utilizing dedicated bare metal servers, we achieve an optimal balance of performance and cost-efficiency. Additionally, the exclusive (single tenancy) use of these servers enhances security by ensuring isolated operations. The implementation of virtualization technology atop these machines enables a powerful deployment automation, increasing service reliability and reducing the need for manual interventions.

Account Management

Relayers need to pay fees for every transaction, just like everybody else. If a relayer account balance depletes, no more transactions can be processed. To automate and manage this, we heavily utilize the Cosmos SDK’s feegrant module, combined with thorough monitoring of account activity, transaction success and balances.

Automation and Monitoring

Maintaining blockchain infrastructure demands more than just deploying nodes, it requires continuous commitment to reliability. Our systems are designed to operate seamlessly. Automated tasks handle most of the common scenarios. Nevertheless, things can go wrong from time to time, and if something happens our on-call team is ready to respond to any incidents around the clock. This setup allows us to maintain high uptime and manage incident responses efficiently.

A Celestial Launch?

In the upcoming days to the genesis of the Celestia modular blockchain, our team made preparations to handle the expected increase in traffic. We also upgraded our Hermes instances to the latest version (v1.7) to manage some Celestia-specific peculiarities in RPC behavior.

Shortly after the first block was completed by Celestia validators on October 31, 2023, at 15:00 UTC, we began to establish new IBC connections with Osmosis, Neutron, Injective, and others. Once stability was confirmed, pools were created on Osmosis, and the first liquidity began flowing in from Celestia users. Acting quick was crucial, as there was a definite interest to get Celestia’s native token $TIA listed on Osmosis before Binance. 

Success! About one hour after the Celestia genesis block. 

Release the degens

What then followed was indeed unexpected; we witnessed a surge of activity to an extent we had never seen before. Not only the Keplr infrastructure saw their metrics soar (check out this very interesting retro by Josh Lee), also our Relayers recorded absolute record numbers in IBC packet activity.


up to 20,000 IBC packets per hour recorded on Osmosis

While our team was able to quickly respond to the traffic by scaling Relayer and RPC instances, total activity on counterparty chains also increased due to arbitrage opportunities. Blocks and mempools were filled with an excessive number of mostly invalid transactions from poorly configured bots, preventing relayers from processing enough transactions to handle the high IBC throughput. Thirty minutes after the Osmosis pools had been established, a queue of 3,000 IBC packets had amassed on the Celestia > Osmosis channel.

Thanks to the fast action of the Osmosis core team and validators, a quick patch was deployed and IBC transactions were passing regularly again only two hours later. Liquidity in the TIA/USDC and TIA/OSMO pools soon was sufficient enough to handle bigger volume swaps.

Only a few hours after the launch, Osmosis markets had done more $TIA volume than most well-known centralized exchanges:

Source: https://defillama.com/protocol/osmosis-dex 

Working on chain congestion issues

It’s true, arbitrage bots on Osmosis have been enjoying an environment a bit akin to a kindergarten, but this is changing lately. The ProtoRev (Protocol Revenue) Module, as well as configuration improvements like “arbitrage-min-gas-fee” or “min-gas-price-for-high-gas-tx” have been developed by Osmosis and contributing teams, but the values still need to be fine-tuned in collaboration with validators.

Additionally, there is a need for better infrastructure to manage block space within the SDK, as well as for performance improvements to transaction propagation (“mempool gossiping”) and the peering strategy of CometBFT. These issues are being addressed lately, for example with the development of Skip’s Block SDK, the collaborative work on throughput issues spearheaded by Notional, or a very spontaneous implementation of the EIP1559 dynamic fee system by the Osmosis team.

To sum up, the launch of the Celestia mainnet has shown us that our infrastructure systems need to be well-prepared in anticipation of sudden traffic surges. Close collaboration in tackling these issues now is essential to improve performance and stability over the long term. It is incredible to see such numbers, especially in the midst of a bear market, and it should give us a good estimation of what’s to come.

The future is bright for the Interchain.