Exploring msg.data Layout in Solidity: A Deep Dive
Written on
Chapter 1: Introduction to msg.data
When it comes to interacting with smart contracts at a low level, understanding the layout of msg.data is crucial. For the average web3 user, direct interaction with msg.data may not be necessary. However, for those keen on exploring deeper aspects of web3, grasping how calldata functions can be quite beneficial.
If you find yourself here, it likely means you're eager to uncover how to extract valuable information from calldata. So, what is msg.data exactly?
Most developers working with smart contracts are likely familiar with the web3.js library, which offers a suite of APIs for engaging with Ethereum-based contracts. To send a transaction to the Ethereum network, a user typically employs the web3.eth.sendTransaction() method. For instance, within Contract A, one might execute:
await web3.eth.sendTransaction({from: ..., to: addressOfContractA, data: something});
In this case, msg.data would represent the "something" being passed.
Understanding Call Data Structure
According to Solidity's official documentation, the input data for a function call adheres to the format outlined by the ABI (Application Binary Interface) specification. This includes requirements for padding arguments to multiples of 32 bytes and encoding data based on its type, as specified in the contract’s ABI. For further insights on what constitutes an ABI, click here.
A Practical Example: DoubleEntryPoint Challenge
To truly grasp msg.data's layout, practical examples prove invaluable. My interest in this topic was sparked by the Ethernaut challenge, which you can read about here before continuing.
What Inspired the Challenge?
This challenge draws inspiration from a real-world vulnerability related to the Compound Finance protocol and the TrueUSD (TUSD) stablecoin on the Ethereum network. Noteworthy organizations like OpenZeppelin and ChainSecurity have analyzed the vulnerability and shared detailed reports, which can be found here and here. Below are three key factors that contributed to this vulnerability:
- The challenges associated with upgrading smart contracts.
- The tendency for web3 developers to mishandle stray tokens.
- The methodology behind establishing markets on Compound.
Smart contracts are inherently immutable, complicating efforts to rectify bugs or introduce new features. Modern upgrade mechanisms generally involve deploying two contracts: a proxy contract that users interact with and a logic contract that contains the actual functionalities. This allows for the logic contract to be swapped out without altering the proxy.
In a notable case, TrustToken’s original deployment of TUSD predated this proxy model, which led to its design allowing for future upgrades through function forwarding. When necessary changes arose, TrustToken deployed a new upgradeable TUSD contract, ensuring that user interactions remained unaffected while the underlying mechanics transitioned.
Addressing the Issue of Stray Tokens
In the web3 landscape, wallets can send tokens to any smart contract regardless of whether such actions align with the contract's intended purpose. This can lead to tokens being stranded. To mitigate this, many smart contracts implement functions that sweep stray tokens back to a designated wallet for potential recovery. All cToken contracts, for example, include a sweepToken function that transfers stray tokens into the Compound Timelock.
The Role of cToken Contracts
Establishing a market on Compound starts with the cToken contract, which serves as a template designed to integrate various ERC20 tokens with the Compound protocol. Once tailored for a specific asset and deployed, a cToken contract establishes a functioning market that accepts deposits and issues corresponding cTokens to depositors. These cToken templates have undergone thorough audits and are deemed secure.
Chapter 2: Analyzing the Vulnerability
In this video, "Solidity Tutorial: Built-in Variables (msg.sender, msg.value...)," we delve into the essential built-in variables in Solidity, shedding light on their importance in smart contract interactions.
The video titled "Solidity, Blockchain, and Smart Contract Course – Beginner to Expert Python Tutorial" provides a comprehensive overview of Solidity, making it an excellent resource for those looking to deepen their understanding of blockchain and smart contracts.
As we investigate the Compound-TUSD integration vulnerability in relation to the DoubleEntryPoint challenge, the cTUSD Compound contract corresponds to the CryptoVault contract, while the Legacy TUSD contract aligns with the LegacyToken contract. The current TUSD contract is represented by the DoubleEntryPoint contract.
While the DoubleEntryPoint challenge aims to develop a detection bot for potential token draining attacks on the CryptoVault contract, exploring the vulnerability can enhance our understanding.
Start by verifying that the CryptoVault contract contains 100 DET and 100 LGT tokens, as specified in the Ethernaut challenge. You can check this by inputting the CryptoVault’s address on the Rinkeby Etherscan website.
In your browser console, you can run the following commands:
- cryptovault = await contract.cryptoVault()
- functionSignature = { name: 'sweepToken', type: 'function', inputs: [ { type: 'address', name: 'token' }, ] }
- legacytoken = await contract.delegatedFrom()
- params = [legacytoken]
- data = web3.eth.abi.encodeFunctionCall(functionSignature, params)
- await web3.eth.sendTransaction({from: player, to: cryptovault, data})
Now, check the token balance of the CryptoVault contract. The tokens should have been swept from the CryptoVault. Essentially, the sweepToken() function is invoked with the legacytoken contract’s address as a parameter, passing the requisite checks and clearing the tokens from the vault.
Preventing Exploitation
In the DoubleEntryPoint Ethernaut challenge, our task is to create a bot capable of detecting instances of token depletion in the CryptoVault. The detection bot's interface is already defined, requiring us only to implement the handleTransaction function.
interface IDetectionBot {
function handleTransaction(address user, bytes calldata msgData) external;
}
Within the DoubleEntryPoint contract, the detection bot activates when the fortaNotify() modifier is invoked, allowing us to analyze the msg.data from the triggering function.
To prevent the CryptoVault contract from transferring DET tokens, the condition to trigger a Forta alert should be:
if (origSender == CryptoVault's address) {
// Stop the transaction
}
Understanding msg.data
Returning to the layout of msg.data, it's straightforward to retrieve the CryptoVault's address with cryptovault = await contract.cryptoVault(). However, how can we ascertain if origSender matches the CryptoVault's address?
Familiarity with the ABI specification is vital here. The first four bytes of the call data indicate the function to be executed, determined by hashing the function's signature.
Decoding Dynamic Types
In our challenge, we must decode the bytes calldata msgData in the handleTransaction function. For dynamic types like uint32[] and bytes, we utilize the offset from the start of the value encoding.
Given that the bytes type lacks a compile-time byte count, the calldata will return the offset and length, followed by the encoded data.
The illustration demonstrates the calldata layout when the handleTransaction function is called before delegateTransfer.
To extract the origSender's calldata offset, we use the assembly opcode calldataload(p) starting from position p (32 bytes). The final DetectionBot contract may resemble the following:
contract DetectionBot is IDetectionBot {
address private vault;
constructor(address _vault) public {
vault = _vault;}
function handleTransaction(address user, bytes calldata msgData) external override {
address origSender;
assembly {
origSender := calldataload(0xa8)}
if (origSender == vault) {
Forta(msg.sender).raiseAlert(user);}
}
}
After deploying the DetectionBot contract, link it to the Forta contract using the setDetectionBot(address detectionBotAddress) function, which can be executed in the Remix IDE.
References
Bonus Section: The Implications of Token Draining in cTUSD
At first glance, the act of transferring underlying funds to an admin may not appear beneficial for an attacker. However, this impacts the TUSD/cTUSD exchange rate, defined as:
(totalCash + totalBorrows - totalReserves) / totalSupply
With totalCash being the total TUSD in the contract, and totalSupply representing the total cTUSD minted, transferring all underlying tokens results in totalCash dropping to zero. Consequently, the TUSD to cTUSD ratio falls dramatically, causing a steep decline in cTUSD value. Should the funds later be restored, the exchange rate would rebound sharply.
How might an attacker capitalize on this scenario? Several methods exist:
- Liquidate users who have TUSD as collateral for loans.
- Borrow TUSD, execute the attack, and then repay less TUSD.
- Execute the attack, mint cTUSD, and redeem it for profit once the funds are returned.
Currently, TUSD cannot serve as collateral, limiting the first option. However, at the time of exploit discovery, the second method could have yielded approximately 12% profit on the TUSD held in the contract, amounting to about $3.1 million, with funds in the contract only increasing since then.