- Title: DO NOT RUG ON ME: ZERO-DIMENSIONAL SCAM DETECTION
- Description: Rug pulls and scam detection.
- Tags: DeFi, Machine learning, Scam Detection,Rug pulls
- Created: 2020-12-25
- Researcher: Bruno Mazorra and Victor Adan
- The reaserchers expand the dataset of Uniswap v2 scam tokens.
- They provide a theoretical classification of three different types of rug pulls and provide tools to identify them.
- The authors introduce two highly accurate and precise Machine learning-based models to discriminate between malicious and nonmalicious tokens in different scenarios before the malicious manuever.
Uniswap, like other DEXs, has gained much attention this last year because it is a non-custodial and publicly verifiable exchange that allows users to trade digital assets without trusted third parties. However, its simplicity and lack of regulation also makes it easy to execute initial coin offering scams by listing non-valuable tokens. This method of performing scams is known as rug pull, a phenomenon that already existed in traditional finance but has become more relevant in DeFi.
Do rug pulls in Constant Function Marker Makers (CFMM) share similar features? Can we predict if a project is a rug pull before the malicious manuver?
- Mazorra, Bruno, Victor Adan, and Vanesa Daza. "Do not rug on me: Zero-dimensional Scam Detection." arXiv preprint arXiv:2201.07220 (2022).
- Mazorra, B., Adan, V., & Daza, V. (2022). Do Not Rug on Me: Leveraging Machine Learning Techniques for Automated Scam Detection. Mathematics, 10(6), 949.
- Smart contract: Programs which are deployed on the blockchain public ledger and are executed in transactions and alter the state of the ledger atomically.
- Decentralized Exchange: Decentralized Exchanges (DEXs) are a category of Decentralized Finance (DeFi) protocol that allow the non-custodial exchange of digital assets. All trades are executed on-chain and are, thus, publicly verifiable. The policy that matches buyers and sellers (or traders and liquidity providers) is hard-coded in a smart contract.
- Rug pull: Is a malicious operation or set of operations in the cryptocurrency industry where the developers abandon the project and take the investors’ funds as profits.
- Transaction graph: Weighted graph induced by token transactions.
- Herfindahl-Hirschman Index: A measure of market concentration and is used to calculate market competitiveness
- Cluster coefficient: Is a measure of network segregation that captures the connections of individual nodes and their neighbors.
- Precision: Is defined by .
- Recall: Is defined by .
- Machine learning classifier: Is an algorithm that automatically categorizes data into one or more set of classes.
- Cross validation: is a resampling method that uses different portions of the data to test and train a model on different iterations. It provides information about how well does a machine learning algorithm or a model generalize.
- Data agumentation: Data augmentation is a technique that allows us to augment our training dataset to improve accuracy, generalisation, and control overfitting.
- Introduction.
- Related Work.
- Preliminars
- Background.
- Malicious Uniswap Maneuvers:a malicious operation or set
of operations in the cryptocurrency industry where the developers abandon the project and
take the investors’ funds as profits.
- Classification of different type of rug pulls.
- Rug pulls
- Pump-and-dump schemes
- Money laundering
- Others
- Classification of different type of rug pulls.
- Data Collection:
- Overview of the method used to extract all the necessary data.
- Token Labelling.
- Provide the methodology to label tokens as scams or non-scams.
- Overview of the results obtained by the labelling methodology proposed.
- Scam detection.
- Define two methods (Activity based Method and 24 Early Method) that use Machine Learning models to discriminate between malicious and non-malicious tokens in different scenarios.
- Conclusions.
- Future Work
- Data collection: To obtain all the data needed to do the labelling and the analysis, we used an Infura archive node and the Etherscan API. To obtain the state of the Uniswap exchange and the tokens, we used the events produced by their respective smart contracts. To obtain the token transactions creation and the source code, we used Etherscan API.
- Labelling:
-
First, we defined the maximum drop and the recovery of token prices and liquidity time series. The maximum drop measures fall in the price or liquidity of the Uniswap listed pools. The recovery represents the largest pump from the bottom. Also, if more than one month has passed between the last movement or transaction of the token so far, we consider that the token is inactive. This made a total of 27,588 tokens that could be tagged as malicious since they were inactive tokens, that had, at some point, lost all their value in price or liquidity and had not recovered it again.
-
Non-malicious tokens cannot be chosen from a liquidity, price, and activity analysis. Given a token, it may be considered malicious if there has been at least one rug pull at some point in its activity. However, a token that has not had any rug pull cannot be considered non-malicious, since it could experience a rug pull later on. Therefore, we take advantage of audits carried out by external companies (Certik, Quantstamp, Hacken...). Thus, a list of 674 tokens labelled as non-malicious have been mined from different sources: coinmarketcap, coingecko, etherscan.
-
- Features: We compute the following features to extract relevant information about the tokens listed in Uniswap V2.
-
Machine Learning: We defined two methods that use Machine Learning models to discriminate between malicious and non-malicious tokens: Activity based Method and 24 Early Method.
- Activity based Method: For each token labelled as malicious, we have randomly selected several evaluation points before the maximum drop. Non-malicious tokens have been evaluated throughout their activity. Then, for each evaluation point, we calculated the token features up to that block and used them to train two ML classifiers (XGBoost and FT-Transformer) to find patterns related to malicious activity.
- 24 Early Method: For each labelled token, we have computed its features in each of the 24 hours after its pool creation. In this case, we are training the models for each hour, therefore, we only have one evaluation point for each token. This also implies that the dataset is smaller compared to the other method.
Most tokens are labelled as malicious. Indeed, it would be enough to label all of them as malicious to achieve an accuracy of 97,7%. Therefore, we used a data augmentation technique that consists of choosing more evaluation points for non-malicious tokens than for malicious tokens. In particular, we selected five evaluation points for non-malicious tokens and one for the malicious. In addition, we labelled the non-malicious tokens as 1 and the malicious tokens as 0 and tried to increase the performance in predicting non-malicious tokens. To validate both methods we used 5-fold cross-validation, therefore all the results will be presented as the mean and standard deviation of all folds.
- Both XGBoost and FT-Transformer get high metrics for accuracy, recall, precision, and F1-Score. However, XGBoost outperforms FT-Transformer in all metrics.
- XGBoost obtains an accuracy of 0.9936, recall of 0.9540 and precision of 0.9838 in distinguishing non-malicious tokens from scams. In contrast, FT-Transformer gets an accuracy of 0.9890, recall of 0.9180 and precision of 0.9752. Therefore, from now we will only analyse on XBoost results.
- For each labelled token, we have computed its features in each of the 24 hours after its pool creation. In this case, we are training both models for each hour. Therefore, we only have one evaluation point for each token.
- Our algorithm obtains a very high accuracy even in the first hours. However, the precision, recall and f1-score are lower than in Activity based Method. In the best of cases, i.e. 20 hours after the creation of the pool, our best algorithm obtains a recall of 0.789. This could indicate that while malicious tokens are easily detectable in the first few hours, detecting non-malicious tokens require more time.
- We provided a theoretical classification to understand the different ways of executing the scam, and through the process of identifying rug pulls we found new token smart contract vulnerabilities (composability attacks) and new ways of money laundering.
- We provided a methodology to find rug pulls that had already been executed. Not surprisingly, we found that more than the 97,7% of the tokens labelled were rug pulls.
- We defined two methods that use ML models to distinguish non-malicious tokens from malicious ones. We also verify the high effectiveness of these models in both scenarios.
- In this paper we showed that different machine learning tecniques can be used to detect scams before executing the malicious maneuver without the need of off-chain data.
- The efficiency and the accuracy of the results could be improved using novel tecniques such as topology data analyisis.
- Due to the market shifting from Uniswap V2 to Uniswap V3, an obvious follow up would be to study the rug pulls in Uniswap V3 and develop new tools to detect them.
- The algorithm and the methodology provided in the paper could be developed to help and protect uninformed investors in blockchain distributed apps.
- With help of exchanges, the algorithm produced to detect rug pulls will provided helpful forensic analyisis to detect scammers.