A Smarter Way of Benchmarking Blockchains: From Overall Measurements of Chains to Smart Contract Timings

The Difficulties that Appear When Trying To Compare Blockchains

Andy Baloiu

Published in

Digital Catapult

9 min readNov 5, 2020

We will dive into benchmarking smart contract systems: testing code compiled to EVM, WASM & RustNative

Since the early modern period there has been an obsession for perfecting all human inventions. Since then until now, human ingenuity has been the thing driving the creation of new technological machines. Something in our human nature, immediately after adopting them, makes us then want to compare how powerful they are and make them go faster and faster…

The focus seems to always be on creating races.

For example, the World witnessed the invention of the petrol-based internal-combustion engine in the 1880s and, not long after, in 1895, it witnessed the first true race organised in France, from Paris to Bordeaux.

*The Peugeot Car that was declared the winner of the race (1895). Source* *here*.

Unfortunately, the fixation of engineers for speed took quite some time to end. It’s not like the needs of regular people for speed, with regards to transport, was so great, and it’s not like the roads were so good for cars to run on anyway.

1924 car race. Source: Caerphilly Mountain Hillclimb Race in Cardiff/Wales.

Nevertheless, the car races continued, uninterrupted, in perhaps amusing ways for us now, when we look back (there is even a movie about that).

Finally, this type of automobile races ended when people woke up and realised that other measurements were needed regarding the optimisation of these inventions. In other words, other types of technical measurements were necessary, not only the ones having a direct connection to speed (like 0 to 100 km/h acceleration time or engine power, etc.).

Now, I know many of you might ask, what does blockchain benchmarking have to do with car benchmarking? Well, what is happening today in the blockchain software industry is similar to what happened in the automobile industry in the past. It’s as if all the blockchain devs, engineers and promoters are only interested in the speed of these things and nothing else, speed that is measured in transactions per second (TPS). Every piece of blockchain marketing material seems to primarily advertise the TPS and finally eventually other things, a bizarre choice to say the least.

Example of blockchain marketing. Source: 101Blockchains

When it comes to benchmarking blockchains, I think the methodology should be different, for sure. Just as we have a different set of benchmarks based on the operation of the car (e.g: for off-road cars, the most important thing is the horsepower output rather than speed), in the same way, we should have a different set of benchmarks for blockchains based on their operation. As in, just to illustrate with a simple example, for blockchains that store a large quantity of information (like Filecoin.io), the main benchmark element should be the storage footprint instead of TPS. This example can be extended for other blockchains.

In the IT world, hardware and even software engineers use benchmarking methodologies that tend to cover a variety of things, like: central processing units (CPU) testing; graphic processing units (GPU) testing (e.g.: multi-texturing fill rate test, vertex & pixel shaders test, etc.); database management systems (DBMS) throughput testing; input and output (I/O) storage speed testing; software compiler development testing and even virtualisation testing, etc.

In a similar way, in the Blockchain world, developers that build those projects perhaps should keep in mind, first and foremost, the following:

The blockchain’s block-time, based on a given throughput, that doesn’t produce forks (in blockchains there is a problem, if you produce too many blocks per second, the network may very likely fork into sub-networks);
The blockchain’s storage footprint (for chains that allow smart contracts) that can be done similar to other benchmarks for databases (or DBMS systems);
The blockchain’s volatile memory footprint (relevant for fast large-blocks blockchains like EOS which uses a lot of RAM memory);
The blockchain’s contract execution CPU footprint, which informs, coincidentally, if the validators’ incentives to execute contracts is proportional to validators’ costs.

From the above mentioned approaches, I have decided that I should concentrate on the CPU footprint of smart contract executions, as the others don’t seem to be so relevant in today’s enterprise blockchain landscape. CPU power is still the most important resource for dev-ops, even today.

The dev team who created Ethereum advertised the project as the World Computer. On it, you can create and deploy your smart contracts virtually at any time and from any place, and then, later, you can execute all the functions built inside them. Your desired code will be executed by basically thousands of computers, simultaneously, all around the world! For that reason, you ought to make the CPU footprint of your code as small as possible. That can only happen through continuous benchmarking and optimisation of the said blockchain software that you want to use.

Of course, you can create a real network, and with it, test the particular blockchain that interests you, and after that optimise certain things in the blockchain software that you have. Unfortunately, this method seems rather costly and contains elements that can cause large development delays. The solution is to separately test each and every element of the software against benchmarks. In this case, the element responsible for this CPU footprint is the virtual machine (VM) in which the smart contract is executed.

Having this way of doing innovation in mind, here at Digital Catapult, we decided to use some internal resources and create a mini-benchmarking project. So, the main idea in this project is to create a benchmarking system for smart contract VMs by running common tests (mainly CPU intensive tests like hashing and big number multiplication) coded into smart contracts. Obviously, smart contracts are generally executed, in the world of blockchain, in some kind of VM. The main reason for Ethereum, Hyperledger Fabric, and other blockchains to execute smart contracts within a VM is because the execution, if done on the native computer system itself, could have weird bugs that produce infinite loops, leading to really crazy results, like having at the same time thousands of frozen machines and a broken chain. Substrate, on the other hand, takes a different approach where in some cases parts of the code can be in native Rust after a thorough verification, of course while the same contract could be executed in some kind of WASM VM. Ingenious solution, hard to explain!

Going back to the main topic, the idea is to develop benchmarks for testing various virtual machines as used in real-world blockchains. Therefore, we can get an idea of how the resulting systems will perform. It is to be expected that something executed in a native computer system is faster than the same thing executed in a virtual machine on the same system. But of course we can test this assumption and see how different they really are.

Example of VM benchmark. Source: Mitchellh.

The mini-project done at Digital Catapult has three parts: PART ONE — EVM, PART TWO — WASM, PART THREE — RUST Native.

So, in part ONE I am benchmarking the EVM (Ethereum Virtual Machine). The EVM is a powerful, sandboxed virtual stack — which is similar to the Bitcoin stack based script but Turing-complete — embedded within each full Ethereum node. That is, each node is responsible for executing contract byte-code (which is the form smart contracts exist on the blockchain after you deploy them). We are currently ignoring precompiled smart contracts embedded within the system.

Then, in part TWO I am looking at the WASM VM (Web Assembly VM). What is Web Assembly? It’s a format that’s compiled into a sandbox environment and runs within a lot of constraints to make sure it has no security vulnerabilities. The original goals are of the WASM project are: FAST — executes with near native code performance, taking advantage of capabilities common to all contemporary hardware; SAFE: code is validated and executes in a memory-safe, sandboxed environment preventing data corruption or security breaches and, finally, last but not least; LANGUAGE-INDEPENDENT: does not privilege any particular language, programming model, or object model.

And last but not least, in part THREE, I am looking at native Rust which is even faster but somehow, perhaps, a bit more exotic in the blockchain context. If you don’t know, Rust is a relatively new programming language, much like C++, designed to empower everyone to build reliable and efficient software. It is blazingly fast (according to the official website) and memory-efficient: with no runtime or garbage collector, it can power performance-critical services, run on embedded devices, and easily integrate with other languages. Today Rust runs, according to some experts, on average 120% faster than WASM.

The project is open-source. The main source of inspiration for the code comes from the official Ethereum Wasm (EWasm) project and everything is available on Github, obviously, in this repository here:

https://github.com/dc-andysign/rust-evm-ewasm-banchmark

This is a humble project, which has as its main purpose to help create benchmarks for all the blockchain projects with all the sample CPU-intensive test contracts being, more or less, virtualised.

The main hope for this project is to help optimise the execution of smart contracts so much that people will build any logic, with any level of complexity in them, in future blockchains. Who knows, maybe in the future, people will end up creating operating systems on top of blockchains. Operating systems on chains that could incorporate other chains on top? Blockchainception… Who knows? Certainly, the current DeFi world is built like that. It seems like it’s a tendency.

Now, let's look at the results!

Depending on the machine, every test execution time varies, but the resulting charts should look similar.

After running the tests, the data will be written in four separate CSV files:

The contract execution environment engines tested here are:

EVM engine versions:
· geth: v1.9.14 (+ go 1.11);
· parity/openethereum: v2.5.1 (May 2019)
WebAssembly engine versions:
· wasm3: v0.4.7;
· 12 other Wasm engines
Rust Native:
· Rustc 1.46.0

The tests algorithms are:

blake2b (2805, 5610 & 8415 bytes)
sha1 (10808, 21896 & 42488 bits)
bn128_mul (chfast2 & cdetrio2)

All the steps for generating the charts are described in the main README file.

EVM vs. Wasm vs. RustNative — Blake2b hashing

EVM vs. Wasm vs. RustNative — Sha1 hashing

EVM vs. Wasm vs. RustNative — Big Number Multiplication — This is the only outlier. EVM is faster than WASM in this case, mainly because, according to Hung-Ying Tai, EVM (aka the only 256-bit stack based VM) supports multiple large-bits operations, including 128 / 256 bit, while WASM supports only 32 / 64 bit operations meaning the large-bit ones need additional steps.

Geth vs. Parity (OpenEthereum) — Blake2b hashing

Wasm All Engines — Sha1 hashing

Conclusion

In regards to finding the best approach to optimise the CPU usage of basically all smart contracts, a good idea is to virtualise them and run them in a VM like WebAssembly, which is safe and native-speed fast. Something faster, like Native Rust or Native C could also be a good choice, but only in the next few years, when the technology will be more mature.

A Smarter Way of Benchmarking Blockchains: From Overall Measurements of Chains to Smart Contract Timings

The Difficulties that Appear When Trying To Compare Blockchains

Written by Andy Baloiu