I fell in love with Bitcoin back in 2012, and studied it obsessively, or, as we say in the crypto world, I fell down the rabbit hole. It was not my intention to become a crypto educator, but when you have a deep understanding of and appreciation for something, it becomes painful to see it misunderstood and/or misrepresented.
I began by writing blog posts, then began hosting educational events, I ran workshops, I taught blockchain courses for undergrad and graduate University students, managed a crypto education nonprofit, produced a variety of crypto courses, and now work with some of the most impactful crypto educators.
I’ve watched the changing misconceptions about crypto and blockchain tech for 8 years now. Thankfully we’ve evolved a lot from the “Bitcoin is drug money” days, but there are still some terribly common and alarmingly inaccurate misconceptions about blockchain technology and what it does. As it continues to pain me to see my favorite tech misunderstood, below I walk through the three most common misconceptions about blockchain tech.
Note: I am speaking here about open, public blockchains such as Bitcoin and Ethereum. Private, permissioned blockchains are more appropriately called DLT, or Distributed Ledger Technology and are a separate discussion. When looking at open blockchains, it is terribly difficult to understand what they are good at, and what they are very bad at, if you don’t understand what a blockchain actually does. Understanding the problem that blockchain tech was built to solve gives you a solid foundation for understanding what it does and does not do, and helps you to sort the interesting projects from the nonsense. Watch this 16min video for an understanding of blockchain mechanics.
Myth: Blockchains secure private data
Blockchains are not for privacy. They are not for preventing access to data. In fact, they are exactly the opposite. They were built for reaching agreement, or “consensus”, on public data. This fundamental misunderstanding of blockchain tech is currently the most persistent that I see, and I see it so often that I have to ponder how this happened. It seems to stem from a misunderstanding of what cryptography is used for on a public blockchain. You see, Bitcoin is called a “cryptocurrency” as it makes heavy use of cryptography, but it doesn’t actually encrypt anything… I’ll explain.
Cryptography, for those not familiar, is a branch of math that is used to secure communications, to hide them from adversaries. So to be fair, this misunderstanding is rather understandable. The word “cryptography”, or “crypto” is a Greek word that literally means “hidden”, or “secret”. And we are very used to thinking of cryptography as a tool for securing private data. Cryptography is what is used to create that green lock that you see in your browser when visiting a website with sensitive data. It’s what is used to hide your credit card number when you are shopping online. But the branch of math that is cryptography includes a wide variety of functions that serve many purposes besides just the hiding of data. Let’s take a look at two of those cryptographic functions that are used heavily in blockchain tech, digital signatures, and hashing.
Note: Blockchains make heavy use of cryptographic keys also known as public/private key pairs which you can learn more about here.
A digital signature is used to prove the origin and validity of data. If you want to broadcast a message and prove that the message came from you, you can sign that message with a private cryptographic key to produce a digital signature. You can then broadcast the message, with the digital signature, in such a way that others will be able to validate that only someone in possession of your private key could have sent that message. Bitcoin uses this functionality on transactions. All transactions must come with a valid digital signature to prove their origin and be entered into the blockchain.
A cryptographic hash is used as a unique identifier of data. Data can be run through a hashing algorithm producing a unique identifier for that data that can not be reverse-engineered. If you have a hash you can not determine what data was used to produce that hash. However, if you have both the data and the hash you can very quickly validate that the hash matches the data. Very simple, but impactful. While we won’t dive into the details here, the asymmetry in this functionality is fundamental to how Bitcoin “mining” works.
The cryptographic functionality that we are generally most familiar with is encrypting data. Encryption scrambles data in such a way that it can only be decrypted and read by the entity in possession of a certain key. As everything that happens on the Bitcoin network is public, the software doesn’t really have use for this functionality.
Visualization of encrypting vs signing vs hashing…
In summary, there are two primary cryptographic functions that are used in the Bitcoin blockchain:
- To “sign” transactions thus proving the authenticity of the transaction data.
- To “hash” data thus creating a unique identifier of that data.
There have been many, many developments in blockchain tech in the 11 years since its inception. There are now some very interesting tools for anonymizing or obscuring data on a blockchain, but these have been additions to the base functionality.
The entire system was built for the purpose of sharing and agreeing upon data. This is the core of its functionality. While it may be possible to retrofit some privacy into a blockchain system, I recommend being very cautious of any attempts to do so. At its core, blockchain is a tool for collaborating on shared, public, data.
To understand why Bitcoin is all about sharing and collaborating on public data you’ll need to understand “the double-spend problem”, which is explained in the blockchain mechanics video.
Myth: A blockchain can control ‘real world’ items
I run across this misconception most commonly when discussing smart contract related projects. Whatever smart contract platform you want to use, your smart contract can only control what is “lives” on that blockchain. There remains a very big problem of feeding accurate “real world” data to your smart contract and this will very likely be the weakest point in your smart contract system.
We often call this “the oracle problem”, or use the term “garbage in, garbage out”.
Let me give you an example of how this can be a problem. I had a team of MBA students working on a blockchain business plan. They wanted to create a smart contract based sports betting system. Blockchain tech has some advantages in this situation, specifically “immutability” (you can’t edit a smart contract and so no one can back out of the bet), and automatic execution.
The idea was interesting, build a platform that serves as a matching service. Say you want to bet that the Chicago Bears will win the Super Bowl this year and some, perhaps anonymous, person on the other side of the world thinks your crazy and wants to take the other side of that bet. Well, this service would connect the both of you, and then the bet would be made on the blockchain (Ethereum in this example). Your bet would be written in immutable (unchangeable) code in a smart contract and you would both put the appropriate amount of funds into this smart contract. Once someone has won the Superbowl the smart contract would automatically execute and the winning party would receive the funds. As this is done automatically via immutable code, neither of you can cheat. Awesome.
But here is the problem, who tells the smart contract who won the Superbowl? Well, you could set up a service to pull data from more traditional sports betting services and feed it to a smart contract. However, a clever thief, of which there are many in this space, can game this situation.
If a thief wants to game this setup, what they need to do is place large bets on unlikely outcomes, … the Bears winning the Superbowl, and then all they need to do is hack one data feed for only a matter of minutes. They only need to compromise one website and wait for the right moment to feed bad data to that immutable code. This is a very, very weak point in this setup. And one that is often overlooked. A system set up in the above fashion would fail very quickly.
As this is a very common problem in smart contract systems, there are solutions in the works. A common solution is to collect data from many sources to prevent one single point of failure in the system. Essentially to feed bad data to that smart a thief would then have to hack 10 systems instead of just one.
For example, Chainlink is one currently popular service that provides assistance here.
However, this remains a fundamental weakness in smart contract tech. Smart contracts are great when dealing with any asset that “lives” on its blockchain. Examples include Ether, the currency used on the Ethereum blockchain, or any tokens created by other smart contracts. However, any smart contract that has to interact with anything outside its own blockchain has a very big problem to solve. As such, a question you must always ask when dealing with any smart contract of this sort is… How are we feeding real-world data to this immutable code?
Myth: Blockchains are databases
Space in a public blockchain is a scarce and valuable resource. If you try to use a blockchain as a database that will cost you, in both time and money.
Blockchains were designed to share data without intermediaries. But cutting out that intermediary isn’t easy, and it comes with some trade-offs.
Blockchains were built for sharing and agreeing on data out in public, in a decentralized fashion. As these systems exist in the wild, open to all kinds of attacks, they can only work if the security of the network and resilience to attack are the priorities. This means speed, efficiency, scaling, and low fees have all been traded in for the resilience that is needed for an open system to survive out in the wild.
While there are ways to embed data in blockchains, Bitcoin’s op_return field for example, every-time you enter data into the Bitcoin blockchain in this fashion you’ll need to pay a transaction fee. These open blockchain networks have transaction fees in part to provide financial incentive to the “miners” who maintain the network, but also to prevent spam and DDoS attacks. If anyone could enter data into the blockchain without a cost to doing so, the network would be crippled with a flood of low priority transactions.
When transaction volume on a blockchain increases, so do the transaction fees. If your application depends on consistently entering data into the blockchain, you may find your operating cost skyrockets when network traffic increases.
Blockchains are slow. New data is entered into the Bitcoin blockchain roughly every 10mins. While some blockchains are much faster, Ethereum block time is ~20 seconds, they are still dramatically slower than even a basic install of a traditional database. And even after your data has been included in the blockchain you’ll need to wait an hour to be very sure that your data is embedded there permanently. These things just don’t function like the SQL databases that you may be familiar with.
Sometimes when people think they need a blockchain, they don’t actually need the things that blockchains do best, resilience to attack and censorship, etc., and what they actually need are features such as auditability, or cryptographically signed entries, or tamper evidence. And, thankfully, there are much simpler ways to achieve many of these functions. A tool such as git will do much of this for you at a fraction of the cost and effort.
If it’s possible to use a traditional database for your use case, do so. If removing an intermediary isn’t entirely necessary, then leave it be. Removing the intermediary is expensive!
And again, space in a public blockchain is a scarce and valuable resource.
What Blockchains are good at
So if open blockchains don’t secure private data, if they aren’t good at managing anything outside the blockchain, and if they aren’t databases, what are they for?
Open, public blockchains such as Bitcoin or Ethereum, were built to enable collaboration on data in a decentralized, censorship-resistant fashion. They are systems for consensus as a protocol, rather than a trusted third party. And they are fantastic at what they do.
Bitcoin has proven its value as a digital currency and Ethereum has proven its value as a smart contract platform. But blockchain tech also shows promise in other areas such as De-Fi or Decentralize Finance (Which is still a VERY immature technology, proceed with caution!), SSI or Self Sovereign Identity, payment networks such as the Lightning Network, NFT’s or Non-Fungible Tokens, and many more.
If you need cryptographically signed entries, auditability, time-stamping, and/or tamper evidence in a decentralized, censorship-resistant fashion, you may need a blockchain. If you need to collaborate on data (not just share it) with multiple entities who you do not trust, in a decentralized, immutable, censorship-resistant fashion, you may need a blockchain.
If you don’t need these features, you probably don’t need a blockchain.
Blockchain is a very new technology, and the frequent misconceptions around it are rather understandable. It uses cryptography but doesn’t actually encrypt anything. It’s a “shared ledger” but not a database. It cuts out the middleman but isn’t necessarily cheaper.
The cure for misconceptions is a real understanding of what this technology actually does.
Blockchain mechanics: https://youtu.be/GLD1ol3wRs8