Kaan Barmore-Genç
acf5d30862
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
91 lines
5.2 KiB
Markdown
91 lines
5.2 KiB
Markdown
---
|
|
title: "Making the Slow Explicit: Dynamodb vs SQL"
|
|
date: 2023-02-26T15:51:19-05:00
|
|
toc: false
|
|
images:
|
|
tags:
|
|
- dev
|
|
- web
|
|
---
|
|
|
|
SQL databases like MySQL, MariaDB, and PostgreSQL are highly performant and can
|
|
scale well. However in practice it's not rare that people run into performance
|
|
issues with these databases, and run to NoSQL solutions like DynamoDB.
|
|
|
|
Proponents of DynamoDB like Alex DeBrie, the author of ["The DynamoDB Book"](https://www.dynamodbbook.com/)
|
|
point to a few things for this difference: HTTP-based APIs of NoSQL databases are more efficient than TCP connections used by SQL databases,
|
|
table joins are slow, SQL databases are designed to save disk space while NoSQL databases take advantage of large modern disks.[^1]
|
|
|
|
[^1]: I don't have my copy of the book handy, so I wrote these arguments from
|
|
memory. I'm confident that I remember them correctly, but apologies if I
|
|
misremembered some details.
|
|
|
|
These claims don't make a lot of sense to me though. HTTP runs over TCP, it's
|
|
not going to be magically faster. Table joins do make queries complex, but they
|
|
are a common feature that SQL engines are designed to optimize. And I don't
|
|
understand the point about SQL databases being designed to save space. While
|
|
disk capacities have skyrocketed, even the fastest disks are extremely slow
|
|
compared to how fast CPUs can crunch numbers. A single cache miss can stall a
|
|
CPU core for millions of cycles, so it's critical to fit data in cache. That
|
|
means making your data take up as little space as possible. Perhaps Alex is
|
|
talking about data normalization which is a property of database schemas and not
|
|
the database itself, but normalization is not about saving space either, it's
|
|
about keeping a single source of truth for everything. I feel like at the end of
|
|
the day, these arguments just boil down to "SQL is old and ugly, NoSQL is new
|
|
and fresh".
|
|
|
|
That being said, I think there is still the undeniable truth that people in
|
|
practice do hit performance issues with SQL databases far more often than they
|
|
hit performance issues with NoSQL databases like DynamoDB. And I think I know
|
|
why: it's because DynamoDB makes what is slow explicit.
|
|
|
|
Look at these 2 SQL queries, can you spot the performance difference between
|
|
them?
|
|
|
|
```SQL
|
|
SELECT * FROM users WHERE user_id = ?;
|
|
SELECT * FROM users WHERE group_id = ?;
|
|
```
|
|
|
|
It's a trick question, of course you can't! Not without looking at the table
|
|
schema to check if there are indexes on `user_id` or `group_id`. And you'd
|
|
likely have to run `DESCRIBE ...` if the query was more complex to make sure the
|
|
database will actually execute it the way you think it will.
|
|
|
|
I think this makes it easy to write bad queries. Look at [Jesse Skinner's article](https://www.codingwithjesse.com/blog/debugging-a-slow-web-app/)
|
|
about the time where he found a web app where all the `SELECT` queries were using `LIKE` instead of `=`
|
|
which meant that the queries were not using indexes at all! While it's easy to
|
|
think that the developer who made the mistake of using `LIKE` everywhere was just
|
|
a bad developer, I think the realization we need to come to is that it is too easy to make these mistakes.
|
|
The same `SELECT` query could be looking up a single item by its primary key,
|
|
or it could be doing a slow table scan. The same syntax could return you a single result, or it could return you a million results.
|
|
If you make a mistake, there is no indication that you made a mistake until
|
|
your application has been live for months or even years and your database has grown to a size
|
|
where these queries are now choking.
|
|
|
|
On one hand I think this speaks to how high performance SQL databases are. You
|
|
can write garbage queries and still get decent performance until your tables
|
|
grow to hundreds of thousands of rows! But at the same time I think this is
|
|
exactly why DynamoDB ends up being more scalable in production: because bad
|
|
queries are explicit.
|
|
|
|
With DynamoDB, if you want to get just one item by its unique key, then you use
|
|
a `Get` operation that makes this explicit. If you make a query that selects
|
|
items based on a key condition, that's an explicit `Query` operation. And your
|
|
query will return you only a small number of results and require you to paginate
|
|
with a cursor. Again making it explicit that you could be querying for many
|
|
items! And a query never falls back to scanning an entire table, you do a `Scan`
|
|
operation for that which makes it explicit that you are doing something wrong.
|
|
|
|
Rather than any magic about table joins or differences in connection types, I
|
|
think this is really the biggest difference in what makes DynamoDB more
|
|
scalable. It's not because DynamoDB is magic, it's because it makes bad patterns
|
|
more visible. I think it's critical that we make our tools be explicit and even
|
|
painful when using them in bad patterns, because we will accidentally follow bad
|
|
patterns if it's easy to do so.
|
|
|
|
I want to add though, DynamoDB is not perfect in this regard either. I
|
|
particularly see this with filters. It's easy to see why Amazon added filters,
|
|
but it's not rare that people use filters without understanding how they work
|
|
and end up making mistakes (for example, [here](https://stackoverflow.com/questions/64814040/dynamodb-scan-filter-not-returning-results-for-some-requests)).
|