Storage options and performance considerations
There is quite a bit more to storage than just the amount of GB that you get. In this lesson we'll dive into some storage options, redundancy and performance considerations.
NVMe SSD > SATA SSD > HDD
I'm not going to go into the full details behind each storage media type, and unless you're hand-picking your own hardware, you typically don't get much of a say anyway.
It is important to note that the difference between HDD and SSD drives is astronomical. You might be tempted to use HDD for something like backups to save a few bucks, however, unless you're storing hundreds of terabytes of data, it usually doesn't move the needle.
Operating systems and especially databases (tons of random IO) will greatly benefit from NVMe SSDs due to their lower latency, but otherwise SATA SSDs are generally okay. In a well-configured WordPress environment, disk access is rather minimal.
Direct vs network-attached
Most dedicated server offerings come with direct/local storage, while VM and cloud vendors typically prefer network-attached storage, with local storage only available on more expensive instances.
Direct will almost always win the performance comparison, due to its low latency. Especially in a WordPress application workload, with a ton of small random IO. Network-attached can generally reach higher throughput, though with most cloud providers you'd need to provision a very large amount of storage to achieve that throughput.
Network-attached storage is absolutely great for portability, due to how this storage is usually managed. This allows you to resize your VMs and quickly move them between physical hosts. Using snapshots you can move them between availability zones and even regions. With local storage you would have to manage such migrations yourself.
This also means that network-attached storage is usually redundant (within the AZ), so in an event of a hardware failure, your instance can resume on a different host, with all your data intact. Data redundancy is managed by the provider, completely transparent from the consumer.
You can still achieve some level of redundancy with directly-attached storage, typically through hardware or software RAID configurations, though this does add some complexity to server management, and is limited to drive failure. A full system failure will still require somebody to physically detach the good drives and attach them to another server.
Some hosting providers give you the option to have both, direct/local storage as well as some network-attached storage, usually for backups. This is a nice compromise, allowing for quicker recovery on another system, should things go wrong.
Block storage, object storage and network filesystems
Network-attached storage also comes in a few different shapes and sizes.
The fastest of them all is block storage, which allows your server or VM instance to see a raw block device, provision a filesystem to it, and work with that filesystem as if it was a local drive.
This is the most performant of the three network options, as all operations on the data happen at the block level. Examples of this are Amazon's Elastic Block Storage (EBS), Google's Persistent Disk (PD) and DigitalOcean's Block Storage, as well as OpenEBS and Ceph Block Device (RBD) from the self-hosted world.
Object storage is another network-attached option. This is often used to store larger files with less frequent access, like images and especially videos. Amazon S3, Cloudflare R2, DigitalOcean Spaces, Google Cloud Storage, Ceph Object Storage are all examples of this. Object storage is much slower than block storage, especially for small files and random IO, due to all operations happening on an object (file) level.
The last in this list is a network filesystem (NFS, EFS, CephFS), which provides POSIX semantics over a network, and can often be mounted on multiple instances at the same time. Due to the additional metadata layer and network hops, latency is typically higher than with block storage.
WordPress
In the context of a WordPress site, all application files (including WordPress core, themes and plugins) as well as MariaDB/MySQL data should definitely live on a block device. A directly-attached block device would be best for performance, but a network-attached one is also fine.
Backups and media uploads (if implemented correctly) can live in an object store or a shared filesystem, assuming access is infrequent. An example of a poor implementation is when you have a WordPress plugin frequently appending a line to a 2GB log file in your uploads directory that lives on a network filesystem.
Benchmarking
If you're hand-picking drives, you can find plenty of benchmarks online to help you make a decision. However, with dedicated servers chances are you'll be limited to using drives the provider has in stock.
When comparing VMs, however, it's important to remember that just like with compute resources, storage is fully controlled by the vendor. They can obviously restrict the disk sizes, but also throttle bandwidth, IOPS and more, through baseline/burst and credit systems and more.
You can use fio to run various IO benchmarks against your devices and filesystems. We'll do some of that in later modules, however, here's a quick comparison of a random IO test on a cloud VM with network-attached block storage, versus a local drive:
fio --name=randread --filename=randfile --size=8G --rw=randread \
--bs=4k --iodepth=64 --ioengine=libaio --direct=1 \
--numjobs=4 --time_based=1 --runtime=60 --group_reporting
One of the benchmarks delivered 108k IOPS with 443MB/s throughput, while the other option came in at 1044k IOPS with 4276MB/s. I'll let you guess which one's the network-attached variant.