Awhile back, we benchmark our Proxmox infrastructure in various ways mainly to experiment and to get the maximum out of the hardware we had. We did write two articles on the subject to share our result and conclusion, but we did not take the time to share all our result. In an initiative to continue the series, this article will share our finding regarding a ZFS tuning parameter that has a good enough impact on your Proxmox infrastructure. The parameters in question is the primarycache option. It’s not available in the Proxmox GUI. You must use CLI to change the value and you may configure it per ZFS volume.
Here what the ZFS manual have to say about this option:
primarycache=all | none | metadata Controls what is cached in the primary cache (ARC). If this property is set to all, then both user data and metadata is cached. If this property is set to none, then neither user data nor metadata is cached. If this property is set to metadata, then only metadata is cached. The default value is all.
Out of this description, one would think caching is better and we must enable it. Wrong. In virtual machine, if you give it enough memory, the guest OS is already caching the file system data. The guest OS can also make better decision regarding what need to be cached since it’s closer to the application. Effectively, enabling the ZFS primarycache for virtual machine is not useful because it creates two caches, one in the guest OS and another in the host OS. With this solution, it’s highly possible to have the same data stored twice in memory. People may argue, the ARC (adaptive replacement cache) as better algorithms for caching, but it’s a waste, because the guest OS doesn’t have direct access to the ARC.
As for LXC, it’s a bit different. LXC does have direct access to the ARC. The performance boost provided by the primarycache highly depends on your workload. One would think primarycache=all for LXC should be beneficial. With our benchmark we observe different results. To check if the primarycache=all provide benefit for your workload, the best it to test it and use various ARC statistics to verify if the ARC is in fact used or not. Have a look at: /usr/sbin/arcstat.py and /usr/sbin/arc_summary.py.
To change this option, you must identify the right zvol to be updated.
$ sudo zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 111G 369G 140K /rpool rpool/ROOT 30.9G 369G 140K /rpool/ROOT rpool/ROOT/pve-1 30.9G 369G 30.9G / rpool/data 70.4G 369G 6.98G /rpool/data rpool/data/subvol-116-disk-1 2.14G 5.86G 2.14G /rpool/data/subvol-116-disk-1 rpool/data/subvol-117-disk-1 2.15G 5.85G 2.15G /rpool/data/subvol-117-disk-1 rpool/data/subvol-120-disk-1 2.14G 5.86G 2.14G /rpool/data/subvol-120-disk-1 rpool/data/subvol-125-disk-1 3.37G 28.6G 3.37G /rpool/data/subvol-125-disk-1 rpool/data/vm-112-disk-1 21.5G 369G 21.0G - rpool/data/vm-114-disk-1 8.02G 369G 8.02G - rpool/data/vm-119-disk-1 16.9G 369G 16.4G - rpool/data/vm-121-disk-1 1.96G 369G 1.96G - rpool/data/vm-121-disk-2 5.57M 369G 5.57M - rpool/data/vm-122-disk-1 1.45G 369G 1.45G - rpool/data/vm-123-disk-1 1.46G 369G 1.46G - rpool/data/vm-124-disk-1 1.46G 369G 1.46G - rpool/subvol-108-disk-1 1.03G 7.13G 893M /rpool/subvol-108-disk-1 rpool/swap 8.50G 375G 2.77G -
In our environment, rpool/data is our storage for Proxmox virtual machine and LXC. If you want to change this option for all your environment, you may set the option on it. Otherwise, you may choose to only change the option on a specific VM by changing the value for a specific zvol.
$ sudo zfs get primarycache NAME PROPERTY VALUE SOURCE rpool primarycache all default rpool/ROOT primarycache all default rpool/ROOT/pve-1 primarycache all default rpool/data primarycache metadata local rpool/data/subvol-116-disk-1 primarycache metadata inherited from rpool/data rpool/data/subvol-117-disk-1 primarycache metadata inherited from rpool/data rpool/data/subvol-120-disk-1 primarycache metadata inherited from rpool/data rpool/data/subvol-125-disk-1 primarycache all local rpool/data/vm-112-disk-1 primarycache metadata inherited from rpool/data rpool/data/vm-114-disk-1 primarycache metadata inherited from rpool/data rpool/data/vm-119-disk-1 primarycache metadata inherited from rpool/data rpool/data/vm-121-disk-1 primarycache all local rpool/data/vm-121-disk-2 primarycache metadata inherited from rpool/data rpool/data/vm-122-disk-1 primarycache metadata inherited from rpool/data rpool/data/vm-123-disk-1 primarycache metadata inherited from rpool/data rpool/data/vm-124-disk-1 primarycache metadata inherited from rpool/data rpool/subvol-108-disk-1 primarycache all default rpool/swap primarycache metadata local sudo zfs set primarycache=metadata rpool/data/vm-112-disk-1
Results
With this test we don't see a big different between the two options. Still it's enough to showcase the benefit of using primarycache=metadata for LXC.
With this test we clearly see how LXC can benefit from setting primarycache=metadata. With KVM on the other end we see little to no benefit.
Setting primarycache=metadata for LXC is providing better throughput because the OS doesn't have to waste time to store data in the cache. On the other hand, KVM result is puzzling. Performing better with primarycache=all.
This test is not conclusive. The difference in results is not significant.
This test is not conclusive. The difference in results is not significant.
Conclusion
As you can see in the results, the primarycache option does have impact on the performance but not for every workload. In some test, we don't see any differences ! While in other tests, it provides more than 200% boost.
With all this information, you might be lost about whether it’s good or not to enable the primarycache and which option is better for you. Here, as a rule of doom: set all VM and LXC to primarycache=metadata and for very, very specific workload, set it to primarycache=all.
With this settings, your system is not wasting any memmory for the ARC and that memory can be used for something else like more memory for your VM.