On 03/13/2018 02:53 PM, Wouter Verhelst wrote:
On Wed, Feb 28, 2018 at 05:20:55PM -0600, Eric Blake wrote:The previous patch mentioned that a server that honors larger TRIM/WRITE_ZEROES requests than accepted for WRITE has to choose whether to advertise the maximum block size as the smaller limit at which it does hard disconnect for WRITE, or the larger limit at which it returns EINVAL for too-large trim/zero. Let's make the situation less ambiguous by allowing a client and server to negotiate explicit alternate limits for these two commands, using the fact that NBD_OPT_GO already requires both client and server to request additional NBD_INFO items, and to ignore items that they don't recognize.This looks good. It should probably also say that a trim size and a zeroes size SHOULD NOT be smaller than the write size (because that makes no sense). Maybe even MUST NOT?
I did say that - the minimum trim size must be at least the preferred write size (where the preferred write size must be a power of 2 that is at least the minimum write size); and the maximum trim size must be at least the maximum write size.
You're right that the maximum size for zeroes should never be smaller than the regular write maximum (otherwise, there's more overhead to send a split write-zeroes request compared to just sending a single write request that happens to write zeroes). The minimum side is different; a server can support 1-byte writes but still request a minimum zero/trim size of 64k (if the file system granularity is 64k for holes, then you can't trim anything smaller than that, and writing zeroes smaller than that requires a read-modify-write rather than punching a hole) in the file system, for example). But the minimum trim/zero should never be smaller than the minimum write.
-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org