I had to use two M400 to get this working.
The reason is that the first M400 sends the ok before it waits.
So in respect of the gcode stream itself it works, but the client still has no idea if the target position is reached.
It is the second M400 (or any other command), which returns its ok after the first M400 finished and I assume this only works because GCODE_BUFFER_SIZE is set to 1.
For me the workaround is good enough, but I'm not sure if this is intended.
Reading the sourcecode it might be difficult to change.