qemu-devel@nongnu.org
[Top] [All Lists]

Re: [kvm-devel] [Qemu-devel] [PATCH] QEMU: fsync AIO writes on flush req

Subject: Re: [kvm-devel] [Qemu-devel] [PATCH] QEMU: fsync AIO writes on flush request
From: Marcelo Tosatti
Date: Fri, 28 Mar 2008 15:36:28 -0300
On Fri, Mar 28, 2008 at 06:03:25PM +0000, Jamie Lokier wrote:
> Marcelo Tosatti wrote:
> > On Fri, Mar 28, 2008 at 03:07:03PM +0000, Jamie Lokier wrote:
> > > Marcelo Tosatti wrote:
> > > > Its necessary to guarantee that pending AIO writes have reached stable
> > > > storage when the flush request returns.
> > > > 
> > > > Also change fsync() to fdatasync(), since the modification time is not
> > > > critical data.
> > > > +    if (aio_fsync(O_DSYNC, &acb->aiocb) < 0) {
> > > 
> > > >      BDRVRawState *s = bs->opaque;
> > > > -    fsync(s->fd);
> > > > +    raw_aio_flush(bs);
> > > > +    fdatasync(s->fd);
> > > > +
> > > > +    /* We rely on the fact that no other AIO will be submitted
> > > > +     * in parallel, but this should be fixed by per-device
> > > > +     * AIO queues when allowing multiple CPU's to process IO
> > > > +     * in QEMU.
> > > > +     */
> > > > +    qemu_aio_flush();
> > > 
> > > I'm a bit confused by this.  Why do you need aio_fsync(O_DSYNC) _and_
> > > synchronous fdatasync() calls?  Aren't they equivalent?
> > 
> > fdatasync() will write and wait for completion of dirty file data
> > present in memory.
> > 
> > aio_write() only queues data for submission:
> > 
> >        The "asynchronous" means that this call returns as soon as the  
> > request
> >        has  been  enqueued;  the  write may or may not have completed when 
> > the
> >        call returns. One tests for completion using aio_error(3).
> > 
> 
> > So fdatasync() is not enough because data written via AIO may not
> > have been reflected as "dirty file data" through write() by the time
> > raw_flush() is called.
> 
> Sure.  But why isn't the aio_fsync(O_DSYNC) enough by itself?

It is enough, fdatasync() becomes redundant.

> It seems to me you should have something like this:
> 
>     /* Flush pending aio_writes until they are dirty data,
>        and wait before the aio_fsync. */
>     qemu_aio_flush();
> 
>     /* Call aio_fsync(O_DSYNC). */
>     raw_aio_flush(bs);
> 
>     /* Wait for the aio_fsync to complete. */
>     qemu_aio_flush();
> 
> What am I missing?

I don't think the first qemu_aio_flush() is necessary because the fsync
request will be enqueued after pending ones: 

       aio_fsync()  function  does a sync on all outstanding asynchronous
       I/O operations associated with aiocbp->aio_fildes.

       More precisely, if op is O_SYNC, then all currently queued  I/O  opera-
       tions  shall  be  completed  as  if by a call of fsync(2), and if op is
       O_DSYNC, this call is the asynchronous analog  of  fdatasync(2).   Note
       that  this  is a request only â this call does not wait for I/O comple-
       tion.

glibc sets the priority for fsync as 0, which is the same priority AIO
reads and writes are submitted by QEMU.



<Prev in Thread] Current Thread [Next in Thread>