mirror of
https://github.com/adulau/aha.git
synced 2025-01-01 13:46:24 +00:00
3d61f75eef
Currently fdatasync is identical to fsync in ext3. I think fdatasync should skip journal flush in data=ordered and data=writeback mode when it overwrites to already-instantiated blocks on HDD. When I_DIRTY_DATASYNC flag is not set, fdatasync should skip journal writeout because this indicates only atime or/and mtime updates. Following patch is the same approach of ext2's fsync code(ext2_sync_file). I did a performance test using the sysbench. #sysbench --num-threads=128 --max-requests=50000 --test=fileio --file-total-size=128G --file-test-mode=rndwr --file-fsync-mode=fdatasync run The result on ext3 was: -2.6.24 Operations performed: 0 Read, 50080 Write, 59600 Other = 109680 Total Read 0b Written 782.5Mb Total transferred 782.5Mb (12.116Mb/sec) 775.45 Requests/sec executed Test execution summary: total time: 64.5814s total number of events: 50080 total time taken by event execution: 3713.9836 per-request statistics: min: 0.0000s avg: 0.0742s max: 0.9375s approx. 95 percentile: 0.2901s Threads fairness: events (avg/stddev): 391.2500/23.26 execution time (avg/stddev): 29.0155/1.99 -2.6.24-patched Operations performed: 0 Read, 50009 Write, 61596 Other = 111605 Total Read 0b Written 781.39Mb Total transferred 781.39Mb (16.419Mb/sec) 1050.83 Requests/sec executed Test execution summary: total time: 47.5900s total number of events: 50009 total time taken by event execution: 2934.5768 per-request statistics: min: 0.0000s avg: 0.0587s max: 0.8938s approx. 95 percentile: 0.1993s Threads fairness: events (avg/stddev): 390.6953/22.64 execution time (avg/stddev): 22.9264/1.17 Filesystem I/O throughput was improved. Signed-off-by :Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Acked-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
91 lines
2.7 KiB
C
91 lines
2.7 KiB
C
/*
|
|
* linux/fs/ext3/fsync.c
|
|
*
|
|
* Copyright (C) 1993 Stephen Tweedie (sct@redhat.com)
|
|
* from
|
|
* Copyright (C) 1992 Remy Card (card@masi.ibp.fr)
|
|
* Laboratoire MASI - Institut Blaise Pascal
|
|
* Universite Pierre et Marie Curie (Paris VI)
|
|
* from
|
|
* linux/fs/minix/truncate.c Copyright (C) 1991, 1992 Linus Torvalds
|
|
*
|
|
* ext3fs fsync primitive
|
|
*
|
|
* Big-endian to little-endian byte-swapping/bitmaps by
|
|
* David S. Miller (davem@caip.rutgers.edu), 1995
|
|
*
|
|
* Removed unnecessary code duplication for little endian machines
|
|
* and excessive __inline__s.
|
|
* Andi Kleen, 1997
|
|
*
|
|
* Major simplications and cleanup - we only need to do the metadata, because
|
|
* we can depend on generic_block_fdatasync() to sync the data blocks.
|
|
*/
|
|
|
|
#include <linux/time.h>
|
|
#include <linux/fs.h>
|
|
#include <linux/sched.h>
|
|
#include <linux/writeback.h>
|
|
#include <linux/jbd.h>
|
|
#include <linux/ext3_fs.h>
|
|
#include <linux/ext3_jbd.h>
|
|
|
|
/*
|
|
* akpm: A new design for ext3_sync_file().
|
|
*
|
|
* This is only called from sys_fsync(), sys_fdatasync() and sys_msync().
|
|
* There cannot be a transaction open by this task.
|
|
* Another task could have dirtied this inode. Its data can be in any
|
|
* state in the journalling system.
|
|
*
|
|
* What we do is just kick off a commit and wait on it. This will snapshot the
|
|
* inode to disk.
|
|
*/
|
|
|
|
int ext3_sync_file(struct file * file, struct dentry *dentry, int datasync)
|
|
{
|
|
struct inode *inode = dentry->d_inode;
|
|
int ret = 0;
|
|
|
|
J_ASSERT(ext3_journal_current_handle() == NULL);
|
|
|
|
/*
|
|
* data=writeback:
|
|
* The caller's filemap_fdatawrite()/wait will sync the data.
|
|
* sync_inode() will sync the metadata
|
|
*
|
|
* data=ordered:
|
|
* The caller's filemap_fdatawrite() will write the data and
|
|
* sync_inode() will write the inode if it is dirty. Then the caller's
|
|
* filemap_fdatawait() will wait on the pages.
|
|
*
|
|
* data=journal:
|
|
* filemap_fdatawrite won't do anything (the buffers are clean).
|
|
* ext3_force_commit will write the file data into the journal and
|
|
* will wait on that.
|
|
* filemap_fdatawait() will encounter a ton of newly-dirtied pages
|
|
* (they were dirtied by commit). But that's OK - the blocks are
|
|
* safe in-journal, which is all fsync() needs to ensure.
|
|
*/
|
|
if (ext3_should_journal_data(inode)) {
|
|
ret = ext3_force_commit(inode->i_sb);
|
|
goto out;
|
|
}
|
|
|
|
if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))
|
|
goto out;
|
|
|
|
/*
|
|
* The VFS has written the file data. If the inode is unaltered
|
|
* then we need not start a commit.
|
|
*/
|
|
if (inode->i_state & (I_DIRTY_SYNC|I_DIRTY_DATASYNC)) {
|
|
struct writeback_control wbc = {
|
|
.sync_mode = WB_SYNC_ALL,
|
|
.nr_to_write = 0, /* sys_fsync did this */
|
|
};
|
|
ret = sync_inode(inode, &wbc);
|
|
}
|
|
out:
|
|
return ret;
|
|
}
|