Discussion:
[PATCH] fallocate() for FFS
Emmanuel Dreyfus
2014-09-25 01:19:35 UTC
Permalink
Implementing fallocate for FFS seems quite obvious. Is there anything I
missed in the patch below? Is it good enough to commit?

This lets me discover that posix_fallocate() libc stub is buggy. Using
it through #include <unistd.h> leads argument corruption. Defining it as
int posix_fallocate(int, int, off_t, off_t); let it work (note the
second int pad, as in syscalls.master). I guess we need to add a libc
stub too.

Index: sys/ufs/ffs/ffs_vnops.c
===================================================================
RCS file: /cvsroot/src/sys/ufs/ffs/ffs_vnops.c,v
retrieving revision 1.125
diff -U 4 -r1.125 ffs_vnops.c
--- sys/ufs/ffs/ffs_vnops.c 25 Jul 2014 08:20:53 -0000 1.125
+++ sys/ufs/ffs/ffs_vnops.c 25 Sep 2014 01:09:50 -0000
@@ -114,9 +114,9 @@
{ &vop_getattr_desc, ufs_getattr }, /* getattr */
{ &vop_setattr_desc, ufs_setattr }, /* setattr */
{ &vop_read_desc, ffs_read }, /* read */
{ &vop_write_desc, ffs_write }, /* write */
- { &vop_fallocate_desc, genfs_eopnotsupp }, /* fallocate */
+ { &vop_fallocate_desc, ffs_fallocate }, /* fallocate */
{ &vop_fdiscard_desc, genfs_eopnotsupp }, /* fdiscard */
{ &vop_ioctl_desc, ufs_ioctl }, /* ioctl */
{ &vop_fcntl_desc, ufs_fcntl }, /* fcntl */
{ &vop_poll_desc, ufs_poll }, /* poll */
@@ -326,8 +326,41 @@
return error;
}

int
+ffs_fallocate(void *v)
+{
+ struct vop_fallocate_args /* {
+ struct vnode *a_vp;
+ off_t a_pos;
+ off_t a_len;
+ } */ *ap = v;
+ struct vnode *vp;
+ struct mount *mp;
+ int error;
+
+
+ vp = ap->a_vp;
+ mp = vp->v_mount;
+
+ fstrans_start(mp, FSTRANS_LAZY);
+ error = UFS_WAPBL_BEGIN(vp->v_mount);
+ if (error)
+ goto out;
+
+ genfs_node_wrlock(vp);
+ error = GOP_ALLOC(vp, ap->a_pos, ap->a_len, 0, curlwp->l_cred);
+ genfs_node_unlock(vp);
+
+ UFS_WAPBL_UPDATE(vp, NULL, NULL, 0);
+ UFS_WAPBL_END(vp->v_mount);
+out:
+ fstrans_done(mp);
+
+ return error;
+}
+
+int
ffs_fsync(void *v)
{
struct vop_fsync_args /* {
struct vnode *a_vp;
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
***@netbsd.org
Emmanuel Dreyfus
2014-09-26 04:07:21 UTC
Permalink
Post by Emmanuel Dreyfus
Implementing fallocate for FFS seems quite obvious. Is there anything I
missed in the patch below? Is it good enough to commit?
Updated version that spares some panic by properly calling
uvm_vnp_setsize(). Any comment?

Index: sys/ufs/ffs/ffs_extern.h
===================================================================
RCS file: /cvsroot/src/sys/ufs/ffs/ffs_extern.h,v
retrieving revision 1.80
diff -U 4 -r1.80 ffs_extern.h
--- sys/ufs/ffs/ffs_extern.h 16 Jun 2013 13:33:30 -0000 1.80
+++ sys/ufs/ffs/ffs_extern.h 26 Sep 2014 03:59:05 -0000
@@ -126,8 +126,9 @@

/* ffs_vnops.c */
int ffs_read(void *);
int ffs_write(void *);
+int ffs_fallocate(void *);
int ffs_fsync(void *);
int ffs_spec_fsync(void *);
int ffs_reclaim(void *);
int ffs_getpages(void *);
Index: sys/ufs/ffs/ffs_vnops.c
===================================================================
RCS file: /cvsroot/src/sys/ufs/ffs/ffs_vnops.c,v
retrieving revision 1.125
diff -U 4 -r1.125 ffs_vnops.c
--- sys/ufs/ffs/ffs_vnops.c 25 Jul 2014 08:20:53 -0000 1.125
+++ sys/ufs/ffs/ffs_vnops.c 26 Sep 2014 03:59:05 -0000
@@ -114,9 +114,9 @@
{ &vop_getattr_desc, ufs_getattr }, /* getattr */
{ &vop_setattr_desc, ufs_setattr }, /* setattr */
{ &vop_read_desc, ffs_read }, /* read */
{ &vop_write_desc, ffs_write }, /* write */
- { &vop_fallocate_desc, genfs_eopnotsupp }, /* fallocate */
+ { &vop_fallocate_desc, ffs_fallocate }, /* fallocate */
{ &vop_fdiscard_desc, genfs_eopnotsupp }, /* fdiscard */
{ &vop_ioctl_desc, ufs_ioctl }, /* ioctl */
{ &vop_fcntl_desc, ufs_fcntl }, /* fcntl */
{ &vop_poll_desc, ufs_poll }, /* poll */
@@ -326,8 +326,43 @@
return error;
}

int
+ffs_fallocate(void *v)
+{
+ struct vop_fallocate_args /* {
+ struct vnode *a_vp;
+ off_t a_pos;
+ off_t a_len;
+ } */ *ap = v;
+ struct vnode *vp;
+ struct mount *mp;
+ int error;
+
+
+ vp = ap->a_vp;
+ mp = vp->v_mount;
+
+ fstrans_start(mp, FSTRANS_LAZY);
+ error = UFS_WAPBL_BEGIN(vp->v_mount);
+ if (error)
+ goto out;
+
+ genfs_node_wrlock(vp);
+ error = GOP_ALLOC(vp, ap->a_pos, ap->a_len, 0, curlwp->l_cred);
+ if (!error && ap->a_pos + ap->a_len > vp->v_size)
+ uvm_vnp_setsize(vp, ap->a_pos + ap->a_len);
+ genfs_node_unlock(vp);
+
+ UFS_WAPBL_UPDATE(vp, NULL, NULL, 0);
+ UFS_WAPBL_END(vp->v_mount);
+out:
+ fstrans_done(mp);
+
+ return error;
+}
+
+int
ffs_fsync(void *v)
{
struct vop_fsync_args /* {
struct vnode *a_vp;
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
***@netbsd.org
Chuck Silvers
2014-09-26 16:16:24 UTC
Permalink
Post by Emmanuel Dreyfus
Post by Emmanuel Dreyfus
Implementing fallocate for FFS seems quite obvious. Is there anything I
missed in the patch below? Is it good enough to commit?
Updated version that spares some panic by properly calling
uvm_vnp_setsize(). Any comment?
as I recall, GOP_ALLOC() doesn't initialize the blocks allocated, nor create
any zeroed pages in memory to cover the uninitialized blocks, so the patch
you gave would expose the current contents of the blocks which were allocated,
creating a security problem.

but even ignoring that, the patch doesn't seem to actually work:
"ls -ls" shows that the number of blocks allocated to the file doesn't change.
if the fs is unmounted and mounted again, the file size goes back to what it
was before fallocate was called.

I don't remember the sequence of operations that has to be used to allocate
space in ffs anymore, you could take a look at ffs_write() and do what it does.

the bigger problem is that fallocate() wants a way to record that the blocks
are not initialized, so that reading from those offsets can return zeroes
instead of reading the uninitialized blocks from disk, but the FFS metadata
has no way to represent that. we would need a change to the FFS on-disk format
for fallocate() to be useful for FFS.

-Chuck
Emmanuel Dreyfus
2014-09-26 16:51:56 UTC
Permalink
Post by Chuck Silvers
the bigger problem is that fallocate() wants a way to record that the blocks
are not initialized, so that reading from those offsets can return zeroes
instead of reading the uninitialized blocks from disk, but the FFS metadata
has no way to represent that. we would need a change to the FFS on-disk format
for fallocate() to be useful for FFS.
It looked too easy :-)
--
Emmanuel Dreyfus
***@netbsd.org
Emmanuel Dreyfus
2014-09-27 00:16:43 UTC
Permalink
Post by Chuck Silvers
"ls -ls" shows that the number of blocks allocated to the file doesn't change.
But OTOH I df shows a growing FS. Did I just managed to leak blocks that
are allocated but not to the file?
--
Emmanuel Dreyfus
***@netbsd.org
David Holland
2014-09-27 06:10:17 UTC
Permalink
Post by Emmanuel Dreyfus
Post by Chuck Silvers
"ls -ls" shows that the number of blocks allocated to the file doesn't change.
But OTOH I df shows a growing FS. Did I just managed to leak blocks that
are allocated but not to the file?
Probably, try fsck.

:-|
--
David A. Holland
***@netbsd.org
Loading...