Discussion:
Unallocated inode
Paul Ripke
2014-08-29 06:56:46 UTC
Permalink
I'm currently running kernel:
NetBSD slave 6.1_STABLE NetBSD 6.1_STABLE (SLAVE) #4: Fri May 23 23:42:30 EST 2014
***@slave:/home/netbsd/netbsd-6/obj.amd64/home/netbsd/netbsd-6/src/sys/arch/amd64/compile/SLAVE amd64
Built from netbsd-6 branch synced around the build time. Over the
last year, I have seen 2 instances where I've had cleared inodes,
causing obvious errors:

slave:ksh$ sudo find /home -xdev -ls > /dev/null
find: /home/netbsd/cvsroot/pkgsrc/japanese/p5-Jcode/pkg/Attic/PLIST,v: Bad file descriptor
find: /home/netbsd/cvsroot/pkgsrc/print/texlive-pdftools/patches/Attic/patch-ac,v: Bad file descriptor

fsdb tells me they're "unallocated inode"s, which I can easily fix,
but does anyone have any idea what might be causing them? This
appears similar to issues reported previously:
http://mail-index.netbsd.org/tech-kern/2013/10/19/msg015770.html

My filesystem is FFSv2 with wapbl, sitting on a raidframe mirror
over two SATA drives.

Cheers,
--
Paul Ripke

"Great minds discuss ideas, average minds discuss events, small minds discuss people."
-- Disputed: Often attributed to Eleanor Roosevelt. 1948.
Manuel Bouyer
2014-08-29 07:01:48 UTC
Permalink
Post by Paul Ripke
NetBSD slave 6.1_STABLE NetBSD 6.1_STABLE (SLAVE) #4: Fri May 23 23:42:30 EST 2014
Built from netbsd-6 branch synced around the build time. Over the
last year, I have seen 2 instances where I've had cleared inodes,
slave:ksh$ sudo find /home -xdev -ls > /dev/null
find: /home/netbsd/cvsroot/pkgsrc/japanese/p5-Jcode/pkg/Attic/PLIST,v: Bad file descriptor
find: /home/netbsd/cvsroot/pkgsrc/print/texlive-pdftools/patches/Attic/patch-ac,v: Bad file descriptor
fsdb tells me they're "unallocated inode"s, which I can easily fix,
but does anyone have any idea what might be causing them? This
http://mail-index.netbsd.org/tech-kern/2013/10/19/msg015770.html
I've seen this occasionally, after unclean shutdown (power loss or
panic) with WAPBL-enabled filesystems.
--
Manuel Bouyer <***@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
Christos Zoulas
2014-08-29 09:43:39 UTC
Permalink
Post by Paul Ripke
NetBSD slave 6.1_STABLE NetBSD 6.1_STABLE (SLAVE) #4: Fri May 23 23:42:30 EST 2014
Built from netbsd-6 branch synced around the build time. Over the
last year, I have seen 2 instances where I've had cleared inodes,
slave:ksh$ sudo find /home -xdev -ls > /dev/null
find: /home/netbsd/cvsroot/pkgsrc/japanese/p5-Jcode/pkg/Attic/PLIST,v: Bad file descriptor
Bad file descriptor
fsdb tells me they're "unallocated inode"s, which I can easily fix,
but does anyone have any idea what might be causing them? This
http://mail-index.netbsd.org/tech-kern/2013/10/19/msg015770.html
My filesystem is FFSv2 with wapbl, sitting on a raidframe mirror
over two SATA drives.
Try unmounting it, and then running fsck -fn on it. Does it report
errors?

christos
Edgar Fuß
2014-08-29 10:33:02 UTC
Permalink
Post by Paul Ripke
does anyone have any idea what might be causing them?
http://mail-index.netbsd.org/tech-kern/2013/10/19/msg015770.html
In my case, they were most probably caused by the disc firmware crashing,
the MPT SAS controller locking up and mpt(4) not properly dealing with that.
Post by Paul Ripke
My filesystem is FFSv2 with wapbl, sitting on a raidframe mirror
over two SATA drives.
In my case, it was FFSv2 with WAPBL on a level 5 RAIDframe accross 5 SAS discs.

Btw.: Has anyone tried to import my mpt(4) enhancements? I had several
crashes of the disc firmware / controller hick-ups that were caught by
my patch (I later updated the firmware and had no crashes since).
Edgar Fuß
2014-08-29 14:28:11 UTC
Permalink
Hello. If the patches you're talking about are the ones we worked on
for the mpt(4) driver, they were committed around February of this year.
Ah, thanks. I'll have a look at what you committed.
Brian Buhrow
2014-08-29 14:08:10 UTC
Permalink
On Aug 29, 12:33pm, Edgar =?iso-8859-1?B?RnXf?= wrote:
} Subject: Re: Unallocated inode
} Btw.: Has anyone tried to import my mpt(4) enhancements? I had several
} crashes of the disc firmware / controller hick-ups that were caught by
} my patch (I later updated the firmware and had no crashes since).
-- End of excerpt from Edgar =?iso-8859-1?B?RnXf?=
Hello. If the patches you're talking about are the ones we worked on
for the mpt(4) driver, they were committed around February of this year.
They should be in NetBSD-7 when it comes out. I don't know if they were
pulled up to NetBSD-5 or NetBSD-6. I opened tickets, but I don't know if
they've been processed.
-thanks
-Brian
Paul Ripke
2014-09-01 04:54:48 UTC
Permalink
Post by Christos Zoulas
Post by Paul Ripke
NetBSD slave 6.1_STABLE NetBSD 6.1_STABLE (SLAVE) #4: Fri May 23 23:42:30 EST 2014
Built from netbsd-6 branch synced around the build time. Over the
last year, I have seen 2 instances where I've had cleared inodes,
slave:ksh$ sudo find /home -xdev -ls > /dev/null
find: /home/netbsd/cvsroot/pkgsrc/japanese/p5-Jcode/pkg/Attic/PLIST,v: Bad file descriptor
find: /home/netbsd/cvsroot/pkgsrc/print/texlive-pdftools/patches/Attic/patch-ac,v: Bad file descriptor
fsdb tells me they're "unallocated inode"s, which I can easily fix,
but does anyone have any idea what might be causing them? This
http://mail-index.netbsd.org/tech-kern/2013/10/19/msg015770.html
My filesystem is FFSv2 with wapbl, sitting on a raidframe mirror
over two SATA drives.
Try unmounting it, and then running fsck -fn on it. Does it report
errors?
christos
Oh, yes, indeed. And fixes them fine:

** /dev/rraid0g
** File system is already clean
** Last Mounted on /home
** Phase 1 - Check Blocks and Sizes
PARTIALLY ALLOCATED INODE I=106999488
CLEAR? [yn] y

PARTIALLY ALLOCATED INODE I=106999489
CLEAR? [yn] y

** Phase 2 - Check Pathnames
UNALLOCATED I=106999489 OWNER=0 MODE=0
SIZE=0 MTIME=Jan 1 10:00 1970
NAME=/netbsd/cvsroot/pkgsrc/japanese/p5-Jcode/pkg/Attic/PLIST,v

REMOVE? [yn] y

UNALLOCATED I=106999488 OWNER=0 MODE=0
SIZE=0 MTIME=Jan 1 10:00 1970
NAME=/netbsd/cvsroot/pkgsrc/print/texlive-pdftools/patches/Attic/patch-ac,v

REMOVE? [yn] y

** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? [yn] y

SUMMARY INFORMATION BAD
SALVAGE? [yn] y

BLK(S) MISSING IN BIT MAPS
SALVAGE? [yn] y

5197833 files, 528560982 used, 396039539 free (1740275 frags, 49287408 blocks, 0.2% fragmentation)

***** FILE SYSTEM WAS MODIFIED *****

Running a second fsck pass comes up clean. What surprises me is that my
machine has been up ~100 days... I find it hard to believe that a power
loss or similar unclean shutdown would generate filesystem corruption
that could sit silent for that long before suddenly emerging.

Cheers,
--
Paul Ripke

"Great minds discuss ideas, average minds discuss events, small minds discuss people."
-- Disputed: Often attributed to Eleanor Roosevelt. 1948.
Christos Zoulas
2014-09-01 07:01:01 UTC
Permalink
On Sep 1, 2:54pm, ***@stix.id.au (Paul Ripke) wrote:
-- Subject: Re: Unallocated inode

| Running a second fsck pass comes up clean. What surprises me is that my
| machine has been up ~100 days... I find it hard to believe that a power
| loss or similar unclean shutdown would generate filesystem corruption
| that could sit silent for that long before suddenly emerging.

Great :-)
With the size of the disks and filesystems these days you can end up not
touch large portions of them for a long time. As to why you did crash
when touch them, you were lucking. Finally this is probably a bug in
WAPBL because it is supposed to maintain metadata file system integrity,
but in this case it did not?

cristos
Paul Ripke
2014-09-02 04:27:03 UTC
Permalink
Post by Christos Zoulas
-- Subject: Re: Unallocated inode
| Running a second fsck pass comes up clean. What surprises me is that my
| machine has been up ~100 days... I find it hard to believe that a power
| loss or similar unclean shutdown would generate filesystem corruption
| that could sit silent for that long before suddenly emerging.
Great :-)
With the size of the disks and filesystems these days you can end up not
touch large portions of them for a long time. As to why you did crash
when touch them, you were lucking. Finally this is probably a bug in
WAPBL because it is supposed to maintain metadata file system integrity,
but in this case it did not?
Yeah, that's what scares me - it was the daily rsync and security cron
jobs starting to generate errors that alerted me - those inodes must've
been marked partially allocated only recently. Makes me wish I dumped
out the contents of those two inodes before running the fsck (maybe
fsdb should be able to do that?).

Cheers,
--
Paul Ripke

"Great minds discuss ideas, average minds discuss events, small minds
discuss people."
-- Disputed: Often attributed to Eleanor Roosevelt. 1948.
Christos Zoulas
2014-09-02 22:15:49 UTC
Permalink
On Sep 2, 2:27pm, ***@stix.id.au (Paul Ripke) wrote:
-- Subject: Re: Unallocated inode

| Yeah, that's what scares me - it was the daily rsync and security cron
| jobs starting to generate errors that alerted me - those inodes must've
| been marked partially allocated only recently. Makes me wish I dumped
| out the contents of those two inodes before running the fsck (maybe
| fsdb should be able to do that?).

Yes, I think fsdb can do that...

christos

Loading...