Testing 7.0 Beta: FFS still very slow when creating files

Discussion:

Stephan

2014-08-22 15:59:37 UTC

Hi,

I´m testing a current build of NetBSD 7.0 BETA and found that an old
FFS issue is still there. It was discussed here:

https://mail-index.netbsd.org/tech-kern/2012/10/25/msg014236.html

Creating many files on FFSv2 is very slow. One example from this box:

# time sh -c 'seq 1 30000|xargs touch '
114.48s real 0.23s user 113.89s system

load averages: 0.28, 0.10, 0.03; up 0+00:14:04 17:53:22
23 processes: 21 sleeping, 2 on CPU
CPU states: 0.0% user, 0.0% nice, 50.0% system, 0.0% interrupt, 50.0% idle
Memory: 44M Act, 7844K Exec, 26M File, 1827M Free
Swap: 2040M Total, 2040M Free

Remarkable is that the disk does almost nothing during this, but one
CPU is completely utilized in kernel mode. Reopening those files is
much faster (as fast as creating them should be):

# time sh -c 'seq 1 30000|xargs touch '
1.09s real 0.16s user 1.01s system

I see something similar happening on tmpfs:

# time sh -c 'seq 1 30000|xargs touch '
36.34s real 0.38s user 36.04s system

Creating files is faster, but still not as fast as it should be. The
only filesystem that (like before) performs fast is the experimental
ZFS port:

# time sh -c 'seq 1 30000|xargs touch '
3.73s real 0.19s user 3.63s system

Has anybody an idea on this or how to track this down? At the moment,
I can´t even enter ddb using Strg+Alt+Esc keys for some reason. I´ve
also seen people playing with dtrace but that doesn´t seem to be
included.

Regards,

Stephan

PS:

My filesystems:

# mount
/dev/wd0a on / type ffs (local)
/dev/wd0e on /a type ffs (log, local)
/dev/wd0g on /c type ffs (local)
kernfs on /kern type kernfs (local)
ptyfs on /dev/pts type ptyfs (local)
procfs on /proc type procfs (local)
tmpfs on /var/shm type tmpfs (local)
mypool on /mypool type zfs (local)
tmpfs on /A type tmpfs (local)

BTW: I told sysinst to make the root filesystem a FFSv2 one, but I
think the "log" option is missing in /ets/fstab. I tested on /a.

This is what lockstat says on FFS:

# lockstat time sh -c 'seq 1 30000|xargs touch '
114.56 real 0.23 user 113.95 sys
Elapsed time: 114.58 seconds.

-- Adaptive mutex spin

Total% Count Time/ms Lock Caller
------ ------- --------- ---------------------- ------------------------------
98.59 15 81.10 mntvnode_lock <all>
98.59 14 81.10 mntvnode_lock vfs_insmntque+2a
0.00 1 0.00 mntvnode_lock vfs_vnode_iterator_destroy+14
0.78 47 0.64 fffffe807f4f2140 <all>
0.64 43 0.52 fffffe807f4f2140 uvm_fault_internal+857
0.13 3 0.11 fffffe807f4f2140 uvm_unmap_remove+200
0.01 1 0.01 fffffe807f4f2140 genfs_lock+83
0.56 4 0.46 fffffe807f4f2c80 cache_lookup+62
0.02 1 0.01 fffffe807f4f2580 uvm_fault_internal+857
0.02 1 0.01 fffffe80779f0840 uvm_fault_internal+363
0.01 20 0.01 bufcache_lock <all>
0.01 10 0.01 bufcache_lock getblk+2f
0.00 6 0.00 bufcache_lock bdwrite+146
0.00 2 0.00 bufcache_lock brelse+19
0.00 2 0.00 bufcache_lock bwrite+e5
0.01 1 0.01 fffffe8077679440 uvm_fault_internal+363
0.01 1 0.01 fffffe807eb8cf80 uvm_fault_internal+857
0.00 15 0.00 uvm_pageqlock <all>
0.00 6 0.00 uvm_pageqlock uvm_fault_internal+94d
0.00 5 0.00 uvm_pageqlock uvm_fault_lower_enter+189
0.00 4 0.00 uvm_pageqlock uvm_fault_upper_enter.isra.5+11
0.00 1 0.00 fffffe807d4b63f8 wapbl_add_buf+3f

-- Adaptive mutex sleep

Total% Count Time/ms Lock Caller
------ ------- --------- ---------------------- ------------------------------
100.00 1 4.76 mntvnode_lock vfs_insmntque+2a

-- Spin mutex spin

Total% Count Time/ms Lock Caller
------ ------- --------- ---------------------- ------------------------------
52.32 1 0.00 fffffe807f4f2bc0 mi_switch+bd
47.68 3 0.00 uvm_fpageqlock <all>
30.33 2 0.00 uvm_fpageqlock uvm_pagealloc_strat+c3
17.35 1 0.00 uvm_fpageqlock uvm_pagefree+24c

-- RW lock sleep (writer)

Total% Count Time/ms Lock Caller
------ ------- --------- ---------------------- ------------------------------
100.00 9 0.28 fffffe807d4b6400 wapbl_flush+84

-- RW lock sleep (reader)

Total% Count Time/ms Lock Caller
------ ------- --------- ---------------------- ------------------------------
100.00 10 275.20 fffffe807d4b6400 wapbl_begin+6a

-- RW lock spin

Total% Count Time/ms Lock Caller
------ ------- --------- ---------------------- ------------------------------
97.81 2 0.09 fffffe807d4b6400 wapbl_begin+6a
2.19 1 0.00 fffffe807ed96360 genfs_lock+7b

-- Kernel lock spin

Total% Count Time/ms Lock Caller
------ ------- --------- ---------------------- ------------------------------
100.00 42 4.94 kernel_lock <all>
97.51 10 4.81 kernel_lock bdev_strategy+53
2.02 18 0.10 kernel_lock VOP_POLL+43
0.31 9 0.02 kernel_lock intr_biglock_wrapper+12
0.14 4 0.01 kernel_lock biodone2+6f
0.02 1 0.00 kernel_lock VOP_LOCK+43

And the box´s dmesg:

# dmesg
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014
The NetBSD Foundation, Inc. All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.

NetBSD 7.0_BETA (GENERIC.201408210820Z)
total memory = 2039 MB
avail memory = 1963 MB
kern.module.path=/stand/amd64/7.0/modules
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
MICRO-STAR INTERNATIONAL CO., LTD MS-7418 ( )
mainbus0 (root)
ACPI: RSDP 0xf91a0 000024 (v02 ACPIAM)
ACPI: XSDT 0x7f7c0100 000054 (v01 MSI_NB NOTEBOOK 20081119 MSFT 00000097)
ACPI: FACP 0x7f7c0290 0000F4 (v03 111908 FACP1105 20081119 MSFT 00000097)
ACPI: DSDT 0x7f7c05c0 003E75 (v01 A7418 A7418170 00000170 INTL 20051117)
ACPI: FACS 0x7f7ce000 000040
ACPI: APIC 0x7f7c0390 00006C (v01 111908 APIC1105 20081119 MSFT 00000097)
ACPI: MCFG 0x7f7c0400 00003C (v01 111908 OEMMCFG 20081119 MSFT 00000097)
ACPI: SLIC 0x7f7c0440 000176 (v01 MSI_NB NOTEBOOK 20081119 MSFT 00000097)
ACPI: OEMB 0x7f7ce040 000061 (v01 111908 OEMB1105 20081119 MSFT 00000097)
ACPI: HPET 0x7f7c4440 000038 (v01 111908 OEMHPET 20081119 MSFT 00000097)
ACPI: All ACPI Tables successfully acquired
cpu0 at mainbus0 apid 0: Intel(R) Atom(TM) CPU 230 @ 1.60GHz, id 0x106c2
cpu1 at mainbus0 apid 1: Intel(R) Atom(TM) CPU 230 @ 1.60GHz, id 0x106c2
ioapic0 at mainbus0 apid 2: pa 0xfec00000, version 0x20, 24 pins
acpi0 at mainbus0: Intel ACPICA 20131218
acpi0: X/RSDT: OemId <MSI_NB,NOTEBOOK,20081119>, AslId <MSFT,00000097>
acpi0: SCI interrupting at int 9
timecounter: Timecounter "ACPI-Fast" frequency 3579545 Hz quality 1000
hpet0 at acpi0: high precision event timer (mem 0xfed00000-0xfed00400)
timecounter: Timecounter "hpet0" frequency 14318180 Hz quality 2000
MCH (PNP0C01) at acpi0 not configured
attimer1 at acpi0 (TMR, PNP0100): io 0x40-0x43 irq 0
pcppi1 at acpi0 (SPKR, PNP0800): io 0x61
midi0 at pcppi1: PC speaker
sysbeep0 at pcppi1
RMSC (PNP0C02) at acpi0 not configured
FWH (INT0800) at acpi0 not configured
FWHE (PNP0C02) at acpi0 not configured
OMSC (PNP0C02) at acpi0 not configured
PCIE (PNP0C02) at acpi0 not configured
RMEM (PNP0C01) at acpi0 not configured
acpibut0 at acpi0 (PWRB, PNP0C0C-170): ACPI Power Button
ACPI: Enabled 1 GPEs in block 00 to 1F
ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_]
(20131218/hwxface-646)
ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S3_]
(20131218/hwxface-646)
attimer1: attached to pcppi1
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0: vendor 0x8086 product 0x2770 (rev. 0x02)
agp0 at pchb0: i915-family chipset
agp0: detected 7932k stolen memory
agp0: aperture at 0xe0000000, size 0x10000000
vga0 at pci0 dev 2 function 0: vendor 0x8086 product 0x2772 (rev. 0x02)
wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation)
wsmux1: connecting to wsdisplay0
i915drm0 at vga0: Intel i945G
i915drm0: AGP at 0xe0000000 256MB
i915drm0: Initialized i915 1.6.0 20080730
hdaudio0 at pci0 dev 27 function 0: HD Audio Controller
hdaudio0: interrupting at ioapic0 pin 16
hdafg0 at hdaudio0: Realtek ALC888
hdafg0: DAC00 8ch: Speaker [Jack]
hdafg0: DAC01 2ch: HP Out [Jack]
hdafg0: ADC02 2ch: Line In [Jack], Mic In [Jack]
hdafg0: ADC03 2ch: Mic In [Jack]
hdafg0: 8ch/2ch 44100Hz 48000Hz 96000Hz 192000Hz PCM16 PCM20 PCM24 AC3
audio0 at hdafg0: full duplex, playback, capture, mmap, independent
ppb0 at pci0 dev 28 function 0: vendor 0x8086 product 0x27d0 (rev. 0x01)
ppb0: PCI Express 1.0 <Root Port of PCI-E Root Complex> x1 @ 2.5Gb/s
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
re0 at pci1 dev 0 function 0: RealTek 8168/8111 PCIe Gigabit Ethernet
(rev. 0x02)
re0: interrupting at ioapic0 pin 16
re0: Ethernet address 00:21:85:97:da:09
re0: using 256 tx descriptors
rgephy0 at re0 phy 7: RTL8169S/8110S/8211 1000BASE-T media interface, rev. 2
rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto
ppb1 at pci0 dev 28 function 1: vendor 0x8086 product 0x27d2 (rev. 0x01)
ppb1: PCI Express 1.0 <Root Port of PCI-E Root Complex> x1 @ 2.5Gb/s
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
ppb2 at pci0 dev 28 function 2: vendor 0x8086 product 0x27d4 (rev. 0x01)
ppb2: PCI Express 1.0 <Root Port of PCI-E Root Complex> x1 @ 2.5Gb/s
pci3 at ppb2 bus 3
pci3: i/o space, memory space enabled, rd/line, wr/inv ok
ppb3 at pci0 dev 28 function 3: vendor 0x8086 product 0x27d6 (rev. 0x01)
ppb3: PCI Express 1.0 <Root Port of PCI-E Root Complex> x1 @ 2.5Gb/s
pci4 at ppb3 bus 4
pci4: i/o space, memory space enabled, rd/line, wr/inv ok
uhci0 at pci0 dev 29 function 0: vendor 0x8086 product 0x27c8 (rev. 0x01)
uhci0: interrupting at ioapic0 pin 23
usb0 at uhci0: USB revision 1.0
uhci1 at pci0 dev 29 function 1: vendor 0x8086 product 0x27c9 (rev. 0x01)
uhci1: interrupting at ioapic0 pin 19
usb1 at uhci1: USB revision 1.0
uhci2 at pci0 dev 29 function 2: vendor 0x8086 product 0x27ca (rev. 0x01)
uhci2: interrupting at ioapic0 pin 18
usb2 at uhci2: USB revision 1.0
uhci3 at pci0 dev 29 function 3: vendor 0x8086 product 0x27cb (rev. 0x01)
uhci3: interrupting at ioapic0 pin 16
usb3 at uhci3: USB revision 1.0
ehci0 at pci0 dev 29 function 7: vendor 0x8086 product 0x27cc (rev. 0x01)
ehci0: interrupting at ioapic0 pin 23
ehci0: EHCI version 1.0
ehci0: companion controllers, 2 ports each: uhci0 uhci1 uhci2 uhci3
usb4 at ehci0: USB revision 2.0
ppb4 at pci0 dev 30 function 0: vendor 0x8086 product 0x244e (rev. 0xe1)
pci5 at ppb4 bus 5
pci5: i/o space, memory space enabled
ichlpcib0 at pci0 dev 31 function 0: vendor 0x8086 product 0x27b8 (rev. 0x01)
timecounter: Timecounter "ichlpcib0" frequency 3579545 Hz quality 1000
ichlpcib0: 24-bit timer
ichlpcib0: TCO (watchdog) timer configured.
piixide0 at pci0 dev 31 function 1: Intel 82801GB/GR IDE Controller
(ICH7) (rev. 0x01)
piixide0: bus-master DMA support present
piixide0: primary channel configured to compatibility mode
piixide0: primary channel interrupting at ioapic0 pin 14
atabus0 at piixide0 channel 0
piixide0: secondary channel configured to compatibility mode
piixide0: secondary channel interrupting at ioapic0 pin 15
atabus1 at piixide0 channel 1
piixide1 at pci0 dev 31 function 2: Intel 82801GB/GR Serial ATA/Raid
Controller (ICH7) (rev. 0x01)
piixide1: bus-master DMA support present
piixide1: primary channel configured to native-PCI mode
piixide1: using ioapic0 pin 19 for native-PCI interrupt
atabus2 at piixide1 channel 0
piixide1: secondary channel configured to native-PCI mode
atabus3 at piixide1 channel 1
ichsmb0 at pci0 dev 31 function 3: vendor 0x8086 product 0x27da (rev. 0x01)
ichsmb0: interrupting at ioapic0 pin 19
iic0 at ichsmb0: I2C bus
isa0 at ichlpcib0
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
acpicpu0 at cpu0: ACPI CPU
acpicpu0: C1: HLT, lat 0 us, pow 0 mW
coretemp0 at cpu0: thermal sensor, 1 C resolution
acpicpu1 at cpu1: ACPI CPU
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
uhub0 at usb1: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhub1 at usb3: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhub2 at usb0: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhub3 at usb2: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
uhub4 at usb4: vendor 0x8086 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub4: 8 ports with 8 removable, self powered
IPsec: Initialized Security Association Processing.
umass0 at uhub4 port 6 configuration 1 interface 0
umass0: Generic USB2.0-CRW, rev 2.00/58.87, addr 2
umass0: using SCSI over Bulk-Only
scsibus0 at umass0: 2 targets, 1 lun per target
sd0 at scsibus0 target 0 lun 0: <Generic-, Multi-Card, 1.00> disk removable
sd0: drive offline
wd0 at atabus2 drive 0
wd0: <WDC WD3200AAJS-22B4A0>
wd0: drive supports 16-sector PIO transfers, LBA48 addressing
wd0: 298 GB, 620181 cyl, 16 head, 63 sec, 512 bytes/sect x 625142448 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)
wd0(piixide1:0:0): using PIO mode 4, Ultra-DMA mode 6 (Ultra/133) (using DMA)
atapibus0 at atabus3: 2 targets
cd0 at atapibus0 drive 0: <ATAPI DVD A DH20A6S, , 7P55> cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
cd0(piixide1:1:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA)
Kernelized RAIDframe activated
pad0: outputs: 44100Hz, 16-bit, stereo
audio1 at pad0: half duplex, playback, capture
boot device: wd0
root on wd0a dumps on wd0b
root file system type: ffs
wsdisplay0: screen 1 added (80x25, vt100 emulation)
wsdisplay0: screen 2 added (80x25, vt100 emulation)
wsdisplay0: screen 3 added (80x25, vt100 emulation)
wsdisplay0: screen 4 added (80x25, vt100 emulation)

Edgar Fuß

2014-08-22 16:11:57 UTC