Monday, December 5, 2011

Monitor a list of currently mounted filesystems

You know that /proc/mounts and /proc/self/mountinfo contain list of currently mounted filesystems. These files is possible to monitor by poll(2) or select(2) functions. The util findmnt(8) exports this functionality to command line.

session A (monitor):                                          
findmnt --poll
session B (event):
mount /home/fs-images/ext2.img /mnt/test

session A (findmnt output after event):
mount /mnt/test /dev/loop0 ext2 rw,relatime,user_xattr,acl,barrier=1
The another examples; wait until /mnt/test is unmounted:
   findmnt --poll=umount --first-only /mnt/test
Inform me about all ext2, ext3, and ext4 remounts to read-only mode:
   findmnt --poll=remount --types ext2,ext3,ext4 --options ro
You can also define output columns, for example if you want to know more about "mount --move" operations:
   # findmnt --poll=move -o OLD-TARGET,TARGET,SOURCE
/mnt/test /mnt/foo /dev/loop0
the event in this example was generated by "mount --move /mnt/test /mnt/foo" command.

And for example if you want info about old and new options after remount:
   # findmnt --poll=remount -o TARGET,OLD-OPTIONS,OPTIONS
/mnt/foo ro,relatime,user_xattr,acl,barrier=1 rw,user_xattr,acl,barrier=1
this event was generated by "mount -o remount /mnt/foo -o rw,strictatime" command.

All this is available in util-linux 2.20 (e.g. Fedora 16).

Friday, November 25, 2011

wipefs(8) improvements

I finally found time to improve the command wipefs(8). The most visible change is support for partition tables. You can use wipefs(8) to remove MBR as well as GPT and many others partition tables.

The another important change (well.. bugfix) is that "wipefs -a" really erases everything what is possible to detect by libblkid (blkid(8)).

Now it calls libblkid detection code also after magic string erasing to ensure that nothing is possible to found on the device. This is important for stuff like GPT where is backup table on another place or for filesystems like FAT where is more ways to detect the superblock.

The last important change is a new command line option "-t". Now you can specify filesystem, raid or partition table name. For example
       wipefs -a -t ext4
will erase 'ext4' only. The option is interpreted in the same way how -t for mount(8) or findmnt(8), so you can specify more filesystems and you can prefix all or selected filesystems by 'no' prefix, for example:
       wipefs -a -t noext4,ext3,ext2
all but ext4, ext3 and ext2 filesystems will be erased.

If you're filesystem tools (e.g. mkfs.type) developer then you should know that now libblkid contains a new function blkid_do_wipe():
blkid_probe pr = blkid_new_probe_from_filename("/dev/sda1");

blkid_probe_enable_superblock(pr, true);
blkid_probe_set_superblocks_flags(pr, BLKID_SUBLKS_MAGIC);

while (blkid_do_probe(pr) == 0)
blkid_do_wipe(pr, 0);

and all superblocks are undetectable...

By the way, it's also good idea to call wipefs -a from system installer to avoid some unexpected problems. I have seen many bug reports from people with mess on their disks (unexpected mix of partition table and raid superblocks, swap and LUKS or ReiserFS ...etc.).

The changes will be available in the next util-linux release 2.21 (beta planned next month).

Monday, November 7, 2011

util-linux stats

Today Sami Kerola (util-linux contributor) sent me nice R graphs from git commit logs.

... yes, the project is growing

...but weekend is weekend

... oh, I thought that we all love summer!

Wednesday, July 20, 2011

dmesg(1) changes for util-linux 2.20

I have re-written the dmesg(1). That's the first large change in the code in last 18 years.

New features:
  • --decode facility and level number to human readable prefixes
$ dmesg --decode
kern :info : [26443.677632] ata1.00: configured for UDMA/100
kern :info : [26443.830225] PM: resume of devices complete after 2452.856 msecs
kern :debug : [26443.830606] PM: Finishing wakeup.
kern :warn : [26443.830608] Restarting tasks ... done.
  • filter out messages according to the --facility and --level options, for example
$ dmesg --level=err,warn

$ dmesg --facility=daemon,user

$ dmesg --facility=daemon --level=debug

  • -u, --userspace to print only userspace messages

  • -k, --kernel to print only kernel messages

  • -t, --notime to skip [...] timestamps

  • -T, --ctime to print human readable timestamp in ctime()-like format. Unfortunately, this is useless on laptops if you have used suspend/resume. (The kernel does not use the standard system time as a source for printk() and it's not updated after resume.)

  • --show-delta to print time delta between printed messages
$ dmesg --show-delta
[35523.876281 < 4.016887>] usb 1-4.1: new low speed USB device using hci_hcd and address 12
[35523.968398 < 0.092117>] usb 1-4.1: New USB device found, idVendor=413c, idProduct=2003
[35523.968408 < 0.000010>] usb 1-4.1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[35523.968416 < 0.000008>] usb 1-4.1: Product: Dell USB Keyboard

Wednesday, April 20, 2011

bind mounts, mtab and read-only

The bind mount feature is supported since Linux 2.4. It's pretty long time, but many users still think that bind mounts are something completely different to the normal mounts.

Example 1:
 # mount /dev/sdb1 /mnt/A
# mount /dev/sdb1 /mnt/B
This is not a bug. It's possible to mount the same filesystem on two places.

Example 2:
 # mount /dev/sdb1 /mnt/A
# mount --bind /mnt/A /mnt/B
The result from both examples is the same, see /proc/self/mountinfo:
 # grep mnt /proc/self/mountinfo
48 20 8:17 / /mnt/A rw,relatime - ext4 /dev/sdb1 rw,barrier=1,stripe=64,data=ordered
49 20 8:17 / /mnt/B rw,relatime - ext4 /dev/sdb1 rw,barrier=1,stripe=64,data=ordered
This is very important, from kernel point of view is it the same thing. The same filesystem is mounted on two places.

The kernel does not maintain anywhere information that /mnt/B was created by bind mount (MS_BIND mount(2) syscall flags). There is not dependence between /mnt/A and /mnt/B (for example you can umount /mnt/A).

Unfortunately, the situation in the /etc/mtab file is completely different:
 # grep mnt /etc/mtab
/dev/sdb1 /mnt/A ext4 rw 0 0
/mnt/A /mnt/B none rw,bind 0 0
This is confusing for many users. Try:
 # umount /mnt/A
# rm -rf /mnt/A

# grep mnt /etc/mtab
/mnt/A /mnt/B none rw,bind 0 0
Does the information in mtab make any sense? I don't think so... Keep this kind of information in userspace is mistake. Yeah, mtab is evil.

Everyone who uses bind mounts on system without mtab (where mtab is symlink to /proc/mounts) has to undestand that "bind" flag is no more stored anywhere. For example you have to explicitly add the flag to the mount options if you want to use read-only bind mount.
 # rm -f /etc/mtab
# ln -s /proc/mounts /etc/mtab
(or install Fedora 15:-)

Let's use findmnt(8) rather than grep /proc/self/mountinfo:
 # findmnt -o TARGET,VFS-OPTIONS,FS-OPTIONS /dev/sda1
/mnt/A rw,relatime rw,errors=continue,user_xattr,acl,barrier=0,data=ordered
/mnt/B rw,relatime rw,errors=continue,user_xattr,acl,barrier=0,data=ordered
What will happen if we try to remount with bind flag? See:
  # mount -o remount,ro,bind /mnt/B

# findmnt -o TARGET,VFS-OPTIONS,FS-OPTIONS /dev/sda1
/mnt/A rw,relatime rw,errors=continue,user_xattr,acl,barrier=0,data=ordered
/mnt/B ro,relatime rw,errors=continue,user_xattr,acl,barrier=0,data=ordered
The filesystem (superblock) is still read-write, but the /mnt/B mountpoint is in VFS marked as read-only.

And now the same thing without the bind flag:
 # mount -o remount,ro /mnt/B

# findmnt -o TARGET,VFS-OPTIONS,FS-OPTIONS /dev/sda1
/mnt/A rw,relatime ro,errors=continue,user_xattr,acl,barrier=0,data=ordered
/mnt/B ro,relatime ro,errors=continue,user_xattr,acl,barrier=0,data=ordered
the superblock has been remounted read-only, so the filesystem is read-only everywhere in the system.

Again, all this is possible independently on the way how /mnt/B has been mounted to the system (examples 1 and 2).

BTW, you can also set the block device as read-only by blockdev --setro. So we have three layers (device -> FS -> VFS) where is possible to set read-only attribute :-)

Tuesday, January 4, 2011

findmnt(8) and submounts

I just applied (to the util-linux upstream) a patch that allows to list all submounts for defined filesystem(s). For example:
$ findmnt --submounts /sys
/sys /sys sysfs rw,relatime
├─/sys/fs/cgroup tmpfs tmpfs rw,nosuid,nodev,noexec,relat
│ ├─/sys/fs/cgroup/systemd cgroup cgroup rw,nosuid,nodev,noexec,relat
│ ├─/sys/fs/cgroup/cpuset cgroup cgroup rw,nosuid,nodev,noexec,relat
│ ├─/sys/fs/cgroup/ns cgroup cgroup rw,nosuid,nodev,noexec,relat
│ ├─/sys/fs/cgroup/cpu cgroup cgroup rw,nosuid,nodev,noexec,relat
│ ├─/sys/fs/cgroup/cpuacct cgroup cgroup rw,nosuid,nodev,noexec,relat
│ ├─/sys/fs/cgroup/memory cgroup cgroup rw,nosuid,nodev,noexec,relat
│ ├─/sys/fs/cgroup/devices cgroup cgroup rw,nosuid,nodev,noexec,relat
│ ├─/sys/fs/cgroup/freezer cgroup cgroup rw,nosuid,nodev,noexec,relat
│ ├─/sys/fs/cgroup/net_cls cgroup cgroup rw,nosuid,nodev,noexec,relat
│ └─/sys/fs/cgroup/blkio cgroup cgroup rw,nosuid,nodev,noexec,relat
├─/sys/kernel/security systemd-1 autofs rw,relatime,fd=22,pgrp=1,tim
├─/sys/kernel/debug systemd-1 autofs rw,relatime,fd=24,pgrp=1,tim
└─/sys/fs/fuse/connections fusectl fusectl rw,relatime
returns info about /sys and all /sys submounts.

Now you can implement recursive umount in shell, something like:
for d in $(findmnt --list --submounts $MOUNTPOINT -o TARGET -n | tac); do
umount $d
I hope that umount(8) will support something like this ASAP.