coreutils inbox - Aug 2010

This status update comes about 4 months after the last one. As always you can see the latest changes as they're added in the NEWS file (changes via RSS). There is also a bug tracker available to record additions, feature requests and issues.

Rejected ideas

Some of the hardest work on coreutils is knowing what to reject and providing appropriate justification to the contributors. The contributions below all came since the last update and while good ideas, they were not included for various reasons detailed on the mailing list.

fold --indent. `fmt -t | sed 's/^ / /'` was deemed sufficient
hostname -b. Setting a default hostname is too platform dependent
truncate -s +50%. Percentage calculation was thought best handled outside of truncate
cut --csv. A separate util was deemed best for this complicated task
mktemp -tp. It was thought better to create a fifo in a temp dir, rather than a temp fifo directly
rm -d. rmdir is equivalent and less confusing
date -iso8601. date --rfc-3339 was deemed sufficient
uniq -c --total. piping to awk '{t+=$1}END{print t,"total"}1' was deemed sufficient

Additions

cp --attributes-only was added to only copy file metadata
ls uses the narrower traditional time/date style by default, which includes the locale's abbreviated month name
sort -g uses long doubles rather than just doubles for greater range and precision and similar performance
sort --debug was added to warn about questionable options and highlight key extents
sort uses all available processors by default, which can be limited with --parallel, `taskset` or OMP_NUM_THREADS
sort supports more options in combination with -R and -V. -R is now allowed with -dfiV, and -V with -di
stat supports %m to output the mount point. Similar to that output by df, but also handling bind mounts
truncate supports setting a file's size relative to an existing file
POSIX_FADV_SEQUENTIAL hint was provided to appropriate utils

TODO

These items mentioned in the last update are not done yet.

Integrate fallocate(2) into cp and mv. The interface hasn't been improved, so we'll just use it as is
posix_fallocate() is still not used due to its dependence on fallocate(2)
libunistring is now available in debian and fedora and we're about to start using it in coreutils
integrate Linus' faster sha1sum. Also the much faster SSE sha1sum implementation was mentioned, which although architecture specific, is probably worth including given the ubiquity of the architecture and the performance gain it provides
support unlimited number of split files
split --number is mostly ready for inclusion with the interface fully defined
dd skip_bytes count_bytes to efficiently extract portions of a file
speed up seq
sort --range to more efficiently output a subset of the input
add a NSA/DoD verify function to shred
Fix column alignment in df (and pinky and stat)
More sensible cp --preserve=mode behavior
Handle ACLs by not using umask
Add a status=noinfo option to silence dd
Automatically use more CPU cache efficient buffer sizes in sort
Possibly integrate the threaded external sort patch
Add an inplace contrib/ script (or command) to robustly edit files in-place
uniq --key (like sort --key)
Use fiemap for mv and cp to efficiently copy sparse files
Add OCFS2 support to cp --reflink

Also new items cropped up since the last update.