coreutils inbox - Dec 2012

This status update (subscribe via RSS) comes about 4 months after the last one, and coincides loosely with release 8.20. Note also the bug tracker with stats which records additions, feature requests and issues.

Rejected ideas

Some of the hardest work on coreutils is knowing what to reject and providing appropriate justification to the contributors. The contributions below all came since the last update and while good ideas, they were not included for various reasons detailed on the mailing list.

readlink -f output/trailing/slash/. It's easy to add the '/' in shell if needed.
dd --limit-speed. It was thought best to leave this to tools like pv or trickle.
sort to use /var/tmp by default. It was thought best to keep using /tmp as its tmp files are stateless.
expand --auto-tabs. It wasn't thought to give much benefit over just specifying --tabs.
rm -rf .. It was thought existing support for rm -rf "$PWD" suffices.
seq --format support for general printf formats. Prefixing etc. is best done outside of seq.
accurate du results for OCFS2 reflinked files. It was thought too complicated and specific to add to du.
head --read-all-input. It was thought adding more control to tee was a more general solution.
mv --symbolic-link. It was thought that mv and ln --relative separately give more control.
chmod b10111. Binary conversion can be done easily in bash or ksh.
mv --swap. To swap to files, a shell script (prehaps provided by coreutils) would be best.
df --without-header. --header options are only really useful for data consumers.
sort fixed width fields. This is already supported with: sort -d$'\n' -k1.5,1.9 ...
md5sum --pipe to output checksum to file and data to stdout. tee suffices for this.
dd conv=noerror should apply to writes as well as reads. shred is best used for this use case.
cut -f2,1 to reorder fields. Using awk or join is deemed sufficient.

Additions

Note you can see the latest changes as they're added in the NEWS file (subscribe via RSS).

cp --no-preserve=mode now works as expected.
The coreutils build process was changed to use non recursive make
timeout now propagates signals from its children.
timeout --preserve-status was added to always propagate the exit status from children.
seq is now 70 times faster in common cases, and has no range limits either in those cases.
df --output to output specific columns or all possible columns.
stat,tail now recognize vzfs, vmhgfs, ceph.
dd status=none to suppress all informational output.
factor has improved speed and range, going from 3 to 10000 times faster for small and large numbers respectively.
cp,mv,install support -Z to set the system defined context appropriate for the dest path (not completed).
numfmt. A new command for transform numbers (not completed).
df suppresses duplicates. This needs to be verified before release.
readlink supports multiple files. To match BSD and the new realpath command.

TODO

These new items were identified since the last update.

These items mentioned in the last update are not done yet.

Integrate fallocate(2) into cp and mv. The interface hasn't been improved, so we'll just use it as is
posix_fallocate() is still not used due to its dependence on fallocate(2)
libunistring is now available in debian and fedora and we're about to start using it in coreutils
integrate Linus' faster sha1sum though the benefits are arch and compiler specific. Also the much faster SSE sha1sum implementation was mentioned, which although architecture specific, is probably worth including given the ubiquity of the architecture and the performance gain it provides
sort --head or sort --range to more efficiently output a subset of the input
add a NSA/DoD verify function to shred
Handle ACLs by not using umask
Automatically use more CPU cache efficient buffer sizes in sort
Possibly integrate the threaded external sort patch
Add an inplace contrib/ script (or command) to robustly edit files in-place
Possibly add OCFS2 support to cp --reflink
uniq --key (like sort --key)
join should support sort options like -n
dd oflag={fsync,BLKFLSBUF}
cut --blank-separated
shuf --random-range=LO-HI to allow repetition within range
uniq --group to enhance grouping
csplit --suppress-matched to exclude delimiter lines from the output files
support SEEK_DATA/SEEK_HOLE in ZFS and elsewhere to efficiently copy sparse files
wc -b -M to output frequencies of characters
multiarch support in stdbuf
rename might be a candidate for coreutils
--noatime support to various recursive traversal tools
rm --no-traverse-mount-points which would be especially useful with bind mounts
du --size to filter results to above/below a specified size
stat(1) and ls(1) support for birth time. Dependent on xstat() being provided by the kernel
fmt -w should not have such low limits
group -0 to support group names with spaces etc.
cp -u should be restartable. Currently may leave partial files in dest, or wrong files in the presence of hard links.
split --confirm-create. To allow one to insert a new disk or whatever. There might be a way to do this externally?
chmod -hHLP should be supported (like BSD).
support tee --write-error={[cont],ignore,exit}
support sleep,timeout --date="..." to specify absolute times independent of suspend/resume etc.
expand seq fast path to more cases like specifying hex, adding arbitrary integers and subtraction etc.
Possibly use sendfile in cp. Not under consideration yet until benchmarked and the code made portable enough.
Add an sha-3 util. Work already started on such a GPL util elsewhere.
Possibly s/--first-only/--initial/ in unexpand, to be less ambiguous and match expand.