This status update comes about 4 months after the last one
and coincides with the release of coreutils 8.5. As always you can see the latest changes
as they're added in the NEWS file,
and you can also subscribe to see those changes.
Another resource available since the last release is a new bug tracker.
Rejected ideas
Some of the hardest work on coreutils is knowing what to reject and providing appropriate justification to the contributors. The contributions below all came since the last update and while good ideas, they were not included for various reasons detailed on the mailing list.- cut -d 'string'. sed 's/string/\x00/g' | cut -d '' was deemed sufficient
- cut --output-delimiter short option. One can already do cut --ou
- An errno utility. A full C wrapper around strerror() was deemed overkill. Maybe we'll add a script to contrib/
- chmod maintains ctime when permissions unchanged. The proposed patch was deemed inefficient
- chmod -d to set perms on just directories. The 'X' mode, or `find` with `chmod` was deemed sufficient
- chmod +S to set setgid on just directories. `find` in combination with `chmod` was deemed sufficient
- configurable md5sum buffer size. It was thought better to use NFS parameters to minimize network latency, or the stdbuf utility to control the buffering more generally
- md5sum --threads. UNIX tools were deemed good enough to process separate files in parallel
- sort -V auto ignores white-space. One can do that more generally with -b
- dd conv=sparse. cp with fiemap and `cp --sparse=always /dev/stdin` were deemed sufficicent until there is more file system support for punching holes in the middle of a file
- mv -p (create target dir). It was thought more functional to just `mkdir -p` first
Additions
Once again, the last few months have been mainly concerned with stabilization, but there have been a few new features:- join got the --header option which essentially allows one to use the recent --check-order option with headings
- join -t '' (empty) now operates on the whole line (like sort does by default)
- sort uses posix_fadvise() to indicate it will stream its input. This was seen to increase performance in certain cases, and will probably be applied to other utilities
- timeout got the --kill-after option to send a KILL signal after the specified duration
TODO
These items mentioned in the last update are not done yet.- Integrate fallocate(2) into cp and mv. The interface hasn't been improved, so we'll just use it as is
- posix_fallocate() is still not used due to its dependence on fallocate(2)
- libunistring is now available in debian and fedora and we're about to start using it in coreutils
- add sort --debug to help figure out the complicated key selection rules in sort
- integrate Linus' faster sha1sum. Also the much faster SSE sha1sum implementation was mentioned, which although architecture specific, is probably worth including given the ubiquity of the architecture and the performance gain it provides
- support unlimited number of split files
- split --number is mostly ready for inclusion with the interface fully defined
- dd skip_bytes count_bytes to efficiently extract portions of a file
- speed up seq
- sort --range to more efficiently output a subset of the input
- add a NSA/DoD verify function to shred
- cp --attributes-only to support the inplace script
- Fix column alignment in df (and pinky and stat)
- More sensible cp --preserve=mode behavior
- Handle ACLs by not using umask
- Add a status=noinfo option to silence dd
- Automatically use more CPU cache efficient buffer sizes in sort
- Possibly integrate the threaded external sort patch
- Integrate the threaded internal sort patch
- Possibly integrate the rm -d patch
- Add an inplace contrib/ script (or perhaps a stand alone utility) to robustly edit files in-place
- uniq --key (like sort --key)
- Use fiemap for mv and cp to efficiently copy sparse files
- Add OCFS2 support to cp --reflink
© Apr 27 2010