Re: The shell and its crappy handling of whitespace
Tuesday, August 1st, 2023
I saw an interesting article on The Universe of Discourse about The shell and its crappy handling of whitespace.
I’m about thirty-five years into Unix shell programming now, and I continue to despise it. The shell’s treatment of whitespace is a constant problem. The fact that
for i in *.jpg; do cp $i /tmp donedoesn’t work is a constant pain. The problem here is that if one of the filenames is
bite me.jpg
then thecp
command will turn intocp bite me.jpg /tmp
and fail, saying
cp: cannot stat 'bite': No such file or directory cp: cannot stat 'me.jpg': No such file or directory
or worse there is a file named
bite
that is copied even though you did not want to copy it, maybe overwriting/tmp/bite
that you wanted to keep.To make it work properly you have to say
for i in *; do cp "$i" /tmp done
with the quotes around the $i.
The article then goes on:
Suppose I want to change the names of all the .jpeg files to the corresponding names with .jpg instead. I can do it like this:
for i in *.jpeg; do mv $i $(suf $i).jpg doneHa ha, no,some of the files might have spaces in their names. […]
before finally settling on the quote-hell version:
for i in *.jpeg; do mv "$i" "$(suf "$i")".jpg # three sets of quotes done
This sparked some interesting discussions on Lobste.rs and Hacker News, and several people suggested that other shells do this properly, suggesting that there is no proper solution for this in standard shells such as bash.
A proper solution
However, this problem has long been solved and is in fact part of the POSIX standard. That solution is called the IFS, or the Internal Field Separator:
The shell shall treat each character of the IFS as a delimiter and use the delimiters as field terminators to split the results of parameter expansion, command substitution, and arithmetic expansion into fields.
I’m quite surprised that noone on the Hacker News of Lobste.rs discussions mentioned it. You can simply set the IFS to an empty value, and things work a lot saner. For example, to achieve the post’s example:
Suppose I want to change the names of all the .jpeg files to the corresponding names with .jpg instead.
You can simply do something like:
# Create some test files echo "foo" > "foo.jpg" echo "bar" > "bar.jpg" echo "baz quux" > "baz cuux.jpg" # Store old IFS and set it to empty value OLD_IFS="$IFS"; IFS="" for I in *; do # No need to quote anything at all! mv $I $(basename $I .txt).jpeg done # Reset IFS to previous value IFS="$OLD_IFS" ls -l
Which results in:
-rw-r--r-- 1 Gebruiker Geen 0 Aug 1 10:07 bar.jpeg -rw-r--r-- 1 Gebruiker Geen 0 Aug 1 10:07 'baz cuux.jpeg' -rw-r--r-- 1 Gebruiker Geen 0 Aug 1 10:07 foo.jpeg
Sure, it feels a little bit uncomfortable, but it’s a perfectly fine solution nonetheless.