PRIMARY CATEGORY → BASH

IFS as Internal/Input Field Separator, basically a set of chars that acts as delimiters when Word Splitting is performed on an input string or parameter reference

By default, IFS parameter is assigned as value a blank, tab char and newline →

$ printf "%q\n" "$IFS" # or printf "%s\n" "${IFS@Q}" | ANSI-C output Format
$' \t\n'

If IFS is not set, It behaves such as its default value →

# IFS default value → $' \t\n'
$ foo="This is a test"
$ printf "=%s= " $foo # Not double quoting
=This= =is= =a= =test= # 4 Words/Fields
$ ( unset -v -- IFS ; printf "=%s= " $foo ) # IFS Unset
=This= =is= =a= =test= # 4 Words

There’s no Word Splitting if IFS has as value an empty string →

$ ( IFS='' ; printf "=%s= " $foo ) # IFS with empty string as value
=This is a test= # 1 Word

When IFS is relevant ?

Read Shell builtin

read’s input string processing steps →

  • read receives a string as input
  • That string undergoes Word Splitting according to IFS values
  • Every field resulting from prior splitting is assigned to a read’s parameter
$ read -r _A _B _C <<< "Ubuntu Arch Debian" # IFS Default Value
$ printf "=%s= " "$_A" "$_B" "$_C"
=Ubuntu= =Arch= =Debian=
$ IFS=: read -r _A _B _C <<< "Ubuntu:Arch:Debian" # : as IFS's value
$ printf "=%s= " "$_A" "$_B" "$_C"
=Ubuntu= =Arch= =Debian=

Last variable gets the remaining words if resulting fields’ number is greater than the number of parameters defined by read

$ read -r _A _B <<< "Alex John Doe" # IFS Default Value
$ printf "=%s= " "$_A" "$_B"
=Alex= =John Doe= # _B receives the remaining args

if read -a is specified, each field resulting from field splitting is assigned into an array as element, according to IFS

$ read -ra _distros <<< "Ubuntu Arch Debian Fedora"
$ printf "=%s= " "${_distros[@]}"
=Ubuntu= =Arch= =Debian= =Fedora=

Any leading and trailing blanks and tabs \s are trimmed from input string if IFS is set to those values (e.g. IFS default value or Unset)

$ read -r _A <<< $'     \t \t foo \t   ' # IFS → $' \t\n'
$ printf "%q\n" "$_A" # ANSI-C Formatted String
foo

Likewise, string’s inner blanks and \t are consolidated into one char →

$ read -r _A _B <<< $'  foo    bar   '
$ printf "=%s= " "$_A" "$_B"
=foo= =bar= # 

While if IFS is set to an empty string and therefore field splitting is not performed, the following situations do not arise →

  • Leading and Trailing blanks and tabs are not stripped out
  • The inner ones are not consolidated
$ IFS='' read -r _A <<< $'     \t \t foo \t   ' # IFS → empty string
$ printf "%q\n " "$_A"
$'     \t \t foo \t   ' # N

A similar case arise when IFS has as its value non-whitespace characters. Again, no initial or final trimming and no inner consolidation is performed on IFS chars →

$ IFS=: read -r _A _B _C _D <<< ":test::foo::::bar::::"
$ printf "=%s= " "$_A" "$_B" "$_C" "$_D"
== =test= == =foo::::bar::::=
Unquoted Shell Expansion

When Shell Expansion is performed in shell’s parsing, all unquoted expansion undergoes Word Splitting and Globbing

As mentioned above, word splitting splits the resulting expansion string into different fields or words through IFS parameter’s value

Therefore, following case undergoes word splitting →

$ foo="God's time is perfect"
$ printf "|%s| " $foo # Variable reference not quoted
|God's| |time| |is| |perfect|

While this one does not →

$ printf "|%s| " "$foo" # Variable reference quoted
|God's time is perfect|

Same happens with other Shell Expansions like Command Substitution

Be aware that any variable reference inside command substitution must be quoted even if command subtitution itself is quoted

$ file="Lord of the Rings.mp3"
$ echo "$( rm -f -- $file )" # Wrong!
$ echo "$( rm -f -- "$file" )" # Correct! Inner variable reference quoted
$* Quoted Form

When positional parameters are expanded through Parameter Expansion, if expansion is not quoted, Word Splitting applies on the resulting string according to IFS values

To expand all positional parameters, It can be done using $* or $@ expansions and their quoted forms, "$*" and "$@", respectively

Let’s create a function that prints the Args number and each of them to better appreciate the following behavior →

foo ()
{
    printf "Args -> %d |" "$#" # Number of arguments
    printf " =%s=" "$@" ; echo # Each argument
}
$ set -- "Ubuntu Focal" "Linux Mint" "Debian Bookworm" # Set Positional Args
  • $* unquoted expansion undergoes word splitting. Therefore, that action is applied for each positional parameter
$ foo $*
Args -> 6 | =Ubuntu= =Focal= =Linux= =Mint= =Debian= =Bookworm=
  • Likewise, $@ unquoted expansion undergoes word splitting. As mentioned above repeteadly, not being quoted means that split-glob is performed on that expansion
$ foo $@
Args -> 6 | =Ubuntu= =Focal= =Linux= =Mint= =Debian= =Bookworm=

If above expansions are quoted:

  • "$*" → Expands all positional parameters as a single string where each parameter is separated by the first IFS value
$ export -f -- foo
$ ( IFS=: ; foo "$*" )
Args -> 1 | =Ubuntu Focal:Linux Mint:Debian Bookworm=
  • "$@" → Expands all positional parameters, but, unlike the above one, each positional parameter is treated as a single quoted word
$ foo "$@"
Args -> 3 | =Ubuntu Focal= =Linux Mint= =Debian Bookworm=

That is, "$@" is the same as "$1" "$2" "$3" ...

Same occurs with array expansion to extract all array elements →

$ declare -a -- _names=( "John Doe" "Richard Stallman" )
$ foo ${_names[*]} # Same as $*
Args -> 4 | =John= =Doe= =Richard= =Stallman=
$ foo ${_names[@]} # Sames as $@
Args -> 4 | =John= =Doe= =Richard= =Stallman=
$ ( IFS=: ; foo "${_names[*]}" ) # Sames as "$*"
Args -> 1 | =John Doe:Richard Stallman= 
$ foo "${_names[@]}" # Sames as "$@"
Args -> 2 | =John Doe= =Richard Stallman=

Ways to set IFS’s values

There are several ways to assign values to IFS parameter, both POSIX and Non-POSIX Compliant

Take into account that It may arise situations where It’s necessary to modify IFS without affect its value globally, like as follows:

  • Any function that returns/prints array elements as a single string with each element separated by IFS’s first value →
(
        declare -a -- _array=( A B C )
        IFS=: # : as IFS value
        printf "%s\n" "${_array[*]}"
)
(
        IFS=$'\n' ; set -f # Set IFS to newline and Disable Globs
        for _file in $( find . -name '.' -o -print ) # Omit . directory
        do
                printf "File -> %s\n" "$_file"
        done
)

That is, perform any actions with system files. See Globbing for more information

  • As explained in the sections at the beginning, any File processing line-by-line via command subtitution
(
        IFS=$'\n' ; set -f # IFS to \n and Globbing Disabled
        for _line in $(< ./foo ) # Same as $( cat ./foo )
        do
                printf "%s\n" "$_line"
        done
)

Note that all above examples are executed inside subshell ( ) to avoid modify IFS’s value globally and Shell Options such as the globbing one (set -f)

Also as mentioned several times, all parameter creation or modification inside a subshell is not reflected outside it (i.e. Those changes do not apply in Shell Parent’s env)

To handle that situation, there are several ways to perform IFS and Shell Options modification without affect them globally plus keep any parameters modification outside child processes

Local Shell Builtin + C-ANSI Quoting

Non-POSIX Compliant

foo()
{
        local -- _line= \
                 IFS=$'\n' \ # \n as IFS value through ANSI-C Format
                 _oldSetOptions=$( set +o ) # Store Shell Options
        set -f # Disable Globbing (noglob opt)
 
        for _line in $( < ./foo ) # Non-POSIX Compliant
        do
                printf "%s\n" "$_line"
        done
 
        eval "$_oldSetOptions" # Restore Shell Options
}
Eval + Printf

POSIX Compliant

foo()
{
        _line= _savedIFS=
        _oldSetOptions=$( set +o ) # Save Shell Options
 
        [ -n "${IFS+set}" ] && _savedIFS=$IFS # If set, saves IFS's value
 
        set -f # Disable Globbing
        eval "$( printf 'IFS="\n"' )" # Assing \n to IFS as its value
 
        for _line in $( cat < ./foo ) # Command Substitution processing
        do
                printf "%s\n" "$_line"
        done
 
        unset -v -- IFS # IFS Cleanup
 
        [ -n "${_savedIFS+set}" ] && { # If IFS was set, restore it
 
                IFS=$_savedIFS ; unset -- _savedIFS ; }
 
        eval "$_oldSetOptions" # Restore Shell Options
}
Printf + Parameter Expansion

POSIX Compliant

$ IFS=$( printf '\nX' )
$ IFS=${IFS%X}

Having seen the above situations, It has to be said that no one should read file lines with a for loop since this way need to process a Command Substitution’s output

That expansion cannot be quoted as It will be treated as a single string. Therefore, only one iteration will be done with that string as the for loop’s parameter value

Thus, Word Splitting and Globbing will be performed together with command subtitution trimming trailing newlines from its output

Also, remember that, once expansion is performed and above actions are occurs on output’s string, the for loop processes each resulting word/field assigning it to the declared parameter

$ for _line in $(< ./foo) ; do printf "%s\n" "$_line" ; done

As mentioned earlier, this situation can be improved modifying IFS to a newline and disabling globbing with set -f sentence

(
        IFS=$'\n'
        set -f
 
        for _line in $( < ./foo)
        do
                printf "%s\n" "$_line"
        done
)

With above code, because of IFS limited to just a newline, a line with blanks between non-whitespace chars is not split into several lines due to Word Splitting and the foor loop processing

Globbing character are not interpreted neither, therefore, no filename expansion is performed and no line is generated for each file matched with that glob pattern

Although, expansion continues trimming trailing newlines from its output

Likewise, since IFS has \n as its value and newlines characters are considered whitespace chars, any consecutive sequence of newlines is consolidated as one single delimiter

In other words, above situation causes the empty lines to be skipped

You cannot possibly preserve blank lines if you are relying on IFS to split on newlines

Moreover, that IFS modification will be remain in the loop context, which means that any unquoted expansion or other situations where Word Splitting acts, will be taken according to that IFS value

That why FOR LOOP IS NOT A RECOMMENDED WAY TO PROCESS FILE LINES

More info. here

Instead of the above one, this is the recommended way →

Correct Processing of a File’s Lines

POSIX Compliant

while IFS= read -r _line
do
        printf "%s\n" "$_line"
 
done < ./foo

That’s all.

Note that above correct way to handle file lines is way more reliable and shorter than this one. The same applies with this one

If a file’s last line does not end with a newline character \n, read process it but returns false

Since while loop iterates until read returns false, the remaning line is stored in read’s declared parameter but It’s not processed inside the loop itself

To prevent above situation, just process that remaning line individually →

while IFS= read -r _line
do
        printf "%s\n" "$_line"
 
done < ./foo
 
[[ -n $_line ]] && printf "%s\n" "$_line"

while loop stops if read returns false, which means that It has reached the EOF (i.e. a line which end with \n)

Then, $_line’s value is checked. It it has content (i.e. no empty string), then It’s processed

Basically, It process last line which lacks a trailing newline but it has been read by read

The same applies to →

while IFS= read -r _line || [[ -n $_line ]]
do
        printf "%s\n" "$_line"
 
done < ./foo

While loop iterates until read returns false, then [[ ]] keyword checks if there was content after last \n processed by read. If true, that content (parameter’s value) is processed

This causes that while loop continues iterating one last time even if read command returns false, so that the last line with no newline can be processed

Note that this does not work →

printf 'foo\nbar' | while IFS= read -r _line
do
        printf "%s\n" "$_line"
done

And this does not work either →

printf 'foo\nbar' | while IFS= read -r _line
do
        printf "%s\n" "$_line"
done
 
[[ -n $_line ]] && pritnf "%s\n" "$_line"

The following would be the correct way to iterate over an input stream using pipelines to redirect first command’s output as input of the while loop

printf 'foo\nbar' | {
 
        while IFS= read -r _line # Or || [[ -n $_line ]] ; do ...
        do
                printf "%s\n" "$_line"
        done
 
        [[ -n $_line ]] && printf "%s\n" "$_line"
}

More information related to this topic