Thursday, May 28, 2009

Pure bash cat

So just to see if I could, I wrote a version of cat using pure bash. Pure bash is a bash script which uses nothing but bash builtins to accomplish it's goal. To determine if a particular command is a builtin, you can use the command type -t "command" (the command type, is itself a builtin). Some notable commands which are builtins include echo, read, exec, return. Some notable commands which are not builtins include cat and grep. As follows is my implementation of cat in pure bash.
#!/bin/bash
INPUTS=( "${@:-"-"}" )
for i in "${INPUTS[@]}"; do
    if [[ "$i" != "-" ]]; then
        exec 3< "$i" || exit 1
    else
        exec 3<&0
    fi
    while read -ru 3; do
        echo -E "$REPLY"
    done
done
Now, keep reading if you want a small lesson in advanced bash. I'll go line by line to explain what this is doing.
#!/bin/bash
INPUTS=( "${@:-"-"}" )
Line 1 is the shebang.
#!/bin/bash
INPUTS=( "${@:-"-"}" )
for i in "${INPUTS[@]}"; do
Line 2 assigns the array variable INPUTS either the arguments provided on the command line if they exist, or the single character "-". The way this happens is as follows: $@ is the variable to reference the positional parameters (the arguments to your program). If you have not heard of $*, read this. The way I reference the positional parameters is like ${@}. That's because the brackets allow me to add a "default value" to the variable. A default value is the value that the variable will seem to have if the variable is not set. The way to use a default value is with the :-, like so: ${@:-"hello"}. So if $@ is not set, it will seem to have the value "hello". You will then notice that is all enclosed in (). That makes an array out of the positional parameters (the first argument to the program becomes the first element in the array, the second argument becomes the second element, etc.).
INPUTS=( "${@:-"-"}" )
for i in "${INPUTS[@]}"; do
    if [[ "$i" != "-" ]]; then
Line 3 begins a for loop which will assign to i each value stored in the array INPUTS which was discussed earlier. The @ index used is the same for arrays as $@ is for the positional parameters.

Maybe i'll explain more when i'm less lazy.

6 comments:

Unknown said...

cat() { while IFS="" read l ; do echo "$l" ; done < $1 ; }

Russ said...

while that works for one file, you can't do it on multiple files. Also, you can't use - to cat from stdin

TheBonsai said...

You must not use a variable for read. After that, simply printf "%s\n" "$REPLY".

TheBonsai said...

Sorry, my fault. You set IFS. But you really should set it for the read command only, IFS influences enough things to be careful with it. In any case, with the read using REPLY, you don't have to bother with IFS at all.

Anonymous said...

Too bad neither example can handled binary data like the real cat. Anything with an EOF character for example :(

Good try though, can't expect much from bash when dealing with extended ascii characters.

Russ said...

From some tests I just did, EOF seems to work, but NULL seems to break it. Oh well, I don't think it can be fixed =(.