Linux Data Streams Overview

Linux data streams are fundamental to how the operating system and its applications communicate and manage data. They are essentially channels through which data is transmitted. Linux automatically provides three standard streams for every process. Each of these streams is automatically opened as a file at the startup of a program. Each data stream is associated with a file handle, which is just a set of metadata that describes the attributes of the file. File handles 0, 1, and 2 are explicitly defined by convention and long practice as STDIN, STDOUT, and STDERR, respectively.

STDIN, File handle 0, is standard input which is usually input from the keyboard. STDIN can be redirected from any file, including device files, instead of the keyboard. It is not common to need to redirect STDIN, but it can be done. When you type commands or data into the terminal, that input is processed through STDIN.

STDOUT, File handle 1, is the standard output that sends the data stream to the display by default. It is common to redirect STDOUT to a file or to pipe it to another program for further processing. When a command prints results on the terminal, it is using STDOUT.

STDERR, File handle 2. The data stream for STDERR is also usually sent to the display. This stream is used specifically for outputting error messages. It functions similarly to STDOUT in that it displays to the terminal by default, but it can be separately redirected to a file or another output. This separation allows users to differentiate between normal output and error messages, which is useful for debugging and logging.

Exit Codes

Exit codes are integer values returned by a program or command to the shell upon its completion. They signify the success or failure of the command's execution, with a convention of 0 indicating success and non-zero values indicating different types of errors or specific outcomes.

While executing, a program writes its output to stdout and its error messages to stderr. The program then exits with an exit code, which is accessible to the shell or calling program. After a command has finished executing, the exit code is stored in the special variable $?. To view the exit code, use the echo command as follows:

echo $?

Why Exit Codes Are Useful

  • Automation and Scripting: In shell scripting and automation tasks, checking the exit status of a command is crucial for conditional execution. For instance, a script might execute a follow-up command only if the previous command succeeded, as indicated by an exit code of 0.

In this example, a script will execute a second command only if the first command succeeds. This is done by checking the exit code of the first command using $?, which holds the exit status of the last command executed. An exit code of 0 indicates success.

    #!/bin/bash

    # First command: trying to create a directory
    mkdir new_directory

    # Check if the mkdir command succeeded
    if [ $? -eq 0 ]; then
      echo "Directory created successfully. Proceeding with the next command."
      # Follow-up command: create a new file inside the newly created directory
      touch new_directory/new_file.txt
    else
      echo "Failed to create directory. Aborting follow-up command."
    fi
  • Error Handling: Exit codes enable scripts and programs to detect errors and handle them appropriately. For example, a backup script might send a notification or attempt a retry if a critical command exits with a failure code.

  • Debugging: When debugging scripts or complex sequences of commands, understanding the exit codes of each command can help identify which part of the process failed and why.

  • Integration: In more complex systems, different components can communicate execution results through exit codes, allowing for sophisticated control flows based on the success or failure of various parts.

Common Exit Codes

Exit codes are integers ranging from 0 to 255, where each number can signify a different outcome. Here are some of the most commonly encountered exit codes and their general meanings:

  • 0: Success. The command or program that executed successfully without any errors.

  • 1: General error. A catch-all for general errors when a more specific exit code is not provided.

  • 2: Misuse of shell builtins. This often indicates incorrect command usage or syntax.

  • 126: Command invoked cannot execute. This might be due to permissions issues or attempting to execute a non-executable file.

  • 127: Command not found. The shell could not locate the command to execute.

  • 128: Invalid exit argument. This exit status indicates an exit argument to a command was out of the allowed range.

  • 128+n: Fatal error signal "n". If a program is terminated by a signal, the exit status is 128 plus the signal number. For example, if a program is killed by signal 9 (SIGKILL), the exit status is 128 + 9 = 137.

  • 130: Script terminated by Control-C. The exit status when a command is terminated by the user with an interrupt signal (SIGINT).

  • 255*: Exit status out of range. This is returned if an exit status is greater than 255, typically indicating incorrect usage of the exit command.

It's worth noting that exit codes above 0 can be specific to the program or script being executed. While some exit codes are standardized (like those listed above), many programs define their own exit codes to communicate more specific errors or statuses. This means that understanding the context of the command or application that produced the exit code is important for interpreting its meaning accurately.

For scripts and many commands, checking their documentation or source code (if available) can provide insights into the specific meanings of their exit codes.

Redirection and Pipes

The concept of standard streams is fundamental to the Unix philosophy of "everything is a file," allowing for powerful and flexible inter-process communication. For instance, pipes (|) and redirections (>, <, 2>, etc.) are built on the idea of connecting these standard streams between processes or to files.

Hands-on Exercise Overview

Let's consider an example where the ls command returns messages to STDERR and STDOUT in one shot, and try to redirect the result to different files.

Hands-on Exercise

  1. Open the /proc directory and change to the directory named 1

    When the ls command was entered to list the contents of the current directory, error messages and a list of files were returned at once. This means that STDERR and STDOUT data streams were used in one shot by the ls command.

  2. To split STDOUT and STDERR messages and handle them differently, the redirection operator > can be used:

     ls > ~/results.txt 2> ~/errors.txt
    
    • ls: This is the command used for listing directory contents.

    • > This is a redirection operator. It redirects the output of the ls command to a file instead of displaying it in the terminal.

    • 2> The number 2 represents the file handle (file descriptor) for the standard error stream (STDERR). The > symbol indicates redirection.

    • ~/results.txt and ~/errors.txt These are the file paths where the output of the ls command will be redirected. The ~ symbol represents the user's home directory.

💡
When the > redirection operator is used without any preceding number, it means that STDOUT (standard output) is being redirected by default. STDOUT has a file descriptor of 1, so redirecting STDOUT to a file without specifying a number is the same as using 1>.
  1. To list the contents of the current directory but discard any error messages that may occur during the process, run:

     ls 2> /dev/null
    
    • /dev/null: This is a special device file in Unix-like operating systems that discards all data written to it. It essentially acts as a black hole for data. So, in this context, any error messages generated by the ls command will be redirected to /dev/null, effectively discarding them.

References:

  1. Linux Crash Course - Data Streams (stdin, stdout & stderr)

  2. Working with data streams on the Linux command line

  3. Wikipedia: exit status