Multi-threaded processing is a powerful tool that can be used in Bash scripts to improve performance. By using multi-threaded processing, you can take advantage of the multiple cores on your computer to speed up your script. There are a few things you need to keep in mind when using multi-threaded processing in your Bash scripts:
- Make sure your script is well written and optimized for multi-threading. Poorly written scripts will not take advantage of the multiple cores on your computer, and will instead suffer from poor performance.
- Make sure you use the right libraries and tools for multi-threading. Some common libraries and tools used for multi-threading include libcurl, libpcre, and libevent. Make sure you have the correct versions installed on your computer before trying to use them in your script.
- Be aware of potential race conditions when using multi-threaded processing in your script. Race conditions occur when two or more threads attempt to access the same piece of data at the same time, resulting in unpredictable behavior. Be careful not to create any race conditions when writing your script; if you do, be sure to debug it carefully so that you can find and fix the problem.
- Use caution when using multithreading in production environments. Multi-threading is a powerful tool that can be used to improve performance, but it should only be used if there are no other options available. If multithreading is going to be used in production, make sure that it is done correctly and with proper safeguards in place ..
What Is multi-threaded programming?
A picture is worth a thousand words, and this holds when it comes to showing the difference between single (1) thread programming and multi-threaded (>1) programming in Bash:
Our first multi-threaded programming setup or mini one-liner script could not have been simpler; in the first line, we sleep for one second using the sleep 1 command. As far as the user is concerned, a single thread was executing a single sleep of one second.
In the second line, we have two one-second sleep commands. We join them by using a & separator, which does not only act as a separator between the two sleep commands, but also as an indicator to Bash to start the first command in a background thread.
Normally, one would terminate a command by using a semicolon (;). Doing so would execute the command and only then proceed to the next command listed behind the semicolon. For example, executing sleep 1; sleep 1 would take just over two seconds – exactly one second for the first command, one second for the second, and a tiny amount of system overhead for each of the two commands.
However, instead of terminating a command with a semicolon one can use other command terminators which Bash recognizes like &, && and ||. The && syntax is quite unrelated to multi-threaded programming, it simply does this; proceed with executing the second command only if the first command was successful. The || is the opposite of && and will execute the second command only if the first command failed.
Returning to multi-threaded programming, using & as our command terminator will initiate a background process executing the command preceding it. It then immediately proceeds with executing the next command in the current shell while leaving the background process (thread) to execute by itself.
In the output of the command we can see a background process being started (as indicated by [1] 445317 where 445317 is the Process ID or PID of the just started background process and [1] is an indicated that this is our first background process) and it subsequently being terminated (as indicated by [1]+ Done sleep 1).
If you would like to view an additional example of background process handling, please see our Bash Automation and Scripting Basics (Part 3) article. Additionally, Bash Process Termination Hacks may be of interest.
Let’s now proof that we are effectively running two sleep processes at the same time:
Here we start our sleep process under time and we can see how our single threaded command ran for exactly 1.003 seconds before our command line prompt was returned.
However, in the second example, it took about the same time (1.005 seconds) even though we were executing two periods (and processes) of sleep, though not consecutively. Again we used a background process for the first sleep command, leading to (semi-)parallel execution, i.e., multi-threaded.
We also used a subshell wrapper ($(…)) around our two sleep commands to combine them together under time. As we can see our done output shows in 1.005 seconds and thus the two sleep 1 commands must have run simultaneously. Interesting is the very small increase in overall processing time (0.002 seconds) which can be easily explained by the time required to start a subshell and the time required to initiate a background process.
Multi-threaded (and Background) Process Management
In Bash, multi-threaded coding will normally involve background threads from a main one-line script or full Bash script. In essence, one may think about multi-threaded coding in Bash as starting several background threads. When one starts to code using multiple threads, it quickly becomes clear that such threads will usually require some handling. For example, take the fictive example where we start five concurrent periods (and processes) of sleep in a Bash script;
When we start the script (after making it executable using chmod +x rest.sh), we see no output! Even if we execute jobs (the command which shows any background jobs in progress), there is no output. Why?
The reason is that the shell which was used to start this script (i.e., the current shell) is not the same shell (nor the same thread; to start thinking in terms of subshells as threads in and by themselves) that executed the actual sleep commands or placed them into the background. It was rather the (sub)shell which was started when ./rest.sh was executed.
Let’s change our script by adding jobs inside the script. This will ensure that jobs is executed from within the (sub)shell where it is relevant, the same one as to where the periods (and processes) of sleep were started.
This time we can see the list of background processes being started thanks to the jobs command at the end of the script. We can also see their PID’s (Process Identifiers). These PIDs are very important when it comes to handling and managing background processes.
Another way to obtain the background Process Identifier is to query for it immediately after placing a program/process into the background:
Similar to our jobs command (with new PID’s now as we restarted our rest.sh script), thanks to the Bash ${!} variable being echoed, we will now see the five PID’s being displayed almost immediately after the script starts: the various sleep processes were placed into background threads one after the other.
The wait Command
Once we have started our background processes, we have nothing further to do than wait for them to be finished. However, when each background process is executing a complex subtask, and we need the main script (which started the background processes) to resume executing when one or more of the background processes terminates, we need additional code to handle this.
Let’s expand our script now with the wait command to handle our background threads:
Here we expanded our script with two wait commands which wait for the PID attached to the first and second threads to terminate. After 10 seconds, our first thread exists, and we are notified of the same. Step by step, this script will do the following: start five threads at almost the same time (though the starting of the threads itself is still sequential and not parallel) where each of the five sleep‘s will execute in parallel.
The main script then (sequentially) reports on the thread created and subsequently waits for the Process ID of the first thread to terminate. When that happens, it will sequentially report on the first thread finishing and commence a wait for the second thread to finish, etc.
Using the Bash idioms &, ${!} and the wait command give us great flexibility when it comes to running multiple threads in parallel (as background threads) in Bash.
Wrapping up
In this article we explored Bash multi-threaded scripting basics. We introduced the background process operator (&) using some easy-to-follow examples showing both single and multi-threaded sleep commands. Next, we looked at how to handle background processes via the commonly used Bash idioms ${!} and wait. We also explored the jobs command to see running background threads/processes.
If you enjoyed reading this article, have a look at our Bash Process Termination Hacks article.