Pipelines and automating workflows
Piplines
Arguably, the shell’s most powerful feature is pipelines, which lets us combine existing programs in new ways. For example, we can ask the shell to capture the output of a command in a file.
Input
$ ls -l > list.txt
The greater than symbol, >
, tells the shell to redirect the command’s output to a file instead of printing it to the screen (the default output). The shell will create the file if it does not exist or will silently overwrite the file.
To pass output of one command to another command, we can use pipelines. The vertical bar, |
, between the two commands is called a pipe and tells shell that we want to use the output of the command on the left as the input to the command on the right.
You can also use grep to search the history of commands ran the shell. To get a list of the past commands:
Input
$ history
Now, we use pipelines to send the output of history
to a grep
command:
Input
$ history | grep cd
Variables in Shell
Like scripting languages, you can define variables in shell and store parameters and values in them. The name of a variable can contain only letters, numbers, or the underscore character. By convention, Unix shell variables will have their names in UPPERCASE.
You can assign values to variables using the =
sign.
Input
$ NAME="shayan"
To retrieve the value stored in a variable, simply precede the variable name with the dollar sign ($
). In certain cases, when establishing connections to remote machines to execute tasks or processes, you may find it necessary to store directory names or file names in variables. This enables easier referencing and manipulation of these values within your scripts or commands. To print the value of a variable in a shell script, you can use the echo
command followed by the variable name preceded by a dollar sign ($
).
Input
$ echo $name
Automating with loops
Loops are used to repeat a command or set of commands for each item in a list. Similar to wildcards, using loops can save a lot of time and reduce the amount of typing required.
The structure of a loop is:
for thing in list_of_things
do
operation_using $thing #Indentation is not required but helps legibility.
done
For example, if you want to copy all of the files with the .doc
extension and add a backup_
in front of their name, you can use the following loop:
for filename in *.doc
do
echo "$filename"
cp "$filename" backup_"$filename"
done
Notice that as you write a loop in the shell the indicator changes from $
to >
which essentially means “waiting for the command to finish”. A ;
can be used to separate commands written on a single line.
In this example we are saying that every time a filename has the “.doc” extension, we want to take the name of the file, and then make a copy of it while prefixing the file with “backup_”. The result is a backup of every file with a .doc extension in our directory.
Let’s say we make four files called one.doc, two.doc, three.doc, and four.doc.
Input
touch one.doc two.doc three.doc four.doc
Output
one.doc
two.doc
three.doc
four.doc
Let’s run our loop and see what happens. Note that we are telling the script to “echo” the filenames that are getting acted on. This is a way to check that the script is okay and to confirm which files we are affecting.
Input
for filename in *.doc
> do
> echo "$filename"
> cp "$filename" backup_"$filename"
> done
Output
four.doc
one.doc
three.doc
two.doc
If we now ls in the working folder we see the four backup files.
Output
one.doc
two.doc
three.doc
four.doc
backup_one.doc
backup_two.doc
backup_three.doc
backup_four.doc
Shell Scripts
We can take the commands we repeat frequently and save them in a file to re-run the operations again and again by running a single command. Shell scripts also make it possible to reproduce the same results simply by running your script rather than finding/remembering a long list of commands.
A shell script should have a .sh
command, but is nothing than a program (text file). You can use nano
to write a shell script.
Input
$ nano script.sh
To execute the commands in a shell script, we ask shell to read and run the commands:
Input
$ bash script.sh
Downloading and archiving
The tar
command archives multiple files into a TAR file – a common Linux format similar to ZIP. The basic syntax looks like this:
Input
$ tar [options] [archive_file] [file or directory to be archived]
For example, to create a new TAR archive named newarchive.tar in the /home/user/
Documents directory:
Input
$ tar -cvf newarchive.tar /home/user/Documents
In the command above, the c
tells tar
to create a new archive, v
sets the screen output to verbose so it will show the result on the screen, and f
points to the filename given to the archive. You can also extract a tar archive by passing -x
or lists the content of a file by -t
.
To download content and files from web servers, you can use wget
:
Input
$ wget [file-address]