The tr command is designed to translate, squeeze and/or delete characters in standard input (stdin) and write them to standard output (stdout). It is often used on the command line for string manipulation in a pipeline.
If you are unsure of what stdin and stdout are, we recommend you read our "Introduction to Linux IO, Standard Streams, and Redirection".
In this article we will demonstrate the basic usage of the tr command with practical examples beyond converting lowercase to uppercase. We also take a VERY interesting and in-depth look at two oddly named options, complement and truncate.
Table of Contents
Using tr Command to Delete Characters
The most common usage for tr is to delete characters from an input stream. You can use the -d (--delete) option followed by the character, set of characters or an interpreted sequence.
Delete all instances of lower case "t":
$ echo "Tongue tied and twisted just an earth bound misfit" | tr -d t
Tongue ied and wised jus an earh bound misfi
Remove/Delete all instances of the letter "t" lowercase or uppercase:
$ echo "Tongue tied and twisted just an earth bound misfit" | tr -d 'Tt'
ongue ied and wised jus an earh bound misfi
Delete all instances of letters in the range of "a-j":
$ echo "Tongue tied and twisted just an earth bound misfit" | tr -d 'a-j'
Tonu t n twst ust n rt oun mst
You can also use interpreted sequences, for example, to delete all uppercase letters:
$ echo "Tongue tied and twisted just an earth bound misfit" | tr -d [:upper:]
ongue tied and twisted just an earth bound misfit
We will discuss interpreted sequences more later in the article.
Using tr Command for Substitution
The tr command can be used for character substitution. It takes two sets of characters as arguments and takes the characters in the first argument and replaces them with the characters in the second argument.
Replace all lower case "t" with a uppercase "T":
$ echo "Tongue tied and twisted just an earth bound misfit" | tr 't' 'T'
Tongue Tied and TwisTed jusT an earTh bound misfiT
You can also use ranges here as well. As an extreme example, here we will replace letters a-d with the numbers 1-4. This effectively maps a to 1, b to 2, c to 3, and d to 4. Example:
$ echo "Tongue tied and twisted just an earth bound misfit" | tr 'a-d' '1-4'
Tongue tie4 1n4 twiste4 just 1n e1rth 2oun4 misfit
One of the most famous uses of the tr command is using it to substitute all lowercase for uppercase letters.
$ echo "Tongue tied and twisted just an earth bound misfit" | tr [:lower:] [:upper:]
TONGUE TIED AND TWISTED JUST AN EARTH BOUND MISFIT
Using tr Command to Remove Repeated Characters (Squeeze)
Another champion option of the tr command is -s (--squeeze-repeats). Squeeze means to take any number of repeated characters and replace then with a single instance of that character. This is often used to remove double spacing or excessive empty lines.
To demonstrate this I created the following test.txt file:
$ cat test.txt
First Line with weird spacing.
Followed by a bunch of empty lines.
Then the last line with weird spacing.
We will use the tr command to squeeze any repeated spaces into a single space using the -s option.
$ cat test.txt | tr -s " "
First Line with weird spacing.
...OUTPUT TRUNCATED...
Now we will use the tr command to squeeze any repeated new lines.
$ cat test.txt | tr -s "\n"
First Line with weird spacing.
Followed by a bunch of empty lines.
Then the last line with weird spacing.
A little more straight forward example:
$ echo "putorius,,,,,,,,,,,,,,,net" | tr -s ,
putorius,net
As you can see, tr removed all the repeated commas and replaced them with a single comma.
You can also specify which character you want to replace it with. For example, it makes no sense to use a comma in a domain name. Let's replace the long string of commas with a single period.
$ echo "putorius,,,,,,,,,,,,,,,,,,,,,net" | tr -s "," "."
putorius.net
Using tr Command Complement Option
I am not too proud to admit it took me a while to understand this option when I first came across it. The man page simply says "use the complement of SET1" which at first just confused me. But after I thought about it for a while and did some additional testing it started to make sense.
Think about the complete set of alphanumeric characters, which basically is A-Z,a-z, and 0-9. Now take the character "s". The complement of "s" is the rest of the alphanumeric characters. The "s" would need all the other characters to make it a complete set.
Complement Definition: A number or quantity of something, especially that required to make a group complete.
-Google Dictionary
Another (more simple) way to look at it...
The tr command complement option is basically a logical NOT.
- Me (My aha! moment)
The -c (--complement) option is basically a logical not. Here is an example:
Replace anything that is NOT a letter "a" with a letter "x":
$ echo "Tongue tied and twisted just an earth bound misfit" | tr -c "a" "x"
xxxxxxxxxxxxaxxxxxxxxxxxxxxxxaxxxaxxxxxxxxxxxxxxxxx
Replace anything that is NOT a digit with an "X":
$ echo "1 mississippi 2 mississippi 3 mississippi" | tr -c [:digit:] X
1XXXXXXXXXXXXX2XXXXXXXXXXXXX3XXXXXXXXXXXXX
Delete anything that is NOT a digit:
$ echo "1 Mississippi 2 Mississippi 3 Mississippi" | tr -cd [0-9]
123
I am baffled as to why the creators decided to call it complement. It seems like an odd way to name something. But I am going to assume that they are much smarter than I and have a darn good reason for it.
Using the tr Command Truncate Option
Here is another option that had my head spinning for a while. If you first read the word truncate you would assume that you are going to use this option to truncate the input string.
It is actually used to truncate the first argument to the size of the second argument. This changes the way the translation works.
Typically the second argument would be extended to match the size of the first set. For example:
tr "happy" "sad"
Since happy is longer than sad, the tr command extends sad to saddd.
SET2 is extended to length of SET1 by repeating its last character as necessary.
-tr man page
So the -t (truncate) option reverses that logic and truncates the first set to the size of the second set. Which turns this:
tr "happy" "sad"
into this:
tr "hap" sad"
I hope it comes across is my writing as well as it is coming across in my head. Let's run through some examples.
$ echo "Pink Floyd is awesome in concert" | tr in 9
P99k Floyd 9s awesome 99 co9cert
In the example above set 2 (9) is shorter that set 1 (in) so set 2 is extended by repeating the last character. Essentially making it 99:
$ echo "Pink Floyd is awesome in concert" | tr in 99
P99k Floyd 9s awesome 99 co9cert
Now let's add the -t option which will shorten (truncate) set 1 (in) to match the length of set 2 (9).
$ echo "Pink Floyd is awesome in concert" | tr -t in 9
P9nk Floyd 9s awesome 9n concert
Ahhh, I love it when a plan comes together.
Interpreted Sequences
Interpreted sequences are used in many Linux commands. They are basically a short way of denoting a larger set of characters.
Any digit [0-9]:
[:digit:]
Lowercase letters:
[:lower:]
Letters:
[:alpha:]
Upper case letters:
[:upper:]
You can find a complete list of interpreted sequences used by tr in the man page.
Conclusion
The tr command may not have the most intuitive options, but it certainly is a useful tool. Even if you have no intention of using the more complex options (Complement/Truncate) I hope you found it to be as interesting as I did. Feel free to leave any comments below.
Resources
Leave a Reply Cancel reply
This site uses Akismet to reduce spam. Learn how your comment data is processed.
2 Comments
Join Our Newsletter
Categories
- Bash Scripting (17)
- Basic Commands (51)
- Featured (7)
- Just for Fun (5)
- Linux Quick Tips (98)
- Linux Tutorials (65)
- Miscellaneous (15)
- Network Tools (6)
- Reviews (2)
- Security (32)
What a great little read!
It's nice to see some uses other than the age old capitalization sanitizing.
Just be aware that tr is old as dirt, and as a result doesn't support any characters outside of the ASCII range.