Filter commands - wc, grep , head , tail , sort , awk

wc command - word count

wc command is used to count words, lines, characters, or bytes in a file or input.

What is the `wc` Command?

wc stands for "word count."
It is used to count the number of lines, words, characters, or bytes in files or standard input.
The wc command is part of the family of filter commands in Linux, meaning it processes and filters data.

Syntax of `wc`

The basic syntax of the wc command is:

wc [OPTION]... [FILE]...

Key Points:

If you provide a file, wc will analyze it.
If you don’t provide a file, wc will work with standard input (text you type or pipe into the command).

Basic Usage

Let’s start with some examples to understand how wc works:

Counting Lines, Words, and Characters in a File

wc example.txt
10 50 300 example.txt

This means:

10: Number of lines in the file.
50: Number of words in the file.
300: Number of characters in the file.

Common Options in `wc`

You can control what wc displays by using options:

1. Count Lines (`-l`)

To display only the number of lines in a file:

wc -l example.txt

10 example.txt

2. Count Words (`-w`)

To display only the number of words:

wc -w example.txt

50 example.txt

3. Count Characters (`-m`)

To count the total number of characters:

wc -m example.txt

300 example.txt

4. Count Bytes (`-c`)

To count the number of bytes:

wc -c example.txt

300 example.txt

In most cases, the byte count matches the character count unless special encoding is used.

5. Count Longest Line (`-L`)

To find the length of the longest line in terms of characters:

wc -L example.txt

45 example.txt

This means the longest line in the file has 45 characters.

Using `wc` with Standard Input

The wc command can also be used with standard input (text you type directly or send via a pipe).

1. Entering Text Manually

Type text and press Ctrl+D when you're done:

2. Piping with Other Commands

You can combine wc with other commands using pipes (|).

Example 1: Counting Lines in a Directory Listing

ls -l | wc -l

This tells you how many lines (files and directories) are in the current directory.

Example 2: Counting Lines and Words in a File (combining options)

cat example.txt | wc -lw

Conclusion

The wc command is a simple but powerful tool for analyzing files and input. Whether you're working with logs, coding files, or just exploring Linux, it provides valuable insights with minimal effort.

grep command - global regular expression print

What is the `grep` Command?

The grep command is used to search for specific patterns or words in a file or output. It scans the input and prints the lines containing the pattern you're looking for.

Syntax of `grep`

The basic syntax is:

grep [OPTIONS] PATTERN [FILE]

Here’s what each part means:

PATTERN: The word or text you’re searching for.
FILE: The file(s) where you want to search.
OPTIONS: Flags that modify how the command works.

Basic Examples

1. Searching for a Word in a File

grep "Linux" sample.txt

This searches for the word "Linux" in sample.txt.
If found, it outputs the entire line containing "Linux".

2. Case-Insensitive Search

By default, grep is case-sensitive. To ignore case, use the -i option:

grep -i "linux" sample.txt

This will match "Linux", "LINUX", or any variation of capitalization.

3. Searching in Multiple Files

You can search across multiple files at once:

grep "error" file1.txt file2.txt

4. Show Line Numbers

To display the line number where the pattern is found, use the -n option:

grep -n "error" sample.txt

5. Highlight Matches

To highlight the matching text, use the --color option (enabled by default in most systems):

grep --color "Linux" sample.txt

6. Count Matches (`-c`)

To count how many times a pattern occurs in a file:

grep -c "Linux" sample.txt

7. Invert Match (`-v`)

To display lines that do not contain the pattern:

grep -v "Linux" sample.txt

8. Search for Whole Words (`-w`)

To search for the exact word match:

grep -w "Linux" sample.txt

9. Recursive Search (`-r`)

To search in all files within a directory (including subdirectories):

grep -r "Linux" /home/user/documents

10 Show Only Filenames (`-l`)

To display only the names of files containing the pattern:

grep -l "error" *.txt

Using `grep` with Pipes

The grep command becomes even more powerful when combined with pipes (|), allowing you to filter the output of other commands.

1. Filtering the Output of `ls`

ls -l | grep "txt"
This filters the list of files and shows only those with "txt" in their names.

2. Searching Logs

cat logfile.txt | grep "ERROR"

This will search for the word "ERROR" in a log file.

Regular Expressions in `grep`

The grep command supports regular expressions for more advanced searches:

1.Search for Multiple Patterns

grep -E "error|warning" sample.txt

This matches lines containing either "error" or "warning".

2.Match Lines Starting with a Pattern

grep "^Linux" sample.txt

3.Match Lines Ending with a Pattern

grep "Linux$" sample.txt

Conclusion

grep is very useful command for searching pattern. It has got several advanced features and extensions like egrep and fgrep. Try exploring the manual.

head command

What is the `head` Command?

The head command is used to display the first few lines of a file. By default, it shows the first 10 lines, but you can customize the number of lines to view.

Think of it as a quick peek at the beginning of a file to understand what it contains.

Syntax of the `head` Command

The basic syntax of the command is:

head [OPTIONS] [FILE]

Here’s what each part means:

OPTIONS: Modify the behavior of the head command.
FILE: The file(s) you want to read.

How to Use the `head` Command

Let’s start with a simple example.

1. Display the First 10 Lines of a File

head example.txt

This will display the first 10 lines of the file example.txt.

2. Specify the Number of Lines to Display

If you don’t want to display exactly 10 lines, you can use the -n option to specify the number of lines you need.

head -n 5 example.txt

This will display the first 5 lines of the file.

3. Display Multiple Files

You can use head to view the first few lines of multiple files at the same time.

head file1.txt file2.txt

This will display the first 10 lines of both file1.txt and file2.txt. The output will be separated by file names.

4. Combine `head` with Other Commands

The head command can be used with other commands using pipes (|). For example:

ls -l | head -n 3

This shows the first 3 lines of the output from the ls -l command.

5.Display Bytes Instead of Lines (`-c`)

The -c option lets you display a specific number of bytes instead of lines:

head -c 50 example.txt

This displays the first 50 bytes (characters, including spaces) from the file.

6. Quiet Mode (-q)

If you’re displaying multiple files and don’t want to see their names in the output, use the -q option:

head -q file1.txt file2.txt

Common Use Cases for `head`

Quickly Preview a File: Use head to check the first few lines of a large file before opening it.
View Log Files: System administrators use head to inspect the start of log files:
```
head /var/log/syslog
```
Check File Structure: When dealing with data files (like CSVs), use head to check the header or column names.

Conclusion

The head command is an essential tool for quickly examining files in Linux. It’s simple to use, but it can save you a lot of time when working with large files. With options like -n and -c, you can customize it to suit your needs.

tail command

This command is the opposite of the head command, and it is used to view the last few lines of a file.

What is the `tail` Command?

The tail command is used to display the last lines of a file. By default, it shows the last 10 lines, but just like the head command, you can customize the number of lines or bytes to display. Additionally, tail has a unique ability to monitor files in real-time.

Syntax of the `tail` Command

Here’s the basic syntax:

tail [OPTIONS] [FILE]

OPTIONS: Modify the behavior of the tail command.
FILE: The file you want to read.

How to Use the `tail` Command

1. Display the Last 10 Lines of a File

Let’s start with a simple example. If you want to see the last 10 lines of a file named example.txt, use:

tail example.txt

2. Display a Custom Number of Lines

You can use the -n option to specify the exact number of lines you want to display:

tail -n 5 example.txt

3. Display the Last N Bytes

Instead of lines, you can display the last N bytes of a file using the -c option:

tail -c 20 example.txt

This will display the last 20 bytes (characters, including spaces) of the file.

4. Monitor a File in Real-Time

The -f option (short for follow) is one of the most powerful features of the tail command. It is used to monitor a file continuously as new data is added. This is particularly useful for watching log files.

tail -f /var/log/syslog

This will display the last 10 lines of the file and update the output in real-time whenever new lines are added.

5.View Multiple Files

You can use tail with multiple files at once:

tail file1.txt file2.txt

This will display the last 10 lines of each file, separating the output with headers.

Common Use Cases for `tail`

Viewing Logs: System administrators often use tail -f to monitor log files for troubleshooting:
tail -f /var/log/auth.log
Checking the End of a File: Quickly check the last few lines of a data or text file:
tail data.csv
Debugging Applications: Developers monitor output logs to debug applications in real-time:
tail -f app.log

Conclusion

The tail command is an essential tool for inspecting the end of files and monitoring them in real-time. Whether you’re debugging, analyzing logs, or just peeking at a file’s content, tail is your go-to command.

sort command

What is the `sort` Command?

The sort command in Linux is used to arrange lines of text in a file in either ascending or descending order. Sorting is performed based on the alphabetical or numerical values of the data.

Syntax of the `sort` Command

Here’s the basic syntax:

sort [OPTIONS] [FILE]

OPTIONS: Specify how you want the data to be sorted.
FILE: The name of the file containing the data to sort. If no file is specified, sort reads from standard input (e.g., data you type or pipe from another command).

How to Use the `sort` Command

1. Sorting Alphabetically (Default)

By default, sort arranges lines in alphabetical order. Let’s say we have a file called names.txt with the following content:

sort names.txt

This will sort names in alphabetical order stored in names.txt file

2. Sorting in Reverse Order

To sort the lines in reverse order, use the -r option:

sort -r names.txt

3. Sorting Numerically

If your file contains numbers, you can sort them numerically using the -n option. For example, consider a file numbers.txt:

For reverse numerical order:

sort -n -r numbers.txt

4. Ignoring Case While Sorting

By default, sorting is case-sensitive. This means uppercase letters are sorted before lowercase ones. For example:

sort -f names.txt

5. Sorting by a Specific Column

The -k option allows you to sort based on a specific column. Suppose we have a file data.txt with this content:

Example:names.txt ( name and age)

Alice 25
Bob 30
Charlie 20
Diana 22

To sort based on age

sort -k 2n names.txt

6. Removing Duplicate Lines

If your file contains duplicate lines and you want to keep only unique entries while sorting, use the -u option:

sort -u names.txt

7. Sorting by Month Names

To sort lines containing month names in chronological order, use the -M option. For example:

sort -M months.txt

8. Sorting by File Size or Numeric Strings

If a file contains mixed numeric and text data, you can use a combination of options like -h (human-readable sizes):

file_sizes

10K

500K

sort -h file_sizes
10K

500K

Using `sort` with Other Commands

The sort command becomes even more powerful when combined with other commands using pipes (|).

Example: Sorting the Output of `ls`

To list files in a directory and sort them alphabetically:

ls | sort

ls -l | sort -k 5n -r

Common Options for `sort`

Option	Description
`-r`	Sorts in reverse order.
`-n`	Sorts numerically.
`-f`	Ignores case sensitivity.
`-k`	Sorts by a specific column or field.
`-u`	Removes duplicate lines while sorting.
`-M`	Sorts by month names.
`-h`	Sorts by human-readable sizes (e.g., 1K, 1M, 1G).
`-o`	Writes the sorted output to a file (e.g., `sort file.txt -o sorted.txt`).

Conclusion

The sort command is incredibly versatile and helps you organize text data efficiently. Whether you're working with logs, data files, or even file lists, sort has an option to meet your needs.

awk command

The awk command is a text-processing tool that allows you to:

Search for patterns.
Extract specific fields or columns from a file.
Perform calculations and formatting on text data.

It gets its name from its creators: Aho, Weinberger, and Kernighan.

Syntax of the `awk` Command

Here’s the general syntax:


awk 'pattern {action}' file

pattern: Specifies the condition to match (optional).
action: Defines what to do when the pattern is matched.
file: The file you want to process.

Basic Structure of an `awk` Program

An awk command can contain three sections:

BEGIN block: Executed once before processing the file.
Pattern-action statements: Applied to each line of the file.
END block: Executed once after processing the file.

Example:

awk 'BEGIN {print "Start"} {print $0} END {print "End"}' file.txt

How to Use `awk`

1. Print the Entire File

By default, awk processes each line of a file. To print the entire content:

awk '{print}' file.txt

2. Print Specific Columns

To extract specific columns (fields), use $ followed by the column number. For example, with a file data.txt:

Alice     25 HR
Bob       30 IT
Charlie   28 Finance

To print the names (1st column):

awk '{print $1}' data.txt

Output:

Alice
Bob
Charlie

To print the names and ages (1st and 2nd columns):

awk '{print $1, $2}' data.txt

Output:

Alice 25
Bob 30
Charlie 28

3. Filter Lines Based on Patterns

You can specify patterns to filter lines. For example, print lines where the department is "IT":

awk '$3 == "IT"' data.txt

Output:

Bob 30 IT

4. Perform Calculations

You can use awk to perform calculations on numerical fields. For example:

Alice 25
Bob 30
Charlie 28

To add 5 years to each person's age:

awk '{print $1, $2 + 5}' data.txt

Output:

Alice 30
Bob 35
Charlie 33

5. Use Field Separator

By default, awk assumes fields are separated by spaces or tabs. To specify a different delimiter, use the -F option. For example, with a CSV file data.csv:

Alice,25,HR
Bob,30,IT
Charlie,28,Finance

To extract the first column:

awk -F ',' '{print $1}' data.csv

Output:

Alice
Bob
Charlie

6. Add a Header Using the BEGIN Block

To add a custom header to your output:

awk 'BEGIN {print "Name Age Department"} {print $1, $2, $3}' data.txt

Output:

Name Age Department
Alice 25 HR
Bob 30 IT
Charlie 28 Finance

7. Count the Number of Lines

To count the total number of lines in a file:

awk 'END {print NR}' data.txt

Output:

8. Use Conditions

You can use conditions like greater than (>) or less than (<). For example:

awk '$2 > 25' data.txt

Output:

Bob 30 IT
Charlie 28 Finance

9. Combine Multiple Actions

You can combine actions using ;. For example:

awk '{print $1; print $2}' data.txt

Output:

Alice
25
Bob
30
Charlie
28

10. Write Output to a File

To save the output to a new file:

awk '{print $1, $2}' data.txt > output.txt

Key Built-in Variables in `awk`

Variable	Description
`$0`	The entire line.
`$1, $2`	The first, second, etc., fields.
`NR`	The current line number.
`NF`	The number of fields in the current line.
`FS`	The input field separator (default is space).
`OFS`	The output field separator.

Conclusion

The awk command is like a mini-programming language designed for text processing. It’s incredibly versatile and can save you time when analyzing or manipulating data. With practice, you’ll be able to create more complex awk scripts for real-world tasks.