Menu
📱 Lihat versi lengkap (non-AMP)
Linux AWK Text Processing

Panduan Lengkap Text Processing dengan AWK di Linux

Editor: Hendra WIjaya
Update: 3 February 2026
Baca: 4 menit

Panduan Lengkap Text Processing dengan AWK di Linux

AWK adalah programming language yang powerful untuk text processing. Dibuat oleh Alfred Aho, Peter Weinberger, dan Brian Kernighan (dari sinilah nama AWK), tools ini sangat efektif untuk manipulasi data terstruktur seperti log files, CSV, dan output command.

1. Pengenalan AWK dan Sintaks Dasar

Struktur AWK Program

pattern { action }
pattern { action }
  • Pattern: Kondisi yang harus dipenuhi untuk menjalankan action
  • Action: Command yang dijalankan jika pattern match

AWK One-Liners

# Print seluruh file
awk '{print}' file.txt

# Print baris tertentu (line 5)
awk 'NR==5' file.txt

# Print baris dengan pattern
awk '/pattern/' file.txt

# Print field tertentu (default delimiter: whitespace)
awk '{print $1}' file.txt    # Field pertama
awk '{print $NF}' file.txt   # Field terakhir
awk '{print $1, $3}' file.txt # Field 1 dan 3

2. Field Processing dan Delimiters

Menggunakan Delimiter Custom

# CSV file dengan comma delimiter
awk -F',' '{print $1, $2}' data.csv

# TSV file dengan tab delimiter
awk -F'\t' '{print $1}' data.tsv

# Multiple delimiters
awk -F'[:,]' '{print $1}' file.txt  # Delimiter : atau ,

# Regular expression sebagai delimiter
awk -F'[ \t]+' '{print $1}' file.txt  # Whitespace

Field Operations

# Print dengan separator custom
awk '{print $1 " - " $2}' file.txt

# Calculate total dari field
awk '{sum += $3} END {print sum}' numbers.txt

# Average dari field
awk '{sum += $1; count++} END {print sum/count}' data.txt

# Find max/min
awk 'max < $1 || NR==1 {max = $1} END {print max}' data.txt
awk 'min > $1 || NR==1 {min = $1} END {print min}' data.txt

# Count non-empty fields
awk '{for(i=1;i<=NF;i++) if($i!="") count++} END {print count}' file.txt

3. Pattern Matching

Pattern Types

# Exact match
awk '$1 == "value"' file.txt

# Regex match
awk '$2 ~ /regex/' file.txt

# Negation
awk '$2 !~ /regex/' file.txt

# Numeric comparison
awk '$3 > 100' file.txt
awk '$3 < 50' file.txt
awk '$3 >= 100 && $3 <= 200' file.txt

# String comparison
awk '$1 > "m"' file.txt  # Alphabetic comparison

# Multiple conditions
awk '$1 == "admin" && $3 > 1000' file.txt
awk '$1 == "user" || $1 == "admin"' file.txt

Built-in Patterns

# BEGIN - dijalankan sebelum processing file
awk 'BEGIN {print "Header"} {print}' file.txt

# END - dijalankan setelah processing file
awk '{sum += $1} END {print "Total:", sum}' file.txt

# NR - Record/line number
awk 'NR==1 {print "First line"}' file.txt
awk 'NR%2==0' file.txt  # Even lines
awk 'NR>1 && NR<=10' file.txt  # Lines 2-10

# NF - Number of fields
awk 'NF > 5' file.txt  # Lines dengan lebih dari 5 fields
awk 'NF == 0 {print "Empty line"}' file.txt

4. Log Processing dengan AWK

Analisis Log Apache/Nginx

# Count requests per IP
cat access.log | awk '{print $1}' | sort | uniq -c | sort -rn | head -20

# Atau dengan AWK saja
awk '{count[$1]++} END {for(ip in count) print count[ip], ip}' access.log | sort -rn | head -20

# Find 404 errors dengan referer
awk '$9 == 404 {print $1, $7, $11}' access.log

# Calculate response time average
awk '{sum += $10; count++} END {print "Avg response time:", sum/count "ms"}' access.log

# Find most requested URLs
awk '{url[$7]++} END {for(u in url) print url[u], u}' access.log | sort -rn | head -10

# Bandwidth usage per IP
awk '{bytes[$1] += $10} END {for(ip in bytes) print bytes[ip], ip}' access.log | sort -rn | head -20

System Log Analysis

# Error count per hour dari syslog
awk '/error/ {hour=substr($3,1,2); count[hour]++} END {for(h in count) print h, count[h]}' /var/log/syslog

# Failed SSH login attempts
awk '/Failed password/ {print $11}' /var/log/auth.log | sort | uniq -c | sort -rn | head -10

# Disk space trend dari df output
df -h | awk 'NR>1 {sum += $3} END {print "Total used:", sum/1024/1024 " GB"}'

5. AWK Scripting dan File Processing

Multi-line AWK Script

Simpan ke file process.awk:

#!/usr/bin/awk -f

BEGIN {
    FS=","
    OFS=" | "
    print "Name", "Department", "Salary"
    print "----", "----------", "------"
}

{
    total += $3
    count++
    
    if ($3 > 50000) {
        print $1, $2, "$" $3
    }
}

END {
    print ""
    print "Average Salary: $", total/count
    print "Total Employees:", count
}

Jalankan dengan:

chmod +x process.awk
./process.awk employees.csv
# atau
awk -f process.awk employees.csv

Data Transformation

# Convert CSV to TSV
awk 'BEGIN {FS=","; OFS="\t"} {$1=$1; print}' input.csv > output.tsv

# Format currency
awk '{printf "$%.2f\n", $1}' prices.txt

# Pad numbers dengan leading zeros
awk '{printf "%04d\n", $1}' numbers.txt

# Date formatting
awk '{gsub(/-/,"/"); print}' dates.txt  # Replace - with /

Kesimpulan

AWK adalah tools yang sangat powerful untuk text processing dan data extraction. Dengan kombinasi pattern matching, field processing, dan mathematical operations, AWK dapat menggantikan banyak tools text processing yang lebih kompleks.

Kapan Menggunakan AWK:

  • Processing structured text data (CSV, TSV, logs)
  • Extract dan transform data
  • Calculations pada data
  • Reporting dan summarization
  • One-liner text processing

Alternatives:

  • sed untuk simple text substitution
  • grep untuk pattern matching
  • cut untuk field extraction sederhana
  • perl untuk complex scripting

Tips:

  • Selalu test dengan sample data terlebih dahulu
  • Gunakan -F untuk set delimiter
  • Print intermediate results saat debugging
  • Combine dengan pipes untuk workflow complex

Artikel Terkait

Bagikan:

Link Postingan: https://www.tirinfo.com/panduan-lengkap-text-processing-awk-linux/