Salin dan Bagikan
Panduan Lengkap Text Processing dengan AWK di Linux
Panduan Lengkap Text Processing dengan AWK di Linux
AWK adalah programming language yang powerful untuk text processing. Dibuat oleh Alfred Aho, Peter Weinberger, dan Brian Kernighan (dari sinilah nama AWK), tools ini sangat efektif untuk manipulasi data terstruktur seperti log files, CSV, dan output command.
1. Pengenalan AWK dan Sintaks Dasar
Struktur AWK Program
pattern { action }
pattern { action }
- Pattern: Kondisi yang harus dipenuhi untuk menjalankan action
- Action: Command yang dijalankan jika pattern match
AWK One-Liners
# Print seluruh file
awk '{print}' file.txt
# Print baris tertentu (line 5)
awk 'NR==5' file.txt
# Print baris dengan pattern
awk '/pattern/' file.txt
# Print field tertentu (default delimiter: whitespace)
awk '{print $1}' file.txt # Field pertama
awk '{print $NF}' file.txt # Field terakhir
awk '{print $1, $3}' file.txt # Field 1 dan 3
2. Field Processing dan Delimiters
Menggunakan Delimiter Custom
# CSV file dengan comma delimiter
awk -F',' '{print $1, $2}' data.csv
# TSV file dengan tab delimiter
awk -F'\t' '{print $1}' data.tsv
# Multiple delimiters
awk -F'[:,]' '{print $1}' file.txt # Delimiter : atau ,
# Regular expression sebagai delimiter
awk -F'[ \t]+' '{print $1}' file.txt # Whitespace
Field Operations
# Print dengan separator custom
awk '{print $1 " - " $2}' file.txt
# Calculate total dari field
awk '{sum += $3} END {print sum}' numbers.txt
# Average dari field
awk '{sum += $1; count++} END {print sum/count}' data.txt
# Find max/min
awk 'max < $1 || NR==1 {max = $1} END {print max}' data.txt
awk 'min > $1 || NR==1 {min = $1} END {print min}' data.txt
# Count non-empty fields
awk '{for(i=1;i<=NF;i++) if($i!="") count++} END {print count}' file.txt
3. Pattern Matching
Pattern Types
# Exact match
awk '$1 == "value"' file.txt
# Regex match
awk '$2 ~ /regex/' file.txt
# Negation
awk '$2 !~ /regex/' file.txt
# Numeric comparison
awk '$3 > 100' file.txt
awk '$3 < 50' file.txt
awk '$3 >= 100 && $3 <= 200' file.txt
# String comparison
awk '$1 > "m"' file.txt # Alphabetic comparison
# Multiple conditions
awk '$1 == "admin" && $3 > 1000' file.txt
awk '$1 == "user" || $1 == "admin"' file.txt
Built-in Patterns
# BEGIN - dijalankan sebelum processing file
awk 'BEGIN {print "Header"} {print}' file.txt
# END - dijalankan setelah processing file
awk '{sum += $1} END {print "Total:", sum}' file.txt
# NR - Record/line number
awk 'NR==1 {print "First line"}' file.txt
awk 'NR%2==0' file.txt # Even lines
awk 'NR>1 && NR<=10' file.txt # Lines 2-10
# NF - Number of fields
awk 'NF > 5' file.txt # Lines dengan lebih dari 5 fields
awk 'NF == 0 {print "Empty line"}' file.txt
4. Log Processing dengan AWK
Analisis Log Apache/Nginx
# Count requests per IP
cat access.log | awk '{print $1}' | sort | uniq -c | sort -rn | head -20
# Atau dengan AWK saja
awk '{count[$1]++} END {for(ip in count) print count[ip], ip}' access.log | sort -rn | head -20
# Find 404 errors dengan referer
awk '$9 == 404 {print $1, $7, $11}' access.log
# Calculate response time average
awk '{sum += $10; count++} END {print "Avg response time:", sum/count "ms"}' access.log
# Find most requested URLs
awk '{url[$7]++} END {for(u in url) print url[u], u}' access.log | sort -rn | head -10
# Bandwidth usage per IP
awk '{bytes[$1] += $10} END {for(ip in bytes) print bytes[ip], ip}' access.log | sort -rn | head -20
System Log Analysis
# Error count per hour dari syslog
awk '/error/ {hour=substr($3,1,2); count[hour]++} END {for(h in count) print h, count[h]}' /var/log/syslog
# Failed SSH login attempts
awk '/Failed password/ {print $11}' /var/log/auth.log | sort | uniq -c | sort -rn | head -10
# Disk space trend dari df output
df -h | awk 'NR>1 {sum += $3} END {print "Total used:", sum/1024/1024 " GB"}'
5. AWK Scripting dan File Processing
Multi-line AWK Script
Simpan ke file process.awk:
#!/usr/bin/awk -f
BEGIN {
FS=","
OFS=" | "
print "Name", "Department", "Salary"
print "----", "----------", "------"
}
{
total += $3
count++
if ($3 > 50000) {
print $1, $2, "$" $3
}
}
END {
print ""
print "Average Salary: $", total/count
print "Total Employees:", count
}
Jalankan dengan:
chmod +x process.awk
./process.awk employees.csv
# atau
awk -f process.awk employees.csv
Data Transformation
# Convert CSV to TSV
awk 'BEGIN {FS=","; OFS="\t"} {$1=$1; print}' input.csv > output.tsv
# Format currency
awk '{printf "$%.2f\n", $1}' prices.txt
# Pad numbers dengan leading zeros
awk '{printf "%04d\n", $1}' numbers.txt
# Date formatting
awk '{gsub(/-/,"/"); print}' dates.txt # Replace - with /
Kesimpulan
AWK adalah tools yang sangat powerful untuk text processing dan data extraction. Dengan kombinasi pattern matching, field processing, dan mathematical operations, AWK dapat menggantikan banyak tools text processing yang lebih kompleks.
Kapan Menggunakan AWK:
- Processing structured text data (CSV, TSV, logs)
- Extract dan transform data
- Calculations pada data
- Reporting dan summarization
- One-liner text processing
Alternatives:
seduntuk simple text substitutiongrepuntuk pattern matchingcutuntuk field extraction sederhanaperluntuk complex scripting
Tips:
- Selalu test dengan sample data terlebih dahulu
- Gunakan
-Funtuk set delimiter - Print intermediate results saat debugging
- Combine dengan pipes untuk workflow complex
Artikel Terkait
Link Postingan : https://www.tirinfo.com/panduan-lengkap-text-processing-awk-linux/
Editor : Hendra WIjaya
Publisher :
Tirinfo
Read : 4 minutes.
Update : 3 February 2026