Salin dan Bagikan
Troubleshooting MySQL: Panduan Lengkap Mengatasi Masalah Database Production - Panduan lengkap troubleshooting masalah MySQL production dari crash, corruption, performance …

Troubleshooting MySQL: Panduan Lengkap Mengatasi Masalah Database Production

Troubleshooting MySQL: Panduan Lengkap Mengatasi Masalah Database Production

Database production yang down bisa berarti kehilangan ribuan dollar per menit. Setelah menangani berbagai crisis selama bertahun-tahun, saya telah mengembangkan systematic approach untuk troubleshooting MySQL. Artikel ini adalah playbook lengkap untuk situasi emergency.

Methodologi Troubleshooting

1. Systematic Approach

  1. Identify: Apa masalahnya?
  2. Isolate: Di mana masalahnya?
  3. Investigate: Kenapa terjadi?
  4. Implement: Solusi apa?
  5. Prevent: Bagaimana mencegah?

2. Data Gathering Checklist

# Simpan sebelum melakukan perubahan
mysql -u root -p -e "SHOW GLOBAL STATUS;" > /tmp/status_before.txt
mysql -u root -p -e "SHOW PROCESSLIST;" > /tmp/processlist.txt
mysql -u root -p -e "SHOW ENGINE INNODB STATUS;" > /tmp/innodb_status.txt

Common Errors dan Solusi

1. “Can’t connect to local MySQL server through socket”

Penyebab: MySQL tidak berjalan atau socket file tidak ada

Solusi:

# Cek status service
sudo systemctl status mysql

# Jika stopped, start
sudo systemctl start mysql

# Cek error log
tail -f /var/log/mysql/error.log

# Jika socket file missing, cek konfigurasi
mysql -u root -p -h 127.0.0.1  # Gunakan TCP/IP

# Fix socket path di my.cnf
[mysqld]
socket = /var/run/mysqld/mysqld.sock

[client]
socket = /var/run/mysqld/mysqld.sock

2. “Too many connections”

Lihat artikel: Cara Mengatasi Error Too Many Connections

Quick Fix:

-- Kill idle connections
SELECT GROUP_CONCAT(ID) INTO @kill_list 
FROM INFORMATION_SCHEMA.PROCESSLIST 
WHERE COMMAND = 'Sleep' AND TIME > 300;

SET @kill_cmd = CONCAT('KILL ', @kill_list);
PREPARE stmt FROM @kill_cmd;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

3. “Table is full”

Penyebab: Disk penuh atau table size limit

Solusi:

# Cek disk space
df -h

# Cek table size
du -sh /var/lib/mysql/database_name/

# Jika disk penuh, cleanup
sudo find /var/lib/mysql -name "*.log" -mtime +7 -delete

# Atau expand disk (cloud)

4. “Lock wait timeout exceeded”

Penyebab: Long-running transaction holding locks

Solusi:

-- Identify blocking transaction
SELECT 
    r.trx_id waiting_trx_id,
    r.trx_mysql_thread_id waiting_thread,
    r.trx_query waiting_query,
    b.trx_id blocking_trx_id,
    b.trx_mysql_thread_id blocking_thread,
    b.trx_query blocking_query
FROM information_schema.innodb_lock_waits w
INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id
INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id;

-- Kill blocking thread
KILL <blocking_thread_id>;

5. “Innodb: Database page corruption”

Penyebab: Hardware failure, power outage, atau disk corruption

Solusi:

# 1. Backup segera
mysqldump -u root -p --all-databases > /backup/corruption_backup.sql

# 2. Cek table corruption
cd /var/lib/mysql
sudo innochecksum database_name/table_name.ibd

# 3. Restart dengan recovery
sudo systemctl stop mysql

# Edit my.cnf
[mysqld]
innodb_force_recovery = 1  # Coba 1-6, semakin tinggi semakin agresif

# 4. Restart dan dump data
sudo systemctl start mysql
mysqldump -u root -p database_name > /backup/recovery_dump.sql

# 5. Recreate database dan restore
mysql -u root -p -e "DROP DATABASE database_name; CREATE DATABASE database_name;"
mysql -u root -p database_name < /backup/recovery_dump.sql

# 6. Remove force recovery dan restart normal

Performance Issues

1. Query Lambat Mendadak

Diagnosa:

-- Cek slow query log
SELECT * FROM mysql.slow_log 
WHERE start_time > DATE_SUB(NOW(), INTERVAL 1 HOUR)
ORDER BY query_time DESC;

-- Cek execution plan berubah
EXPLAIN FORMAT=JSON SELECT ... FROM slow_query;

-- Cek statistics update
SHOW TABLE STATUS LIKE 'problematic_table';
ANALYZE TABLE problematic_table;

Common Causes:

  • Statistics outdated: ANALYZE TABLE
  • Index corruption: OPTIMIZE TABLE atau recreate index
  • Data distribution changed: Reconsider index strategy
  • Server restarted: Query plan cache cleared

2. High CPU Usage

Diagnosa:

-- Top CPU consuming queries
SELECT 
    DIGEST_TEXT,
    COUNT_STAR,
    AVG_TIMER_WAIT/1000000000 as avg_ms,
    SUM_CREATED_TMP_DISK_TABLES as disk_tmp_tables
FROM performance_schema.events_statements_summary_by_digest
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 10;

Solusi:

-- Enable profiling
SET profiling = 1;

-- Run query
SELECT ... FROM ...;

-- Analyze
SHOW PROFILES;
SHOW PROFILE FOR QUERY 1;

3. Memory Exhaustion

Diagnosa:

# Cek memory usage
top -p $(pgrep mysqld)

# Atau
ps aux | grep mysqld | grep -v grep

Konfigurasi Memory:

# my.cnf
[mysqld]
# 70-80% dari RAM untuk InnoDB
innodb_buffer_pool_size = 4G

# Connection memory
max_connections = 100
# Setiap connection ~256KB - 1MB

# Query cache (jika di-enable)
query_cache_size = 256M
query_cache_type = 1

4. Disk I/O Bottleneck

Diagnosa:

# Monitor disk I/O
iostat -x 1 10

# Atau
iotop -o -b -d 1

Solusi:

-- Cek slow I/O tables
SELECT 
    OBJECT_SCHEMA,
    OBJECT_NAME,
    SUM_TIMER_WAIT/1000000000 as latency_ms
FROM performance_schema.table_io_waits_summary_by_table
ORDER BY SUM_TIMER_WAIT DESC;

-- Cek temp tables on disk
SHOW GLOBAL STATUS LIKE 'Created_tmp%';

Replication Issues

1. Replication Lag Tinggi

Diagnosa:

-- Di slave
SHOW SLAVE STATUS\G
-- Perhatikan: Seconds_Behind_Master

-- Identify cause
SELECT * FROM performance_schema.replication_applier_status;

Solusi:

-- Enable parallel replication
STOP SLAVE;
SET GLOBAL slave_parallel_threads = 4;
SET GLOBAL slave_parallel_mode = 'optimistic';
START SLAVE;

2. Replication Error

Diagnosa:

SHOW SLAVE STATUS\G
-- Perhatikan: Last_SQL_Error, Last_IO_Error

Solusi berdasarkan error:

-- Error: Duplicate entry
STOP SLAVE;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
START SLAVE;

-- Error: Table doesn't exist
-- Recreate table dari master

-- Error: Foreign key constraint
-- Periksa data consistency

3. Master-Slave Data Inconsistency

Check:

# Percona Toolkit
pt-table-checksum --user=root --password --databases=mydb

# Atau manual check
mysqldump --no-data mydb table_name | md5sum
# Jalankan di master dan slave, bandingkan

Crash Recovery

1. MySQL Won’t Start

Steps:

# 1. Cek error log
sudo tail -100 /var/log/mysql/error.log

# 2. Cek disk space
df -h

# 3. Cek permission
ls -la /var/lib/mysql/
sudo chown -R mysql:mysql /var/lib/mysql

# 4. Recovery mode
sudo systemctl stop mysql
sudo nano /etc/mysql/my.cnf
[mysqld]
innodb_force_recovery = 1  # Mulai dari 1, naik sampai 6 jika perlu
sudo systemctl start mysql

# Dump semua data jika recovery berhasil
mysqldump -u root -p --all-databases > /backup/emergency_dump.sql

# Restore ke instance baru

2. Corrupted Table (MyISAM)

-- Repair MyISAM table
REPAIR TABLE corrupted_table;

-- Atau dari command line
myisamchk --recover /var/lib/mysql/db_name/table_name

3. Corrupted Table (InnoDB)

-- 1. Backup dengan force recovery
-- 2. Drop table
DROP TABLE corrupted_table;

-- 3. Recreate dari backup atau schema
CREATE TABLE corrupted_table (...);

-- 4. Restore data

Deadlock Resolution

1. Identify Deadlock

-- Cek deadlock terakhir
SHOW ENGINE INNODB STATUS\G
-- Cari bagian "LATEST DETECTED DEADLOCK"

2. Analisis Deadlock Log

LATEST DETECTED DEADLOCK
*** (1) TRANSACTION:
TRANSACTION 12345, ACTIVE 11 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 1234, OS thread handle 12345678, query id 123 localhost user
SELECT * FROM orders WHERE id = 100 FOR UPDATE
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 55 page no 3 n bits 72 index PRIMARY of table `db`.`orders`

3. Prevention

-- Akses tables dalam urutan yang konsisten
-- ORDER BY untuk lock acquisition
-- Keep transactions pendek
-- Gunakan indexes untuk reduce lock contention

Emergency Playbook

1. Complete Crash Recovery

#!/bin/bash
# Emergency recovery script

# 1. Stop MySQL
sudo systemctl stop mysql

# 2. Backup data files (meski corrupt)
sudo cp -r /var/lib/mysql /var/lib/mysql_backup_$(date +%s)

# 3. Enable recovery mode
sudo tee -a /etc/mysql/my.cnf <<EOF
[mysqld]
innodb_force_recovery = 4
innodb_purge_threads = 0
EOF

# 4. Start MySQL
sudo systemctl start mysql

# 5. Dump data
mysqldump -u root -p --all-databases > /backup/emergency_$(date +%Y%m%d_%H%M%S).sql

# 6. Stop MySQL dan reset config
sudo systemctl stop mysql
sudo sed -i '/innodb_force_recovery/d' /etc/mysql/my.cnf
sudo sed -i '/innodb_purge_threads/d' /etc/mysql/my.cnf

# 7. Clear data directory
sudo rm -rf /var/lib/mysql/*

# 8. Reinitialize
sudo mysqld --initialize-insecure --user=mysql

# 9. Start MySQL
sudo systemctl start mysql

# 10. Restore data
mysql -u root -p < /backup/emergency_*.sql

2. Data Corruption Prevention

-- Enable checksums
SET GLOBAL innodb_checksum_algorithm = 'strict_crc32';

-- Regular checks
CHECK TABLE important_table;

Kesimpulan

Troubleshooting yang efektif memerlukan:

  1. Systematic approach: Jangan panik, follow methodology
  2. Proper logging: Always check logs first
  3. Backup before fix: Simpan state sebelum perubahan
  4. Test di staging: Validasi fix sebelum production
  5. Document: Catat semua untuk future reference

Dengan playbook ini, Anda siap untuk menghadapi hampir semua crisis database.

Artikel Terkait

Link Postingan : https://www.tirinfo.com/troubleshooting-mysql-panduan-lengkap-masalah-production/

Hendra WIjaya
Tirinfo
6 minutes.
3 February 2026