We have faced ‘out of space’ issue on disk even after clearing old data on database server(MySQL) and temporary data on application. We noticed that 90% of the space was occupied by the database tables which were used to store very large de-normalized statistical data and when we track the database table we noticed there is not much data in the database in the tables but they were still occupying space on disk.
Increased the disk space. (This was not a solution but just to delay the problem until next de-fragmentation).
Executed following command to defrag the tables wherever we face the issue.
OPTIMIZE [NOWRITETOBINLOG | LOCAL] TABLE tblname [, tbl_name] ...
But this has following issues
- Above operation is a heavy operation
- When this process is in progress, those database tables remain inaccessible.
- So above operation need to be executed in off hours when nobody is using that system.
- Also this activity needs to be done in some frequent interval.
Solution to above issues:
- Implemented a Java program which runs optimization command on database for the database tables provided in a CSV file.
- Divided the heavy tables in two set to equalized memory size of table.
- Schedule the java program to run alternate weekend one set of tables and on next weekend run another set of tables. This way java program runs on every weekend early morning and optimizes one set of tables. In other word, every states table gets optimized bi-weekly.
This approach has fixed most of the out of space issues.
But we have faced more.
Optimization/Defragmentation/Repair of table requires free space on disk. And if any optimization fails in between then it may corrupt the table, which need to be repaired by running
We had faced an issue on our production server where a big table(40+ GB data and 10GB index ) got corrupted as optimization fails on the tables, because optimization and repair table uses system’s temporary space to repair a table indexes and temp space(
/tmp folder which was a mounted drive) on system was just 9GB. It leads to out of space issue.
Space requirement for table on temp folder can be calculated as :
(largestkey + rowpointerlength) * numberof_rows * 2
See http://dev.mysql.com/doc/refman/5.1/en/myisamchk-memory.html for more details
Also to optimize a table equal amount on space as table size is required on disk where table actually exists.
To fix the issue we have increased the temp space by extending the tmp directory as follows on other drive which is having more space:
- Create a folder say
- Give permission to this folder so that mysql can write temporary files.
- Edit mysql configuration file
/etc/my.cnf and add following line
mysql> show variables like "tmpdir";
| Variable_name | Value |
| tmpdir | /tmp/:/var/lib/mysql/tmp/ |
1 row in set (0.00 sec)
Also updated the java program to calculate available disk size before optimizing table if current state is not optimal to optimize the table then skip the optimization for that table notify administrator about the error. Also Java program updated to send status of optimization by email which includes the space recovered after optimization, tables optimized etc.
After above, we have fixed 95% of the issues, bit still more: Since optimization is a heavy operation, it is taking 4 hours average. We have encountered swap memory being used on the system even when nobody is using the application. It’s true that this is a very memory intensive application as it needs to execute lot simulation data in memory. we noticed that after 2-3 months the system started using lot of swap memory which lead to slow performance. After analysis we come to know that above java program does not release the memory for long time and garbage collector does not behave correctly for above program for so long. Also it put lot of load on CPU.
Above problem has been fixed by implementing shell-script in-place of java program, now shell script does all the operation which was earlier done by java. It work great and we never faced that out of space issue again for the same set of data.