Strange JSP Compilation Issue- 4 minutes read - 705 words
We were getting following error on the production sever (Linux, Jboss). We were in the middle of user acceptance testing with the client, many users were accessing the application to simulate various scenarios in a live session.
the application was crashing just after 2-3 minute of start with above error, and no user were able to access the application. We were expected for fix the issue ASAP, as the users were assembled for the web session, all were waiting for the server up.
ERROR [org.apache.jasper.compiler.Compiler] Compilation error java.io.FileNotFoundException: /home/user/jboss/server/user/work/jboss.web/localhost/user/org/apache/jsp/jsp/common/header_jsp.java (Too many open files) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.(FileInputStream.java:106) at java.io.FileInputStream.(FileInputStream.java:66) at org.apache.jasper.compiler.JDTCompiler$1CompilationUnit.getContents(JDTCompiler.java:103)
Solving the Chaos!
We had followed below approach and steps:
Recorded the logs.
Server Reboot – this was the quickest option.
We had rebooted the server.
But this option had not given us much time to analyze the recorded logs, rhe application crashed again after 5 minutes, as soon as the concurrent users increased.
And once again we noticed above exception in the logs. “Too many open files”. Actually the exception was misleading “
Then we borrowed some time from the users and continued the analysis.
We had following points in mind:
File/Resources were not being closed properly – I was sure that there was no file handling or resource handing related code introduced in that release and IO handing were proper in previous releases. We had not faced such issue before if there were some coding issue then we would have encountered it earlier.
Next thing came in our mind was, what is the allowed limit for the number of open files? – In the release, numbers of JSP files were introduced, and it could be possible that compiler or OS allow only certain number of open files at a time.
We started analyzing, problem looked more related to platform (OS, Hardware …) as this was the only difference in the QA and Production server. Production server was owned by the client.
and We found following :
Run following command to know currently used files:
ls -l /proc/[jbosspid]/fd
Replace the [jbosspid] by the jboss process id, get pid by :
ps -ef | grep java
[user@system ~]$ ps -ef | grep java user 23096 23094 0 Apr27 ? 00:04:55 java -Xms3072m -Xmx3072m -jar /opt/user/app/deploy/app-server.jar user 31090 29459 0 12:16 pts/2 00:00:00 grep java
And locate the process id for jboss process. For the above it is 23096. Run the next command to get the list of file descriptors being used currently :
[user@system ~]$ ls -l /proc/23096/fd total 0 lr-x------ 1 remate remate 64 May 1 12:13 0 -> /dev/null l-wx------ 1 remate remate 64 May 1 12:13 1 -> /opt/user/jboss/server/user/log/console.log lr-x------ 1 remate remate 64 May 1 12:13 10 -> /opt/user/jboss/lib/endorsed/jbossws-native-saaj.jar lr-x------ 1 remate remate 64 May 1 12:13 101 -> /usr/share/fonts/liberation/LiberationSerif-Italic.ttf lr-x------ 1 remate remate 64 May 1 12:13 102 -> /usr/share/fonts/dejavu-lgc/DejaVuLGCSans-BoldOblique.ttf lrwx------ 1 remate remate 64 May 1 12:13 105 -> socket:[33496164 .. ..
We have analyzed the result of above commands and had not noticed any abnormality.
- System Level File Descriptor Limit
Run following command:
[user@system ~]$ cat /proc/sys/fs/file-max 743813
It was already quite large; we did not find any need to change it. Any way it can be changed by adding ‘
fs.file-max = 1000000’ to
/etc/sysctl.conf. “1000000” is the limit you want to set.
- User level File descriptor limit. Run following command:
[user@system ~]$ ulimit -n 1024
‘1024’ file descriptor per process for a user could be a low value for the concurrent jsp application, where lot of jsps need to be complied initially. From our analysis, we found that open FD were close to 1024 limit.
We changed this limit appending following lines in
user soft nofile 16384 user hard nofile 16384
and then check again the with ulimit –n
[user@system ~]$ ulimit -n 16384
Looks like we have solved it.
We requested our client to apply changes mentioned steps on production server. After this change we have never faced the issue again. We were able to solve the issue in 20 minutes and in another 10minutes (taken by client’s IT process) the server were up again and we had good game play and the UAT gone well.