Over the last four days at work I have learned a lot about OS, networks and java and that just because the source of the problem is something you would not expect, the problem may still be there.
The source of my problem was that for whatever insane reason the loopback device on the production server was dead.
The symptoms were not that funky and looking back should have pointed me to the problem quickly if only I had known such a problem existed.
# my program hangs on startup
# kill -3 gave me a stack trace and the source code told me that tomcat was simply trying to create a java.net.ServerSocket at that point, there’s not much that can go wrong there, hmmm? This one line method should have given me the first hint that it was not a “java issue”
at java.net.PlainSocketImpl.initProto(Native Method)
# somebody else found out that at one point or another tomcat got a timeout there
# after the timeout tomcat proceeded to run without further troubles
# stopping tomcat took just as long
# tcpdump told me that there was not really any traffic with that port on eth0 which finally led me to realize that this might not be the correct interface …..
# then everything resolved quickly: ifup lo generated the “device already configured” while ifconfig didn’t show the device, the sysadmin who had told me “the OS is healthy must be a java issue” agreed that it wasn’t good to have no loopback device
Now my tomcat starts as fast as before last wednesday and I wish I could catch the program that was responsible for putting the lo device into nirvana. By the way this only happened when I was trying to run tomcat as a standalone server with the org.apache.coyote.tomcat5.CoyoteConnector instead of plugging into apache.