debugging - GreenAsh Poignant wit and hippie ramblings that are pertinent to debugging https://greenash.net.au/thoughts/topics/debugging/ 2011-02-10T00:00:00Z Solr, Jetty, and daemons: debugging jetty.sh 2011-02-10T00:00:00Z 2011-02-10T00:00:00Z Jaza https://greenash.net.au/thoughts/2011/02/solr-jetty-and-daemons-debugging-jettysh/ I recently added a Solr-powered search feature to this site (using django-haystack). Rather than go to the trouble (and server resources drain) of deploying Solr via Tomcat, I decided instead to deploy it via Jetty. There's a wiki page with detailed instructions for deploying Solr with Jetty, and the wiki page also includes a link to the jetty.sh startup script.

The instructions seem simple enough. However, I ran into some serious problems when trying to get the startup script to work. The standard java -jar start.jar was working fine for me. But after following the instructions to the letter, and after double-checking everything, a call to:

sudo /etc/init.d/jetty start

still resulted in my getting the (incredibly unhelpful) error message:

Starting Jetty: FAILED

My server is running Ubuntu Jaunty (9.04), and from my experience, the start-stop-daemon command in jetty.sh doesn't work on that platform. Let me know if you've experienced the same or similar issues on other *nix flavours or on other Ubuntu versions. Your mileage may vary.

When Jetty fails to start, it doesn't log the details of the failure anywhere. So, in attempting to nail down the problem, I had no choice but to open up the jetty.sh script, and to get my hands dirty with some old-skool debugging. It didn't take me too long to figure out which part of the script I should be concentrating my efforts on, it's the lines of code from 397-425:

##################################################
# Do the action
##################################################
case "$ACTION" in
  start)
    echo -n "Starting Jetty: "

    if (( NO_START )); then
      echo "Not starting jetty - NO_START=1";
      exit
    fi

    if type start-stop-daemon > /dev/null 2>&1
    then
      unset CH_USER
      if [ -n "$JETTY_USER" ]
      then
        CH_USER="-c$JETTY_USER"
      fi
      if start-stop-daemon -S -p"$JETTY_PID" $CH_USER -d"$JETTY_HOME" -b -m -a "$JAVA" -- "${RUN_ARGS[@]}" --daemon
      then
        sleep 1
        if running "$JETTY_PID"
        then
          echo "OK"
        else
          echo "FAILED"
        fi
      fi

To be specific, the line with if start-stop-daemon … (line 416) was clearly where the problem lay for me. So, I decided to see exactly what this command looks like (after all the variables have been substituted), by adding a line to the script that echo'es it:

echo start-stop-daemon -S -p"$JETTY_PID" $CH_USER -d"$JETTY_HOME" -b -m -a "$JAVA" -- "${RUN_ARGS[@]}" --daemon

And the result of that debugging statement looked something like:

start-stop-daemon -S -p/var/run/jetty.pid -cjetty -d/path/to/solr -b -m -a /usr/bin/java -- -Dsolr.solr.home=/path/to/solr/solr -Djetty.logs=/path/to/solr/logs -Djetty.home=/path/to/solr -Djava.io.tmpdir=/tmp -jar /path/to/solr/start.jar --daemon

That's a good start. Now, I have a command that I can try to run manually myself, as a debugging test. So, I took the above statement, pasted it into my terminal, and whacked a sudo in front of it:

sudo start-stop-daemon -S -p/var/run/jetty.pid -cjetty -d/path/to/solr -b -m -a /usr/bin/java -- -Dsolr.solr.home=/path/to/solr/solr -Djetty.logs=/path/to/solr/logs -Djetty.home=/path/to/solr -Djava.io.tmpdir=/tmp -jar /path/to/solr/start.jar --daemon

Well, that didn't give me any error messages; but then again, no positive feedback, either. To see if this command was successful in launching the Jetty daemon, I tried:

ps aux | grep java

But all that resulted in was:

myuser      3710  0.0  0.0   3048   796 pts/0    S+   19:35   0:00 grep java

That is, the command failed to launch the daemon.

Next, I decided to investigate the man page for the start-stop-daemon command. I'm no sysadmin or Unix guru — I've never dealt with this command before, and I have no idea what its options are.

When I have a Unix command that doesn't work, and that doesn't output or log any useful information about the failure, the first thing I look for is a "verbose" option. And it just so turns out that start-stop-daemon has a -v option. So, next step for me was to add that option and try again:

sudo start-stop-daemon -S -p/var/run/jetty.pid -cjetty -d/path/to/solr -v -b -m -a /usr/bin/java -- -Dsolr.solr.home=/path/to/solr/solr -Djetty.logs=/path/to/solr/logs -Djetty.home=/path/to/solr -Djava.io.tmpdir=/tmp -jar /path/to/solr/start.jar --daemon

Unfortunately, no cigar; the result of running that was exactly the same. Still absolutely no output (so much for verbose mode!), and ps aux showed the daemon had not launched.

Next, I decided to read up (in the man page) on the various options that the script was using with the start-stop-daemon command. Turns out that the -b option is rather a problematic one — as the manual says:

Typically used with programs that don't detach on their own. This option will force start-stop-daemon to fork before starting the process, and force it into the background. WARNING: start-stop-daemon cannot check the exit status if the process fails to execute for any reason. This is a last resort, and is only meant for programs that either make no sense forking on their own, or where it's not feasible to add the code for them to do this themselves.

Ouch — that sounds suspicious. Ergo, next step: remove that option, and try again:

sudo start-stop-daemon -S -p/var/run/jetty.pid -cjetty -d/path/to/solr -v -m -a /usr/bin/java -- -Dsolr.solr.home=/path/to/solr/solr -Djetty.logs=/path/to/solr/logs -Djetty.home=/path/to/solr -Djava.io.tmpdir=/tmp -jar /path/to/solr/start.jar --daemon

Running that command resulted in me seeing a fairly long Java exception report, the main line of which was:

java.io.FileNotFoundException: /path/to/solr/--daemon (No such file or directory)

Great — removing the -b option meant that I was finally able to see the error that was occurring. And… seems like the error is that it's trying to add the --daemon option to the solr filepath.

I decided that this might be a good time to read up on what exactly the --daemon option is. And as it turns out, the start-stop-daemon command has no such option. No wonder it wasn't working! (No such option in the java command-line app, either, or in any other standard *nix util that I was able to find).

I have no idea what this option is doing in the jetty.sh script. Perhaps it's available on some other *nix variants? Anyway, doesn't seem to be recognised at all on Ubuntu. Any info that may shed some light on this mystery would be greatly appreciated, if there are any start-stop-daemon experts out there.

Next step: remove the --daemon option, re-add the -b option, remove the -v option, and try again:

sudo start-stop-daemon -S -p/var/run/jetty.pid -cjetty -d/path/to/solr -b -m -a /usr/bin/java -- -Dsolr.solr.home=/path/to/solr/solr -Djetty.logs=/path/to/solr/logs -Djetty.home=/path/to/solr -Djava.io.tmpdir=/tmp -jar /path/to/solr/start.jar

And… success! Running that command resulted in no output; and when I tried a quick ps aux | grep java, I could see the daemon running:

myuser      3801 75.7  1.9 1069776 68980 ?       Sl   19:57   0:03 /usr/bin/java -Dsolr.solr.home=/path/to/solr/solr -Djetty.logs=/path/to/solr/logs -Djetty.home=/path/to/solr -Djava.io.tmpdir=/tmp -jar /path/to/solr/start.jar
myuser      3828  0.0  0.0   3048   796 pts/0    S+   19:57   0:00 grep java

Now that I'd successfully managed to launch the daemon with a manual terminal command, all that remained was to modify the jetty.sh script, and to do some integration testing. So, I removed the --daemon option from the relevant line of the script (line 416), and I tried:

sudo /etc/init.d/jetty start

And it worked. That command gave me the output:

Starting Jetty: OK

And a call to ps aux | grep java was also able to verify that the daemon was running.

Just one final step left in testing: restart the server (assuming that the Jetty startup script was added to Ubuntu's startup list at some point, manually or using update-rc.d), and see if Jetty is running. So, I restarted (sudo reboot), and… bup-bummmmm. No good. A call to ps aux | grep java showed that Jetty had not launched automatically after restart.

I remembered the discovery I'd made earlier, that the -b option is "dangerous". So, I removed this option from the relevant line of the script (line 416), and restarted the server again.

And, at long last, it worked! After restarting, a call to ps aux | grep java verified that the daemon was running. Apparently, Ubuntu doesn't like its startup daemons forking as background processes, this seems to result in things not working.

However, there is one lingering caveat. With this final solution — i.e. both the --daemon and the -b options removed from the start-stop-daemon call in the script — the daemon launches just fine after restarting the server. However, with this solution, if the daemon stops for some reason, and you need to manually invoke:

sudo /etc/init.d/jetty start

Then the daemon will effectively be running as a terminal process, not as a daemon process. This means that if you close your terminal session, or if you push CTRL+C, the process will end. Not exactly what init.d scripts are designed for! So, if you do need to manually start Jetty for some reason, you'll have to use another version of the script that maintains the -b option (adding an ampersand — i.e. the & symbol — to the end of the command should also do the trick, although that's not 100% reliable).

So, that's the long and winding story of my recent trials and tribulations with Solr, Jetty, and start-stop-daemon. If you're experiencing similar problems, hope this explanation is of use to you.

]]>