The Current installation#

Our live public installation of NewPred and Bioserf runs the following services:

PSIPRED
GenTHREADER
pGenTHREADER
pDomTHREADER
DOMPRED
DISOPRED
METSITE
HSPred
MemPack
MEMSAT-SVM
FFPRED

Before you go any further you need to know a little bit about the server architecture. The server runs across 8 machines. The front end machine hosts bioinf4 and bioinfdev. Then there are 6 backend machines: bios1 through to bios6. Bioinfdev hosts a private installation of the frontend and backend software for development purposes only, it is only visible within the department, you probably have no need to to touch this. bioinf4 hosts the frontend rails software and the queue daemon. The 6 bios machines host the backend NewPredServer software which runs the jobs and they each also run and instance of the fragfold client. The mysql database is hosted on bioinf5


The Job Queue#

If you want to see what jobs are currently in the queue to be processed by the backend you can go to http://bioinf4.cs.ucl.ac.uk/bio_serf. This gives you an overview of the currently running jobs. Most of what's in the columns is self explanatory but the State and Id columns are probably the most interesting. The Id column gives you a link to the backend process output, from there you can cancel jobs (not really something that you should need to do). The state column lists what the whole job is currently doing. There are 5 status codes:

-1: The server can not find a job of the name you specified in the rails controller
0: Unknown
1: JobRunning
2: JobPause
3: JobError
4: JobComplete

Jobs arrive in the queue with a 0 status but once they are assigned to a backend server the should switch to a 1, jobs should never sit with a 0 status and stack up and up (this means they are not being allocated to backend machines), if that's the case then something somewhere in the frontend has probably crashed (the runner daemon usually, more on this below). A 1 status means that the backend server that was assigned the job is happily getting on processing the job. A 2 status is rarely seen unless you wrote a bit of code that paused the job. A 3 status has many potential meanings, most commonly something caused the backend to throw an exception and terminate the job, most frequently these are bugs in the back end code possibly in handling the job running threads. A 4 status means that the job completed successfully and the results were mailed to the user (joy!).

The program_X columns (program_psipred, program_dompred, etc...). These show you which sub-jobs a seqJob or structJob process wants to runn. A 1 in this column indicates a job type that will be or has been run a 0 is a job type that a user did not select to run. If the job ended in status 4 and all the selected program_x columns remain as 1s or 0s this this indicates that everything was successful. If the job ended with status 4 and any of the program_x columns contains a 3 then this indicates that the higher level marshalling of the job ran but one of the sub-job types threw and exception and did not complete; this will typically be problems with the executables or data sources that the backend/sub-job is trying to use.


Job Configurations#

Each job type has a default configuration. There are really only 2 improtant job types structJob and seqJob. You can view, edit and display them and add configurations for new jobs types at http://bioinf4.cs.ucl.ac.uk/configurations. You can also add new backend server locations. As a general rule you do not want to touch these unless you've chosen to move some executables or some data around. If you want to play with this functionality I would advise you do so in the bioinfdev instance of the server and not the live server. Really, I implore you not to touch, rogue spaces or single missing '/' will cause services to fail

So saying, if a backend node dies for whatever reason you need to take it out of the list of available backend nodes then select edit for a given configuration and at the bottom you'll notice a selection list of servers. To reassign the available servers set the Active drop down to True. Then in the server list only select those bios machines you know to be available, then click Edit.


The Live Frontend#

To log into the live front end you must log in directly as the rails user (password in the usual location)

> ssh rails@bioinf4

The rails account is running 3 concurrent terminal sessions, each one is running a process necessary for the web services. Do not log in as yourself and then su to the rails user, if you do this you will not get access to the the live terminal sessions that are running the server. Once logged in you can view the current running sessions by typing

> screen -r

You should get some output that looks something like:

There are a couple of suitable screens on:
2460.frontend (Detached)
2520.runner (Detached)

This lists the process id for the live sessions and the name that's been assigned to it. The names should be reasonably self explanatory. frontend is the session the Ruby on Rails frontend is running in. And runner is the session the that Rails queue daemon is running in. The daemon handles frontend and backend to database communication, it outputs it's running logs to runner_log and runner_errors. If you want to view one of those sessions type something like://

> screen -r frontend

This command will toggle to that running session. To toggle back to the log in session hit ctrl+a ctrl+d. Once in a session you can stop the running process with ctrl-c. You'll be left with a prompt in the directory the process was started in. Process can be restarted from those specific directories as follows

Frontend:
dir: /home/rails/NewPred/
> ./script/server --environment=production --port=80

Runner:
dir: /home/rails/
> ./script/runner --environment=development Job.poll_all_loop

However this has a propensity to die for various reasons (mostly to do with mysql buffer floods) so you there is a small wrapper script that restarts it when it dies at /home/rails/ to start this:
dir: /home/rails
> ./runner_daemon.pl

Fragfold_server:
dir: /webdata/binaries/current/fragfold/Beta/RubyFragMan/lib
> ruby server_main.rb


The live backend#

You can additionally ssh to the backend nodes bios1, bios2, bios3, bios4, bios5, bios6. Once there you can restart the backend server with the following, any running will be terminated and a new instance of the backend bits will be spawned. Use this to push out changes to the backend that you've made.:

> sudo /root/start.bioserv

Incidentally you can read what the backend has being spewing out in the bioinfd log (/var/log/bioinfd)


Some other things worth knowing#

bioinf4 has a range of cron jobs running which build the tdb, clear the tmp directories and clear the rails public directories

You should keep and eye on bioinf4's disk space

The rails log can get quite intensely large.

Add new attachment

Only authorized users are allowed to upload new attachments.
« This page (revision-1) was last changed on 15-Apr-2013 18:13 by UnknownAuthor