Opened 3 years ago

Last modified 2 years ago

#8006 assigned enhancement

Improve granularity of webservice poll timing — at Version 5

Reported by: Tom Goddard Owned by: Zach Pearson
Priority: moderate Milestone:
Component: Web Services Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description (last modified by Zach Pearson)

We use client-side polling to determine when webservice jobs are complete, but it's possible for the backend to advise the client on when the next poll should be.

Whether on the frontend or the backend, these decisions are made on a per-service basis but should be made on at least a per-resource (e.g. PDB will need fast polling and AlphaFold will be OK with a more reserved method) if not a per-job basis.

Split off from #7725

Change History (5)

comment:1 by Zach Pearson, 3 years ago

Server-side polling can be improved by passing the job's parameters to the implementing module's PollTimer class. The PollTimer can then decide how to advise clients on when to check back in for results. There are generic fallback timers both in ChimeraX and in Webservices.

So the question is just how often you think ChimeraX should ask the backend for results of AlphaFold, NR, and ESMFold jobs.

in reply to:  2 ; comment:2 by goddard@…, 3 years ago

We want the user to get results as soon as they are available.  Polling is a poor solution for making this perform well.  While we can probably estimate the time blast will take to run based on the size of the database, the estimate is not too reliable, more an order-of-magnitude estimate.  The time will vary if we change the number of threads, change the hardware, the OS, the disks the database is on, update the database, update blast, ....  So if the estimate is not reliable either you have to poll a lot, like once every second that it does now -- meaning 600 network requests for a 10 minute job, or we guess when to poll and end up giving the results possibly minutes after they are available.  It seems clear that polling is both more complex and poorer performing then simply having the client query the server in a separate thread and that blocks until the server replies.  But I don't know if that is viable, because I don't know how time-outs are handled by https requests.  At any rate, it seems the choices are to use non-polling methods that offer good performance and probably simpler too, or to just accept that it is going to require a ton of network requests to know when a job is done without much delay.  Either approach seems fine since BLAST of large databases is probably rarely used in ChimeraX.  It might be useful to quantify how many times ChimeraX blast is being run per month on large databases before proceeding.

comment:3 by Zach Pearson, 3 years ago

I've added a method to the job API that lets programmers choose which of the input parameters they want to log; it goes on the same line that logs the job's ID and the requested service.

There's also now a line to log when a job's results are requested from the server, so we can know the lead time between a job finishing and its results being requested.

It doesn't really matter to me either way whether the client polls for results or not, but:

It seems clear that polling is both more complex and poorer performing then simply having the client query the server in a separate thread and that blocks until the server replies.

I think this will also be blocking on the server. We have 20 threads available in production to take in requests for jobs and push them over to redis, which is not many people at a time and it's not inconceivable that 20 people at a time could hog every thread that should be used for taking in jobs, enqueueing them, and sending their results out. I have to read more about e.g. WebSockets to see how it can be done.

comment:4 by Zach Pearson, 3 years ago

Milestone: 1.61.7

comment:5 by Zach Pearson, 2 years ago

Description: modified (diff)
Milestone: 1.7
Note: See TracTickets for help on using tickets.