Difference between revisions of "HeavyJobs"

(DoneDone: typo)
(add image)
Line 22: Line 22:
 
* A worker should terminate when a manager has no more work to do.
 
* A worker should terminate when a manager has no more work to do.
  
 +
== Pilot Workflow ==
 +
 +
[[Image:HeavyJobsWorkflow.png|300px]]
  
 
[[Category:OpenTask]]
 
[[Category:OpenTask]]
 
[[Category:DevelopmentTeam]]
 
[[Category:DevelopmentTeam]]
 
</noinclude>
 
</noinclude>

Revision as of 16:48, 11 April 2008

OurWork Edit-chalk-10bo12.png

What (summary)

Manage long-running jobs on available compute resources (servers) using db tables to keep track of work, and inter-process communication to keep track of workers.

Why this is important

We will use this infrastructure to manage our algorithmic data collection. This is a strategic direction for the company.

DoneDone

We will be satisfied with this infrastructure when:

  • we can launch, balance, and diagnose all steps of our pilot whois refresh path.
    • fetchers
    • parsers
    • aggregators
  • we have startup scripts that will resume proper job processing after a machine reboot
  • we can monitor overall health of all heavy job processing with zabbix, including system administrator alerts

Bugs and Todos

(non-prioritized at the moment)

  • Workers should do partially completed chunks before starting new chunks.
  • A worker should terminate when a manager has no more work to do.

Pilot Workflow

HeavyJobsWorkflow.png



Retrieved from "http://aboutus.com/index.php?title=HeavyJobs&oldid=15260141"