Difference between revisions of "HeavyJobs"
(→Bugs and Todos: digging for :pid) |
(→Bugs and Todos: restart dark workers) |
||
(One intermediate revision by the same user not shown) | |||
Line 21: | Line 21: | ||
(new items) | (new items) | ||
+ | * Detect when worker goes dark > 2 min. Record last status in chunk; terminate and restart. | ||
+ | * from feed_aggregator: :error=>"private method `log_error' called for #<HeavyWorker:0xb7e9c318>" | ||
* heavy_jobs/show can't find pid when buried in :last | * heavy_jobs/show can't find pid when buried in :last | ||
** (related) first attempt to look both places (in some other method) is coded with nil sensitivity | ** (related) first attempt to look both places (in some other method) is coded with nil sensitivity | ||
Line 70: | Line 72: | ||
[[Image:HeavyJobsWorkflow.png|500px]] | [[Image:HeavyJobsWorkflow.png|500px]] | ||
+ | </noinclude> | ||
+ | [[Category:DevelopmentTeam]] | ||
[[Category:OpenTask]] | [[Category:OpenTask]] | ||
− | |||
− |
Latest revision as of 18:34, 4 May 2008
What (summary)
Manage long-running jobs on available compute resources (servers) using db tables to keep track of work, and inter-process communication to keep track of workers.
Why this is important
We will use this infrastructure to manage our algorithmic data collection. This is a strategic direction for the company.
DoneDone
We will be satisfied with this infrastructure when:
- we can launch, balance, and diagnose all steps of our pilot whois refresh path.
- fetchers
- parsers
- aggregators
- we have startup scripts that will resume proper job processing after a machine reboot or other operational events.
- we can monitor overall health and productivity of all heavy job processing through a web interface.
Bugs and Todos
(new items)
- Detect when worker goes dark > 2 min. Record last status in chunk; terminate and restart.
- from feed_aggregator: :error=>"private method `log_error' called for #<0xb7e9c318>
0xb7e9c318>