Restarting Results ad infinitum

Message boards : Number crunching : Restarting Results ad infinitum

To post messages, you must log in.

AuthorMessage
AMDave

Send message
Joined: 16 Dec 05
Posts: 35
Credit: 12,576,896
RAC: 0
Message 17524 - Posted: 1 Jun 2006, 23:42:21 UTC

Running BOINC Mgr v5.2.13, Rosetta v5.16

Beginning @5/30/06, virtually all WUs are restarted after every 50 - 55 mins. Below is a partial of the Messages scr. Note the lines "If this happens repeatedly you may need to reset the project." I did, it didn't fix the problem. What's going on?

Another repeated line is "exited with zero status but no 'finished' file." What does this indicate?

5/30/2006 10:02:42 PM||libcurl/7.14.0 OpenSSL/0.9.8 zlib/1.2.3
5/30/2006 10:02:42 PM||Data directory: C:Program FilesBOINC
5/30/2006 10:02:42 PM||Processor: 1 AuthenticAMD AMD Athlon(tm) XP 2700+
5/30/2006 10:02:42 PM||Memory: 1023.48 MB physical, 1.90 GB virtual
5/30/2006 10:02:42 PM||Disk: 5.00 GB total, 1.80 GB free
5/30/2006 10:02:42 PM|rosetta@home|Computer ID: 103601; location: home; project prefs: home
5/30/2006 10:02:42 PM||General prefs: from rosetta@home (last modified 2006-03-07 07:45:59)
5/30/2006 10:02:42 PM||General prefs: no separate prefs for home; using your defaults
5/30/2006 10:02:43 PM||Remote control not allowed; using loopback address
5/30/2006 10:02:43 PM|rosetta@home|Deferring computation for result t299__CASP7_ABRELAX_SAVE_ALL_OUT_nterm_hom009__554_1804_0
5/30/2006 10:02:47 PM||request_reschedule_cpus: project op
5/30/2006 10:02:48 PM|rosetta@home|Restarting result t299__CASP7_ABRELAX_SAVE_ALL_OUT_nterm_hom009__554_1804_0 using rosetta version 516
5/30/2006 10:21:19 PM||request_reschedule_cpus: process exited
5/30/2006 10:21:19 PM|rosetta@home|Computation for result t299__CASP7_ABRELAX_SAVE_ALL_OUT_nterm_hom009__554_1804_0 finished
5/30/2006 10:21:19 PM|rosetta@home|Starting result t285_HOMOLOG_ABRELAX_hom001__518_94157_0 using rosetta version 516
5/30/2006 10:21:20 PM|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi
5/30/2006 10:21:20 PM|rosetta@home|Reason: To fetch work
5/30/2006 10:21:20 PM|rosetta@home|Requesting 3195 seconds of new work
5/30/2006 10:21:21 PM|rosetta@home|Started upload of t299__CASP7_ABRELAX_SAVE_ALL_OUT_nterm_hom009__554_1804_0_0
5/30/2006 10:21:25 PM|rosetta@home|Scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi succeeded
5/30/2006 10:21:28 PM||request_reschedule_cpus: files downloaded
5/30/2006 10:21:29 PM|rosetta@home|Finished upload of t299__CASP7_ABRELAX_SAVE_ALL_OUT_nterm_hom009__554_1804_0_0
5/30/2006 10:21:29 PM|rosetta@home|Throughput 10481 bytes/sec
5/30/2006 10:35:45 PM|rosetta@home|Result t285_HOMOLOG_ABRELAX_hom001__518_94157_0 exited with zero status but no 'finished' file
5/30/2006 10:35:45 PM|rosetta@home|If this happens repeatedly you may need to reset the project.
5/30/2006 10:35:45 PM||request_reschedule_cpus: process exited
5/30/2006 10:35:45 PM|rosetta@home|Restarting result t285_HOMOLOG_ABRELAX_hom001__518_94157_0 using rosetta version 516
5/30/2006 11:29:44 PM|rosetta@home|Result t285_HOMOLOG_ABRELAX_hom001__518_94157_0 exited with zero status but no 'finished' file
5/30/2006 11:29:44 PM|rosetta@home|If this happens repeatedly you may need to reset the project.
5/30/2006 11:29:44 PM||request_reschedule_cpus: process exited
5/30/2006 11:29:44 PM|rosetta@home|Restarting result t285_HOMOLOG_ABRELAX_hom001__518_94157_0 using rosetta version 516
5/31/2006 12:22:27 AM|rosetta@home|Result t285_HOMOLOG_ABRELAX_hom001__518_94157_0 exited with zero status but no 'finished' file
5/31/2006 12:22:27 AM|rosetta@home|If this happens repeatedly you may need to reset the project.
5/31/2006 12:22:27 AM||request_reschedule_cpus: process exited
5/31/2006 12:22:27 AM|rosetta@home|Restarting result t285_HOMOLOG_ABRELAX_hom001__518_94157_0 using rosetta version 516
5/31/2006 1:14:27 AM|rosetta@home|Result t285_HOMOLOG_ABRELAX_hom001__518_94157_0 exited with zero status but no 'finished' file
5/31/2006 1:14:27 AM|rosetta@home|If this happens repeatedly you may need to reset the project.
5/31/2006 1:14:27 AM||request_reschedule_cpus: process exited
5/31/2006 1:14:27 AM|rosetta@home|Restarting result t285_HOMOLOG_ABRELAX_hom001__518_94157_0 using rosetta version 516
5/31/2006 2:05:58 AM|rosetta@home|Result t285_HOMOLOG_ABRELAX_hom001__518_94157_0 exited with zero status but no 'finished' file
5/31/2006 2:05:58 AM|rosetta@home|If this happens repeatedly you may need to reset the project.
5/31/2006 2:05:58 AM||request_reschedule_cpus: process exited
5/31/2006 2:05:58 AM|rosetta@home|Restarting result t285_HOMOLOG_ABRELAX_hom001__518_94157_0 using rosetta version 516
5/31/2006 2:57:19 AM|rosetta@home|Result t285_HOMOLOG_ABRELAX_hom001__518_94157_0 exited with zero status but no 'finished' file
5/31/2006 2:57:19 AM|rosetta@home|If this happens repeatedly you may need to reset the project.
5/31/2006 2:57:19 AM||request_reschedule_cpus: process exited
5/31/2006 2:57:19 AM|rosetta@home|Restarting result t285_HOMOLOG_ABRELAX_hom001__518_94157_0 using rosetta version 516
5/31/2006 3:48:31 AM|rosetta@home|Result t285_HOMOLOG_ABRELAX_hom001__518_94157_0 exited with zero status but no 'finished' file
5/31/2006 3:48:31 AM|rosetta@home|If this happens repeatedly you may need to reset the project.
5/31/2006 3:48:31 AM||request_reschedule_cpus: process exited

<continued - here's where I reset>

6/1/2006 9:10:15 AM|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi
6/1/2006 9:10:15 AM|rosetta@home|Reason: To fetch work
6/1/2006 9:10:15 AM|rosetta@home|Requesting 25560 seconds of new work
6/1/2006 9:10:20 AM|rosetta@home|Scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi succeeded
6/1/2006 9:10:20 AM|rosetta@home|Message from server: Not sending work - last RPC too recent: 226 sec
6/1/2006 9:10:20 AM|rosetta@home|No work from project
6/1/2006 9:11:20 AM|rosetta@home|Result T0283_CONTACTS_CONSERVATIVE_CALPHA_HALFHB_MAP_FROM_hom024_593_39206_0 exited with zero status but no 'finished' file
6/1/2006 9:11:20 AM|rosetta@home|If this happens repeatedly you may need to reset the project.
6/1/2006 9:11:20 AM||request_reschedule_cpus: process exited
6/1/2006 9:11:20 AM|rosetta@home|Restarting result T0283_CONTACTS_CONSERVATIVE_CALPHA_HALFHB_MAP_FROM_hom024_593_39206_0 using rosetta version 516
6/1/2006 9:14:27 AM|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi
6/1/2006 9:14:27 AM|rosetta@home|Reason: To fetch work
6/1/2006 9:14:27 AM|rosetta@home|Requesting 26172 seconds of new work
6/1/2006 9:14:32 AM|rosetta@home|Scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi succeeded
6/1/2006 9:14:34 AM||request_reschedule_cpus: files downloaded
6/1/2006 9:14:34 AM||request_reschedule_cpus: files downloaded
6/1/2006 10:05:24 AM|rosetta@home|Result T0283_CONTACTS_CONSERVATIVE_CALPHA_HALFHB_MAP_FROM_hom024_593_39206_0 exited with zero status but no 'finished' file
6/1/2006 10:05:24 AM|rosetta@home|If this happens repeatedly you may need to reset the project.
6/1/2006 10:05:24 AM||request_reschedule_cpus: process exited
6/1/2006 10:05:24 AM|rosetta@home|Restarting result T0283_CONTACTS_CONSERVATIVE_CALPHA_HALFHB_MAP_FROM_hom024_593_39206_0 using rosetta version 516
6/1/2006 10:58:53 AM||request_reschedule_cpus: project op
6/1/2006 10:58:54 AM|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi
6/1/2006 10:58:54 AM|rosetta@home|Reason: Requested by user
6/1/2006 10:58:54 AM|rosetta@home|Note: not requesting new work or reporting results
6/1/2006 10:58:59 AM|rosetta@home|Scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi succeeded
6/1/2006 10:58:37 AM|rosetta@home|Resetting project
6/1/2006 10:58:37 AM||request_reschedule_cpus: exit_tasks
6/1/2006 10:58:37 AM||request_reschedule_cpus: project op
6/1/2006 10:58:43 AM||request_reschedule_cpus: project op
6/1/2006 10:59:33 AM||request_reschedule_cpus: project op

<continued - here's the most recent>

6/1/2006 12:53:31 PM|rosetta@home|Restarting result t285_HOMOLOG_ABRELAX_hom001__514_4147_1 using rosetta version 516
6/1/2006 1:51:41 PM|rosetta@home|Result t285_HOMOLOG_ABRELAX_hom001__514_4147_1 exited with zero status but no 'finished' file
6/1/2006 1:51:41 PM|rosetta@home|If this happens repeatedly you may need to reset the project.
6/1/2006 1:51:41 PM||request_reschedule_cpus: process exited
6/1/2006 1:51:41 PM|rosetta@home|Restarting result t285_HOMOLOG_ABRELAX_hom001__514_4147_1 using rosetta version 516
6/1/2006 3:10:45 PM|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi
6/1/2006 3:10:45 PM|rosetta@home|Reason: To fetch work
6/1/2006 3:10:45 PM|rosetta@home|Requesting 2805 seconds of new work
6/1/2006 3:10:50 PM|rosetta@home|Scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi succeeded
6/1/2006 3:10:52 PM||request_reschedule_cpus: files downloaded
6/1/2006 3:50:32 PM|rosetta@home|Result t285_HOMOLOG_ABRELAX_hom001__514_4147_1 exited with zero status but no 'finished' file
6/1/2006 3:50:32 PM|rosetta@home|If this happens repeatedly you may need to reset the project.
6/1/2006 3:50:32 PM||request_reschedule_cpus: process exited
6/1/2006 3:50:32 PM|rosetta@home|Restarting result t285_HOMOLOG_ABRELAX_hom001__514_4147_1 using rosetta version 516
6/1/2006 4:27:30 PM||request_reschedule_cpus: process exited
6/1/2006 4:27:30 PM|rosetta@home|Computation for result t285_HOMOLOG_ABRELAX_hom001__514_4147_1 finished
6/1/2006 4:27:30 PM|rosetta@home|Starting result T0283_CONTACTS_MAP_FROM_hom006_535_1783_1 using rosetta version 516
6/1/2006 4:27:32 PM|rosetta@home|Started upload of t285_HOMOLOG_ABRELAX_hom001__514_4147_1_0
6/1/2006 4:27:37 PM|rosetta@home|Finished upload of t285_HOMOLOG_ABRELAX_hom001__514_4147_1_0
6/1/2006 4:27:37 PM|rosetta@home|Throughput 22457 bytes/sec
6/1/2006 5:46:29 PM||request_reschedule_cpus: project op
6/1/2006 5:46:32 PM|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi
6/1/2006 5:46:32 PM|rosetta@home|Reason: Requested by user
6/1/2006 5:46:32 PM|rosetta@home|Reporting 1 results
6/1/2006 5:46:37 PM|rosetta@home|Scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi succeeded
6/1/2006 6:22:59 PM||request_reschedule_cpus: project op
6/1/2006 6:23:00 PM|rosetta@home|Pausing result T0283_CONTACTS_MAP_FROM_hom006_535_1783_1 (left in memory)
6/1/2006 6:23:53 PM||request_reschedule_cpus: project op
6/1/2006 6:23:54 PM|rosetta@home|Resuming result T0283_CONTACTS_MAP_FROM_hom006_535_1783_1 using rosetta version 516
6/1/2006 6:49:01 PM|rosetta@home|Result T0283_CONTACTS_MAP_FROM_hom006_535_1783_1 exited with zero status but no 'finished' file
6/1/2006 6:49:01 PM|rosetta@home|If this happens repeatedly you may need to reset the project.
6/1/2006 6:49:01 PM||request_reschedule_cpus: process exited
6/1/2006 6:49:01 PM|rosetta@home|Restarting result T0283_CONTACTS_MAP_FROM_hom006_535_1783_1 using rosetta version 516
ID: 17524 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikus

Send message
Joined: 7 Nov 05
Posts: 58
Credit: 700,115
RAC: 0
Message 17526 - Posted: 2 Jun 2006, 0:49:05 UTC

The "zero status" warnings are typically caused by the clent having trouble communicating internally (usually by using a port on 127.0.0.1).

Try leaving your boincmgr running all the time, and see if those messages diminish. If that doesn't help, there is probably something on your system that interferes with the handling of internal TCP/IP messages.

In any case, as long as those WUs eventually complete o.k., you can probably ignore the "zero status" warnings -- i.e., no need for you to "reset".
.
ID: 17526 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 17535 - Posted: 2 Jun 2006, 8:45:31 UTC

Do you have any other cpu-intense application running?

I would try to close down all running apps, including virus scanner, firewall, tweaker and optimizer apps, etc., disconnect from the internet and set BOINC to Rosetta only mode (in case you're attached to other projects as well). Then let it go over night and check whether these error messages pertain. If not gradually switch on all your stuff you usually run on your machine. If the error messages remain you have to look closer on your internal tcp/ip settings.
ID: 17535 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Nite Owl
Avatar

Send message
Joined: 2 Nov 05
Posts: 87
Credit: 3,019,449
RAC: 0
Message 17537 - Posted: 2 Jun 2006, 9:20:47 UTC

I get the same message on several of my machines, and all they do is run Rosetta.... <shrug>
Join the Teddies@WCG
ID: 17537 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 17563 - Posted: 3 Jun 2006, 3:05:22 UTC

I WAS getting that message frequently, as well as the manager losing contact with the crunching thread. I was suggested that the new BOINC version would resolve or improve on the problem when it was available. So when it was released, I installed BOINC 5.4.9 and I believe those machines are no longer seeing the problem.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 17563 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile m.mitch
Avatar

Send message
Joined: 10 Feb 06
Posts: 34
Credit: 1,928,904
RAC: 0
Message 17896 - Posted: 7 Jun 2006, 6:20:22 UTC
Last modified: 7 Jun 2006, 6:22:18 UTC


If you're using Win Xp. Try resetting the TCP/IP stack.

Start -> Run. Type in: netsh int ip reset outfile.txt (it needs a dump file).

If you're on dial-up, you're supposed to delete and add a new DUN entry with a different name. The usual suggestion is ISP1 (Where ISP are your ISP's intials) that way support desks know what's happened.

Close everything and reboot.




Click here to join the #1 Aussie Alliance on Rosetta
ID: 17896 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Restarting Results ad infinitum



©2024 University of Washington
https://www.bakerlab.org