Message boards : Number crunching : Workunit series 1hz6a
Author | Message |
---|---|
Christian Barrett Send message Joined: 17 Sep 05 Posts: 11 Credit: 14,933 RAC: 0 |
I just recieved new workunits for this series and the first two i started crashed during the run. Is this series stable? I havent had any problems with the other series. 11/13/2005 6:30:43 PM|rosetta@home|Pausing result 1hz6A_abrelaxmode_random_length20_jitter02_omega_00594_0 (removed from memory) 11/13/2005 6:30:45 PM|rosetta@home|Unrecoverable error for result 1hz6A_abrelaxmode_random_length20_jitter02_omega_00594_0 ( - exit code -1073741819 (0xc0000005)) 11/13/2005 6:30:45 PM||request_reschedule_cpus: process exited and 11/14/2005 1:07:15 AM|rosetta@home|Unrecoverable error for result 1hz6A_abrelaxmode_random_length20_jitter02_omega_sim_aneal_01678_0 ( - exit code -1073741819 (0xc0000005)) i know from the FAQ that these are general client errors but it didnt start until this new series. |
Vester Send message Joined: 2 Nov 05 Posts: 258 Credit: 3,651,260 RAC: 521 |
No problems on my two computers which have uploaded about 60 of these jobs. |
stephan_t Send message Joined: 20 Oct 05 Posts: 129 Credit: 35,464 RAC: 0 |
Mine is stuck at 70% for the past 6 hours. Never had any problems before. Team CFVault.com http://www.cfvault.com |
Christian Barrett Send message Joined: 17 Sep 05 Posts: 11 Credit: 14,933 RAC: 0 |
Mine is stuck at 70% for the past 6 hours. Never had any problems before. well, i have had 3 fail on me now. the fourth has run twice if not three times as long as the previous set but hasnt client errored yet. we shall see. |
Foxfire Send message Joined: 3 Nov 05 Posts: 12 Credit: 582,360 RAC: 0 |
Mine is stuck at 70% for the past 6 hours. Never had any problems before. I did not have any failures yet, but my WUs went from average 10-15 min (1,5 Weeks ago) to average 130 min now. |
Divide Overflow Send message Joined: 17 Sep 05 Posts: 82 Credit: 921,382 RAC: 0 |
I haven't had any errors with this new series. Some of them can take much longer than other recent protein WUs, but this is normal and should be no cause for alarm. @Stephan: Are you still stuck at 70%? |
stephan_t Send message Joined: 20 Oct 05 Posts: 129 Credit: 35,464 RAC: 0 |
Hello there - no I'm no longer stuck, I used some patience and it ended up taking 8h30 total to get processed. It was this result for this WU on this box here Got me a cool 52.54 points. EDIT: I should add, looking at my bench for the machine they seem very low. It's not the first time this happened - I posted a graph of my RAC for that box before... so I reckon it might be overheating and the p4 is throttling back. Will investigate. Team CFVault.com http://www.cfvault.com |
Red Squirrel Send message Joined: 26 Sep 05 Posts: 13 Credit: 3,613 RAC: 0 |
Yes, I've got one of the 1hz6a work units and it's got to 90% after 3 hours 30 mins. Most of the other work units have taken just over an hour. I wonder if we could be given some idea how long a WU is going to take in comparison with the original WU's that were given out. The project team must have some idea how complex each different protein WU is. |
Christian Barrett Send message Joined: 17 Sep 05 Posts: 11 Credit: 14,933 RAC: 0 |
Yes, I've got one of the 1hz6a work units and it's got to 90% after 3 hours 30 mins. Most of the other work units have taken just over an hour. mine went for 4 hours and 40min before i got this 11/14/2005 5:48:49 PM|rosetta@home|Unrecoverable error for result 1hz6A_abrelaxmode_random_length20_jitter02_omega_sim_aneal_03386_0 ( - exit code -164 (0xffffff5c)) thats 4 failures out of 4 different units. They also failed at various times during the run. the unit is located here https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1680973 |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
Christian, This is probably due to the increased size of the work units and/or having an older client version. If you are running multiple projects, be sure to keep the application in memory and set the "Switch between applications every" option in your general preferences to at least 2 hours. You may want to try the most recent version of the BOINC client. I have reduced the size of new work units, but there will likely be large work units in the future (larger proteins, longer methods etc..). |
Scribe Send message Joined: 2 Nov 05 Posts: 284 Credit: 157,359 RAC: 0 |
No problems with me on these longer ones, wen from about 1 hour to 3 hours. We ex-FaD will not find a problem with even longer ones....at FaD there were some that took a couple of DAYS....even on my machines... |
Christian Barrett Send message Joined: 17 Sep 05 Posts: 11 Credit: 14,933 RAC: 0 |
Christian, Ok, thanks. I am upgrading to the new 5.2.* tonight. I didnt want to upgrade earlier because i was running a spinup for another project and was worried about the future stability but they assured it wont crash. We shall see. |
Message boards :
Number crunching :
Workunit series 1hz6a
©2024 University of Washington
https://www.bakerlab.org