Message boards : Number crunching : Help us solve the 1% bug!
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 10 · Next
Author | Message |
---|---|
Ledo Send message Joined: 22 Feb 06 Posts: 3 Credit: 42,171 RAC: 0 |
I've attached to the project yesterday and since then i can't finish the download of the following file: avgE_from_pdb.gz. This is the error i obtained after failing the: download 2/23/2006 9:54:38 AM|rosetta@home|Started download of avgE_from_pdb.gz 2/23/2006 9:54:39 AM|rosetta@home|Temporarily failed download of avgE_from_pdb.gz: error 403 Should i dettach from the project and try again, to see if the problem is solved (without this, my WU are on permanent status: downloading, on the work tab)? |
Astro Send message Joined: 2 Oct 05 Posts: 987 Credit: 500,253 RAC: 0 |
I've attached to the project yesterday and since then i can't finish the download of the following file: avgE_from_pdb.gz. It appears you lost your socket connection to Rosetta. Can you retry the communication? Does it yeild the same result?. It should automatically try to finish the download that got interrupted. |
Snake Doctor Send message Joined: 17 Sep 05 Posts: 182 Credit: 6,401,938 RAC: 0 |
Today i have the same problem, but i'm not sure if this is really a bug, or a very large WU? The Client isn't frozen, the step-counter is raising(Step 1.544.555 so far) but progress is at 1% for 1.20 hour There is an item in the FAQs thread that discusses this issue. We Must look for intelligent life on other planets as, it is becoming increasingly apparent we will not find any on our own. |
Ledo Send message Joined: 22 Feb 06 Posts: 3 Credit: 42,171 RAC: 0 |
It appears you lost your socket connection to Rosetta. Can you retry the communication? Does it yeild the same result?. It should automatically try to finish the download that got interrupted. I've aborted the download of this file and imediatelly finished the WU with the status: client error downloading. It downloaded again another WU wich have several files and it stucked on that file again. I hit the retry button several times and the result is the same ...error 403. I have another projects running on this PC and there is no problem with them. Edit: I see now the system requirements. The PC i attached is a 350Mhz and is less the minimun required for this project, that why i can't finished the download. Time to change to another project |
aguiar@carrier.com.br Send message Joined: 19 Feb 06 Posts: 6 Credit: 367,089 RAC: 0 |
Hi! I'm a newcomer to BOINC and very interested in Rosetta. I downloaded a WU yesterday and now it is running on my computer. I am following the graphics to find out if it is working well (since the BOINC program says it is only 1% complete and time to complete is now 15:23 and increasing instead of decreasing). Graphics now read as follows: Workunit: PRODUCTION_ABINITIO_INCREASECYCLES50_1urnA_317_213 1% Complete CPU time: 2 hr 18 min 23 sec (and counting) Stage: Ab initio Model: 1 Step: 1299394 (and counting) Is that OK? Is it running correctly? If so, how much time would be required (approximately) to finish this WU? Thanks and regards, Valter Aguiar. Edit: I just noticed that it stepped back. It is now at step 151736 and counting. This is the second time it happens with this same WU. |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
For people having many work Unit Errors!! I have received an e-mail from Dr. Baker with information for any of you who are having a lot of Work Unit errors. "Could you help us to recommend to people having problems with lots of WU to set the target run time to a smaller value like 2 hours. We think there aren't any new bugs, just with longer run times it is more likely for a WU to have problems." So if you are having a lot of errors please reset your Time setting to 2 hours and see if that helps. Moderator9 ROSETTA@home FAQ Moderator Contact |
Nite Owl Send message Joined: 2 Nov 05 Posts: 87 Credit: 3,019,449 RAC: 0 |
For people having many work Unit Errors!! That kinda defeats the whole purpose of having a adjustable Target Runtime doesn't it? |
Angus Send message Joined: 17 Sep 05 Posts: 412 Credit: 321,053 RAC: 0 |
That kinda defeats the whole purpose of having a adjustable Target Runtime doesn't it? Right. Doesn't for much for the dial-up users and the restricted bandwidth users issue. The whole point of extending the run time was to reduce the download frequency. If the WUs will only run reliably for 2 hours, then there are still problems to be solved. Proudly Banned from Predictator@Home and now Cosmology@home as well. Added SETI to the list today. Temporary ban only - so need to work harder :) "You can't fix stupid" (Ron White) |
AMD_is_logical Send message Joined: 20 Dec 05 Posts: 299 Credit: 31,460,681 RAC: 0 |
For people having many work Unit Errors!! I don't think a lot of people are having many of these crunching errors. I haven't seen such an error on any of the 13 systems I have crunching on rosetta. (I am currently using a 10 hour target run time.) The suggestion for a shorter target run time was for the few people who are having a lot of errors. |
Nite Owl Send message Joined: 2 Nov 05 Posts: 87 Credit: 3,019,449 RAC: 0 |
Angus wrote: If the WUs will only run reliably for 2 hours, then there are still problems to be solved. My point exactly... Join the Teddies@WCG |
Snake Doctor Send message Joined: 17 Sep 05 Posts: 182 Credit: 6,401,938 RAC: 0 |
Hi! I'm a newcomer to BOINC and very interested in Rosetta. I downloaded a WU yesterday and now it is running on my computer. You should read the FAQ list located here everything you have reported is normal behavior. We Must look for intelligent life on other planets as, it is becoming increasingly apparent we will not find any on our own. |
ecafkid Send message Joined: 5 Oct 05 Posts: 40 Credit: 15,177,319 RAC: 0 |
24hrs and 27 left on this one I aborted. 3/2/2006 5:22:32 PM|rosetta@home|Unrecoverable error for result ABINITew_hom002_1ew4A_322_56_0 (aborted via GUI RPC) |
rbpeake Send message Joined: 25 Sep 05 Posts: 168 Credit: 247,828 RAC: 0 |
I believe this is the first 1% error WU that I have ever received: SSFEATURES_BARCODE_ABINITIO_1ew4A_334_354 Regards, Bob P. |
ecafkid Send message Joined: 5 Oct 05 Posts: 40 Credit: 15,177,319 RAC: 0 |
I am runnig at 2hrs and still getting errors so those that think there is no new bug think again. I didn't have these problems until recentlly. Is proceesing these WU's doing any good or should we wait until they find a fix ? I am crunching as many hours of work but getting alot less done. |
Los Alcoholicos~La Muis Send message Joined: 4 Nov 05 Posts: 34 Credit: 1,041,724 RAC: 0 |
My first 1% wu on this pc. It was stuck at 1% for 6:30 hrs before I noticed it. After a restart of the Boincmanager it froze again at model 2106, so I aborted it. HBLR_1.0_2reb_332_989_0 |
[B@H] Ray Send message Joined: 20 Sep 05 Posts: 118 Credit: 100,251 RAC: 0 |
I am running BOINC 4.72 on two machines and have never had one stuck at 1%. Also I do not run P@H, could it be a Rosetta & newer BOINC problem? Or a Rosetta & P@H problem. In reading this thread I see others running BOINC 4.72 who never had one stuck at 1%. Pizza@Home Rays Place Rays place Forums |
Morten Starkeby Send message Joined: 18 Feb 06 Posts: 10 Credit: 472,142 RAC: 0 |
Got my first 1% stuck bug. The work unit is stuck at 1% for 1 hour and 54 minutes at the time of this writing. It is stuck at model 1, step 20880, in the Ab initio stage. Name of work unit: HB_BARCODE_30_1a32__351_1964 I supended boinc, and ran the following command from the command prompt: rosetta_4.82_windows_intelx86.exe cc 1a32 _ -abrelax -stringent_relax -more_relax_cycles -output_chi_silent -vary_omega -rand_envpair_res_wt -rand_SS_wt -farlx -ex1 -ex2 -silent -barcode_from_fragments -new_centroid_packing -barcode_from_fragments_length 30 -ssblocks -barcode_mode 3 -omega_weight 0.5 -jitter_frag -jitter_variation gauss -output_silent_gz -nstruct 10 -paths ccfrags200.txt -relax_score_filter -filter1 -115 -filter2 -130 -short_range_hb_weight 0.50 -long_range_hb_weight 1.0 -increase_cycles 10 -cpu_run_time 7200 -constant_seed -jran 3046839 The command executed perfectly, and proceeded quickly beyond 1% (at the time of writing it is at model 3, 20.9%). I then aborted the stand alone rosetta execution. I am using the 5.3.24 version of Boinc (beta version) |
Zazie Send message Joined: 1 Mar 06 Posts: 2 Credit: 159,032 RAC: 0 |
Hi, with half of my workunits errored out and the last one (HB_BARCODE_30_1ten__351_2528_0) stuck at 1% for 10+ hours, I decided not to waste my CPU time anymore and withdrew from the project. Sorry guys, I would have loved to help you by my tiny contribution, but there are too many bugs in Rosetta and from what I see on the message boards, no-one from your staff is too worried. |
rbpeake Send message Joined: 25 Sep 05 Posts: 168 Credit: 247,828 RAC: 0 |
...from what I see on the message boards, no-one from your staff is too worried. I don't know what message boards you are reading, but fixing the bugs is the top priority. See David Baker's latest Journal entry. Regards, Bob P. |
ecafkid Send message Joined: 5 Oct 05 Posts: 40 Credit: 15,177,319 RAC: 0 |
81 hrs and no credit to show for it except an ever declining RAC and alot of wasted cycles. 3/9/2006 5:13:49 PM|rosetta@home|Unrecoverable error for result HOMSdc_homDB002_1dcj__339_185_0 (aborted via GUI RPC) |
Message boards :
Number crunching :
Help us solve the 1% bug!
©2024 University of Washington
https://www.bakerlab.org