Message boards : Number crunching : Report Problems with Rosetta Version 5.22
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
tralala Send message Joined: 8 Apr 06 Posts: 376 Credit: 581,806 RAC: 0 |
Wait and don't abort. It will finish after "completion" in maximum an hour. Rosetta waits for the watchdog to shut down. It was something introduced in 5.19 for better debugging but reported over at RALPH and supposedly fixed in 5.22. It is very good that you report this here. If you happen to observe this again please check whether the graphics show 100% as well or something lower and make a screenshot from the graphics window in "idling" state. |
Astro Send message Joined: 2 Oct 05 Posts: 987 Credit: 500,253 RAC: 0 |
The Fatal Winows Error Bug is still with us, I'm afraid. wuid=19791659 Result ID 23483927 Name t309__CASP7_ABRELAX_SAVE_ALL_OUT_nohistag_hom001__661_7645_0 Workunit 19791659 Created 9 Jun 2006 11:23:04 UTC Sent 9 Jun 2006 12:59:31 UTC Received 10 Jun 2006 19:23:54 UTC Server state Over Outcome Client error Client state Computing Exit status -1073741811 (0xc000000d) Computer ID 212252 Report deadline 16 Jun 2006 12:59:31 UTC CPU time 28426.171875 stderr out <core_client_version>5.4.9</core_client_version> <message> - exit code -1073741811 (0xc000000d) </message> <stderr_txt> # cpu_run_time_pref: 28800 # random seed: 1655106 </stderr_txt> Validate state Invalid Claimed credit 109.225790694355 Granted credit 0 application version 5.22 |
Rhiju Volunteer moderator Send message Joined: 8 Jan 06 Posts: 223 Credit: 3,546 RAC: 0 |
Hi Alan: Thanks for reporting. There seem to be numerous little issues with the screensaver, and we've been trying to track them down one-by-one over on the test project, ralph. But I haven't seen a lot of problems like the one you describe -- has it happened in previous work units before this double batch? I wonder if something went haywire with the core boinc application -- you may need to restart. Hello Moderator9 |
Rhiju Volunteer moderator Send message Joined: 8 Jan 06 Posts: 223 Credit: 3,546 RAC: 0 |
Hi mmciastro... yes, we know its still there. You might be happy to know that the error -1073741811 (0xc000000d) is currently number one on our lists of things to kill. Its been the most common error for a while, but Rom only now has come up with a hypothesis for what the cause might be. He has just put in some extra debugging stuff on ralph to track it down -- maybe that will let us unravel this puzzle! The Fatal Winows Error Bug is still with us, I'm afraid. wuid=19791659 |
Astro Send message Joined: 2 Oct 05 Posts: 987 Credit: 500,253 RAC: 0 |
Hi mmciastro... yes, we know its still there. It's only a problem for certain video cards, .net, whatever it is. If regular users who get this, turn OFF the screensaver, they'll never see it until it's fixed. I'm in direct communication with Rom on this bug, just FYI. I happen to have a machine, that regularly gets this error (lucky me, and I guess, lucky Rosetta/Rom}. tony |
Alan Roberts Send message Joined: 7 Jun 06 Posts: 61 Credit: 6,901,926 RAC: 0 |
Hi Alan: Rhiju, Work units before the failures and after were completed based on a look at my results. I may have pulled a boinc restart somewhere in there ... I'm pitching this as an employee contribution/team-effort project at one of my customer sites, and the three of us who are the test cases have been grabbing our volunteer minutes here and there getting our sample desktops running, to demonstrate safety (at least lack of harm and impact on the "real work") and stability. When I saw the comments about screen saver issues on the forum and noticed my failures, I may have restarted boinc in a quick-and-dirty quest for a fix. I set the test machines to not use the screen saver over the weekend, but if there is a better procedure for providing debugging information (i.e., "run the screensaver and do the following if you get another error"), please let me know. Cheers, Alan |
RWIoffice Send message Joined: 7 Jun 06 Posts: 4 Credit: 37,344 RAC: 0 |
Screensaver or possibly some other flavor of "completion" problem with a t299__CASP7 work unit. I noticed lack of completion progress from home this morning. When I got to the office, the screensaver was sitting on "Model 9, Step 0." Once I got past the screensaver, BOINC Manager was reporting the work unit as "Running" and "100%" for Progress. BOINC Manager would not display the graphics. Shutdown of BOINC Manager seemed to take a long time, but it finally happened. I rebooted the system. Once BOINC Manager launched it reported the status on this work unit as completed, "Ready to Report" and started the next work unit. I forced an update so the result would be available prior to my posting this report. Keeping in mind that I'm a newbie and could easily be misinterpreting, the result seems to be referring to only 8 models, so the screensaver graphic's Model 9 reference doesn't make sense to me. |
tralala Send message Joined: 8 Apr 06 Posts: 376 Credit: 581,806 RAC: 0 |
Screensaver or possibly some other flavor of "completion" problem with a t299__CASP7 work unit. This is related to a bug report on Ralph. The behaviour is exactly the same. It was supposed to be fixed in 5.22 obviously it is not. |
Craig Miller Send message Joined: 5 Jun 06 Posts: 1 Credit: 241,534 RAC: 0 |
I am having a problem running Rosetta. I attach to Rosetta using BOINC manager, and receive the notice of a successfull attachment. When I look at BOINC manager it shows Rosetta running, while Einstein and SETI are suspended. But when I come back several hours later Rosseta is not present, either in Projects or Tasks. When I look at the messages they seem to show Rosetta being loaded and started, but then it ends with: Detaching from project, shown below. ------------- 11-Jun-06 12:41:20|rosetta@home|Starting task t314__CASP7_ABRELAX_SAVE_ALL_OUT_hom002__666_13970_1 using rosetta version 522 11-Jun-06 12:49:01||Contacting account manager at http://bam.boincstats.com/ 11-Jun-06 12:49:03||Account manager: BAM Host-ID: 2098 11-Jun-06 12:49:03||Account manager contact succeeded 11-Jun-06 12:49:03|rosetta@home|Resetting project 11-Jun-06 12:49:04||Rescheduling CPU: exit_tasks 11-Jun-06 12:49:04|rosetta@home|Detaching from project When I check with BAM, my resources are shown as Einstein, SETU, and Rosetta. What could be causing this problem? Caig Miller |
Dogbytes Send message Joined: 4 Dec 05 Posts: 37 Credit: 207,563 RAC: 0 |
The below linked WU crunched for over 2 hours, and yet was stuck at 0.00%. I aborted the unit because it appeared to be completely hung up, and stopped crunching. I would be interested to know what the problem was... Aborted 5.22 WU link |
Ian Send message Joined: 14 Apr 06 Posts: 29 Credit: 317,303 RAC: 866 |
Here's one from Saturday: https://boinc.bakerlab.org/rosetta/result.php?resultid=23575087 And one from Friday: https://boinc.bakerlab.org/rosetta/result.php?resultid=23484615 The only two errors for quite q while. Ian Cundell, St Albans, UK |
Dogbytes Send message Joined: 4 Dec 05 Posts: 37 Credit: 207,563 RAC: 0 |
The below linked WU has severe memory leakage...using >275Megs of CPU memory bringing the hosts commit charge to nearly 600Megs. WU was aborted by user. Aborted WU |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
When I check with BAM, my resources are shown as Einstein, SETU, and Rosetta. Rosetta's servers were just upgraded to support BAM last week. But it looks like BAM did something, not Rosetta. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
scsimodo Send message Joined: 17 Sep 05 Posts: 93 Credit: 946,359 RAC: 0 |
Had a few WUs crashing when hitting the "show graphics" button. The window popped up, closed immediately and trashed the WU. The Wus are: WU1 WU2 WU3 WU4 Host list is unhidden, host is a Mac Mini Core Duo, 1,66Ghz, 2GB RAM. Please drop a short notice when I can hide my hosts again... What's strange is: hitting the "show graphics" button a few minutes before worked perfectly, seems to be a random problem... |
Billy Send message Joined: 29 May 06 Posts: 13 Credit: 1,536,368 RAC: 0 |
I had a work unit processing at about 80% complete and it seemed to be going normally. I suspended the project (as well as Einstein and Seti) and quit Boinc. I shutdown the computer and restarted. When Boinc started again, it reported this work unit as complete and uploaded it. Either it was stuck before or isn't actually complete. I had a similar thing happen a couple of days ago and it also reported work units complete even though the completion times were unusually short. https://boinc.bakerlab.org/rosetta/result.php?resultid=23946621 iMac Core Duo, Rosetta version 5.22 |
Stwato Send message Joined: 11 Jan 06 Posts: 150 Credit: 655,634 RAC: 0 |
This work unit looks very strange in the graphics. Parts of the protein do not appear to be connected to the rest of it. It's like some of its missing. The protein appears very small. I first noticed the problem when I saw that the work unit is only using 20Mb memory and ~70Mb virtual but has nearly 9 million page faults. Assuming it was a small protein I took a look at the graphics and noticed the disjointed parts. Any ideas? I'll let it continue for now (5 hours into an 8 hour work unit) If you would like me to screen shot it please let me know the best dimensions for posting into the forum as I have nowhere to upload it to. Cheers Stwato |
Keith Akins Send message Joined: 22 Oct 05 Posts: 176 Credit: 71,779 RAC: 0 |
After patting myself on the back for so many successful WU's, I get the following error on this unit: 6/12/2006 11:05:25 PM|rosetta@home|Unrecoverable error for result t306__CASP7_JUMPABINITIO_SAVE_ALL_OUT_BARCODE_hom001__680_902_0 ( - exit code -1073741811 (0xc000000d)) This could be a v5.22, BOINC 5.4.9 or a conflict when checking mail with Mozilla Thunderbird. Win XP Home Service Pack 2 Mozilla Firefox/Thunderbird combo. Computers are visible and BOINC 5.4.9 should be debug reporting. Ignore the Linux Computer as mine is a dual booter. |
Snake Doctor Send message Joined: 17 Sep 05 Posts: 182 Credit: 6,401,938 RAC: 0 |
This work unit looks very strange in the graphics. Parts of the protein do not appear to be connected to the rest of it. It's like some of its missing. The protein appears very small. I first noticed the problem when I saw that the work unit is only using 20Mb memory and ~70Mb virtual but has nearly 9 million page faults. Assuming it was a small protein I took a look at the graphics and noticed the disjointed parts. Any ideas? I'll let it continue for now (5 hours into an 8 hour work unit) this is a known problem with a few of the processing techniques being used. Not all the work units are using the same processing approach. In some cases they are only looking at parts of the protein structure and that somehow affects the display. We Must look for intelligent life on other planets as, it is becoming increasingly apparent we will not find any on our own. |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
See Rhiju's post describing "Jumping" https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1453#15060 |
TCU Computer Science Send message Joined: 7 Dec 05 Posts: 28 Credit: 12,861,977 RAC: 0 |
rosetta 5.22 WU Name: t314__CASP7_ABRELAX_SAVE_ALL_OUT_hom004__666_16529_0 running on Mac OS 10.4.6 BOINC Manager Tasks tab shows CPU Time stuck at 01:30:40 and 15% top command shows TIME = 28:53:41 and climbing stopped and restarted BOINC CPU Time reverted to 01:13:00 and 15% but no longer stuck Symptoms are identical to my post for ralph 5.18 |
Message boards :
Number crunching :
Report Problems with Rosetta Version 5.22
©2024 University of Washington
https://www.bakerlab.org