Unrecoverable error???

Message boards : Number crunching : Unrecoverable error???

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile KamiKatze
Avatar

Send message
Joined: 20 Jan 06
Posts: 7
Credit: 1,531
RAC: 0
Message 9723 - Posted: 24 Jan 2006, 19:46:15 UTC

hello rosetta-team,
this error-text occured a few hours ago:

24.01.2006 19:10:03|rosetta@home|Pausing result 17535_looprlx_round1_ev1b4fA_.0239_0001_273_17_0 (removed from memory)
24.01.2006 19:10:08|rosetta@home|Unrecoverable error for result 17535_looprlx_round1_ev1b4fA_.0239_0001_273_17_0 ( - exit code -1073741819 (0xc0000005))

hope it'S not a problem on my side, otherwise pls tell me how to proceed
brgds
karsten
das leben ist eins der härtesten...
ID: 9723 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 9724 - Posted: 24 Jan 2006, 19:49:30 UTC - in response to Message 9723.  

hello rosetta-team,
this error-text occured a few hours ago:

24.01.2006 19:10:03|rosetta@home|Pausing result 17535_looprlx_round1_ev1b4fA_.0239_0001_273_17_0 (removed from memory)
24.01.2006 19:10:08|rosetta@home|Unrecoverable error for result 17535_looprlx_round1_ev1b4fA_.0239_0001_273_17_0 ( - exit code -1073741819 (0xc0000005))

hope it'S not a problem on my side, otherwise pls tell me how to proceed
brgds
karsten



Hi Karsten

You need to change your preferences to leave the app in memory. If you go into your online account, you'll find the option under your general preferences. Just change the "leave applications in memory while preempted" setting to "yes".

Anders n
ID: 9724 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kevin

Send message
Joined: 15 Jan 06
Posts: 21
Credit: 109,496
RAC: 0
Message 9894 - Posted: 26 Jan 2006, 6:18:10 UTC

I had an unrecoverable error today which read:
Unrecoverable error for result DEFAULT_RLX_NATIVE_1hz6_280_159_3
(Incorrect function.(0X1)-exit code 1 (0X1))

This unit seems to be bad as it has been sent out 4 times and all returned a client error. Is there a place to report units that crash constantly or is that monitored?
ID: 9894 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 9897 - Posted: 26 Jan 2006, 6:41:28 UTC - in response to Message 9894.  

I had an unrecoverable error today which read:
Unrecoverable error for result DEFAULT_RLX_NATIVE_1hz6_280_159_3
(Incorrect function.(0X1)-exit code 1 (0X1))

This unit seems to be bad as it has been sent out 4 times and all returned a client error. Is there a place to report units that crash constantly or is that monitored?


thanks, we'll look into it. David

ID: 9897 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kevin

Send message
Joined: 15 Jan 06
Posts: 21
Credit: 109,496
RAC: 0
Message 10119 - Posted: 28 Jan 2006, 16:27:32 UTC

A lot of errors have been generated with this unit: 17535_test_279_1_0

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=6516852



Here is the error:
<core_client_version>5.2.13</core_client_version>
<stderr_txt>
<soft_link>../../projects/boinc.bakerlab.org_rosetta/17535_test_279_1_0_2</soft_link>

</stderr_txt>
<message><file_xfer_error>
<file_name>17535_test_279_1_0_1</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>17535_test_279_1_0_2</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
ID: 10119 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Natty

Send message
Joined: 10 Nov 05
Posts: 2
Credit: 18,983
RAC: 0
Message 10770 - Posted: 15 Feb 2006, 12:29:07 UTC

Hi,
I just got this error message. Can you please look into it.

MD5 check failed for aa1cei_09_05.200_v1_3gz
expected 8884afefe4c336c61568bc09ec0a627c, got 96d1de9e5a87f51eac091ca3ff8fb495
Checksum or signature error for aa1cei_09_05.200_v1_3.gz
Unrecoverable error for result FAST_ABINITIO_DEFAULT_1CEI__306_3430_0 (WU download error: couldn't get input files:<file_xfr_ .......

Before this message appeared I changed my preferences under general preferences by changing the "leave applications in memory while preempted" setting to "yes" as I kept getting unrecoverable errors when most of the work is 100% done. Hence me not being able to get any credits.

thanks


ID: 10770 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Andrew

Send message
Joined: 19 Sep 05
Posts: 162
Credit: 105,512
RAC: 0
Message 10771 - Posted: 15 Feb 2006, 13:27:25 UTC

The transfer/MD5 errors are due to the web server being overloaded.

ID: 10771 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
humanoid

Send message
Joined: 22 Dec 05
Posts: 4
Credit: 31,590
RAC: 0
Message 11084 - Posted: 21 Feb 2006, 8:09:59 UTC

The past few days I've been getting an error message ""unrecoverable error for result...". Every time it finishes the work unit I get this message. What the heck is going on here?
ID: 11084 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 11088 - Posted: 21 Feb 2006, 8:42:30 UTC

This is odd. I would suspend the R@h project on your clients for now, attach the problem computers to the Ralph test server and see what happens when we start doing more tests.
ID: 11088 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
humanoid

Send message
Joined: 22 Dec 05
Posts: 4
Credit: 31,590
RAC: 0
Message 11126 - Posted: 21 Feb 2006, 17:04:44 UTC - in response to Message 11088.  

This is odd. I would suspend the R@h project on your clients for now, attach the problem computers to the Ralph test server and see what happens when we start doing more tests.


Thanks for the reply, David. I'm a noob to all of this so bear with me. What is the Ralph test and how can I attatch to it? Thanks!
ID: 11126 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 11137 - Posted: 21 Feb 2006, 19:14:10 UTC

If you are using a recent version of the client, you can attach via the boinc manager by selecting "Attach to project" from the "Projects" pull down menu and go from there. The project url is:

http://ralph.bakerlab.org

or you can go to this web site and click on "Join RALPH@home".


RALPH is our new alph testing project for R@h. We are using this project to help fix bugs, test new work units, and application updates, etc. If you attach a host that is consistently having problems with R@h, we will be able to try to trouble shoot and debug via RALPH.
ID: 11137 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
humanoid

Send message
Joined: 22 Dec 05
Posts: 4
Credit: 31,590
RAC: 0
Message 11209 - Posted: 22 Feb 2006, 7:51:58 UTC - in response to Message 11137.  

If you are using a recent version of the client, you can attach via the boinc manager by selecting "Attach to project" from the "Projects" pull down menu and go from there. The project url is:

http://ralph.bakerlab.org

or you can go to this web site and click on "Join RALPH@home".


RALPH is our new alph testing project for R@h. We are using this project to help fix bugs, test new work units, and application updates, etc. If you attach a host that is consistently having problems with R@h, we will be able to try to trouble shoot and debug via RALPH.


OK, I've attached it to Ralph, anything else I need to do?
ID: 11209 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 11215 - Posted: 22 Feb 2006, 14:19:26 UTC - in response to Message 11209.  

If you are using a recent version of the client, you can attach via the boinc manager by selecting "Attach to project" from the "Projects" pull down menu and go from there. The project url is:

http://ralph.bakerlab.org

or you can go to this web site and click on "Join RALPH@home".


RALPH is our new alpha testing project for R@h. We are using this project to help fix bugs, test new work units, and application updates, etc. If you attach a host that is consistently having problems with R@h, we will be able to try to trouble shoot and debug via RALPH.


OK, I've attached it to Ralph, anything else I need to do?




Wait for the next batch of Ralph test Work Units. Currently Ralph is between tests, so it may be a few daays before you see any workunits. When they arrive let the system run them, and report any errors in the RALPH error reporting forums.

{NOTE: This thread will soon be moved to the NUMBER CRUNCHING forum.}


Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 11215 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Conan
Avatar

Send message
Joined: 11 Oct 05
Posts: 151
Credit: 4,244,078
RAC: 2,770
Message 11366 - Posted: 25 Feb 2006, 4:20:25 UTC

Has anyone had the "Unrecoverable Error" when running both Rosetta@home and Ralph@home together? I had no problems with Rosetta till I started to run Ralph. Ironic really as Ralph is supposed to make rosetta better but instead crashes it.
I have had this error a number of times over the past few days and am no longer getting cretid for much at all. I run Einstein on the same machine and it has no problems. All projects were set to stay in memory but I have now made Ralph not stay in memory and dropped cpu usage down to 4 hours to try and stop the jobs corrupting. When the graphic is running and I move the mouse to go back to the programme I get the following error "rosetta_4.82_windows_intelx86.exe has encountered a problem and needs to close. We are sorry for any inconvenience." I then see the corrupted jobs in the list.
Any help would be appreciated.
ID: 11366 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
John Crowley

Send message
Joined: 26 Nov 05
Posts: 2
Credit: 951,224
RAC: 0
Message 11382 - Posted: 25 Feb 2006, 12:41:18 UTC - in response to Message 11366.  

Has anyone had the "Unrecoverable Error" when running both Rosetta@home and Ralph@home together? I had no problems with Rosetta till I started to run Ralph. Ironic really as Ralph is supposed to make rosetta better but instead crashes it.
I have had this error a number of times over the past few days and am no longer getting cretid for much at all. I run Einstein on the same machine and it has no problems. All projects were set to stay in memory but I have now made Ralph not stay in memory and dropped cpu usage down to 4 hours to try and stop the jobs corrupting. When the graphic is running and I move the mouse to go back to the programme I get the following error "rosetta_4.82_windows_intelx86.exe has encountered a problem and needs to close. We are sorry for any inconvenience." I then see the corrupted jobs in the list.
Any help would be appreciated.


Just started getting the same error on Thursday, have aborted, re started, to no avail I'm not running Ralph

ID: 11382 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 11385 - Posted: 25 Feb 2006, 16:01:44 UTC - in response to Message 11382.  

Has anyone had the "Unrecoverable Error" when running both Rosetta@home and Ralph@home together? I had no problems with Rosetta till I started to run Ralph. Ironic really as Ralph is supposed to make rosetta better but instead crashes it.
I have had this error a number of times over the past few days and am no longer getting cretid for much at all. I run Einstein on the same machine and it has no problems. All projects were set to stay in memory but I have now made Ralph not stay in memory and dropped cpu usage down to 4 hours to try and stop the jobs corrupting. When the graphic is running and I move the mouse to go back to the programme I get the following error "rosetta_4.82_windows_intelx86.exe has encountered a problem and needs to close. We are sorry for any inconvenience." I then see the corrupted jobs in the list.
Any help would be appreciated.


Just started getting the same error on Thursday, have aborted, re started, to no avail I'm not running Ralph



I only see1 Rosetta error in you stats before the time period you mentioned and it was before Ralph started. So I have to assume the errors you are talking about are all in Ralph. Have you posted any problem reports there? If not please do so and provide a link to the error results.


Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 11385 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 11386 - Posted: 25 Feb 2006, 16:09:19 UTC - in response to Message 11366.  

Has anyone had the "Unrecoverable Error" when running both Rosetta@home and Ralph@home together? I had no problems with Rosetta till I started to run Ralph. Ironic really as Ralph is supposed to make rosetta better but instead crashes it.
I have had this error a number of times over the past few days and am no longer getting cretid for much at all. I run Einstein on the same machine and it has no problems. All projects were set to stay in memory but I have now made Ralph not stay in memory and dropped cpu usage down to 4 hours to try and stop the jobs corrupting. When the graphic is running and I move the mouse to go back to the programme I get the following error "rosetta_4.82_windows_intelx86.exe has encountered a problem and needs to close. We are sorry for any inconvenience." I then see the corrupted jobs in the list.
Any help would be appreciated.


Conan,

Looking at your errors, they are at least consistantly the same error, and at least two are on the same WU type. What type of grafics card do you have in your system? Also, I assume that you have also had errors, in Ralph. Could you provide a link to those so they can be compared to the Rosetta errors?

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 11386 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Conan
Avatar

Send message
Joined: 11 Oct 05
Posts: 151
Credit: 4,244,078
RAC: 2,770
Message 11414 - Posted: 26 Feb 2006, 9:24:58 UTC

Moderator9, thanks for the reply.
Computer is an AMD4800+ with 4GB RAM(Boinc shows 2.78GB),2xWinfastPX6600GT TDH graphics cards (SLI but only 1 connected0,2x250 Gb HD and about a 4Gb swap file.
Settings (same for Rosetta and RALPH):-
Use no more than 210 GB Hard disk (was set to 100)
Leave 1 GB free (was 0.1)
Use no more than 85% of total disk space (was 50%)
60 sec write to disk
Use no more than 90% total virtual memory (was 70% then 85%)
Changes made over the last 2 days.
Rosetta WorkUnits that have failed are 17/02/06 WU8383366
24/02/06 WU9556896
25/02/06 WU9556791,WU9556728
26/02/06 WU9556807
RALPH WorkUnits that have failed are 17/02/06 BARCODE_30_1shfA_209_12_0
(I don't have WU numbers for 17/18) 18/02/06 BARCODE_30_256bA_209_18_0
25/02/06 WU6428,WU8592
26/02/06 WU9854,WU9855,WU9761
This may not be the way you wanted the links but I was unsure how to give then to you, sorry. I joined RALPH on the 16/02/06 and my Rosetta has suffered markedly since (only occasional WU gets out).
Does this help you?
ID: 11414 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Conan
Avatar

Send message
Joined: 11 Oct 05
Posts: 151
Credit: 4,244,078
RAC: 2,770
Message 11454 - Posted: 27 Feb 2006, 11:59:09 UTC

To Moderator9.
Just a follow up on my unrecoverable problems. I extended the time before the screen saver comes on from 10 minutes to 700 minutes, as i believe the errors happen when the screen saver is running. I had a Ralph download and both it and Rosetta were running side by side when I left for work, I came home over 10.5 hours later and both projects had worked fine completing the workunits and sending them off with no errors. So that may be something to look at.
ID: 11454 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Conan
Avatar

Send message
Joined: 11 Oct 05
Posts: 151
Credit: 4,244,078
RAC: 2,770
Message 11491 - Posted: 28 Feb 2006, 12:43:55 UTC
Last modified: 28 Feb 2006, 12:46:18 UTC

A follow up on this. After making my screen saver go for 700 minutes before coming on and reducing both Rosetta and Ralph to 4 hours cpu time, Rosetta now appears to be running ok. Ralph still give Unrecoverable errors saying running out of disk space (with 210 GB available and 85% total usable), so have increased disk to 220 GB and 95% total usable and dropped the RALPH cpu time down to 2 hours to see what happens.

See http://ralph.bakerlab.org/workunit.php?wuid=10305 for the latest failed workunit.
ID: 11491 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · Next

Message boards : Number crunching : Unrecoverable error???



©2024 University of Washington
https://www.bakerlab.org