Rosetta STOPS Running!

Questions and Answers : Windows : Rosetta STOPS Running!

To post messages, you must log in.

AuthorMessage
Ricky@SETI.USA
Avatar

Send message
Joined: 13 Dec 05
Posts: 20
Credit: 97,355
RAC: 0
Message 32548 - Posted: 12 Dec 2006, 23:32:15 UTC

I have notice that over the last few time I have got WU's from Rosetta that when I open the BOINC Manager to check to see if any WU's needed to be uploaded that Rosetta sometimes stops running rvrn when it says it is running based on the Status.

Also when it is running the "To Completion" time goes UP instead of down as it does on the other projects until you close the BOINC Manager screen.

"Life is like an Ice Cream cone, just when you think you got it licked, it drips all over you!"

ID: 32548 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 32563 - Posted: 13 Dec 2006, 3:32:04 UTC

How have you determined that Rosetta has stopped running? Perhaps you are having the screensaver problems? Are you running BOINC as your screensaver?

Time to completion going up is normal. It will drop again once the current model is completed. This is described in the QA.

Happy Rosetta birthday to you!
Rosetta Moderator: Mod.Sense
ID: 32563 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ricky@SETI.USA
Avatar

Send message
Joined: 13 Dec 05
Posts: 20
Credit: 97,355
RAC: 0
Message 32668 - Posted: 15 Dec 2006, 0:20:13 UTC - in response to Message 32563.  
Last modified: 15 Dec 2006, 0:30:45 UTC

How have you determined that Rosetta has stopped running? Perhaps you are having the screensaver problems? Are you running BOINC as your screensaver?

Time to completion going up is normal. It will drop again once the current model is completed. This is described in the QA.

Happy Rosetta birthday to you!


It ran all night last night and is still on the SAME WU! AT the SAME place where it was when I restarted BOINC and NO other project has been done!

Yes the BOINC screensaver is set to NONE.

Well if this is normal then I need to stop R@H and let the other project finish before they miss their deadline.

Thank you!


"Life is like an Ice Cream cone, just when you think you got it licked, it drips all over you!"

ID: 32668 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 32678 - Posted: 15 Dec 2006, 3:15:19 UTC - in response to Message 32668.  

...ran all night last night and is still on the SAME WU! AT the SAME place where it was when I restarted BOINC and NO other project has been done!

Ya, that would not be normal. The time to completion will recompute when end of a model is reached. It sounds like you are not reaching the end of the model for some reason.

You might suspend that task. Then BOINC will reschedule work, and crunch other projects if it needs to, and then see if the next Rosetta task proceeds normally.

Rosetta Moderator: Mod.Sense
ID: 32678 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ricky@SETI.USA
Avatar

Send message
Joined: 13 Dec 05
Posts: 20
Credit: 97,355
RAC: 0
Message 32768 - Posted: 16 Dec 2006, 21:40:16 UTC - in response to Message 32678.  

...ran all night last night and is still on the SAME WU! AT the SAME place where it was when I restarted BOINC and NO other project has been done!

Ya, that would not be normal. The time to completion will recompute when end of a model is reached. It sounds like you are not reaching the end of the model for some reason.

You might suspend that task. Then BOINC will reschedule work, and crunch other projects if it needs to, and then see if the next Rosetta task proceeds normally.


OK took 24 hours to do 1 3 hour WU.

I think I will i have 24 SpinHenge WU due by the 23rd so I'm going to let it do them.


"Life is like an Ice Cream cone, just when you think you got it licked, it drips all over you!"

ID: 32768 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile paulcsteiner

Send message
Joined: 15 Oct 05
Posts: 19
Credit: 3,144,322
RAC: 0
Message 33070 - Posted: 21 Dec 2006, 19:40:44 UTC

Something similar happening here too. The machine is a P4 3gig HT, 512M ram, WinXP Pro. If I reboot and re-launch BOINC Mngr it will list Rosetta as the project and will load up the tasks it was working on. But when I leave it for the night, at some point the client will just stop.
Screensaver is set to blank (not the BIONC saver, the Windows screensaver) and power off monitor after 2min. The machine is not ging into hibernaton and HD's are set to never power off.
When I go to check the machine, the BOINC client comes up blank. No messages, no tasks, nothing listed in projects, definitely no activity. If I exit the app and re-launch it, it will then show project and tasks that have some processing done, and some that are pending.
Any help is greatly appreciated as this was my best producer.

ID: 33070 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Maurizio

Send message
Joined: 20 Jun 06
Posts: 8
Credit: 642,308
RAC: 0
Message 39704 - Posted: 21 Apr 2007, 16:49:29 UTC

I have a problem similar to paulcsteiner, Message 33070.
My PC is:
Pentium 4 3GHz HT, RAM 1,5G, HD with 80G free, WinXP Home, ADSL 4M

General Rosetta setup:
- Use always 1 CPU at 100%, so my PC gives constantly 50% to Rosetta (it's the only project I run) for 15 or 16 hours a day
- No screensaver

At least one or two times a day when I open the BOINC Manager to check the work in progress, I find all the windows blank, the status bar stating: 'Connecting to' and the CPU usage related to Rosetta equal to 0%.
Exiting BOINC by mean of 'File', 'Exit' and then relaunching the application, all starts perfectly, but obviously the time remaining for the WU running when the failure occurred is elongated.

I attach the last section of the stderrdae.txt file, that I suppose is logging the errors occurred while BOINC is running. I think this section is related to the last time the failure occurred.

****************************************************************************
ModLoad: 00e00000 00031000 C:ProgrammiBOINCsrcsrv.dll (6.5.3.7) (PDB Symbols Loaded)
File Version : 6.5.0003.7 (vbl_core_fbrel(jshay).050527-1915)
Company Name : Microsoft Corporation
Product Name : Debugging Tools for Windows(R)
Product Version: 6.5.0003.7



*** UNHANDLED EXCEPTION ****
Reason: Access Violation (0xc0000005) at address 0x0033B014 read attempt to address 0x00000008

*** Dump of the (offending) thread: ***
eax=00cb2c90 ebx=00bf4160 ecx=00000000 edx=00bf4208 esi=00c87ff0 edi=00bf4208
eip=0033b014 esp=016bfee0 ebp=00c7dff0
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010202

ChildEBP RetAddr Args to Child
016bfee0 0033adcd 00bf4208 00000000 00c87ff0 00000015 libcurl!Curl_llist_insert_next+0x5 (c:boincsrcsdkscurllibllist.c:78) FPO: [3,0,0]
016bff00 0032f7b3 00bf4160 00bf4208 00000015 00c7dff0 libcurl!Curl_hash_add+0xb (c:boincsrcsdkscurllibhash.c:165) FPO: [4,0,0]
016bff24 0032fae5 011c7c30 00c8a8e8 01222188 00000050 libcurl!Curl_cache_addr+0x19 (c:boincsrcsdkscurllibhostip.c:361) FPO: [4,1,0]
016bff48 0032fb52 003c72e0 0032fd7c 00c70ba8 00000000 libcurl!addrinfo_callback+0x15 (c:boincsrcsdkscurllibhostasyn.c:131) FPO: [0,1,0]
016bff50 0032fd7c 00c70ba8 00000000 003c72e0 00000013 libcurl!Curl_addrinfo4_callback+0x12 (c:boincsrcsdkscurllibhostasyn.c:161) FPO: [3,0,0]
016bff80 7c349565 00000000 00000013 00970a18 00c93dd8 libcurl!gethostbyname_thread+0x0 (c:boincsrcsdkscurllibhostthre.c:335) FPO: [1,4,0]
016bffb4 77e5d33b 00c93dd8 00000013 00970a18 00c93dd8 MSVCR71!__endthreadex+0x0 (c:boincsrcsdkscurllibhostthre.c:335)
016bffec 00000000 7c3494f6 00c93dd8 00000000 00000000 kernel32!_BaseThreadStart@8+0x0 (c:boincsrcsdkscurllibhostthre.c:335)

Exiting...
****************************************************************************

When Rosetta fails there is a waste of compute time, so I would like to fix it.
Any suggestion?
Thanks
Maurizio

ID: 39704 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 39746 - Posted: 23 Apr 2007, 2:40:20 UTC

Maurizio, looking at your results page, it appears you are still running BOINC Manager version 5.4.11. At that release the BOINC Manager was having some problems maintaining contact with the crunching threads. These problems seems to have been resolved now in the current BOINC version.

Here is an outline of how to update your BOINC version.
Rosetta Moderator: Mod.Sense
ID: 39746 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Maurizio

Send message
Joined: 20 Jun 06
Posts: 8
Credit: 642,308
RAC: 0
Message 39750 - Posted: 23 Apr 2007, 5:52:59 UTC - in response to Message 39746.  

Maurizio, looking at your results page, it appears you are still running BOINC Manager version 5.4.11. At that release the BOINC Manager was having some problems maintaining contact with the crunching threads. These problems seems to have been resolved now in the current BOINC version.

Here is an outline of how to update your BOINC version.


Thank you for the help.
I've just installed the new BOINC version, so in a few days I will know if the problem is gone.
Thank you again.

Bye
Maurizio
ID: 39750 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Maurizio

Send message
Joined: 20 Jun 06
Posts: 8
Credit: 642,308
RAC: 0
Message 40082 - Posted: 30 Apr 2007, 14:09:47 UTC

During this week of work with the new BOINC release I've noticed that the behaviour is greatly improved, but sometimes BOINC still disconnects from the project.
What can I do?

Thank you for the support.
Bye
Maurizio
ID: 40082 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 40151 - Posted: 1 May 2007, 16:16:37 UTC

What are you seeing that tells you BOINC has disconnected from the project?
Rosetta Moderator: Mod.Sense
ID: 40151 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Maurizio

Send message
Joined: 20 Jun 06
Posts: 8
Credit: 642,308
RAC: 0
Message 40225 - Posted: 2 May 2007, 17:24:58 UTC - in response to Message 40151.  

What are you seeing that tells you BOINC has disconnected from the project?


In the lower right part of the status bar instead of 'Connected to localhost' BOINC states 'Disconnected', the CPU usage of Rosetta is 0% instead of the usual 50%, and the BOINC icon on the task bar has a red dot on white background superimposed.
Exiting from BOINC with 'File', 'Exit' and then relaunching, all goes well.

Bye
Maurizio

ID: 40225 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 40230 - Posted: 2 May 2007, 18:47:52 UTC
Last modified: 2 May 2007, 18:48:34 UTC

I haven't seen that problem with the newer versions of BOINC, but I see you've already installed the current BOINC version. But it sounds like a BOINC problem, although I believe I've heard others indicate that other projects don't seem to have this problem.

How often do you see that happen?

If you would like to allow BOINC to use both CPUs, you can change your General Preferences and indicate that the maximum number of CPUs BOINC can use is 2, or something higher.
Rosetta Moderator: Mod.Sense
ID: 40230 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Maurizio

Send message
Joined: 20 Jun 06
Posts: 8
Credit: 642,308
RAC: 0
Message 40256 - Posted: 3 May 2007, 6:00:33 UTC - in response to Message 40230.  
Last modified: 3 May 2007, 6:01:28 UTC

I haven't seen that problem with the newer versions of BOINC, but I see you've already installed the current BOINC version. But it sounds like a BOINC problem, although I believe I've heard others indicate that other projects don't seem to have this problem.

How often do you see that happen?

If you would like to allow BOINC to use both CPUs, you can change your General Preferences and indicate that the maximum number of CPUs BOINC can use is 2, or something higher.


BOINC disconnects very randomly, but not more than once a day. Some days it doesn't disconnect at all.
One strange thing that has begun to happen only with the new BOINC version, however, is that when I switch on the PC and BOINC starts automatically, sometimes it doesn't connect to the project, and stays disconnected with no further actions.
I tried to wait some minutes to allow BOINC make its initializations, but nothing happened. Then with 'Advanced', 'Select computer...', 'localhost' the connection is established and BOINC begins working. Also manually relaunching BOINC is effective.

A suggestion: an external watchdog process could check if BOINC is running and automatically restart it if it finds BOINC is hung up.

Bye
Maurizio

ID: 40256 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Maurizio

Send message
Joined: 20 Jun 06
Posts: 8
Credit: 642,308
RAC: 0
Message 40694 - Posted: 11 May 2007, 5:45:08 UTC

I have to tune up my last post.
After another week of crunching, I can say that BOINC disconnects no more while working.
And it was not fully correct what I stated in my last post. Sometimes on PC startup BOINC doesn't connect to the client, but nonetheless the crunching is on. In that condition the CPU usage is 50% as expected, and then when I connect BOINC using 'Advanced', 'Select computer...', 'localhost' the Message Window lists all the actions taken since the startup, thus including task restarting, task reporting and so on, all related to the date and time of the PC startup.

Bye
Maurizio

ID: 40694 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Questions and Answers : Windows : Rosetta STOPS Running!



©2024 University of Washington
https://www.bakerlab.org