Computation Error

Message boards : Number crunching : Computation Error

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
truckpuller

Send message
Joined: 5 Nov 05
Posts: 40
Credit: 229,134
RAC: 0
Message 7304 - Posted: 23 Dec 2005, 0:25:13 UTC

Just lost 15 jobs due to this "computation error" in the work section anyone here have any ideas.Useing 4.81 version running windows XP pro, cpu is a barton 2500+ and 1 gig of memory.
Visit us at Christianboards.org
ID: 7304 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Andrew

Send message
Joined: 19 Sep 05
Posts: 162
Credit: 105,512
RAC: 0
Message 7305 - Posted: 23 Dec 2005, 0:30:06 UTC

This is a known issue...

Read this thread: https://boinc.bakerlab.org/rosetta/forum_thread.php?id=726
ID: 7305 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
truckpuller

Send message
Joined: 5 Nov 05
Posts: 40
Credit: 229,134
RAC: 0
Message 7315 - Posted: 23 Dec 2005, 1:17:15 UTC

So i take it that there is no cure for this?? so i spent 1-2 hours downloading jobs to have them all be of no use(by the way iam on dial-up@32Kbps)it seems almost a waste of time then.
Visit us at Christianboards.org
ID: 7315 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 576
Credit: 4,695,450
RAC: 13
Message 7318 - Posted: 23 Dec 2005, 1:23:09 UTC - in response to Message 7315.  

So i take it that there is no cure for this?? so i spent 1-2 hours downloading jobs to have them all be of no use(by the way iam on dial-up@32Kbps)it seems almost a waste of time then.


If you are on dial-up, the best thing you can do is "Suspend" Rosetta until after the holidays. Let your CPU work on some other projects, and then we'll be here waiting for you with all problems solved! :-)

(Fingers crossed...)

ID: 7318 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 7361 - Posted: 23 Dec 2005, 12:53:46 UTC - in response to Message 7315.  

So i take it that there is no cure for this?? so i spent 1-2 hours downloading jobs to have them all be of no use(by the way iam on dial-up@32Kbps)it seems almost a waste of time then.

Though all projects try to be problem free, so far none of them have much made it. So, we live with it and work around the issues. If you want *MY* list of known issues (though some of these issues do not affect me - like screen savers; i don't use the screen savers) ...

CPDN: Model *ALWAYS* crashes on the PowerMac, usually on model re-starts
SAH: Enhance application and AstroPulse not out yet
EAH: Screen saver crashes on some systems
PPAH: Crash pops dialog box halting system processing till clicked (real bad for systems I don't check daily)
LHC: Screen saver crashes; intermittant work
RAH: 1% bug (cost me 25 hours once ... sob)
SDG: Web site still partly written in Hungarian, no interaction to speak of with participants
PG: Not really science
WCG: 31 digit Account Key, I can get a 32 digit if I detach and lose work then reattach (will likely drain WCG when LHC gets work and do it at that time, giving LHC roughly 20%).

So, there you have it ... all projects have problems ... we just have to make the best of it ... though it is easier for me with cable access ...

If you can afford the connect time, it still is usefull to run and crash and let the project have the bad result back. How can they fix it if they don't know what is broken? :)

Hope you can stay, or, if you detach, you come back soon ...
ID: 7361 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Christianb
Avatar

Send message
Joined: 5 Nov 05
Posts: 25
Credit: 364,297
RAC: 0
Message 7365 - Posted: 23 Dec 2005, 13:32:35 UTC - in response to Message 7318.  

If you are on dial-up, the best thing you can do is "Suspend" Rosetta until after the holidays. Let your CPU work on some other projects, and then we'll be here waiting for you with all problems solved! :-)

(Fingers crossed...)
That's not exactly a fix.

Visit us at Christianboards.org
ID: 7365 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 7369 - Posted: 23 Dec 2005, 13:50:32 UTC - in response to Message 7365.  

That's not exactly a fix.

No, it isn't.

It *IS* an appropriate and responsible solution to the immediate problem.

I will bet that you did not schedule a lot of work over the holidays ... we have to remember that the project people are just that sometimes ... people I mean ... they need time off too ...

Only on TV can problems be solved in 30 minutes (well, 20) ... the real world takes longer ... :)
ID: 7369 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 576
Credit: 4,695,450
RAC: 13
Message 7370 - Posted: 23 Dec 2005, 13:52:14 UTC - in response to Message 7365.  

That's not exactly a fix.


No, it isn't. It's a temporary workaround to avoid wasting a participant's resources.

And?

ID: 7370 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Pphalan
Avatar

Send message
Joined: 5 Nov 05
Posts: 53
Credit: 291,580
RAC: 0
Message 7371 - Posted: 23 Dec 2005, 13:59:07 UTC

So the fact that I have the same problem with 6 machines, 3 different OS's, Cable internet. The best solution is work another project? Thats 2 members of the same team with CPU's working at 100% for nothing.....Sounds like my team will have to see about a new project if that keeps up.
http://www.christianboards.org/forum.php
http://usalug.org/phpBB2/index.php
ID: 7371 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 576
Credit: 4,695,450
RAC: 13
Message 7373 - Posted: 23 Dec 2005, 14:19:33 UTC - in response to Message 7371.  

So the fact that I have the same problem with 6 machines, 3 different OS's, Cable internet. The best solution is work another project? Thats 2 members of the same team with CPU's working at 100% for nothing.....Sounds like my team will have to see about a new project if that keeps up.


The suggestion was "if you are on dial-up, you should probably suspend Rosetta until after the holidays, when they can get the problem solved". This is because of the ratio of download time to crunching time; dial-up people are spending too high a percentage of their time on non-productive work.

If you can afford the connect time, it still is usefull to run and crash and let the project have the bad result back. How can they fix it if they don't know what is broken? :)


So, the recommendation for _you_ is to keep on crunching. You certainly _could_ suspend Rosetta, but you could do that just because it's Friday and you prefer not to do bio projects on Fridays. That's up to you. From the project's perspective, it would be far better if you and the rest of your team members who are not on dialup just continued processing. That would be the best contribution to the project that you can make, assisting in getting the "short WUs" flushed through.

On dial-up or paying by the MB for bandwidth - Suspend Rosetta until after holidays.
Have full-time internet connection -
a) Wish to make contribution to project - keep crunching
b) Wish to insure that you don't lose any credit - Suspend Rosetta until after holidays.

It all depends on what your goals are, but it's ultimately up to you.

ID: 7373 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Scribe
Avatar

Send message
Joined: 2 Nov 05
Posts: 284
Credit: 157,359
RAC: 0
Message 7374 - Posted: 23 Dec 2005, 15:06:41 UTC

I will keep flushing them through for you on my two machines...broadband and no limit....Mewrry Christmas
ID: 7374 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Johnathon

Send message
Joined: 5 Nov 05
Posts: 120
Credit: 138,226
RAC: 0
Message 7418 - Posted: 23 Dec 2005, 21:04:39 UTC
Last modified: 23 Dec 2005, 21:05:29 UTC

I'll run through what I can, but if I get too many, I'll shut down my machines on dialup, and save the 'leccy bills.. I cant afford to run through too much. If i can catch some of the bad ones, I'll be happy to be able to help as much as poss.


<edit> And oh yes.. Merry Christmas =D </edit>

ID: 7418 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Pphalan
Avatar

Send message
Joined: 5 Nov 05
Posts: 53
Credit: 291,580
RAC: 0
Message 7441 - Posted: 23 Dec 2005, 23:17:37 UTC
Last modified: 23 Dec 2005, 23:56:04 UTC

After a review of all machines on this project 7 are showing a 60% (Correction its a 90%) failure rate with the client. I am ending my involvement in this project since I cant seem to get a straight answer to any of the questions I ask. GOOD BYE...
http://www.christianboards.org/forum.php
http://usalug.org/phpBB2/index.php
ID: 7441 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 576
Credit: 4,695,450
RAC: 13
Message 7449 - Posted: 24 Dec 2005, 0:35:21 UTC - in response to Message 7441.  

I cant seem to get a straight answer to any of the questions I ask. GOOD BYE...


Could have sworn I gave a very straight and definitive answer to the question... it's a temporary problem, it will be solved after the holidays, there is NO way to get it solved before then, so, anyone wanting to continue can continue, anyone wanting to suspend Rosetta until it IS solved, should do so.

???

ID: 7449 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 576
Credit: 4,695,450
RAC: 13
Message 7456 - Posted: 24 Dec 2005, 1:10:34 UTC

Just for info - SETI sent out a bunch of bad WUs not too long ago. Anyone complaining about running for 30 seconds and not getting credit, should look here.

Nobody's perfect. But at least the Rosetta folks are _trying_ to do the right things, and giving credit when possible.

ID: 7456 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 7457 - Posted: 24 Dec 2005, 1:11:58 UTC

The random number seed issue should be resolved by now. There is still a 7% chance of getting a bad seed but this is no different than previous runs weeks before. When I and others get back from holiday break, we will fix it completely and grant credit to those affected by the recent issues.
ID: 7457 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Pphalan
Avatar

Send message
Joined: 5 Nov 05
Posts: 53
Credit: 291,580
RAC: 0
Message 7461 - Posted: 24 Dec 2005, 1:33:07 UTC

I still have no understanding as to why all of my machines have had 3 days of these failures. Was I just plain hit with bad odds of recieving so many bad jobs? I have to leave for Iraq, I want to know if keeping these machines running is worth it. These Machines ran without problem for the month I was in Afghanistan for Find-a-Drug. Remotes keep droping the client and I have to restart them about every week. Now I look at my results for this day alone and have not contributed any good job all day.
http://www.christianboards.org/forum.php
http://usalug.org/phpBB2/index.php
ID: 7461 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Webmaster Yoda
Avatar

Send message
Joined: 17 Sep 05
Posts: 161
Credit: 162,253
RAC: 0
Message 7463 - Posted: 24 Dec 2005, 1:33:53 UTC - in response to Message 7458.  
Last modified: 24 Dec 2005, 1:52:25 UTC

We live in an imperfect world where not everything runs smoothly. That's life.

Yes, Rosetta has had problems and is still recovering from it. So have other projects (and I've participated in a few). Some of those took a lot longer to recover.

You're not the only one who has had problems with this batch of bad work units. I've had dozens, if not hundreds of them myself. I decided to not download any new Rosetta work for a few days and reduce my resource share temporarily. This problem will take some time to clear up but signs are that it is getting there.

Most of the problems were in jobs from batches 204 to 207. If you are having problems with other batches, perhaps the problem is elsewhere, e.g. hardware problems.

And you have been given straight answers.

So the fact that I have the same problem with 6 machines, 3 different OS's, Cable internet. The best solution is work another project?


Yes. That's the beauty of BOINC.

Thats 2 members of the same team with CPU's working at 100% for nothing.....


More like thousands of members of hundreds of teams having CPUs working in short bursts on work units that have a problem. It pales into insignificance with problems I have experienced on other projects.

Sounds like my team will have to see about a new project if that keeps up.


That's your choice and I have done so myself with some projects, after giving them time to sort out their problems. Let us know when you find a project that runs perfectly - I am yet to find one.

Edit: Some of your machines also fall below the recommend specs to run Rosetta. For instance, one of your machines has only 128MB RAM and a number of them are runnng Windows ME, which is not officially supported.
*** Join BOINC@Australia today ***
ID: 7463 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Pphalan
Avatar

Send message
Joined: 5 Nov 05
Posts: 53
Credit: 291,580
RAC: 0
Message 7467 - Posted: 24 Dec 2005, 2:09:16 UTC

Remove me as a member of this forum Administrator.
http://www.christianboards.org/forum.php
http://usalug.org/phpBB2/index.php
ID: 7467 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile UBT - Rick Horn

Send message
Joined: 17 Dec 05
Posts: 7
Credit: 283,961
RAC: 0
Message 7491 - Posted: 24 Dec 2005, 7:14:44 UTC - in response to Message 7304.  

Just lost 15 jobs due to this "computation error" in the work section anyone here have any ideas.Useing 4.81 version running windows XP pro, cpu is a barton 2500+ and 1 gig of memory.

I`m using 4.8 and am losing at least 50% of WUs also from "computation error". It doesn`t happen on other BOINC projects, and I`m sure my computer is OK. I guess Rosetta will have to sort out it`s act before it loses all it`s clients.
ID: 7491 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : Computation Error



©2024 University of Washington
https://www.bakerlab.org