Work Units that fail in under one minute - Report HERE

Message boards : Number crunching : Work Units that fail in under one minute - Report HERE

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 11957 - Posted: 12 Mar 2006, 21:16:24 UTC

Please report workunits that fail in under one min on this thread. Please provide a link to the result, and if possible your computer type and operating system. This is important because system type is sometimes related to the problem.
Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 11957 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Los Alcoholicos~La Muis

Send message
Joined: 4 Nov 05
Posts: 34
Credit: 1,041,724
RAC: 0
Message 11961 - Posted: 12 Mar 2006, 23:27:53 UTC
Last modified: 12 Mar 2006, 23:28:17 UTC

HOMSdt_homDB027_1dtj__352_898_0 Intel celeron tualatin 1,4GHz - Win98

HOMSdt_homDB004_1dtj__352_1699_0 AMD Athlon XP 2600+ - WinXP Professional SP2
HOMSdt_homDB030_1dtj__352_1518_0 AMD Athlon XP 2600+ - WinXP Professional SP2


HOMSdt_homDB009_1dtj__352_1987_1 AMD Athlon XP 2400+ - WinXP Professional SP2


HOMSdt_homDB004_1dtj__352_798_2 Intel PII 350MHz - WinXP Professional SP2
(It took a little longer to fail, but is a very slow pc)

HOMSdt_homDB011_1dtj__352_1628_0 PPC G4 2GHz - MacOSX 10.3.9


HOMSti_homDB025_1tif__352_601_2 PPC Dual G5 2GHz - MacOSX 10.4.5
ID: 11961 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 11962 - Posted: 12 Mar 2006, 23:33:11 UTC

3/4/2006 6:26:59 PM|rosetta@home|Unrecoverable error for result HOMSdt_homDB004_1dtj__340_50_0 (Incorrect function. (0x1) - exit code 1 (0x1))

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10015363
And what's with the claimed credit? "0.0793899404026963" .. You don't really keep track of the credits with that precision do you? *snicker* (There's no chance that I'll be turning in the trillions of results to ever get a point out of the numbers on the far right.. :)

Previously posted:
-------
stderr.txt
# random seed: 3468381
# cpu_run_time_pref: 86400
---
2/19/2006 6:24:18 PM||Starting BOINC client version 5.2.13 for windows_intelx86
2/19/2006 6:24:18 PM||libcurl/7.14.0 OpenSSL/0.9.8 zlib/1.2.3
2/19/2006 6:24:18 PM||Executing as a daemon
2/19/2006 6:24:18 PM||Data directory: C:Program FilesBOINC
2/19/2006 6:24:18 PM||BOINC is running as a service and as a non-system user.
2/19/2006 6:24:18 PM||No application graphics will be available.
2/19/2006 6:24:18 PM||Processor: 1 AuthenticAMD AMD Athlon(tm) 64 Processor 3000+
2/19/2006 6:24:18 PM||Memory: 1023.48 MB physical, 1.65 GB virtual
2/19/2006 6:24:18 PM||Disk: 29.29 GB total, 3.84 GB free
2/19/2006 6:24:18 PM|rosetta@home|Computer ID: 121218; location: home;
project prefs: default
2/19/2006 6:24:18 PM||General prefs: from rosetta@home (last modified 2005-12-29 13:52:58)
2/19/2006 6:24:18 PM||General prefs: no separate prefs for home; using your defaults
2/19/2006 6:24:19 PM||Remote control not allowed; using loopback address

------
3/4/2006 6:26:59 PM|rosetta@home|Unrecoverable error for result HOMSdt_homDB004_1dtj__340_50_0 (Incorrect function. (0x1) - exit code 1 (0x1))
3/4/2006 6:26:59 PM||request_reschedule_cpus: process exited
3/4/2006 6:26:59 PM|rosetta@home|Computation for result HOMSdt_homDB004_1dtj__340_50_0 finished
------

25.98 seconds.. it sure failed quickly.

Mine is a 754 pin Athlon 64; running WinXP Pro SP2. (supposedly, fully
updated.. minus the microsoft anti spyware package.) Panda Titanium antivirus.

Running on a MSI K8T Neo motherboard.

----
The WU in question failed 3 times..
ID: 11962 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Scribe
Avatar

Send message
Joined: 2 Nov 05
Posts: 284
Credit: 157,359
RAC: 0
Message 11972 - Posted: 13 Mar 2006, 6:38:01 UTC
Last modified: 13 Mar 2006, 6:38:18 UTC

As Dr Baker said elsewhere -

there were only a subset that failed--Divya identified the problem and fixed it. for experts, the problem was that the "-termini" option adds a proton to the N terminus, but for proline there is no place to put the proton, and for a subset of the 1dtj homologues there was an N terminal proline. this is the sort of mistake that only gets made once--it has now been fixed.


Do we now need this sticky?
ID: 11972 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dag
Avatar

Send message
Joined: 16 Dec 05
Posts: 106
Credit: 1,000,020
RAC: 0
Message 11996 - Posted: 14 Mar 2006, 3:01:18 UTC

I'm tired. HEY MOD9!! Why don't you write a query to do this??

ON INTEL PENTIUM R OS = WINDOZ XP + SP2
https://boinc.bakerlab.org/rosetta/result.php?resultid=12543070
https://boinc.bakerlab.org/rosetta/result.php?resultid=12886882
https://boinc.bakerlab.org/rosetta/result.php?resultid=13121126
https://boinc.bakerlab.org/rosetta/result.php?resultid=13142734
https://boinc.bakerlab.org/rosetta/result.php?resultid=13391574
https://boinc.bakerlab.org/rosetta/result.php?resultid=13391593

ON INTEL OS = WIN2K
https://boinc.bakerlab.org/rosetta/result.php?resultid=13120701
https://boinc.bakerlab.org/rosetta/result.php?resultid=13200669
https://boinc.bakerlab.org/rosetta/result.php?resultid=13207899



ON INTEL XEON OS = LINUX 2.4 (at least two different systems)
https://boinc.bakerlab.org/rosetta/result.php?resultid=12612776
https://boinc.bakerlab.org/rosetta/result.php?resultid=12797558
https://boinc.bakerlab.org/rosetta/result.php?resultid=12817678
https://boinc.bakerlab.org/rosetta/result.php?resultid=12837769
https://boinc.bakerlab.org/rosetta/result.php?resultid=12843045
https://boinc.bakerlab.org/rosetta/result.php?resultid=12871697
https://boinc.bakerlab.org/rosetta/result.php?resultid=12885622
https://boinc.bakerlab.org/rosetta/result.php?resultid=12893021
https://boinc.bakerlab.org/rosetta/result.php?resultid=12923657
https://boinc.bakerlab.org/rosetta/result.php?resultid=12923657
https://boinc.bakerlab.org/rosetta/result.php?resultid=12927304
https://boinc.bakerlab.org/rosetta/result.php?resultid=13010256
https://boinc.bakerlab.org/rosetta/result.php?resultid=13012697
https://boinc.bakerlab.org/rosetta/result.php?resultid=13014030
https://boinc.bakerlab.org/rosetta/result.php?resultid=13105842
https://boinc.bakerlab.org/rosetta/result.php?resultid=13116789
https://boinc.bakerlab.org/rosetta/result.php?resultid=13116799
https://boinc.bakerlab.org/rosetta/result.php?resultid=13116800
https://boinc.bakerlab.org/rosetta/result.php?resultid=13116808
https://boinc.bakerlab.org/rosetta/result.php?resultid=13116813
https://boinc.bakerlab.org/rosetta/result.php?resultid=13116842
https://boinc.bakerlab.org/rosetta/result.php?resultid=13116853
https://boinc.bakerlab.org/rosetta/result.php?resultid=13116857
https://boinc.bakerlab.org/rosetta/result.php?resultid=13116864
https://boinc.bakerlab.org/rosetta/result.php?resultid=13116865
https://boinc.bakerlab.org/rosetta/result.php?resultid=13149591
https://boinc.bakerlab.org/rosetta/result.php?resultid=13209450
https://boinc.bakerlab.org/rosetta/result.php?resultid=13215689
https://boinc.bakerlab.org/rosetta/result.php?resultid=13217161
https://boinc.bakerlab.org/rosetta/result.php?resultid=13220449
https://boinc.bakerlab.org/rosetta/result.php?resultid=13401124
https://boinc.bakerlab.org/rosetta/result.php?resultid=13402189



dag
--Finding aliens is cool, but understanding the structure of proteins is useful.
ID: 11996 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 11997 - Posted: 14 Mar 2006, 3:13:19 UTC - in response to Message 11996.  

I'm tired. HEY MOD9!! Why don't you write a query to do this??...


dgeiser,

Well, to be honest I would not have thought the problem would be this big. I will suggest that.


Scribe,

I think I could safely say that maybe we still need a thread for this.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 11997 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 12018 - Posted: 14 Mar 2006, 21:00:51 UTC

From Mikus: https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1106#12013

HOMSdt_homDB030_1dtj__352_577
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10322648

HOMSdt_homDB003_1dtj__352_744
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10339303

After failing on the 8th, they were handed out again on the 14th.






ID: 12018 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile JChojnacki
Avatar

Send message
Joined: 17 Sep 05
Posts: 71
Credit: 10,633,777
RAC: 3,777
Message 12024 - Posted: 14 Mar 2006, 23:22:37 UTC
Last modified: 14 Mar 2006, 23:24:00 UTC

All these failed in under one minute.

WU's:
HOMSti_homDB025_1tif__352_437_0
HOMSdt_homDB030_1dtj__352_723_1
HOMSdt_homDB003_1dtj__352_583_1

Computer 420:
CPU type GenuineIntel
Intel(R) Pentium(R) 4 CPU 3.40GHz
Number of CPUs 2
Operating System Microsoft Windows XP
Professional Edition, Service Pack 2, (05.01.2600.00)
Memory 2045.52 MB
Cache 976.56 KB
Swap space 3939.82 MB
Total disk space 172.76 GB
Free Disk Space 93.22 GB

Hope it helps.

Joel


ID: 12024 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jon Kennedy

Send message
Joined: 1 Oct 05
Posts: 6
Credit: 418,027
RAC: 0
Message 12040 - Posted: 15 Mar 2006, 5:34:38 UTC

My results that failed around a minute and the workunits:

https://boinc.bakerlab.org/rosetta/result.php?resultid=12943155
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10294520

https://boinc.bakerlab.org/rosetta/result.php?resultid=12792095
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10278745

Both had three tries and three failures.

My computer
https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=7476
ID: 12040 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Evan

Send message
Joined: 23 Dec 05
Posts: 268
Credit: 402,585
RAC: 0
Message 12049 - Posted: 15 Mar 2006, 9:44:25 UTC

[this one failed at model 1 step 21985 twice]
https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=143996
ID: 12049 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 12067 - Posted: 15 Mar 2006, 19:43:19 UTC - in response to Message 12049.  

[this one failed at model 1 step 21985 twice]
https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=143996

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11046932 is the workunit. FA_RLXc9_hom009_1c9oA_359_287
failed on a p4, but succeeded on a mobile pIII. Strange.
ID: 12067 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 12068 - Posted: 15 Mar 2006, 19:50:53 UTC

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10816289
HOMSti_homDB025_1tif__352_1905_1
March 12th.


ID: 12068 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mewbysea

Send message
Joined: 29 Jan 06
Posts: 17
Credit: 15,843,832
RAC: 1,618
Message 12122 - Posted: 17 Mar 2006, 2:59:38 UTC - in response to Message 11957.  

Here's another one -- failed for 3 users

HOMSti_homDB025_1tif__352_1177

wuid=10684378

computer 169851 = HP D530 SFF w/ Pentium 4 @ 2.6 GHz (stock) running WIN XP SP2
ID: 12122 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile bruce boytler
Avatar

Send message
Joined: 17 Sep 05
Posts: 68
Credit: 3,565,442
RAC: 0
Message 12267 - Posted: 19 Mar 2006, 15:44:00 UTC

HOMSdt_homDB027_1dtj__352_875
HOMSdt_homDB030_1dtj__352_163
HOMSdt_homDB004_1dtj__352_123
HOMSti_homDB025_1tif__346_132
HOMSdt_homDB011_1dtj__340_114
NO_MORE_RELAX_CYCLES_1dtj_214_18
ID: 12267 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 12269 - Posted: 19 Mar 2006, 16:32:43 UTC
Last modified: 19 Mar 2006, 22:29:12 UTC

For those who may be interested, Rom has posted information about Rosetta Work Unit errors and the status of the ongoing work to fix the bugs in Rosetta on his "Blog".
Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 12269 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Joe59

Send message
Joined: 16 Dec 05
Posts: 2
Credit: 43,529
RAC: 0
Message 12672 - Posted: 25 Mar 2006, 10:27:39 UTC
Last modified: 25 Mar 2006, 10:31:26 UTC

Unit homsdt_homdb004_1dtj__352_1433 has failed on my computer AMD Athlon64 X2 4200+ and WinXP SP2

This unit has failed also for two other people. Here you can find this WU:

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10408204
ID: 12672 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Christoph

Send message
Joined: 10 Dec 05
Posts: 57
Credit: 1,512,386
RAC: 0
Message 12673 - Posted: 25 Mar 2006, 11:21:54 UTC

[This WU stops at step 20995]
HB_BARCODE_30_1b3aA_351_15946
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=12025181
---
P4 1500 MHZ
640 MB RAM
WinXP SP2
ID: 12673 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jon Kennedy

Send message
Joined: 1 Oct 05
Posts: 6
Credit: 418,027
RAC: 0
Message 12864 - Posted: 31 Mar 2006, 2:42:45 UTC

Here's another WU that stopped for me in just under a minute - and failed on two other systems, even earlier, too:
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10812981
ID: 12864 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 12886 - Posted: 31 Mar 2006, 17:58:18 UTC

I got two more of the 25 second failures. It looked like they instantly failed on the first machine, and where ghosted or lost on the second machine, and had to wait this long for me to become the 3rd failure. :)

That's probably what's happening with other of these failures showing up all of a sudden after a couple weeks of silence.
ID: 12886 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 12889 - Posted: 31 Mar 2006, 19:29:37 UTC - in response to Message 12886.  

I got two more of the 25 second failures. It looked like they instantly failed on the first machine, and where ghosted or lost on the second machine, and had to wait this long for me to become the 3rd failure. :)

That's probably what's happening with other of these failures showing up all of a sudden after a couple weeks of silence.


thanks for letting us know--these work units should have been out of the system weeks ago; I'll check into why they are still around
ID: 12889 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · Next

Message boards : Number crunching : Work Units that fail in under one minute - Report HERE



©2024 University of Washington
https://www.bakerlab.org