Questions and Answers : Unix/Linux : some WU's stop executing on linux
Author | Message |
---|---|
NilsB Send message Joined: 6 May 06 Posts: 1 Credit: 821 RAC: 0 |
There are several WUs that hang: Work unit ID: 19785174, 19758621 the symtom ist, BOINC don't spent time on this WU after 1:28 hours. It simply stops executing, the CPU consumptions goes to 0. Even if I let BOINC run for several hours. Another WUs on this systems works fine, the same for other projects. BOINC Manager 5.4.9 on linux |
hugothehermit Send message Joined: 26 Sep 05 Posts: 238 Credit: 314,893 RAC: 0 |
G'day NilsB Welcome to Rosetta@Home Rosetta does occasionally have Linux errors (3.52% last time I saw). You can of course abort them if you see them, but the programme will eventually stop itself. The programme will also send debugging information about work unit that failed, so the Rosetta@Home team can reduce these errors even further. Hope that helps Hugo. |
Christian Send message Joined: 24 Nov 05 Posts: 1 Credit: 221,416 RAC: 0 |
I have the same problem on 2 Linux machines... |
Jean-David Beyer Send message Joined: 2 Nov 05 Posts: 188 Credit: 6,431,332 RAC: 4,520 |
G'day NilsB I also have this problem. I noticed it yesterday and it is still stuck today. Work unit 1n0u_HIGHFREQ_ABRELAX_7_1_NATIVe_ONLY_BARCODE__1312_9043_0. It has accumullated 00:58:44. The BOINC client gives it an hour of CPU from time-to-time and it seems to use none of it. I am running Red Hat Enterprise Linux 3 ES (up to date) on a dual 3.06 GHz Xeon hyperthreaded processor with 8 GBytes RAM, and this leaves one hyperthreaded processor idle all the time it is scheduled. Other Rosetta applications run just fine and one completed sometime yesterday. You say "the programme will eventually stop itself." How long is eventually? Because eventually I will wish to abort it. |
bozho Send message Joined: 21 Dec 05 Posts: 1 Credit: 46,904 RAC: 0 |
I have similar problem Rosetta hangs: 6275 ? SN 59:56 rosetta_5.54_i686-pc-linux-gnu aa z025 _ -relax -looprlx -nstruct 5 -farlx -ex1 -ex2 -random_loop -loop_model -termini -short_range_hb_weight 0.50 -long_range_hb_weight 1.0 -farlx_cycle_ratio 1.0 -idl_no_chain_break -vary _omega -output_silent_gz -output_chi_silent -protein_name_prefix hom002_ -frags_name_prefix boinc_hom002_ -s z025_4_1g1cA__9 6.pdb -paths paths_200_z025.txt -do_farlx_checkpointing -checkpointing_interval 10 -fix_disulf disulf -cpu_run_time 10800 -w atchdog -constant_seed -jran 3770064 6276 ? SN 0:00 rosetta_5.54_i686-pc-linux-gnu aa z025 _ -relax -looprlx -nstruct 5 -farlx -ex1 -ex2 -random_loop -loop_model -termini -short_range_hb_weight 0.50 -long_range_hb_weight 1.0 -farlx_cycle_ratio 1.0 -idl_no_chain_break -vary _omega -output_silent_gz -output_chi_silent -protein_name_prefix hom002_ -frags_name_prefix boinc_hom002_ -s z025_4_1g1cA__9 6.pdb -paths paths_200_z025.txt -do_farlx_checkpointing -checkpointing_interval 10 -fix_disulf disulf -cpu_run_time 10800 -w atchdog -constant_seed -jran 3770064 6277 ? SN 0:00 rosetta_5.54_i686-pc-linux-gnu aa z025 _ -relax -looprlx -nstruct 5 -farlx -ex1 -ex2 -random_loop -loop_model -termini -short_range_hb_weight 0.50 -long_range_hb_weight 1.0 -farlx_cycle_ratio 1.0 -idl_no_chain_break -vary _omega -output_silent_gz -output_chi_silent -protein_name_prefix hom002_ -frags_name_prefix boinc_hom002_ -s z025_4_1g1cA__9 6.pdb -paths paths_200_z025.txt -do_farlx_checkpointing -checkpointing_interval 10 -fix_disulf disulf -cpu_run_time 10800 -w atchdog -constant_seed -jran 3770064 6278 ? SN 0:00 rosetta_5.54_i686-pc-linux-gnu aa z025 _ -relax -looprlx -nstruct 5 -farlx -ex1 -ex2 -random_loop -loop_model -termini -short_range_hb_weight 0.50 -long_range_hb_weight 1.0 -farlx_cycle_ratio 1.0 -idl_no_chain_break -vary _omega -output_silent_gz -output_chi_silent -protein_name_prefix hom002_ -frags_name_prefix boinc_hom002_ -s z025_4_1g1cA__9 6.pdb -paths paths_200_z025.txt -do_farlx_checkpointing -checkpointing_interval 10 -fix_disulf disulf -cpu_run_time 10800 -w atchdog -constant_seed -jran 3770064 And process have to be killed manualy. (I thing a week is enough time to wait) OS - slackware 11, updated to current, rosetta_5.54_i686-pc-linux-gnu |
Questions and Answers :
Unix/Linux :
some WU's stop executing on linux
©2024 University of Washington
https://www.bakerlab.org