Message boards : Number crunching : Lot of failures
Author | Message |
---|---|
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 28 |
A large percentage of the work units sent here today quickly, (~20 seconds), failed with... -1073741819 (0xC0000005) STATUS_ACCESS_VIOLATION ... two, however, have started, have been running for a couple of hours, and are showing 12% complete. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 393 Credit: 12,114,842 RAC: 4,200 |
A large percentage of the work units sent here today quickly, (~20 seconds), failed with... They’re all working fine on my Ubuntu boxes. If this is another example of work on Linux / crash on windows could those who are crashing and do not run Vbox tasks set NNT so that they last longer for us work starved souls who don’t have vbox level resources. |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 28 |
The two I mentioned earlier are still crunching away, 24% and 26% respectively. VBox is here, and a couple of projects use it without issue. I know what you mean by "work starved". I always regarded Rosetta as an endless supply of work units which caused no problems for me, (Windows 8.1 x64), but work units are few and far between nowadays. The Italian project TN-Grid, (http://gene.disi.unitn.it/test/), is taking up most of the slack though. Generally, the number of projects I consider to be worth supporting has fallen markedly. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 393 Credit: 12,114,842 RAC: 4,200 |
The two I mentioned earlier are still crunching away, 24% and 26% respectively. VBox is here, and a couple of projects use it without issue. Of my 5+1 projects (Ralph is the +1) I’m down to 2 giving me work and of those TN-Grid can’t take the strain, it’s bumping along with all the work going out as soon as it’s loaded into the queue because the work generator is unable to keep up with demand. |
Jean-David Beyer Send message Joined: 2 Nov 05 Posts: 188 Credit: 6,461,090 RAC: 6,053 |
Not here. Of the work units I have received, the first six completed successfully. I am currently running six more and each has accumulated around four hours of cpu time. I am running Computer 5910575 CPU type GenuineIntel Intel(R) Xeon(R) W-2245 CPU @ 3.90GHz [Family 6 Model 85 Stepping 7] Number of processors 16 Operating System Linux Red Hat Enterprise Linux Red Hat Enterprise Linux 8.5 (Ootpa) [4.18.0-348.20.1.el8_5.x86_64|libc 2.28 (GNU libc)] BOINC version 7.16.11 |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 28 |
Hi JD, yes, Linux systems certainly do seem immune from this problem. If the project is getting enough work done, there is little impetus to fix the problem with Windows, or issue VBox work to the same crunchers. As long as the work is getting done, fair enough. Twenty years ago, the comms was so slow, it would have been a serious annoyance. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1995 Credit: 9,634,307 RAC: 6,840 |
Hi JD, yes, Linux systems certainly do seem immune from this problem. If the project is getting enough work done, there is little impetus to fix the problem with Windows Yes. But the LARGE part of clients/volunteers runs Windows. P.S. Same error on my Win11: <message> |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 28 |
>>> LARGE part of clients/volunteers runs Windows Certainly. Let us, however, consider the purpose. We are trying to help them. It is up to them to decide if Linux users are sufficient in number to acheive the result they require. It is already mentioned in this thread, that the amount of work from the project is a lot less than it used to be. They will, however, pick up on the fact that Windows users are seeing crashed work, and stop issuing it to these people, that, obviously, includes me. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
rjs5 Send message Joined: 22 Nov 10 Posts: 273 Credit: 23,054,272 RAC: 5,361 |
>>> LARGE part of clients/volunteers runs Windows I looked at a number of your failing WU DETAILS and there was the same failure by the other machine running the WU. It looks like the WU are bad and you are OK. |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 28 |
>>> It looks like the WU are bad and you are OK. Yes, I know. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
We have discovered over time that a certain researcher (unknown, but this bug seems to be related to them) compiles tasks just for linux and not for windows. So they bomb on windows machines (no matter what kind) and work just fine on linux. We have tried to reach to the project about this, but are ignored (this is also typical now). So if you get an task from 4.2 that bombs in a few seconds or under a minute,then it is that researcher. The best way to check if it is your machine or the task is to find the task that crashed in your errors results on your tasks page of your account. Click on the workunit, not the task and see what your wingman has done. If they completed it (4.2 tasks) then look at the machine OS. If it is windows, then there is also a issue we are exploring (mainly with python) that certain older CPU's do not run certain tasks while newer cpu's do. (long complicated unfolding story). If the task completes on a linux machine, then you know the answer. In short, certain 4.2 tasks are designed for linux only machines and not windows. This is one of them. For example look here: [urlhttps://boinc.bakerlab.org/rosetta/workunit.php?wuid=1318176948[/url] This is one that bombed on my system, was resent, another windows user got it and it bombed. Exact same error code as you. Look through Problems and Technical Issues with Rosetta@home thread above for more information of all the current bugs and complaints. |
Message boards :
Number crunching :
Lot of failures
©2024 University of Washington
https://www.bakerlab.org