Rosetta 4.0+

Message boards : Number crunching : Rosetta 4.0+

To post messages, you must log in.

Previous · 1 . . . 14 · 15 · 16 · 17 · 18 · 19 · Next

AuthorMessage
Profile [AF>Le_Pommier] Jerome_C2005

Send message
Joined: 22 Aug 06
Posts: 42
Credit: 1,258,039
RAC: 0
Message 94667 - Posted: 17 Apr 2020, 10:30:35 UTC - in response to Message 94651.  
Last modified: 17 Apr 2020, 10:31:03 UTC

But this is on linux debian... ?

I see there is a 7.16.6 development version for linux... but the "stable" version is supposed to be 7.4.22 (old !), I got my 7.14.2 from the default depots (I used the apt-get install command).
ID: 94667 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 397
Credit: 12,254,928
RAC: 11,616
Message 94684 - Posted: 17 Apr 2020, 14:55:39 UTC - in response to Message 94667.  

But this is on linux debian... ?

I see there is a 7.16.6 development version for linux... but the "stable" version is supposed to be 7.4.22 (old !), I got my 7.14.2 from the default depots (I used the apt-get install command).


I’m running version 7.17.0 and it’s working well for me :-)

To get it add :-

ppa:costamagnagianfranco/boinc

to the other software tab of the software updater
ID: 94684 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [AF>Le_Pommier] Jerome_C2005

Send message
Joined: 22 Aug 06
Posts: 42
Credit: 1,258,039
RAC: 0
Message 94988 - Posted: 20 Apr 2020, 12:20:04 UTC - in response to Message 94684.  

A method using app_info.xml file, re-describing only the rosetta app and not the mini app, allowed me to get rid definitely of the horrible mini tasks, the rosetta are now crunching like a charm.
ID: 94988 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1699
Credit: 18,186,917
RAC: 24,275
Message 94990 - Posted: 20 Apr 2020, 12:53:53 UTC - in response to Message 94988.  

A method using app_info.xml file, re-describing only the rosetta app and not the mini app, allowed me to get rid definitely of the horrible mini tasks, the rosetta are now crunching like a charm.
My understanding is Rosetta doesn't support Anonymous Platform, and it looks like many of your Anonymous Platform tasks are Erroring out, cancelled by the project.
Along with all the ones you haven't processed by the deadline.


Try setting a smaller cache, and you might be able to return some Results without trashing hundreds.
Grant
Darwin NT
ID: 94990 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [AF>Le_Pommier] Jerome_C2005

Send message
Joined: 22 Aug 06
Posts: 42
Credit: 1,258,039
RAC: 0
Message 95036 - Posted: 21 Apr 2020, 8:39:33 UTC - in response to Message 94990.  

The platform was not anonymous before and my cache was 2 days. I don't know why boinc got so many tasks at the beginning, it's not a problem for me if the project cancels them as long as I have rosetta tasks running and it is the case.

Mini tasks were failing after starting to run, not before.

You only see canceled tasks because rosetta is purging the list all the time and I have no real history.

Platform is now anonymous because I created the app_info (this is a direct consequence) in order to avoid mini tasks to be sent by the project and now things are OK for me.
ID: 95036 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1699
Credit: 18,186,917
RAC: 24,275
Message 95039 - Posted: 21 Apr 2020, 10:08:33 UTC - in response to Message 95036.  

The platform was not anonymous before and my cache was 2 days. I don't know why boinc got so many tasks at the beginning, it's not a problem for me if the project cancels them as long as I have rosetta tasks running and it is the case.
I'm not concerned about you as you obviously don't care about the project.
But i am concerned about the project as trashing Tasks doesn't help. Having a reasonably sized cache would.
Grant
Darwin NT
ID: 95039 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2130
Credit: 41,424,155
RAC: 16,102
Message 95052 - Posted: 21 Apr 2020, 18:07:13 UTC - in response to Message 95036.  

The platform was not anonymous before and my cache was 2 days. I don't know why boinc got so many tasks at the beginning, it's not a problem for me if the project cancels them as long as I have rosetta tasks running and it is the case.

Mini tasks were failing after starting to run, not before

The very first download of tasks is always ridiculous. As you complete tasks it corrects things. Very annoying, but fortunately doesn't last long.

MiniRosetta is being deprecated and won't be around once in process tasks are completedaborted, so hopefully that goes away too. I don't think any more tasks are going out any more
ID: 95052 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 653
Credit: 11,840,739
RAC: 10
Message 95055 - Posted: 21 Apr 2020, 18:36:01 UTC
Last modified: 21 Apr 2020, 18:39:35 UTC

For information only in case others are worried by this behaviour. It is not just your machine!!!

One of my machines, (4GHz i7), has four odd workunits running at the moment. I have 12 hours set for the target run time, and jobs normally finish around that, but I have four jobs running that are going to be well over that. One is already over 18 hours in. The remaining IS decreasing, but not in a very linear way, it jumps up and down for a while, then drops a bit, then up and down again.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 95055 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1699
Credit: 18,186,917
RAC: 24,275
Message 95075 - Posted: 21 Apr 2020, 22:56:01 UTC - in response to Message 95052.  

MiniRosetta is being deprecated and won't be around once in process tasks are completedaborted, so hopefully that goes away too. I don't think any more tasks are going out any more
Nah, i'm still getting new Min Rosetta work, but i have been getting more resends than new work lately.
Grant
Darwin NT
ID: 95075 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 653
Credit: 11,840,739
RAC: 10
Message 95119 - Posted: 22 Apr 2020, 6:47:42 UTC
Last modified: 22 Apr 2020, 7:21:34 UTC

This extended run time is actually getting to be a little bizarre! Looking at my results page, the run times do show this extended run time, up by 50%, but the CPU time does not reflect this, that remains fairly static within the realms of normality. A question that arises therefore, is what are these jobs doing. Useless credit varies wildly of course, always been the case.

1156283316 1040082343 3117659 20 Apr 2020, 23:28:39 UTC 21 Apr 2020, 20:07:56 UTC Completed and validated 43,240.01 42,839.94 633.48 Rosetta v4.15
windows_x86_64
1156260164 1040062029 3161065 20 Apr 2020, 22:52:12 UTC 22 Apr 2020, 2:05:01 UTC Completed and validated 68,470.36 42,776.83 549.05 Rosetta v4.15
windows_x86_64
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 95119 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1699
Credit: 18,186,917
RAC: 24,275
Message 95122 - Posted: 22 Apr 2020, 9:14:16 UTC - in response to Message 95119.  

Looking at my results page, the run times do show this extended run time, up by 50%, but the CPU time does not reflect this, that remains fairly static within the realms of normality. A question that arises therefore, is what are these jobs doing.
A difference between Runtime & CPU time has nothing to do with the Task, it has to do with your settings for Computation work and/or other programmes running on your computer. If you make heavy use of your system, or you have "Use at most 100 % of CPU time" at anything other than 100% then there will be a difference between the times. The lower that percentage value & the greater the usage of the CPU by other programmes, then the bigger the difference.
Your system is showing that other programmes are making use of the CPU almost half the time, meaning Rosetta can't. For a lightly used system, the difference between Runtime & CPU time should be around 4min for a 8 hour CPU Target time. For a dedicated cruncher, the difference will be a minute or 2 (if that).
Grant
Darwin NT
ID: 95122 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 653
Credit: 11,840,739
RAC: 10
Message 95124 - Posted: 22 Apr 2020, 9:30:51 UTC - in response to Message 95122.  
Last modified: 22 Apr 2020, 9:34:27 UTC

The difference I am seeing is new, I have only seen it recently, and only on one machine, which, coincidently, is the machine I use least. I doubt anything other than BOINC has been run on there for weeks. Use at most is 100% on both, has always been so.

<edit>
Normally, I do not have a screen/keyboard/mouse on that machine. I moved the gear over there and had a look. I saw Foldig@Home was running on there. I looked at that a while back but decided against it - I deleted it from this machine and THOUGHT I had deleted it from the other. That could be an aspect of the effect I am seeing.
</edit>
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 95124 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1699
Credit: 18,186,917
RAC: 24,275
Message 95125 - Posted: 22 Apr 2020, 10:10:06 UTC - in response to Message 95124.  

<edit>
Normally, I do not have a screen/keyboard/mouse on that machine. I moved the gear over there and had a look. I saw Foldig@Home was running on there. I looked at that a while back but decided against it - I deleted it from this machine and THOUGHT I had deleted it from the other. That could be an aspect of the effect I am seeing.
</edit>
If it is running CPU tasks (and even GPU tasks as they require CPU support) then that explains the big difference in CPU time & Runtime. Folding@ home is using the CPU, so Rosetta can't.

You can use a configuration file to limit the number of cores/threads Rosetta uses on that system (never done it myself so i can't help there unfortunately), and Folding probably allows similar configurations so the two can coexist without impacting on each other's processing.
Grant
Darwin NT
ID: 95125 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 653
Credit: 11,840,739
RAC: 10
Message 95126 - Posted: 22 Apr 2020, 10:38:21 UTC

I'm pretty sure it was Folding that had created the effect I was seeing. When I was investigating it, they told me that I could prohibit F@H using the GPU, but then it would run multithreaded. I looked at it because the GPU on that machine was starting to suffer, arrays of dark spots in distinct rhomboidal shapes over parts of the display, so removed the GPU projects from its portfolio. It is running Rosetta and TN-Grid now.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 95126 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [AF>Le_Pommier] Jerome_C2005

Send message
Joined: 22 Aug 06
Posts: 42
Credit: 1,258,039
RAC: 0
Message 95198 - Posted: 23 Apr 2020, 9:55:59 UTC - in response to Message 95039.  

The platform was not anonymous before and my cache was 2 days. I don't know why boinc got so many tasks at the beginning, it's not a problem for me if the project cancels them as long as I have rosetta tasks running and it is the case.
I'm not concerned about you as you obviously don't care about the project.
But i am concerned about the project as trashing Tasks doesn't help. Having a reasonably sized cache would.


???

Instead of judging completely off base maybe you can read the problems that I have reported here, from the beginning, and that have not been solved.

I don't want to have mini tasks running forever with errors in the log, not using CPU anymore and not terminating ever, so blocking other tasks to run. All the mini were having the same behavior on that host. None of the rosetta did.

This what I call "caring about the project" and not yelping like a twitter troll.

Since when 2 days is not a "reasonable cache" for boinc ?
ID: 95198 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1699
Credit: 18,186,917
RAC: 24,275
Message 95199 - Posted: 23 Apr 2020, 10:11:05 UTC - in response to Message 95198.  
Last modified: 23 Apr 2020, 10:13:18 UTC

Since when 2 days is not a "reasonable cache" for boinc ?
When you run more than 1 project.
And since you run multiple projects, there is absolutely no need at all for any sort of cache. None at all. The only reasons for having a cache, is so that you have work to do if your system is off line for an extended time, or in the case that that project(s) you are attached to run out of work or are off line. Given the number of projects you are attached to there is no way on Earth (other than maybe the end of the Earth) that all of them would be out of work or offline at the same time.
So for you to have a cache of any size is completely unnecessary .


Even so, the biggest argument against the size of the cache you have is that you produce more errors due to missed deadlines than you actually produce that are Valid results.
I have seen mentioned here several times when people have posted about about a lack of WUs to processes, the amount of time & effort it takes to actually produce those WUs in order for us to be able process them. So if people really cared about the project, they wouldn't download more work than they could ever hope to do, or once finding out that is what is happening- stop doing it.
Particularly so when all it would require is reducing the size of their cache- not a lot of effort required for what would be a significant contribution to the project.
Grant
Darwin NT
ID: 95199 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [AF>Le_Pommier] Jerome_C2005

Send message
Joined: 22 Aug 06
Posts: 42
Credit: 1,258,039
RAC: 0
Message 95217 - Posted: 23 Apr 2020, 17:23:05 UTC - in response to Message 95199.  

The problems I had with mini tasks had *nothing* to do with missed deadline.

The canceled by they server" mini tasks had not started to run = sent back to other users, no problem.

The mini tasks that started to run on that machine never terminated / succeeded due to the problems I documented before.

Besides I had solved the problem on that host by blocking mini tasks execution (documented also, app_info = anonymous platform = no problem).

Rosetta tasks were running fine.

For the moment I have turned this host on another project, so less problem even :)
ID: 95217 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1699
Credit: 18,186,917
RAC: 24,275
Message 95249 - Posted: 23 Apr 2020, 22:49:09 UTC - in response to Message 95217.  

Rosetta tasks were running fine.
Except for all the ones that missed the deadlines- which were more than the number of Tasks that you actually processed. You produced more errors with missed deadlines than your issues with Rosetta Mini did. Hence missed deadlines was the bigger problem.
Grant
Darwin NT
ID: 95249 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2130
Credit: 41,424,155
RAC: 16,102
Message 95253 - Posted: 23 Apr 2020, 23:02:24 UTC - in response to Message 95075.  

MiniRosetta is being deprecated and won't be around once in process tasks are completed/aborted, so hopefully that goes away too. I don't think any more tasks are going out any more
Nah, i'm still getting new Min Rosetta work, but i have been getting more resends than new work lately.

It definitely is. I had one earlier today, but it's just the dregs/resends. The server status page tells the story.
I'm surprised there are even that many still floating around by now
ID: 95253 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2130
Credit: 41,424,155
RAC: 16,102
Message 95256 - Posted: 23 Apr 2020, 23:20:41 UTC - in response to Message 95198.  
Last modified: 23 Apr 2020, 23:20:56 UTC

The platform was not anonymous before and my cache was 2 days. I don't know why boinc got so many tasks at the beginning, it's not a problem for me if the project cancels them as long as I have rosetta tasks running and it is the case.
I'm not concerned about you as you obviously don't care about the project.
But i am concerned about the project as trashing Tasks doesn't help. Having a reasonably sized cache would.

???

That was an unjustified comment - ignore it

I don't want to have mini tasks running forever with errors in the log, not using CPU anymore and not terminating ever, so blocking other tasks to run. All the mini were having the same behavior on that host. None of the rosetta did.

It was announced a few days ago MiniRosetta is ended now - only Rosetta tasks. Some resends may arrive, but cancel them if you want.

Since when 2 days is not a "reasonable cache" for Boinc?

It's not a matter of Boinc, it's a matter of the Rosetta project.
There used to be mainly 8-day deadlines, but since it started running COVID19 tasks, the project effectively became a 3-day deadline project exclusively from 3 weeks ago, with one researcher saying they started to look at results returned after they'd been released 2-days.
You're right - 2 days <was> fine. But once COVID19 tasks began and the deadline change it's probably the case that 1.5 days is the maximum you can hold for results you return to be meaningful here, allowing an amount for task overruns.
I now use 1.45 days - for no good reason tbh - but we should consider 1.5 to be the maximum cache
ID: 95256 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 14 · 15 · 16 · 17 · 18 · 19 · Next

Message boards : Number crunching : Rosetta 4.0+



©2024 University of Washington
https://www.bakerlab.org