all 16 of my boxes can't communicate with Rosetta anymore

Questions and Answers : Windows : all 16 of my boxes can't communicate with Rosetta anymore

To post messages, you must log in.

AuthorMessage
amgthis

Send message
Joined: 25 Mar 06
Posts: 81
Credit: 203,879,282
RAC: 0
Message 26222 - Posted: 7 Sep 2006, 0:56:48 UTC

A snippet from the log:

2006-09-06 17:48:33 [---] Starting BOINC client version 5.4.11 for windows_intelx86
2006-09-06 17:48:33 [---] libcurl/7.15.3 OpenSSL/0.9.8a zlib/1.2.3
2006-09-06 17:48:33 [---] Data directory: D:Program FilesBOINC
2006-09-06 17:48:33 [---] Processor: 1 AuthenticAMD AMD Athlon(tm) 64 Processor 4000+
2006-09-06 17:48:33 [---] Memory: 510.42 MB physical, 1.22 GB virtual
2006-09-06 17:48:33 [---] Disk: 71.52 GB total, 64.99 GB free
2006-09-06 17:48:33 [rosetta@home] URL: https://boinc.bakerlab.org/rosetta/; Computer ID: 224998; location: home; project prefs: default
2006-09-06 17:48:33 [---] General prefs: from rosetta@home (last modified 2006-03-25 11:18:10)
2006-09-06 17:48:33 [---] General prefs: no separate prefs for home; using your defaults
2006-09-06 17:48:33 [---] Local control only allowed
2006-09-06 17:48:33 [---] Listening on port 31416
2006-09-06 17:48:33 [rosetta@home] Started upload of file 1dtj__CHEAT_ABRELAX_SAVE_ALL_OUT_BARCODE__1222_8420_0_0
2006-09-06 17:48:33 [rosetta@home] Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi
2006-09-06 17:48:33 [rosetta@home] Reason: To fetch work
2006-09-06 17:48:33 [rosetta@home] Requesting 8640 seconds of new work
2006-09-06 17:48:41 [---] Project communication failed: attempting access to reference site
2006-09-06 17:48:41 [---] Access to reference web site failed - check network connection or proxy configuration.
2006-09-06 17:48:42 [rosetta@home] Temporarily failed upload of 1dtj__CHEAT_ABRELAX_SAVE_ALL_OUT_BARCODE__1222_8420_0_0: http error
2006-09-06 17:48:42 [rosetta@home] Backing off 1 minutes and 0 seconds on upload of file 1dtj__CHEAT_ABRELAX_SAVE_ALL_OUT_BARCODE__1222_8420_0_0
2006-09-06 17:48:43 [rosetta@home] Scheduler request failed: Unrecognized HTTP Content-Encoding
2006-09-06 17:48:43 [rosetta@home] Deferring scheduler requests for 1 minutes and 5 seconds
2006-09-06 17:49:02 [---] Project communication failed: attempting access to reference site
2006-09-06 17:49:05 [---] Access to reference site failed - check network connection or proxy configuration.
2006-09-06 17:49:06 [---] Project communication failed: attempting access to reference site
2006-09-06 17:49:11 [---] Project communication failed: attempting access to reference site
2006-09-06 17:49:14 [---] Project communication failed: attempting access to reference site
2006-09-06 17:49:16 [---] Rescheduling CPU: project reset by user
2006-09-06 17:49:16 [rosetta@home] Resetting project
2006-09-06 17:49:16 [---] Rescheduling CPU: exit_tasks
2006-09-06 17:49:16 [rosetta@home] Persistent file transfer object not found
2006-09-06 17:49:16 [rosetta@home] Persistent file transfer object not found

I haven't changed my proxy settings from what I've been running for many months.

/Mike
ID: 26222 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 26232 - Posted: 7 Sep 2006, 6:12:09 UTC

What do you get if you click on Start, choose Run, type in "cmd" and hit enter.
Click on the dos box, and type "ping boinc.bakerlab.org" and hit enter.
If it converts the domain name into an IP#, then DNS is working; and if you get ping times, then at least some communication is possible.

Does tracert show any network issues between you and boinc.bakerlab.org?


If you're well beyond the tech level of the first two questions, then run EtherReal, capture the first few minutes of Boinc starting up Rosetta. Posting the network messages that show the results of attempting to contact the Rosetta servers should give better clues as to the problem. Posting those results at https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1891 will get more attention.

What proxy and version number are you using? Has the proxy server been rebooted?

One of our clients decided to save money by not renewing their Anti Virus subscriptions for a year. They plugged an infected machine into the network, which infected all the other machines on the network, and all the infected machines managed to overload the router. You might want to verify that they haven't been compromised.


ID: 26232 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
amgthis

Send message
Joined: 25 Mar 06
Posts: 81
Credit: 203,879,282
RAC: 0
Message 26254 - Posted: 7 Sep 2006, 12:59:10 UTC - in response to Message 26232.  


Thanks for the quick reply BennyRop! I fired up folding on all 16 machines
last night when I couldn't resolve this issue. Folding won't run real well though over my satellite ISP because the returned work result packets are much too large for the very slow uplink speeds of the satellite.

[quote] What do you get if you click on Start, choose Run, type in "cmd" and hit enter.
Click on the dos box, and type "ping boinc.bakerlab.org" and hit enter.
If it converts the domain name into an IP#, then DNS is working; and if you get ping times, then at least some communication is possible.

Yes I can ping this domain and load all of the Rosetta sites with my browser.

Does tracert show any network issues between you and boinc.bakerlab.org?

I haven't checked that yet but will try.

If you're well beyond the tech level of the first two questions, then run EtherReal, capture the first few minutes of Boinc starting up Rosetta. Posting the network messages that show the results of attempting to contact the Rosetta servers should give better clues as to the problem. Posting those results at https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1891 will get more attention.

I'll try that, too. I rebooted my router. I think the real nature of the
problem is my satellite ISP has become less than impressed with me tying up too much bandwidth with 16 boxes over one proxy server and somehow they've managed to block traffic for work packets while still enabling some connections to
boinc.bakerlab.org. Is this a possible scenerio?

What proxy and version number are you using? Has the proxy server been rebooted?

I'm just proxying all boxes to the gateway IP address of my linksys router and
the satellite ISP's default gateway IP.

One of our clients decided to save money by not renewing their Anti Virus subscriptions for a year. They plugged an infected machine into the network, which infected all the other machines on the network, and all the infected machines managed to overload the router. You might want to verify that they haven't been compromised.

I turned off zonealarm and my A/V just to test, rebooted everything and same problem. I'll keep you posted on what happens with the other tests you mentioned to try. Thanks again.

/mike



ID: 26254 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 26288 - Posted: 7 Sep 2006, 18:21:16 UTC - in response to Message 26254.  


Thanks for the quick reply BennyRop! I fired up folding on all 16 machines
last night when I couldn't resolve this issue. Folding won't run real well though over my satellite ISP because the returned work result packets are much too large for the very slow uplink speeds of the satellite.


Have you switched to 24 hour WUs, so only 16 tasks are uploaded and downloaded each day?

There was a description of setting up machines so they would only communicate with the server during certain hours of the day. Perhaps you could setup your computers into 8 sets of 2 systems, with each set of 2 only allowed to communicate to the server for a 3 hour block of time each day.

If you turn everything off, and then turn on the machines one at a time, let each machine upload its data and download the next WU - does the machine still have errors uploading/downloading?
ID: 26288 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
amgthis

Send message
Joined: 25 Mar 06
Posts: 81
Credit: 203,879,282
RAC: 0
Message 26322 - Posted: 8 Sep 2006, 1:11:34 UTC - in response to Message 26232.  

What do you get if you click on Start, choose Run, type in "cmd" and hit enter.
Click on the dos box, and type "ping boinc.bakerlab.org" and hit enter.
If it converts the domain name into an IP#, then DNS is working; and if you get ping times, then at least some communication is possible.

Does tracert show any network issues between you and boinc.bakerlab.org?


If you're well beyond the tech level of the first two questions, then run EtherReal, capture the first few minutes of Boinc starting up Rosetta. Posting the network messages that show the results of attempting to contact the Rosetta servers should give better clues as to the problem. Posting those results at https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1891 will get more attention.

What proxy and version number are you using? Has the proxy server been rebooted?

One of our clients decided to save money by not renewing their Anti Virus subscriptions for a year. They plugged an infected machine into the network, which infected all the other machines on the network, and all the infected machines managed to overload the router. You might want to verify that they haven't been compromised.



ID: 26322 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
amgthis

Send message
Joined: 25 Mar 06
Posts: 81
Credit: 203,879,282
RAC: 0
Message 26323 - Posted: 8 Sep 2006, 1:14:41 UTC - in response to Message 26232.  

What do you get if you click on Start, choose Run, type in "cmd" and hit enter.
Click on the dos box, and type "ping boinc.bakerlab.org" and hit enter.
If it converts the domain name into an IP#, then DNS is working; and if you get ping times, then at least some communication is possible.

Does tracert show any network issues between you and boinc.bakerlab.org?


If you're well beyond the tech level of the first two questions, then run EtherReal, capture the first few minutes of Boinc starting up Rosetta. Posting the network messages that show the results of attempting to contact the Rosetta servers should give better clues as to the problem. Posting those results at https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1891 will get more attention.

What proxy and version number are you using? Has the proxy server been rebooted?



BennyRop - tonight it's back up and talking to the project. I leave everything on 24/7 so I don't know what changed but I suspect it was a deal with
my ISP. I just wish if they had issues with my bandwidth consumption, etc. they would email first. But then again, I'm not sure what happened. I haven't
changed anything from yesterday other than to reboot - and that wasn't enough as
of this morning.

Thanks for your suggestions and quick replies.

happy crunching,

/Mike



One of our clients decided to save money by not renewing their Anti Virus subscriptions for a year. They plugged an infected machine into the network, which infected all the other machines on the network, and all the infected machines managed to overload the router. You might want to verify that they haven't been compromised.



ID: 26323 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 26328 - Posted: 8 Sep 2006, 2:08:29 UTC

I have satilite ISP as well. No other "high-speed" option for me. When you use a lot of bandwidth, I think they just penalize you by slowing you down, it shouldn't mean being unable to connect. Just that your speed would be slow.

Also, with Rosetta, you download a fairly large (3-5MB) WU, you crunch it for roughly your WU runtime preference (up to 24 hrs!) and then you send back a fairly small (~250K) file with the results. So, the limited upload bandwidth should not be a problem for you.

I find my PC seems to lose internet access and I have to reboot. I'm thinking it's a Windows problem, because it worked GREAT until my PC crashed and I had to reload it. Perhaps I still need a BIOS update to get back where I was, not sure.

Anyway, keep crunchin'! And be sure to post any problems you run across.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 26328 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 26329 - Posted: 8 Sep 2006, 2:08:43 UTC - in response to Message 26323.  

BennyRop - tonight it's back up and talking to the project. I leave everything on 24/7 so I don't know what changed but I suspect it was a deal with
my ISP. I just wish if they had issues with my bandwidth consumption, etc. they would email first. But then again, I'm not sure what happened. I haven't
changed anything from yesterday other than to reboot - and that wasn't enough as
of this morning.

Thanks for your suggestions and quick replies.

happy crunching,

/Mike


If you think it's a matter of their complaining about your bandwidth usage.. switch to 24 hour WUs to minimize the amount of bandwidth you use.
ID: 26329 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
amgthis

Send message
Joined: 25 Mar 06
Posts: 81
Credit: 203,879,282
RAC: 0
Message 26331 - Posted: 8 Sep 2006, 2:18:22 UTC - in response to Message 26329.  
Last modified: 8 Sep 2006, 2:20:08 UTC

BennyRop - tonight it's back up and talking to the project. I leave everything on 24/7 so I don't know what changed but I suspect it was a deal with
my ISP. I just wish if they had issues with my bandwidth consumption, etc. they would email first. But then again, I'm not sure what happened. I haven't
changed anything from yesterday other than to reboot - and that wasn't enough as
of this morning.

Thanks for your suggestions and quick replies.

happy crunching,

/Mike


If you think it's a matter of their complaining about your bandwidth usage.. switch to 24 hour WUs to minimize the amount of bandwidth you use.


Thanks for that idea, too Benny. I don't know if it's a problem yet as I haven't heard anything from the ISP but if it happens again I'll look into
the 24hr. WU's.

/Mike

ID: 26331 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
amgthis

Send message
Joined: 25 Mar 06
Posts: 81
Credit: 203,879,282
RAC: 0
Message 28595 - Posted: 27 Sep 2006, 20:57:00 UTC - in response to Message 26232.  

All I can figure is it is a starband satellite latency issue. Sometimes I can connect, sometimes I can't. Usually I can connect and get work maybe once a day but that is all. The units get crunched and there are no more communications for another day or so usually after I quit and restart BOINC.
I have returned to folding@home and I seem to be able to run steadily over the same satellite / networked connections. Maybe Stanford has a more relaxed protocol with their server network.
Someone on the Boinc support site suggested many tests, I pinged sites with options to set MTU size and had numerous time out problems with many missed packets, almost no matter what size I set the packet to. That is why I'm betting it's a satellite delay problem. For whatever reason, BOINC and rosetta hate it, while I can web browse and even download complete linux iso's with no problems. Folding@home will have to do for now but I'll keep coming back and trying all the latest BOINC clients until I can find one that works smoother with my admittedly marginal satellite setup.

regards and keep crunching,

/mike





What do you get if you click on Start, choose Run, type in "cmd" and hit enter.
Click on the dos box, and type "ping boinc.bakerlab.org" and hit enter.
If it converts the domain name into an IP#, then DNS is working; and if you get ping times, then at least some communication is possible.

Does tracert show any network issues between you and boinc.bakerlab.org?


If you're well beyond the tech level of the first two questions, then run EtherReal, capture the first few minutes of Boinc starting up Rosetta. Posting the network messages that show the results of attempting to contact the Rosetta servers should give better clues as to the problem. Posting those results at https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1891 will get more attention.

What proxy and version number are you using? Has the proxy server been rebooted?

One of our clients decided to save money by not renewing their Anti Virus subscriptions for a year. They plugged an infected machine into the network, which infected all the other machines on the network, and all the infected machines managed to overload the router. You might want to verify that they haven't been compromised.



ID: 28595 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Questions and Answers : Windows : all 16 of my boxes can't communicate with Rosetta anymore



©2024 University of Washington
https://www.bakerlab.org