weinig te vertellen vandaag
geen leden en geen mijlpaal vandaag
dan maar wat nieuws van seti
23 Feb 2009 21:06:51 UTC
Our outbound traffic has been pegged since Friday. This may seem like only a download problem, but it even affects uploads, as the basic syn/ack handshaking packets on the upload server get dropped along with the rest of the download packets that can't make it through the dam.
After discussions with Eric and Jeff, here's what we gather is happening. We use coral cache to reduce our bandwidth needs. Coral cache is an easy-to-use, free, third-party system which does some nice distributed caching just by redirecting the right apache requests to their servers. For example, somebody wants to download the latest astropulse client, they go to our download server, and then they redirected automatically to the coral cache server. The redirect is of the form such that, if the coral cache server hasn't done so already, it downloads the latest astropulse client from us, caches it, and then sends it to the requester. Once cached, it doesn't need to contact our servers again. So, in essence, all but one of the client download requests hit originate from sources outside our lab, thus saving us lots of bandwidth.
That brings us to problem 1. Many ISPs don't like redirects to third-party IPs. This is understandable. What happens in this case is a client downloads a new application, but instead of getting the actual executable they get a blob of HTML saying "this ISP doesn't like third party redirects," etc. Obviously the checksum of this HTML blob won't match the executable checksum, resulting in an application download checksum error. This has been a known problem. So we've been only using coral cache during the first couple of weeks after a new application is made available to reduce the pain of the download rush. A small fraction of our users will be inconvenienced by those redirect errors, but they'll get their clients in due time when coral cache is turned off after the initial "wave."
But then there's problem 2. An application download checksum error (a) doesn't cause exponential backoff and (b) causes all workunits also requested by this particular client to be errored out and resent. This is at least the behavior is older, yet still commonly used, boinc clients. Dave said most of that has been addressed, but if they're still bugs they'll be fixed.
In any case, what we saw this weekend was a confluence of these two problems. This may not have been an issue before due to lighter traffic patterns, but we sure fell off the deep end this time. Maybe there was a small set of heavily active clients this time around causing most of the pain. And once the network gets pegged, all hell breaks loose, and it takes a while to heal itself.
Eric actually had most of this figured out before we arrived today, and already turned off coral cache. At least the broken redirects spiraling out of control would stop happening. He also adjusted the tcp settings on the upload server to help get those partially working again (instead of only 2% uploads getting through, now it's about 50%).
The plan is to let this current state of indigestion pass on its own, and if needed change some BOINC settings (if not also BOINC code) so that future coral cache attempts will be direct links as opposed to apache redirects.
- Matt
gaat dus over het upload en download probleem
wat we dus ook aan de stats kunnen zien man wat een lage output
i say tada
DPC SETI@Home hitparade van 23 februari 2009
When do you get them
SETI@Home Links
SETI@Home webpage
SETI@Home forum
DPCH Suggestiepagina
Bron
geen leden en geen mijlpaal vandaag
dan maar wat nieuws van seti
23 Feb 2009 21:06:51 UTC
Our outbound traffic has been pegged since Friday. This may seem like only a download problem, but it even affects uploads, as the basic syn/ack handshaking packets on the upload server get dropped along with the rest of the download packets that can't make it through the dam.
After discussions with Eric and Jeff, here's what we gather is happening. We use coral cache to reduce our bandwidth needs. Coral cache is an easy-to-use, free, third-party system which does some nice distributed caching just by redirecting the right apache requests to their servers. For example, somebody wants to download the latest astropulse client, they go to our download server, and then they redirected automatically to the coral cache server. The redirect is of the form such that, if the coral cache server hasn't done so already, it downloads the latest astropulse client from us, caches it, and then sends it to the requester. Once cached, it doesn't need to contact our servers again. So, in essence, all but one of the client download requests hit originate from sources outside our lab, thus saving us lots of bandwidth.
That brings us to problem 1. Many ISPs don't like redirects to third-party IPs. This is understandable. What happens in this case is a client downloads a new application, but instead of getting the actual executable they get a blob of HTML saying "this ISP doesn't like third party redirects," etc. Obviously the checksum of this HTML blob won't match the executable checksum, resulting in an application download checksum error. This has been a known problem. So we've been only using coral cache during the first couple of weeks after a new application is made available to reduce the pain of the download rush. A small fraction of our users will be inconvenienced by those redirect errors, but they'll get their clients in due time when coral cache is turned off after the initial "wave."
But then there's problem 2. An application download checksum error (a) doesn't cause exponential backoff and (b) causes all workunits also requested by this particular client to be errored out and resent. This is at least the behavior is older, yet still commonly used, boinc clients. Dave said most of that has been addressed, but if they're still bugs they'll be fixed.
In any case, what we saw this weekend was a confluence of these two problems. This may not have been an issue before due to lighter traffic patterns, but we sure fell off the deep end this time. Maybe there was a small set of heavily active clients this time around causing most of the pain. And once the network gets pegged, all hell breaks loose, and it takes a while to heal itself.
Eric actually had most of this figured out before we arrived today, and already turned off coral cache. At least the broken redirects spiraling out of control would stop happening. He also adjusted the tcp settings on the upload server to help get those partially working again (instead of only 2% uploads getting through, now it's about 50%).
The plan is to let this current state of indigestion pass on its own, and if needed change some BOINC settings (if not also BOINC code) so that future coral cache attempts will be direct links as opposed to apache redirects.
- Matt
gaat dus over het upload en download probleem
wat we dus ook aan de stats kunnen zien man wat een lage output

i say tada

DPC SETI@Home hitparade van 23 februari 2009
![]() | |||||
Daily Top 30 | |||||
Flushers: 64 / 962 (6,7 %) | |||||
pos | daily | member | total | ||
1. | (![]() | 4.723 | [DPC] hansR | 14.498.103 | (1) |
2. | (![]() | 3.888 | Searchy! Internet Services V.O.F. | 8.036.637 | (5) |
3. | (![]() | 3.268 | [DPC]TeamGrazzie | 11.989.418 | (2) |
4. | (![]() | 2.796 | [DPC] Switch | 8.326.360 | (4) |
5. | (![]() | 1.932 | F@stmem | 1.901.889 | (10) |
6. | (![]() | 1.255 | ray | 1.145.097 | (17) |
7. | (![]() | 1.074 | Sp@ceNv@der | 10.405.082 | (3) |
8. | (![]() | 1.024 | [DPC] Brainy007 | 537.895 | (31) |
9. | (![]() | 897 | o25o & Dharkon | 4.621.239 | (6) |
10. | (![]() | 894 | Jox444 | 769.525 | (25) |
11. | (![]() | 852 | Teun Leijten | 135.079 | (102) |
12. | (![]() | 839 | Ensign Wildfire | 2.885.554 | (7) |
13. | (![]() | 798 | Dadoke | 849.595 | (24) |
14. | (![]() | 770 | Shuwi | 99.965 | (125) |
15. | (![]() | 541 | dpc_kluizenaar | 206.880 | (74) |
16. | (![]() | 514 | Bram | 35.332 | (224) |
17. | (![]() | 496 | dr bommel | 342.143 | (54) |
18. | (![]() | 437 | Sylvester | 132.817 | (104) |
19. | (![]() | 302 | Herby | 347.717 | (51) |
20. | (![]() | 290 | Jan Zandvliet & x-RaY99 | 1.625.052 | (12) |
21. | (![]() | 272 | Robin Ovaere | 315.187 | (57) |
22. | (![]() | 270 | Joey van den Berge | 182.891 | (83) |
23. | (![]() | 270 | Mr Beamer | 1.752.893 | (11) |
24. | (![]() | 242 | BlueTooth76 | 244.763 | (66) |
25. | (![]() | 221 | seridan | 401.441 | (40) |
26. | (![]() | 170 | Tijntje | 1.284.196 | (16) |
27. | (![]() | 170 | trout | 168.234 | (86) |
28. | (![]() | 131 | grass460 | 94.397 | (128) |
29. | (![]() | 130 | Toaby | 40.576 | (207) |
30. | (![]() | 129 | ozzo | 74.001 | (143) |
More... |
![]() | |||||
Overall Top 30 | |||||
pos | total | member | daily | ||
1. | (![]() | 14.498.103 | [DPC] hansR | 4.723 | (1) |
2. | (![]() | 11.989.418 | [DPC]TeamGrazzie | 3.268 | (3) |
3. | (![]() | 10.405.082 | Sp@ceNv@der | 1.074 | (7) |
4. | (![]() | 8.326.360 | [DPC] Switch | 2.796 | (4) |
5. | (![]() | 8.036.637 | Searchy! Internet Services V.O.F. | 3.888 | (2) |
6. | (![]() | 4.621.239 | o25o & Dharkon | 897 | (9) |
7. | (![]() | 2.885.554 | Ensign Wildfire | 839 | (12) |
8. | (![]() | 2.537.723 | [DPC]Spinpoint | 0 | (-) |
9. | (![]() | 2.197.833 | [DPC]Trancemaus | 85 | (37) |
10. | (![]() | 1.901.889 | F@stmem | 1.932 | (5) |
11. | (![]() | 1.752.893 | Mr Beamer | 270 | (23) |
12. | (![]() | 1.625.052 | Jan Zandvliet & x-RaY99 | 290 | (20) |
13. | (![]() | 1.618.781 | AmeComputers | 72 | (40) |
14. | (![]() | 1.502.870 | MAX3400 | 0 | (-) |
15. | (![]() | 1.463.564 | [DPC]Team Zuid-Holland | 48 | (43) |
16. | (![]() | 1.284.196 | Tijntje | 170 | (26) |
17. | (![]() | 1.145.097 | ray | 1.255 | (6) |
18. | (![]() | 1.129.217 | Speedy67 & Friends | 0 | (-) |
19. | (![]() | 1.115.935 | visvogel | 0 | (-) |
20. | (![]() | 1.069.176 | [DPC] Anticimex & Lock Metaaldetectie | 0 | (-) |
21. | (![]() | 1.049.597 | [DPC]SjoQing | 0 | (-) |
22. | (![]() | 1.033.087 | Berkie | 54 | (41) |
23. | (![]() | 852.862 | Override | 17 | (63) |
24. | (![]() | 849.595 | Dadoke | 798 | (13) |
25. | (![]() | 769.525 | Jox444 | 894 | (10) |
26. | (![]() | 747.082 | Rydo | 0 | (-) |
27. | (![]() | 673.789 | Psyed | 0 | (-) |
28. | (![]() | 626.692 | MaNDaRK | 0 | (-) |
29. | (![]() | 563.965 | holy-shit | 0 | (-) |
30. | (![]() | 557.899 | sebastiaan | 0 | (-) |
More... |
![]() | |||||
Teams Daily Top 15 | |||||
pos | daily | team | total | ||
1. | (![]() | 245.263 | SETI.Germany | 757.168.993 | (2) |
26. | (![]() | 42.533 | OcUK - Overclockers UK | 191.959.294 | (16) |
27. | (![]() | 42.453 | Elite Games | 94.221.472 | (37) |
28. | (![]() | 40.506 | UK BOINC Team | 121.689.065 | (29) |
29. | (![]() | 40.477 | Ars Technica | 140.751.649 | (20) |
30. | (![]() | 34.698 | AUSTRIA - NATIONAL - TEAM | 116.815.461 | (30) |
31. | (![]() | 34.375 | Team NIPPON | 105.130.456 | (34) |
32. | (![]() | 34.041 | Universe Examiners | 122.130.321 | (28) |
33. | (![]() | 32.744 | Dutch Power Cows | 122.796.907 | (27) |
34. | (![]() | 32.111 | BOINC SETI@home RUSSIA | 111.003.701 | (32) |
35. | (![]() | 27.567 | BOINC.Italy | 140.408.938 | (21) |
36. | (![]() | 27.114 | Team Norway | 54.125.016 | (57) |
37. | (![]() | 26.985 | RoEduNet | 58.491.038 | (51) |
38. | (![]() | 25.981 | Planet 3DNow! | 89.210.850 | (38) |
39. | (![]() | 24.554 | Hungary | 94.694.972 | (36) |
40. | (![]() | 24.275 | Microsoft | 58.456.669 | (52) |
More... |
![]() | |||||
Teams Overall Top 15 | |||||
pos | total | team | daily | ||
1. | (![]() | 963.141.916 | SETI.USA | 222.819 | (2) |
20. | (![]() | 140.751.649 | Ars Technica | 40.477 | (29) |
21. | (![]() | 140.408.938 | BOINC.Italy | 27.567 | (35) |
22. | (![]() | 139.815.178 | Canada | 49.471 | (21) |
23. | (![]() | 139.771.737 | Phoenix Rising | 42.757 | (25) |
24. | (![]() | 135.338.953 | U.S.Air Force | 44.797 | (23) |
25. | (![]() | 133.985.779 | Amateur Radio Operators | 53.495 | (19) |
26. | (![]() | 126.765.825 | BOINC@AUSTRALIA | 62.618 | (16) |
27. | (![]() | 122.796.907 | Dutch Power Cows | 32.744 | (33) |
28. | (![]() | 122.130.321 | Universe Examiners | 34.041 | (32) |
29. | (![]() | 121.689.065 | UK BOINC Team | 40.506 | (28) |
30. | (![]() | 116.815.461 | AUSTRIA - NATIONAL - TEAM | 34.698 | (30) |
31. | (![]() | 113.233.285 | BOINC@Poland | 74.097 | (14) |
32. | (![]() | 111.003.701 | BOINC SETI@home RUSSIA | 32.111 | (34) |
33. | (![]() | 110.470.463 | PC Perspective Killer Frogs | 19.913 | (48) |
34. | (![]() | 105.130.456 | Team NIPPON | 34.375 | (31) |
More... |
![]() | |||||
Megaflush Top 5 | |||||
pos | Team | Flush | Date | ||
1. | [DPC]TeamGrazzie | 553.056 | 04-01-2008 | ||
2. | chelloo.com | 261.227 | 09-01-2008 | ||
3. | [DPC]Spinpoint | 174.423 | 14-06-2007 | ||
4. | hansR | 170.409 | 17-08-2008 | ||
5. | REISinformatiegroep | 165.307 | 18-04-2007 | ||
More... |
When do you get them
Team | Average | Days |
BOINC.Italy | 129.902 | 385 |
Ars Technica | 137.366 | 469 |
SETI@Home Links
SETI@Home webpage
SETI@Home forum
DPCH Suggestiepagina
Bron