squid配置參數round-robin嚴重不均衡的討論

←手機掃碼閱讀火星人 @ 2014-03-04 , reply:0

squid配置參數round-robin嚴重不均衡的討論

  在《squid權威文檔中》，「round-robin」選項的解釋如下：
  該選項是簡單的負載共享技術。僅僅當你指定了2 個或多個父cache 作為輪轉時，它才有用。squid 對每個父cache 維持一個計數器。當需要轉發cache 丟失時，squid 選擇計數器值最低的父cache。

  在我的squid的如下配置中，有2個parent，一個命名為ha，一個命名為hb，然後通過「round-robin」對2個parent進行簡單輪詢。
################### round-robin #####################
cache_peer source1.parent.com parent 80 3100 no-digest no-query no-netdb-exchange originserver name=ha round-robin
cache_peer source2.parent.com parent 80 3100 no-digest no-query no-netdb-exchange originserver name=hb round-robin
cache_peer_domain ha flv.domain.com
cache_peer_domain hb flv.domain.com

按照「round-robin」的描述，過一段時間后，2個parent的請求應該基本一致，但是過了大概一天，統計的請求數如下：
5 TCP_DENIED NONE -
203 TCP_HIT NONE -
22 TCP_MISS ROUNDROBIN_PARENT ha
212 TCP_MISS ROUNDROBIN_PARENT hb
因為請求在本地HIT，所以不會再通過parent，所以去掉HIT的請求不管。Miss的請求需要通過parent，但是發現2個parent的輪詢到的比例嚴重失調，hb的請求是ha的一百多倍，實在太奇怪了，而實際上2個parent在同一個網段，網路質量是一樣的，不知道是什麼原因會導致這樣的情況。

《解決方案》

另外，在測試round-robin的過程中，我順便測試了parent的dead狀態判斷，發現squid在判斷parent是否存活的流程有意思，整理如下：
《關於squid判斷parent的Dead或者live狀態》

當配置有一個多個parent的情況下，如果其中有一個parent連接不上，被判斷為dead狀態的時候，squid會在cache.log中記錄類似如下日誌：
2012/08/18 11:26:08| Detected DEAD Parent: source2.parent.com
當處於dead狀態的parent可以連接上了之後，會再記錄這樣的日誌；
2012/08/18 11:27:51| Detected REVIVED Parent: source2.parent.com

那麼，squid是在判斷parent到底是dead還是live的過程中，都做了一些什麼呢，如下是一些測試結果。

配置一：只有一個parent的情況下，每次請求連接失敗都會記錄日誌，連續10次連接失敗，該parent則會設置為dead狀態，以後每次請求失敗都不會再記錄，直到恢復：
2012/08/18 11:40:48| TCP connection to ctsource.parent.com (ctsource.parent.com:80) failed
2012/08/18 11:41:40| TCP connection to ctsource.parent.com (ctsource.parent.com:80) failed
2012/08/18 11:43:46| TCP connection to ctsource.parent.com (ctsource.parent.com:80) failed
2012/08/18 11:43:46| TCP connection to ctsource.parent.com (ctsource.parent.com:80) failed
2012/08/18 11:43:46| TCP connection to ctsource.parent.com (ctsource.parent.com:80) failed
2012/08/18 11:43:46| TCP connection to ctsource.parent.com (ctsource.parent.com:80) failed
2012/08/18 11:43:46| TCP connection to ctsource.parent.com (ctsource.parent.com:80) failed
2012/08/18 11:43:46| TCP connection to ctsource.parent.com (ctsource.parent.com:80) failed
2012/08/18 11:43:46| TCP connection to ctsource.parent.com (ctsource.parent.com:80) failed
2012/08/18 11:44:24| TCP connection to ctsource.parent.com (ctsource.parent.com:80) failed
2012/08/18 11:44:24| Detected DEAD Parent: ctsource.parent.com

配置二：在有多個Parent的情況下。
1）如果配置了round-robin，其中有一個parent斷掉，則會記錄日誌，然後進入失敗流程。在連續失敗10次后，該parent記錄為dead狀態，並從"round-robin"演算法中排除掉。
2）如果沒有配置round-robin，則會按照順序請求，如果第一順序parent斷掉，會記錄日誌，然後進入失敗流程。連續失敗10次，該parent記錄為dead狀態，之後的請求被發送到第二個parent。

《解決方案》

期待關於round-robin不均衡的回復

《解決方案》

哥們，我也出現過類似情況，我也有用round-robin ，每次情況差不多都是這樣：第一個cache_peer一開始的量很高，過幾分鐘，量就下去了，然後第二個cache_peer就承擔著主力軍的角色，第三個cache_peer的量處於第1個和第2個中間。

《解決方案》

看一下squid的源碼比較容易明白

peer_select.cc
裡面的
peerGetSomeParent

rr的實現策略在這一行
} else if ((p = getRoundRobinParent(request))) {
      code = ROUNDROBIN_PARENT;

這裡調用了：getRoundRobinParent
實現如下： rr計數器以及權值都會有所影響
         if (p->weight == q->weight) {
            if (q->rr_count < p->rr_count)
                  continue;
         } else if ( ((double) q->rr_count / q->weight) < ((double) p->rr_count / p->weight)) {
            continue;
         }

Tags:

[火星人 ] squid配置參數round-robin嚴重不均衡的討論已經有847次圍觀

本文地址：http://coctec.com/docs/service/show-post-11214.html

squid配置參數round-robin嚴重不均衡的討論

squid配置參數round-robin嚴重不均衡的討論

熱門文章

最新文章