<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<style style="display: none;" id="owaParaStyle" type="text/css">P {margin-top:0;margin-bottom:0;}</style>
</head>
<body id="" tabindex="0" aria-label="Nachrichtentext" fpstyle="1" dir="ltr">
<div name="divtagdefaultwrapper" id="divtagdefaultwrapper" style="font-family: Tahoma, Geneva, sans-serif; font-size: 10pt; color: #000000; margin: 0">
Hi!<br>
<br>
I installed Munin on two of our machines. The master runs in OpenStack, the first node I configured is the controller node of another OpenStack installation. Because of the high number of disks the "diskstats" plugin generates a lot of output (6154 lines, 218698
bytes). This much data kills the TCP connection between the master and the node.<br>
<br>
I can reproduce this with just a telnet to the node, port 4949 like this:<br>
<font face="'Courier New',monospace"># telnet 192.168.104.61 4949<br>
Trying 192.168.104.61...<br>
Connected to 192.168.104.61.<br>
Escape character is '^]'.<br>
# munin node at CNT64IB003.example.com<br>
config diskstats<br>
... lots of config data ...<br>
graph_info This graph shows the number of IO operations pr second and the average size of these requests. Lots of small requests should result in in lower throughput (separate graph) and higher service time (separate graph). Please note Connection closed
by foreign host.<br>
</font><br>
<div><br>
<title></title>
<basefont size="2" face="Arial">The connection goes from the VM on the internal network through an OpenStack router to the external network and over a "real" router to the node. I have confirmed by using a different hardware machine that connection is OK outside
of OpenStack and also by using an OpenStack VM as another Munin node, that the OpenStack L3/router is to blame, not the internal networking. For completeness, this is up-to-date Grizzly RDO running on up-to-date CentOS 6.4.<br>
<br>
Networks:<br>
<ul style="font-family: Tahoma; font-size: 10pt; margin-top: 0px; margin-bottom: 0px;">
<li><font face="Tahoma, Geneva, sans-serif">192.168.163.0/24 OpenStack internal</font></li><li><font face="Tahoma, Geneva, sans-serif">192.168.142.0/24 OpenStack external</font></li><li><font face="'Courier New',monospace"><font face="Tahoma, Geneva, sans-serif">192.168.104.0/24 outside of OpenStack</font><br>
</font></li></ul>
<br>
Here is the tcpdump output from the Munin master:<br>
<br>
<font face="'Courier New',monospace">13:35:02.464812 IP 192.168.163.12.46630 > 192.168.104.61.munin: Flags [.], ack 94081, win 143, options [nop,nop,TS val 83989877 ecr 338498688], length 0<br>
13:35:02.465086 IP 192.168.104.61.munin > 192.168.163.12.46630: Flags [.], seq 94081:95529, ack 727, win 114, options [nop,nop,TS val 338498690 ecr 83989877], length 1448<br>
13:35:02.465282 IP 192.168.104.61.munin > 192.168.163.12.46630: Flags [.], seq 95529:98425, ack 727, win 114, options [nop,nop,TS val 338498690 ecr 83989877], length 2896<br>
13:35:02.469397 IP 192.168.163.12.46630 > 192.168.104.61.munin: Flags [.], ack 98425, win 205, options [nop,nop,TS val 83989881 ecr 338498690], length 0<br>
13:35:02.469686 IP 192.168.104.61.munin > 192.168.163.12.46630: Flags [.], seq 98425:99873, ack 727, win 114, options [nop,nop,TS val 338498694 ecr 83989881], length 1448<br>
13:35:02.469881 IP 192.168.104.61.munin > 192.168.163.12.46630: Flags [.], seq 99873:104217, ack 727, win 114, options [nop,nop,TS val 338498694 ecr 83989881], length 4344<br>
13:35:02.474029 IP 192.168.163.12.46630 > 192.168.104.61.munin: Flags [.], ack 104217, win 189, options [nop,nop,TS val 83989885 ecr 338498694], length 0<br>
13:35:02.474292 IP 192.168.104.61.munin > 192.168.163.12.46630: Flags [R], seq 4001796938, win 0, length 0<br>
13:35:02.605976 IP 192.168.104.61.munin > 192.168.163.12.46630: Flags [.], seq 39738:41186, ack 727, win 114, options [nop,nop,TS val 338498680 ecr 83989867], length 1448<br>
13:35:02.606042 IP 192.168.163.12.46630 > 192.168.104.61.munin: Flags [R], seq 3552918741, win 0, length 0<br>
13:35:02.615483 IP 192.168.104.61.munin > 192.168.163.12.46630: Flags [P.], seq 41186:41197, ack 727, win 114, options [nop,nop,TS val 338498680 ecr 83989867], length 11<br>
13:35:02.615512 IP 192.168.163.12.46630 > 192.168.104.61.munin: Flags [R], seq 3552918741, win 0, length 0<br>
13:35:02.618052 IP 192.168.104.61.munin > 192.168.163.12.46630: Flags [.], seq 41197:42645, ack 727, win 114, options [nop,nop,TS val 338498680 ecr 83989867], length 1448</font><br>
<br>
You can see that the Munin node terminates the connection with a RESET packet (13:35:02.474292).<font face="'Courier New',monospace"><font face="Tahoma, Geneva, sans-serif"> The Munin master isn't running NTP, so please disregard the timestamps.
</font><br>
<br>
<font face="Tahoma, Geneva, sans-serif">Now, here is a tcpdump output from the node:</font><br>
<br>
13:35:17.485792 IP 192.168.142.11.46630 > 192.168.104.61.munin: Flags [.], ack 94081, win 143, options [nop,nop,TS val 83989877 ecr 338498688], length 0<br>
13:35:17.485812 IP 192.168.104.61.munin > 192.168.142.11.46630: Flags [.], seq 94081:98425, ack 727, win 114, options [nop,nop,TS val 338498690 ecr 83989877], length 4344<br>
13:35:17.490390 IP 192.168.142.11.46630 > 192.168.104.61.munin: Flags [.], ack 98425, win 205, options [nop,nop,TS val 83989881 ecr 338498690], length 0<br>
13:35:17.490406 IP 192.168.104.61.munin > 192.168.142.11.46630: Flags [.], seq 98425:104217, ack 727, win 114, options [nop,nop,TS val 338498694 ecr 83989881], length 5792<br>
13:35:17.492061 IP 192.168.142.11.46630 > 192.168.104.61.munin: Flags [R], seq 3552918741, win 0, length 0<br>
13:35:17.495034 IP 192.168.142.11.46630 > 192.168.104.61.munin: Flags [.], ack 104217, win 189, options [nop,nop,TS val 83989885 ecr 338498694], length 0<br>
13:35:17.495055 IP 192.168.104.61.munin > 192.168.142.11.46630: Flags [R], seq 4001796938, win 0, length 0<br>
13:35:17.501622 IP 192.168.142.11.46630 > 192.168.104.61.munin: Flags [R], seq 3552918741, win 0, length 0<br>
13:35:17.511234 IP 192.168.142.11.46630 > 192.168.104.61.munin: Flags [R], seq 3552918741, win 0, length 0<br>
13:35:17.520809 IP 192.168.142.11.46630 > 192.168.104.61.munin: Flags [R], seq 3552918741, win 0, length 0<br>
</font><br>
You can see that this shows the Master sending the RESET packet (13:35:17.492061).<br>
<br>
I turned on debugging for the L3 agent, but can't tell anything from the output. Without debugging, the log file has no new entries.<br>
<br>
Please advise.<br>
<br>
<font face="Tahoma">Best regards / Mit freundlichen Grüßen <br>
Lutz Christoph <br>
<br>
-- <br>
<br>
Lutz Christoph <br>
<br>
arago Institut für komplexes Datenmanagement AG <br>
<br>
Eschersheimer Landstraße 526 - 532 <br>
60433 Frankfurt am Main <br>
<br>
eMail: lchristoph@arago.de - www: http://www.arago.de <br>
Tel: 0172/6301004 <br>
Mobil: 0172/6301004 </font><font face="Tahoma"><br>
<br>
-- <br>
Bankverbindung: Frankfurter Sparkasse, BLZ: 500 502 01, Kto.-Nr.: 79343 <br>
Vorstand: Hans-Christian Boos, Martin Friedrich <br>
Vorsitzender des Aufsichtsrats: Dr. Bernhard Walther <br>
Sitz: Kronberg im Taunus - HRB 5731 - Registergericht: Königstein i.Ts <br>
Ust.Idnr. DE 178572359 - Steuernummer 2603 003 228 43435 </font></div>
</div>
</body>
</html>