Discussion:
Undefined behaviour when setting sFlowCpInterval to zero
andy kitchingman
2010-02-11 22:59:00 UTC
Permalink
Hi

In the poller function:

void
sfl_poller_set_sFlowCpInterval (SFLPoller * poller, u_int32_t sFlowCpInterval)
{
poller->sFlowCpInterval = sFlowCpInterval;
/* Set the countersCountdown to be a randomly selected value between 1 and
sFlowCpInterval. That way the counter polling would be desynchronised
(on a 200-port switch, polling all the counters in one second could be
harmful). */
poller->countersCountdown = 1 + (random () % sFlowCpInterval); <--- MAY NOT
DO WHAT YOU THINK IF sFlowCpInterval == 0!!
}

The subexpression "random () % sFlowCpInterval" results in undefined behaviour
according to the C and C++ standards, if sFlowCpInterval is zero.

Depending on what platform you compile this on, you get different results.

For example, on the i686, I got a floating point error at runtime.

On a PowerPC, the expression evaluated to 1 + random().

Zero is a valid value for sFlowCpInterval according to the sFlow specification
V5, so shouldn't poller->countersCountdown be set to zero in this case?

Regards

Andy


NOTICE: This message contains privileged and confidential
information intended only for the use of the addressee
named above. If you are not the intended recipient of
this message you are hereby notified that you must not
disseminate, copy or take any action in reliance on it.
If you have received this message in error please
notify Allied Telesis Labs Ltd immediately.
Any views expressed in this message are those of the
individual sender, except where the sender has the
authority to issue and specifically states them to
be the views of Allied Telesis Labs.
Neil McKee
2010-02-11 23:53:52 UTC
Permalink
Well spotted. Do you agree that the patch below fixes the mistake? (I took the opportunity to enhance the comment as well, just to emphasize the importance of this kind of desynchronization in a large deployment.)

Regards,
Neil


% svn diff sflow_poller.C
Index: sflow_poller.C
===================================================================
--- sflow_poller.C (revision 1929)
+++ sflow_poller.C (working copy)
@@ -79,10 +79,29 @@

void sfl_poller_set_sFlowCpInterval(SFLPoller *poller, u_int32_t sFlowCpInterval) {
poller->sFlowCpInterval = sFlowCpInterval;
- /* Set the countersCountdown to be a randomly selected value between 1 and
- sFlowCpInterval. That way the counter polling would be desynchronised
- (on a 200-port switch, polling all the counters in one second could be harmful). */
- poller->countersCountdown = 1 + (random() % sFlowCpInterval);
+ if(sFlowCpInterval) {
+ /* Set the countersCountdown to be a randomly selected value between 1 and
+ sFlowCpInterval. That way the counter polling will be desynchronised
+ (on a 200-port switch, polling all the counters in one second could be harmful).
+ In a large network, even this might not be ideal if time-synchroniziation
+ between devices is close and counters are always polled on second boundaries. If
+ 1000 different devices all send an sFlow datagram on the same second boundary
+ it could result in an antisocial burst.
+ However when counter-samples are packed into the export datagram they do not
+ always result in that datagram being sent immediately. It is more likely that
+ a subsequent packet-sample will be the one that triggers the datagram to be sent.
+ The packet-sample events are not sychronized to any clock, so that results in
+ excellent desynchronization (http://blog.sflow.com/2009/05/measurement-traffic.html).
+ Another smoothing factor is that the tick() function called here is usually
+ driven from a fairly "soft" polling loop rather than a hard real-time event.
+ */
+ poller->countersCountdown = 1 + (random() % sFlowCpInterval);
+ }
+ else {
+ /* Setting sFlowCpInterval to 0 disables counter polling altogether. Thanks to
+ Andy Kitchingman for spotting this ommission. */
+ poller->countersCountdown = 0;
+ }
}
Post by andy kitchingman
Hi
void
sfl_poller_set_sFlowCpInterval (SFLPoller * poller, u_int32_t sFlowCpInterval)
{
poller->sFlowCpInterval = sFlowCpInterval;
/* Set the countersCountdown to be a randomly selected value between 1 and
sFlowCpInterval. That way the counter polling would be desynchronised
(on a 200-port switch, polling all the counters in one second could be
harmful). */
poller->countersCountdown = 1 + (random () % sFlowCpInterval); <--- MAY NOT
DO WHAT YOU THINK IF sFlowCpInterval == 0!!
}
The subexpression "random () % sFlowCpInterval" results in undefined behaviour
according to the C and C++ standards, if sFlowCpInterval is zero.
Depending on what platform you compile this on, you get different results.
For example, on the i686, I got a floating point error at runtime.
On a PowerPC, the expression evaluated to 1 + random().
Zero is a valid value for sFlowCpInterval according to the sFlow specification
V5, so shouldn't poller->countersCountdown be set to zero in this case?
Regards
Andy
NOTICE: This message contains privileged and confidential
information intended only for the use of the addressee
named above. If you are not the intended recipient of
this message you are hereby notified that you must not
disseminate, copy or take any action in reliance on it.
If you have received this message in error please
notify Allied Telesis Labs Ltd immediately.
Any views expressed in this message are those of the
individual sender, except where the sender has the
authority to issue and specifically states them to
be the views of Allied Telesis Labs.
----
Neil McKee
InMon Corp.
http://www.inmon.com
andy kitchingman
2010-02-11 23:56:09 UTC
Permalink
That should do it.
Well spotted. Do you agree that the patch below fixes the mistake? (I took
the opportunity to enhance the comment as well, just to emphasize the
importance of this kind of desynchronization in a large deployment.)

Regards,
Neil


% svn diff sflow_poller.C
Index: sflow_poller.C
===================================================================
--- sflow_poller.C(revision 1929)
+++ sflow_poller.C(working copy)
@@ -79,10 +79,29 @@

void sfl_poller_set_sFlowCpInterval(SFLPoller *poller, u_int32_t
sFlowCpInterval) {
poller->sFlowCpInterval = sFlowCpInterval;
- /* Set the countersCountdown to be a randomly selected value between 1 and
- sFlowCpInterval. That way the counter polling would be desynchronised
- (on a 200-port switch, polling all the counters in one second could be
harmful). */
- poller->countersCountdown = 1 + (random() % sFlowCpInterval);
+ if(sFlowCpInterval) {
+ /* Set the countersCountdown to be a randomly selected value between 1
and
+ sFlowCpInterval. That way the counter polling will be desynchronised
+ (on a 200-port switch, polling all the counters in one second could be
harmful).
+ In a large network, even this might not be ideal if
time-synchroniziation
+ between devices is close and counters are always polled on second
boundaries. If
+ 1000 different devices all send an sFlow datagram on the same second
boundary
+ it could result in an antisocial burst.
+ However when counter-samples are packed into the export datagram they
do not
+ always result in that datagram being sent immediately. It is more
likely that
+ a subsequent packet-sample will be the one that triggers the datagram
to be sent.
+ The packet-sample events are not sychronized to any clock, so that
results in
+ excellent desynchronization
(http://blog.sflow.com/2009/05/measurement-traffic.html).
+ Another smoothing factor is that the tick() function called here is
usually
+ driven from a fairly "soft" polling loop rather than a hard real-time
event.
+ */
+ poller->countersCountdown = 1 + (random() % sFlowCpInterval);
+ }
+ else {
+ /* Setting sFlowCpInterval to 0 disables counter polling altogether.
Thanks to
+ Andy Kitchingman for spotting this ommission. */
+ poller->countersCountdown = 0;
+ }
}
Post by andy kitchingman
Hi
void
sfl_poller_set_sFlowCpInterval (SFLPoller * poller, u_int32_t
sFlowCpInterval)
Post by andy kitchingman
{
poller->sFlowCpInterval = sFlowCpInterval;
/* Set the countersCountdown to be a randomly selected value between 1 and
sFlowCpInterval. That way the counter polling would be desynchronised
(on a 200-port switch, polling all the counters in one second could be
harmful). */
poller->countersCountdown = 1 + (random () % sFlowCpInterval); <--- MAY NOT
DO WHAT YOU THINK IF sFlowCpInterval == 0!!
}
The subexpression "random () % sFlowCpInterval" results in undefined behaviour
according to the C and C++ standards, if sFlowCpInterval is zero.
Depending on what platform you compile this on, you get different results.
For example, on the i686, I got a floating point error at runtime.
On a PowerPC, the expression evaluated to 1 + random().
Zero is a valid value for sFlowCpInterval according to the sFlow specification
V5, so shouldn't poller->countersCountdown be set to zero in this case?
Regards
Andy
NOTICE: This message contains privileged and confidential
information intended only for the use of the addressee
named above. If you are not the intended recipient of
this message you are hereby notified that you must not
disseminate, copy or take any action in reliance on it.
If you have received this message in error please
notify Allied Telesis Labs Ltd immediately.
Any views expressed in this message are those of the
individual sender, except where the sender has the
authority to issue and specifically states them to
be the views of Allied Telesis Labs.
----
Neil McKee
InMon Corp.
http://www.inmon.com




NOTICE: This message contains privileged and confidential
information intended only for the use of the addressee
named above. If you are not the intended recipient of
this message you are hereby notified that you must not
disseminate, copy or take any action in reliance on it.
If you have received this message in error please
notify Allied Telesis Labs Ltd immediately.
Any views expressed in this message are those of the
individual sender, except where the sender has the
authority to issue and specifically states them to
be the views of Allied Telesis Labs.

Loading...