Bandwidth Management: Difference between revisions

From JCWiki
Jump to navigation Jump to search
Line 14: Line 14:
Open up the mrtg graph for p1a (the top-level switch for most of the machines at castle): mgmt -> monitoring -> p1a -> bytes/sec
Open up the mrtg graph for p1a (the top-level switch for most of the machines at castle): mgmt -> monitoring -> p1a -> bytes/sec


From there, you can begin to narrow down which switch is causing the spike, and then you would load the mrtg graph for that switch and further narrow down by port. Word of caution- even though the mrtg graphs show labels to indicate which device is connected to which port, you should take followup steps to confirm which machine is actually in that port (except for 3750 and p1a/p1b where the labeling should accurate). See [[Switch_Control#Finding_which_IPs_are_on_a_port|Finding which IPs are on a port]]
i2b:<br>
Open up the mrtg graph for p20 (the top-level switch for most of the machines at i2b): mgmt -> monitoring -> p20


From there, you can begin to narrow down from which switch spike is coming from, and then you would load the mrtg graph for that switch and further narrow down by port/device. Word of caution- even though the mrtg graphs show labels to indicate which device is connected to which port, you should take followup steps to confirm which machine is actually in that port (except for 3750, p1a, p1b, p20 where the labeling should be accurate, also in general the switches at i2b are mostly correctly labeled). See [[Switch_Control#Finding_which_IPs_are_on_a_port|Finding which IPs are on a port]]


Once you've determined and confirmed which server is connected to a port, there are a few ways to curb the traffic.
* you can turn off the port entirely (last resort). See [[Switch_Control#Shutting_down_a_port|Shutting down a port]]


= Caps =
= Caps =

Revision as of 16:04, 10 January 2013

TODO

Finding who's causing bandwidth spike

We find out about bandwidth usage spikes in one of several ways:

  • NOC calls and tells us they notice a large usage spike
  • we see a system-generated email telling us a customer has passed their usage
  • speed complaints are coming in
  • we notice the spike on the mrtg page

Determining the cause of the spike is fairly easy with a bit of looking.

Castle:
Open up the mrtg graph for p1a (the top-level switch for most of the machines at castle): mgmt -> monitoring -> p1a -> bytes/sec

i2b:
Open up the mrtg graph for p20 (the top-level switch for most of the machines at i2b): mgmt -> monitoring -> p20

From there, you can begin to narrow down from which switch spike is coming from, and then you would load the mrtg graph for that switch and further narrow down by port/device. Word of caution- even though the mrtg graphs show labels to indicate which device is connected to which port, you should take followup steps to confirm which machine is actually in that port (except for 3750, p1a, p1b, p20 where the labeling should be accurate, also in general the switches at i2b are mostly correctly labeled). See Finding which IPs are on a port

Once you've determined and confirmed which server is connected to a port, there are a few ways to curb the traffic.

Caps

Setting up bandwidth caps

Reporting

Notices