Q: I am a network administrator managing a small network of about 20 servers, one Cisco router, one Cisco VPN concentrator and a Cisco 260e PIX firewall with three switches (150-200 nodes). Every couple of months there is huge traffic that brings the network down for about two hours. The origin of this traffic is unknown. Earlier, a worm was suspected to be causing this, but after talking to other network administrators, we figured it may be a wrong broadcast setting of some sort. We have a Norton AntiVirus server installed. How can I find the source and prevent this from happening again?
Thurman: If you know that the suspect traffic is consistently impacting a certain network, you can install a stand-alone packet sniffer with an IP address of one of the nodes on that particular network, then analyze the network traffic after the attack. Being that it's a "once every couple of months" type of activity, you will have to keep an eye on the logs of the packet sniffer.
Another option would be to have your network engineers configure a span (switch port analyzer -- a feature on Cisco Catalyst switches) port on your switch, to monitor the traffic at your router's trunk point. If they can't, you could possibly utilize a tap (such as Finisar's) to get a look at the network traffic during the surge of suspicious traffic.
Another suggestion is to look at the timing of the attacks. What type of business activities are occurring during this time frame? Is there some sort of backup activity occurring? Perhaps the accounting department is reconciling pay records.
Hofmeyr: The easiest method for determining the source of the traffic would be to use a protocol analysis tool such as a sniffer. Attach the sniffer to either a saturated VLAN, or set up a span port on one of your switches to capture all of the traffic. By analyzing the resulting capture file, you should be able to determine the source and type(s) of traffic that are causing these spikes. Once you locate this information, a variety of solutions can then be applied to alleviate the situation, depending upon the specific issues you find.
Schweitzer: It's quite possible that you're experiencing what's known as a broadcast storm. A broadcast storm is a peculiar phenomenon in which a message is broadcast across a network, causing a response to be returned. Each such response results in still more responses, bringing about a snowball effect. Such "traffic" can occupy so much bandwidth that it can block all of your remaining network traffic -- in effect, a total meltdown.
In addition, unnecessary broadcasts consume processor (CPU) cycles from workstations and servers across the network, slowing them down. Keep in mind that if your Layer 2 switches are using redundant connections across multiple paths via the spanning tree protocol, when the primary path fails, the alternate path will be activated in order to maintain connectivity. If the spanning tree protocol were to somehow become disabled, a broadcast storm would result.
This is because any broadcast packet received on one of the redundant connections is rebroadcast to all interfaces, with the exception of the interface on which it was received. Many switches these days have the ability to limit broadcasts to reduce the effect of any storm or chatty NIC.
Really, the only option is to use a sniffer to capture packet information, or to use accounting on a good managed switch to figure out the traffic hogs involved and see what is actually happening.
Wilson: Fortunately, your network is relatively simple. However, I think you must have a Cisco Pix 500 Series firewall, since I'm not aware of a 260e PIX. I am assuming a flat IP scheme (private address space), 10/100 Ethernet, TCP/IP (as opposed to AppleTalk, for instance), RIP (Routing Information Protocol -- a basic distance vector protocol), a Cisco router and Cisco Catalyst switches, possibly using subnets, but not a segmented network. As networks grow and become more complex, it's more difficult to troubleshoot this kind of problem. In a small, flat, single-protocol switched network, this problem is relatively straightforward to troubleshoot. I didn't say "fix" because the issue could be originating from a number of sources, including an incorrectly configured application.
The goal is to find out what kind of traffic is hampering network performance, where the traffic is coming from (source IP address) and where the traffic is going to (destination IP address). Discovering this information can be accomplished by use of a sniffer, a network analyzer designed to capture packets. Ethernet traffic consists of three different types of packets: unicast, multicast and broadcast. Unicast packets are addressed to a single destination. This type typically comprises the bulk of traffic on an Ethernet LAN. Multicast refers to a single transmission sent to a group of users. Broadcast packets are sent to all nodes within a single network segment and can be a major source of congestion. As you stated, broadcast traffic may be a problem, but if you run a packet capture, you will find out if that is the problem.
The easiest and simplest method of determining the problem is to capture traffic and inspect it. A sniffer will do the job relatively quickly. However, making sure you have captured all the traffic is the key to success.
The quickest method to ensure that you are capturing all the traffic, in absence of more sophisticated tools, is to configure a span port on a primary switch and attach a network sniffer or analyzer to the destination span port. The span port is important because there is a fundamental difference between a switch and a hub. When a hub receives a packet on a port, it sends a copy of that packet to all ports on the hub except the one it was received on. This is very inefficient, but if your sniffer is attached to a hub, it will see all the traffic traversing the hub.
A switch, from the moment it boots up, begins to build a Layer 2 forwarding table based on the source MAC address of the packets received. The switch forwards traffic destined for a MAC address directly to the corresponding port, not to all ports. This is very efficient, but the sniffer will not capture all the packets if you haven't configured a local span port on the primary switch. It's also important to configure RSPAN (remote SPAN) ports on the secondary switches and configure a special VLAN to carry the traffic to the destination span port where your sniffer will be attached. Not all switches support RSPAN, so you may have to manually go from switch to switch to run a packet capture.
Once you have correctly figured the span ports and have attached the network sniffer, you can inspect the results of the packet capture for large packet size, or for an unusually high number of small packets, unusual traffic, bandwidth utilization, etc. The results will give you the information you need, such as source and destination IP address, type of traffic and size of packets. Setting up the monitoring environment is another topic to consider, including which product to use (hardware- or software-based), how often to run the packet capture and how much disk space to devote to the effort. There are plenty of products on the market.
Once you have identified what is happening on your network, the real sleuthing begins. I have seen cases where the network appeared to be down when the real issue was an incorrectly configured Microsoft Windows domain controller. Users and systems could not authenticate properly, causing a slowdown of network access and the perception that the network was bogged down. Patching the operating system solved the problem. I have seen another case where a poorly written application (developed using Microsoft Access) spewed bad data onto the network when an employee performed a particular function in the application. The problem was identified by running a packet capture when network performance appeared to degrade, identifying the problem system, interviewing the end user to find out what functions were being invoked, and turning the issue over to the application development support team with a copy of the decoded packet capture. A code fix solved the problem.
Q: I am currently deploying Microsoft BizTalk for a company that I am consulting for. The basic design is as follows: An external company will FTP a PGP-encrypted file to a server in our DMZ. Due to a lack of choices, we are forced to open up file access from our internal BizTalk server to the DMZ server to retrieve the files for processing using BizTalk's file receive function. We have taken a few precautions -- only the BizTalk server is allowed to initiate a connection to the DMZ, and only one port is open.
Unfortunately, there are additional security concerns. The share on the DMZ server must be secured using a local account on the DMZ server. This set of username and password must be used for the file receive function in BizTalk. Granted, it is not the same account, just the same username and password. To compound the problem, the file receive function must have access to the SQL server (which houses the BizTalk databases) and runs under that username and password combination. This combo also must have access to the PGP software and private key set.
What type of risks am I susceptible to? Do you have any suggestions as to what we can do to reduce the security exposure?
Avanade: We've come across the same kind of scenario on a recent project. It sounds like you've effectively mitigated the inherent security risks associated with FTP via the use of PGP. But this still leaves the risks of opening up remote procedure calls (RPC) in your DMZ for the share access as well as using the same username and password combination in three places (FTP, BizTalk and SQL). Risks of three other types of attacks include: "brute force" (due to the username and password duplication) as well as spoofing and denial-of-service (uploading an excessive number of files/buffer overflows).
Although reducing the security exposure in situations like this can often be more of a business issue than a technical one, there are a few technical configurations we have used that might help your project as well.
In order to mitigate brute-force attacks, we didn't use share access for BizTalk's file receive function. This meant we didn't have to duplicate the usernames and passwords, and we didn't have to open up RPC. Instead, we used a scheduled FTP "get" process to poll the FTP server directories and place the data into BizTalk's file receive function directories. We also used IPSec filters to lock down every service on the FTP server except for FTP, and we used nonstandard FTP ports. In addition to firewall rules limiting communication between the FTP and BizTalk servers, we also specified endpoint IP addresses for business partners and used separate FTP directories (strong passwords and nonstandard FTP ports) that specified strict connection limits and timeouts in order mitigate spoofing and denial-of-service attacks.
Q: I would like to be able to keep a database of all known "good" devices and be able to take some sort of action if another device tries to enter my network. I would like to be able to do this via the network and not through domain logon/asset management utilities. Actions to take could include disallow and alert; scan, alert, if clean let on, etc. Are there any products that do this?
Wilson: Very simply, you need an intrusion-detection system (IDS) or an intrusion-prevention system (IPS). The goal is to know your network and be alerted if an unknown or hostile device attaches to it. You may want to automate the response to the unknown device or instead automate an alert about the device and manually respond. In either case, you want to know when a device attaches to your network and you want to be able to control that access at the network layer rather than at the application layer (i.e., domain management).
An IDS is composed of sensors or agents that monitor data sources, apply some type of detection algorithm (signature- or anomaly-based) and initiate responses or alerts to events detected. Generally, there is a management system that allows for monitoring and analysis, along with system configuration functionality. There are approaches that are host-based, network-based and a combination of both.
There is also a network intrusion-detection system, which is an IDS with a network focus. The strategically placed sensors sniff network traffic on an assigned segment and report back to a central database engine whose job it is to collect and correlate data. An excellent paper written by Alan McCarty titled "Distributed NIDS: A How-to Guide" (download PDF), published in the SANS Institute's Information Security Reading Room, is excellent, though lengthy. His solution is entirely open-source (Linux, Snort, MySQL, ACID).
IPSs are a relatively new idea. They are in-line systems that drop unwanted packets in real time. The primary problem with an IPS is the risk that it could block wanted traffic. The appliance is only as good as its configuration. Internet Security Systems Inc. has upgraded its product offering in the Proventia Enterprise Protection suite.
But getting back to your question, an IDS/NIDS/IPS system will provide a database of "good" devices and alert you to prevent packets from an unknown or "bad" system attaching to your network. You stated, "Actions to take could include disallow and alert; scan, alert, if clean let on, etc." If you want to manage this at the network layer, you need to have an NIDS- or IPS-type solution to "disallow and alert." Scanning and cleaning is an activity that happens at the application layer, and there are a number of products that will allow you to manage systems in this way -- GFI LANguard Network Security Scanner or HFNetChk, for example.
There's no single engine today that does it all and does it well. What you really want is something that protects your network from Layers 1-7 and is fully redundant and easily configured, managed and administered. The best advice is to develop a "defense in depth" strategy, which is a layered approach to network and security management.
Our expert panel:
Vince Tuesday and Mathias Thurman write Computerworld's weekly feature Security Manager's Journal. They are real security managers, but their names have been disguised for obvious reasons (their journals report in detail on security issues within their organizations).
Steven Hofmeyr, recently named one of MIT's top 100 innovators under 35, is founder and CTO of Sana Security Inc. He has spent years researching the analogy between computer security and the human immune system and has worked with top CIOs in organizations such as the U.S. Air Force, Federal Aviation Administration, RSA Inc. and Merrill Lynch & Co.
Avanade Inc.'s responses were contributed by Christopher Burry, technology infrastructure practice director and Avanade fellow; Rick Birkenstock, Western region technology infrastructure practice director; Ryan McCune, MCSE in the technology infrastructure practice; and David Bleecker, senior systems engineer. Avanade is a Seattle-based integrator for Microsoft technology that's a joint venture of Accenture Ltd. and Microsoft.
Marcia J. Wilson holds the CISSP designation and is the founder and CEO of Wilson Secure LLC, a company focused on providing independent network security auditing and risk analysis. She can be reached at marcia@wilsonsecure.com.
Douglas Schweitzer is an Internet security specialist with a focus on malicious code. He is the author of several books, including Internet Security Made Easy, Securing the Network from Malicious Code and the recently released Incident Response: Computer Forensics Toolkit.
Souped-up Security
Stories in this report:
- Souped-Up Security
- Farming Out Security: How to Choose a Service Provider
- Security and QoS Unite
- Security Begins at Home (With Telecommuters)
- The Almanac: Networking