Extract from a presentation by Luca "ntop" Deri
With todays fast networks packet capture has become a problem.
The problem is two fold, first the packets have to be captured, and second there must be spare cpu cycles for analyzing the packets.
The tool of the trade is the pcap library which provides a unified interface to packet capture. It presents the same API on every OS while it is highly customized on the machine side.
The problem is that pcap performance is not very performant. Especially in problem cases when denial of service attacks happen pcap is not able to capture the traffic because it does not deliver the necessary performance.
There are specialized cards available for packet capture but they do not have pulic APIs and are very expensive.
Lucas tests have shown that normal Linux packet capture is really the worst among Linux (0.2%) , FreeBSD (34%) and Windows (68%).
There have been various attempts to improve this, but even the best solutions were only able to capture (11.7%) of the traffic in the worst case.
In essence this mean using Linux for packet capture is a very bad idea. FreeBSD is better by almost a magnitude.
Lucas idea was to create a special package capture architecture for Linux by providing a new socket type Socket Packet Ring (PF_RING). PF_RING provides a ring buffer in memory for each socket. The application can then read from the ring buffer with mmap. The socket has facilities to record the fact that it had to overwrite a packet in the ring before it was read. This decouples the application from the kernel and improves performance substantially. This implementation is network driver neutral and quite fast (47%) still a bit slower than FreeBSD and still over 50% of the traffic lost in the worst case.
The interesting thing at that point was that the CPU was running at 30% in that situation and loosing 50% of the traffic. Luca found that disabling and enabling the interrupts in the kernel were preventing it from going into packet capturing mode fast enough. The rtirq patch was able to solve this last problem. Luca ended up with a System that was as fast as FreeBSD in capturing packets but with much more CPU to spare.
This solution is about twice as fast as commercial netflow capturing probe selling for much higher prices than the hardware cost for running Lucas solution.
Luca has been investigating ways to further improve performance and found that Gigabit Ethernet drivers on Linux could be programmed much more efficiently by exploiting the cards local packet buffers. A second issue is that Linux does memory allocation and de-allocation whenever it reads from the network card which takes a long time.
Luca has published his work in a project called nCap which provides a accelerated variant of libpcap that lets you recompile your old libpcap applications to reach much better speed.
In an attempt to further improve performance Luca has created a custom gigabit Ethernet card driver that programs the Ethernet card to make its traffic data available directly in the computers memory, freeing the CPU totally from this task, letting it work on traffic analysis exclusively. Luca calls this new method 'straight capture'. This method gives you traffic capture at device speed. With the limitation that only one application per card.