5/2/2014
Understanding Linux
Network Device Driver and
NAPI Mechanism
Xinying Wang, Cong Xu
CS 423 Project
Outline
Ethernet Linux Network Ethernet Driver
Introduction Driver Initialization &
Registration
• Ethernet Frame • Intel e1000 driver • Initialize net_device
• MAC address • Important data • Set up device opetation
structures • Set up DMA & NAPI
Summary & Interrupt Profiler
Future Work To Test NAPI
• Profiler implementation
• Experiment results
1
5/2/2014
Introduction to Ethernet
• A family of computer networking
technologies for local area networks
(LANs)
• Commercially introduced in 1980
and standardized in 1983 as IEEE
802.3.
• The most popular network with good
degree of compatibility
• Features:
o Ethernet frame
o MAC Address
Ethernet frame
• Transported by Ethernet packet (a data packet on an
Ethernet)
• Example of Ethernet frame structure through TCP socket:
Ethernet header IP header TCP header Data Frame Check Sequence (FCS)
• Ethernet header
o Header: a set of bytes (octets*) prepended to a packet
o Include destination MAC address and source MAC address
• FCS: to detect any in-transit corruption of data
*octet: a group of eight bits
2
5/2/2014
MAC address
• Media Access Control address
• Often stored in hardware’s
read-only memory
• First three octets:
Organizationally Unique
Identifier (OUI)
• Following octets: as long as
unique
Linux network driver
• Linux kernel handles MAC
address resolution.
• Network drivers are still
needed
o Kernel cannot do anything
o Different from character drivers and block
drivers
• Intel e1000 driver for Ethernet
adapter
o /drivers/net/ethernet/intel/e1000
3
5/2/2014
Data structure: struct net_device
• Global information
o char name[IFNAMSIZ]:
• The name of the device.
o unsigned long state:
• Device state.
o struct net_device *next:
• Pointer to the next device in the global linked list.
o int (*init)(struct net_device *dev):
• An initialization function.
Data structure: struct net_device
• Hardware information:
o unsigned long rmem_end, rmem_start, mem_end,
mem_start:
• Device memory information.
o unsigned long base_addr:
• The I/O base address of the network interface.
o unsigned char irq:
• The assigned interrupt number.
o unsigned char dma:
• The DMA channel allocated by the device.
4
5/2/2014
Data structure: struct e1000_adapter
• struct net_device *netdev;
o Pointer to net_device struct
• struct pci_dev *pdev;
o Pointer to pci_device struct
• struct e1000_hw hw;
o An e1000_hw struct
• struct e1000_hw_stats stats;
o Statistics counters collected by the MAC
Data structure:
structure: struct e1000_hw
• e1000_mac_type mac_type;
o An enum for currently available devices
• u8 mac_addr[NODE_ADDRESS_SIZE];
o MAC address
• u16 device_id;
o Device identification information
• u16 vendor_id;
o Vendor information
5
5/2/2014
Ethernet driver initialization
and registration
init_module pci_register_drive
e1000_probe( )
r(&e1000_driver)
Create and initialize struct
net_device
set up device operation
Driver entrance .name = e1000_driver_name,
Do hardware and software
.id_table = e1000_pci_tbl,
initialization
.probe = e1000_probe,
Copy out MAC address from
…. EEPROM
Set up DMA and napi
Set up timers
Register device
Initialize struct net_device
• Initialization is done by calling Create a net_device
MACRO struct
alloc_etherdev(sizeof_priv)
• Track down to function Allocate kernel memory
for package_receiving
struct net_device transmitting queues
*alloc_netdev_mq(int sizeof_priv,
Initiate lists:
const char *name, void ethtool_ntuple_list
napi_list
(*setup)(struct net_device *), unreg_list
unsigned int queue_count) in link_watch_list
net/core/dev.c
Return a pointer of the
• What does this function do? new struct
6
5/2/2014
Set up device operation
It is defined in struct net_device_ops
What does device operation do?
open
Close Set transmission time-out
Get System Network Change MTU
Statistics I/O control
Configuring hardware for Validate Ethernet address
uni or multicast
Change Ethernet address
Hardware and software initialization
• Hardware initialization
Initialize members of hw struct; abstract vendor ID, device
ID, subsystem ID; identify mac type; set MTU size.
• Software initialization
This is done after hardware initialization; Initialize general
software structures (struct e1000_adapter)
7
5/2/2014
Set up DMA and NAPI
• What is NAPI and why do we need NAPI?
• Allocate buffer skb
e1000_rx_ring
• Remap DMA
dma_map_single()
• NAPI add
netif_napi_add(struct net_device *dev, struct napi_struct
*napi, int (*poll)(struct napi_struct *, int), int weight)
How a package being received
Received a NAPI
No
enable? Normal interrupt
package
handling package
Yes
E1000_intr()
Interrupt handler IRQ_disable
_schedule(NAPI)
E1000 driver is NAPI-enabled
8
5/2/2014
NAPI implementation
In interrupt handler function e1000_entr()
Make sure the net device is working properly
netif_rx_schedule_prep()
Add net device into poll list
_netif_rx_schedule()->napi_schedule()
_raise_softirq_irqoff(NET_RX_SOFTIRQ) for switching to bottom half
NAPI implementation
• Bottom half function net_rx_action()
Return
yes
Work >
no Calculate Call poll
budget or Move device
Set budget weight for function
process time to the list end
> 1s device E1000_clean()
9
5/2/2014
NAPI implementation
• Poll function e1000_clean()
Yes Remove device
Work_done
Clean_rx()
< weight ? from list
no
Return
work_done
Experiment
• An experiment is designed to test NAPI mechanism
• A interrupt profiler is designed to profile the interrupt counts
in a designed period
• Linux kernel 3.13.6 was employed to fulfill the experiment
• Experiment platform: CPU: Intel core i5 dual core 2.53Ghz;
Memory: 4G; Network card: Intel-82577 Gb card
10
5/2/2014
Profiler implementation
unsigned long
Char Device Driver Interrupt_counter;
interface
mmap() Monitor
Monitor Work Queue
Process Network Driver
Profiler Handler function
buffer
Kernel Module Kernel build-in
Network driver
Results
streaming YouTube video streaming YouTube video
16000 2500
14000
2000
12000
interrupt (counts)
interrupts/second
10000
1500
8000
6000 1000
4000
500
2000
0 0
0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350
Time (s) Time(s)
Accumulated interrupts Differentiated interrupts
11
5/2/2014
Summary and future work
• The Linux network device driver was analyzed base on Intel E1000
driver code files.
• The mechanism and implementation of NAPI was detailed
• An experiment was designed to further understand the NAPI
mechanism
• A thorough understanding the Linux network device driver could be
done for the future by further analysis of more sub functions.
Reference
• Branden Moore, Thomas Slabach, Lambert Schaelicke, Profiling
Interrupt Handler Performance through Kernel Instrumentation,
Proceedings of the 21st international conference on computer design
• Lambert Schaelicke, Al Davis, and Sally A. Mckee, Profiling I/O
Interrupts in Modern Architectures
• Linux kernel source code (version 3.13.6)
12
5/2/2014
Q&A
Thanks!
13