More 82573L details Getting ready to write and test a character-mode device-driver for our anchor-LAN’s ethernet controllers.

Download Report

Transcript More 82573L details Getting ready to write and test a character-mode device-driver for our anchor-LAN’s ethernet controllers.

More 82573L details
Getting ready to write and test a
character-mode device-driver for our
anchor-LAN’s ethernet controllers
A ‘nic.c’ character driver?
my_fops
my_isr()
ioctl
my_ioctl()
open
my_open()
read
my_read()
write
my_write()
release
my_release()
module_init()
module_exit()
Statistics registers
• The 82573L has several dozen statistical
counters which automatically operate to
keep track of significant events affecting
the ethernet controller’s performance
• Most are 32-bit ‘read-only’ registers, and
they are automatically cleared when read
• Your module’s initialization routine could
read them all (to start counting from zero)
Initializing the nic’s counters
• The statistical counters all have addressoffsets in the range 0x04000 – 0x04FFF
• You can use a very simple program-loop to
‘clear’ each of these read-only registers
// Here ‘io’ is the virtual base-address of the nic’s i/o-memory region
{
int
r;
// clear all of the Pro/1000 controller’s statistical counters
for (r = 0x4000; r < 0x4FFF; r += 4) ioread32( io + r );
}
A few ‘counter’ examples
0x4000
0x400C
0x4014
0x4018
0x4074
0x4078
0x407C
0x40D0
0x40D4
0x40F0
0x40F4
CRCERRS
RXERRC
SCC
ECOL
GPRC
BPRC
MPRC
TPR
TPT
MPTC
BPTC
CRC Errors Count
Receive Error Count
Single Collision Count
Excessive Collision Count
Good Packets Received
Broadcast Packets Received
Multicast Packets Received
Total Packets Received
Total Packets Transmitted
Multicast Packets Transmitted
Broadcast Packets Transmitted
Ethernet packet layout
• Total size normally can vary from 64 bytes
up to 1536 bytes (unless ‘jumbo’ packets
and/or ‘undersized’ packets are enabled)
• The NIC expects a 14-byte packet ‘header’
and it appends a 4-byte CRC check-sum
0
6
destination MAC address
(6-bytes)
12
source MAC address
(6-bytes)
14
Type/length
(2-bytes)
the packet’s data ‘payload’ goes here
(usually varies from 56 to 1500 bytes)
Cyclic Redundancy
Checksum (4-bytes)
Filter registers
• All the modern ethernet controllers have a
built-in ‘filtering’ capability which allows the
NIC to automatically discard any packets
having a destination-address different from
the controller’s own unique MAC address
• But the 82573L offers a more elaborate
filtering mechanism (and can also ‘reject’
packets based on the ‘source’ addresses)
How ‘receive’ works
List of Buffer-Descriptors
descriptor0
descriptor1
descriptor2
descriptor3
0
0
0
0
Buffer0
Buffer1
Buffer2
We setup memory-buffers where we want
received packets to be placed by the NIC
We also create a list of buffer-descriptors
and inform the NIC of its location and size
Then, when ready, we tell the NIC to ‘Go!’
(i.e., start receiving), but to let us know
when these receptions have occurred
Buffer3
Random Access Memory
Receive Control (0x0100)
31
R
=0
30
29
0
28
27
F
0LXBUF
15
B
A
M
14
R
=0
13
MO
26
25
SE
CRC
BSEX
12
24
R
23
22
PMCF
DPF
=0
11
DTYP
10
9
8
RDMTS
21
20
R
CFI
=0
7
6
I
S
L
LBML
O
S
U
19
CFI
EN
5
18
17
BSIZE
VFE
4
16
3
2
LPE MPE UPE SBP
0
1
0
E
R
0N
=0
EN = Receive Enable
DTYP = Descriptor Type
DPF = Discard Pause Frames
SBP = Store Bad Packets
MO = Multicast Offset
PMCF = Pass MAC Control Frames
UPE = Unicast Promiscuous Enable
BAM = Broadcast Accept Mode
BSEX = Buffer Size Extension
MPE = Multicast Promiscuous Enable BSIZE = Receive Buffer Size
SECRC = Strip Ethernet CRC
LPE = Long Packet reception Enable VFE = VLAN Filter Enable
FLXBUF = Flexible Buffer size
LBM = Loopback Mode
CFIEN = Canonical Form Indicator Enable
RDMTS = Rx-Descriptor Minimum Threshold Size
CFI = Cannonical Form Indicator bit-value
Registers’ Names
Memory-information registers
RDBA(L/H) = Receive-Descriptor Base-Address Low/High (64-bits)
RDLEN = Receive-Descriptor array Length
RDH = Receive-Descriptor Head
RDT = Receive-Descriptor Tail
Receive-engine control registers
RXDCTL = Receive-Descriptor Control Register
RCTL = Receive Control Register
Notification timing registers
RDTR = Receive-interrupt packet Delay Timer
RADV = Receive-interrupt Absolute Delay Value
Rx-Desc Ring-Buffer
0x00
RDBA
base-address
0x10
0x20
RDH (head)
0x30
RDLEN
(in bytes)
0x40
0x50
0x60
RDT (tail)
0x70
0x80
= owned by hardware (nic)
= owned by software (cpu)
Circular buffer (128-bytes minimum)
Rx-Descriptor Control (0x2828)
31
30
29
28
27
26
25
R
R
R
R
R
R
R
=0
=0
=0
=0
=0
=0
=0
15
14
R
R
=0
=0
13
0
12
11
24
G
R
A
N
10
23
22
R
R
=0
=0
21
1
--------0
9
8
FRC HTHRESH
FRC
0
DPLX
SPD
(Host
Threshold)
7
20
19
18
17
16
SDP1 SDP0
ADV DATA DATA
WTHRESH
D3 --------- --------(Writeback
Threshold)
WUC
D/UD
0
status
0
6
R
R
=0
=0
5
A
S
D
E
4
3
2
1
0
L
PTHRESH
R
0
00 00
S Threshold)
(Prefetch
T
Prefetch Threshold – A prefetch operation is considered when the number of valid, but unprocessed,
receive descriptors that the ethernet controller has in its on-chip buffer drops below this threshold.
Host Threshold - A prefetch occurs if at least this many valid descriptors are available in host memory
Writeback Threshold - This field controls the writing back to host memory of already processed receive
descriptors in the ethernet controller’s on-chip buffer which are ready to be written back to host memory
GRAN (Granularity): 1=descriptor-size, 0=cacheline-size
Legacy Rx-Descriptor Layout
31
0
Buffer-Address low (bits 31..0)
0x0
Buffer-Address high (bits 63..32)
0x4
Packet Checksum
VLAN tag
Packet Length (in bytes)
0x8
Errors
0xC
Status
Buffer-Address = the packet-buffer’s 64-bit address in physical memory
Packet Length = number of bytes in the data-packet that has was received
Packet Checksum = the16-bit one’s-complement of the entire logical packet
Status = shows if descriptor has been used and if it’s last in a logical packet
Errors = valid only when DD and EOP are set in the descriptor’s Status field
Suggested C syntax
typedef struct {
unsigned long long
unsigned short
unsigned short
unsigned char
unsigned char
unsigned short
} rx_descriptor;
base_addr;
pkt_length;
checksum;
desc_stat;
desc_errs;
vlan_tag;
RxDesc Status-field
7
6
PIF
IPCS
5
4
TCPCS UDPCS
3
2
VP
1
IXSM
0
EOP
DD
DD = Descriptor Done (1=yes, 0=no) shows if nic is finished with descriptor
EOP = End Of Packet (1=yes, 0=no) shows if this packet is logically last
IXSM = Ignore Checksum Indications (1=yes, 0=no)
VP = VLAN Packet match (1=yes, 0=no)
USPCS = UDP Checksum calculated in packet (1=yes, 0=no)
TCPCS = TCP Checksum calculated in packet (1=yes, 0=no)
IPCS = IPv4 Checksum calculated on packet (1=yes, 0=no)
PIF = Passed In-exact Filter (1=yes, 0=no) shows if software must check
RxDesc Error-field
7
6
RXE
5
IPE
TCPE
4
3
reserved
=0
reserved
=0
2
1
SEQ
0
SE
RXE = Received-data Error (1=yes, 0=no)
IPE = IPv4-checksum error
TCPE = TCP/UDP checksum error (1=yes, 0=no)
SEQ = Sequence error (1=yes, 0=no)
SE = Symbol Error (1=yes, 0=no)
CE = CRC Error or alignment error (1=yes, 0=no)
CE
Network Administration
• Some higher-level networking protocols
require the Operating System to setup a
translation between the ‘hostname’ for a
workstation and the hardware-address of
its Network Interface Controller
• One mechanism for doing this is creation
of a specially-named textfile (‘/etc/ethers’)
that provides database for translations
In-class exercise #1
• We put a file named ‘ethers’ on our course
website that offers a template for defining the
translation database that software can consult
on our ‘anchor’ cluster’s LAN
• One of the eight workstations’ entries has been
filled in already:
00:30:48:8A:30:03
anchor00.cs.usfca.edu
• Can you complete this database by adding the
MAC addresses for the other 7 machines?
Our ‘seereset.c’ demo
• We created this LKM to demonstrate the
sequence of ‘state-changes’ that three of
our network controller’s registers undergo
in response to initiating a ‘reset’ operation
• The programming technique used here is
one which we think could be useful in lots
of other hardware programming situations
where a vendor’s manual may not answer
all our questions about how devices work
In-class exercise #2
• Try redirecting the output from this ‘cat’
command to a file, like this:
$ cat /proc/seereset > seereset.out
• Then edit this textfile, adding a comment
to each line which indicates the bit(s) that
experienced a ‘change-of-state’ from the
line that came before it (thereby providing
yourself with a running commentary as to
how the NIC proceeds through a ‘reset’)