82573L Initializing our Pro/1000 Chicken-and-Egg? • We want to create a Linux Kernel Module that can serve application-programs as a character-mode device-driver for our.

Download Report

Transcript 82573L Initializing our Pro/1000 Chicken-and-Egg? • We want to create a Linux Kernel Module that can serve application-programs as a character-mode device-driver for our.

82573L
Initializing our Pro/1000
Chicken-and-Egg?
• We want to create a Linux Kernel Module
that can serve application-programs as a
character-mode device-driver for our NIC
• So, as with the UART device, we will need
to implement ‘read()’ and ‘write()’ methods
• But which method should we do first?
• No way to “test” a ‘read()’ method without
having a way to send packets to our NIC
How ‘transmit’ works
List of Buffer-Descriptors
descriptor0
descriptor1
descriptor2
descriptor3
0
0
0
0
Buffer0
Buffer1
Buffer2
We setup each data-packets that we want
to be transmitted in a ‘Buffer’ area in ram
We also create a list of buffer-descriptors
and inform the NIC of its location and size
Then, when ready, we tell the NIC to ‘Go!’
(i.e., start transmitting), but let us know
when these transmissions are ‘Done’
Buffer3
Random Access Memory
Registers’ Names
Memory-information registers
TDBA(L/H) = Transmit-Descriptor Base-Address Low/High (64-bits)
TDLEN = Transmit-Descriptor array Length
TDH = Transmit-Descriptor Head
TDT = Transmit-Descriptor Tail
Transmit-engine control registers
TXDCTL = Transmit-Descriptor Control Register
TCTL = Transmit Control Register
Notification timing registers
TIDV = Transmit Interrupt Delay Value
TADV = Transmit-interrupt Absolute Delay Value
Tx-Desc Ring-Buffer
0x00
TDBA
base-address
0x10
0x20
TDH (head)
0x30
TDLEN
(in bytes)
0x40
0x50
0x60
TDT (tail)
0x70
0x80
= owned by hardware (nic)
= owned by software (cpu)
Circular buffer (128-bytes minimum)
Tx-Descriptor Control (0x3828)
31
0
30
29
0
28
0
15
0
27
0
25
24
0
0
0
G
R
A
N
13
12
11
10
0
14
26
0
FRC HTHRESH
FRC
0
DPLX
SPD
(Host
Threshold)
23
22
0
0
9
8
21
20
19
18
17
16
WTHRESH
(Writeback Threshold)
7
I
L
0
O0
S
6
00
5
A
S
D
E
4
3
2
1
L
PTHRESH
R
0
00 00
(Prefetch
S Threshold)
T
“This register controls the fetching and write back of transmit descriptors.
The three threshhold values are used to determine when descriptors are
read from, and written to, host memory. Their values can be in units of
cache lines or of descriptors (each descriptor is 16 bytes), based on the
value of the GRAN bit (0=cache lines, 1=descriptors). When GRAN = 1,
all descriptors are written back (even if not requested).” --Intel manual
Recommended for 82573: 0x01010000 (GRAN=1, WTHRESH=1)
0
Transmit Control (0x0400)
31
R
=0
30
R
=0
29
R
28
MULR
27
26
TXCSCMT
=0
15
14
13
12
25
UNO
RTX
11
COLD (lower 4-bits)
(COLLISION DISTANCE)
EN = Transmit Enable
PSP = Pad Short Packets
CT = Collision Threshold (=0xF)
COLD = Collision Distance (=0x3F)
24
RTLC
23
R
=0
10
0
9
22
21
20
18
17
16
COLD (upper 6-bits)
SW
XOFF
8
19
(COLLISION DISTANCE)
7
6
5
I
S
CT
L
TBI
(COLLISION
ASDV THRESHOLD)
SPEED
L
O
mode
S
U
4
3
P
S
P
2
1
0
R0
=0
0N
E
R
=0
SWXOFF = Software XOFF Transmission
RLTC = Retransmit on Late Collision
UNORTX = Underrun No Re-Transmit
TXCSCMT = TxDescriptor Minimum Threshold
MULR = Multiple Request Support
82573L
Tx Configuration Word (0x0178)
31
ANE
30
Tx
Config
29
28
R
ITCE
15
SPD
BYPS
=0
14
R
=0
27
IAME
13
26
R
=0
12
EE ASD
RST CHK
25
24
23
Tx
Tx LS
Reserved
LS(=0)Flow
=0
DF
PB
PAR PAR
EN
EN
11
10
R
R
=0
=0
22
9
R
8
21
R
=0
7
20
19
18
Phy
DMA
Pwr
Dyn
Down
GE
En
6
R R
TxConfigWord
=0
=0
=0
5
17
R
RO
DIS
=0
4
16
3
R
R
R
=0
=0
=0
2
0
1
0
0
ANE = Auto-Negotiation Enable
TxConfig = Transmit Configuration Control bit
TxConfigWord = Transmit Configuration Word
This register has two meanings, depending on the state of the ANE bit
(i.e., setting ANE=1 enables the hardware auto-negotiation machine).
Applicable only in SerDes mode; program as 0 for internal-PHY mode.
82573L
Legacy Tx-Descriptor Layout
31
0
Buffer-Address low (bits 31..0)
0x0
Buffer-Address high (bits 63..32)
0x4
CMD
CSO
special
Packet Length (in bytes)
CSS
reserved
=0
status
Buffer-Address = the packet-buffer’s 64-bit address in physical memory
Packet-Length = number of bytes in the data-packet to be transmitted
CMD = Command-field
CSO/CSS = Checksum Offset/Start (in bytes)
STA = Status-field
0x8
0xC
Suggested C syntax
typedef struct {
unsigned long long
unsigned short
unsigned char
unsigned char
unsigned char
unsigned char
unsigned short
} tx_descriptor;
base_addr;
pkt_length;
cksum_off;
desc_cmd;
desc_stat;
cksum_org;
special;
TxDesc Command-field
7
6
IDE
5
VLE
DEXT
4
reserved
=0
3
2
RS
1
IC
0
IFCS
EOP
EOP = End Of Packet (1=yes, 0=no)
IFCS = Insert Frame CheckSum (1=yes, 0=no) – provided EOP is set
IC = Insert CheckSum (1=yes, 0=no) as indicated by CSO/CSS fields
RS = Report Status (1=yes, 0=no)
DEXT = Descriptor Extension (1=yes, 0=no) use ‘0’ for Legacy-Mode
VLE = VLAN-Packet Enable (1=yes, 0=no) – provided EOP is set
IDE = Interrupt-Delay Enable (1=yes, 0=no)
TxDesc Status field
3
reserved
=0
2
1
LC
0
EC
DD
DD = Descriptor Done
this bit is written back after the NIC processes the descriptor
provided the descriptor’s RS-bit was set (i.e., Report Status)
EC = Excess Collisions
indicates that the packet has experienced more than the
maximum number of excessive collisions (as defined by
the TCTL.CT field) and therefore was not transmitted.
(This bit is meaningful only in HALF-DUPLEX mode.)
LC = Late Collision
indicates that Late Collision has occurred while operating in
HALF-DUPLEX mode. Note that the collision window size
is dependent on the SPEED: 64-bytes for 10/100-MBps, or
512-bytes for 1000-Mbps.
Bit-mask definitions
enum {
DD = (1<<0),
EC = (1<<1),
LC = (1<<2),
// Descriptor Done
// Excess Collisions
// Late Collision
EOP = (1<<0), // End Of Packet
IFCS = (1<<1), // Insert Frame CheckSum
IC = (1<<2),
// Insert CheckSum as per CSO/CSS
RS = (1<<3), // Report Status
DEXT = (1<<5), // Descriptor Extension
VLE = (1<<6), // VLAN packet
IDE = (1<<7) // Interrupt-Delay Enable
};
Allocating kernel-memory
• Our 82573L device-driver will need to use
a segment of contiguous physical memory
which is cache-aligned and non-pageable
• As explained in our LDD3 textbook, such a
memory-block can be allocated using the
Linux kernel’s ‘kmalloc()’ function (and it
can later be deallocated using ‘kfree()’)
• The maximum-size allocation is 128-KB
• You should use the ‘GFP_KERNEL’ flag
Network MTU
• Unless the ‘Large-Send’ functionality has
been enabled, there will be a maximum
length for your network ‘datagrams’ equal
to 1536 bytes (=0x0600)
• So if you reused the same Packet-Buffer
for successive transmissions, you could fit
your packet-buffer and a moderate-sized
Descriptor-Buffer into one 4KB-pageframe
Single page-frame option
Descriptor-Buffer (1-KB)
(room for up to 256 descriptors)
4KB
PageFrame
Packet-Buffer (3-KB)
(reused for successive transmissions)
Another design-option…
Descriptor-Buffer (128-bytes)
(room for 16 descriptors)
4KB
PageFrame
16 Packet-Buffers (3968-bytes)
(248-bytes per buffer )
Initialization
• Your device-driver needs to initialize your
82573L hardware to a known state, and
configure its options for your desired mode
of operation
• The Device Control register has bits which
let you initiate a ‘device reset’ operation
• The Device Status register has bits which
inform you when a ‘reset’ has completed
Device Status (0x0008)
31
?
30
29
28
0
0
27
0
26
0
25
24
0
0
23
0
0
22
0
21
20
0
0
19
18
GIO
Master
EN
17
0
16
0
0
some undocumented functionality?
15
0
14
0
13
0
12
0
11
0
10
PHY
reset
9
ASDV
8
7
6
I
S
L
SPEED
L
O
S
U
FD = Full-Duplex
LU = Link Up
TXOFF = Transmission Paused
SPEED (00=10Mbps,01=100Mbps, 10=1000Mbps, 11=reserved)
ASDV = Auto-negotiation Speed Detection Value
5
0
4
TX
OFF
3
2
1
0
Function
ID 0
0U
L
F
D
82573L
Device Control (0x0000)
31
30
29
R
PHY
VME
RST
=0
15
28
27
26
TFCE RFCE RST
14
13
R
R
R
=0
=0
=0
12
25
23
22
21
R
R
R
R
R
=0
=0
=0
=0
=0
11
FRC FRC
DPLX SPD
FD = Full-Duplex
GIOMD = GIO Master Disable
SLU = Set Link Up
FRCSPD = Force Speed
FRCDPLX = Force Duplex
24
10
R
=0
9
SPEED
8
=0
19
ADV
D3
WUC
7
R
20
6
S
L
U
R
=0
5
18
17
D/UD
status
4
R
R
=0
=0
3
R
R
R
=0
=0
=1
16
2
1
0
GIO
M
0
D
R
0=0
F
D
SPEED (00=10Mbps, 01=100Mbps, 10=1000Mbps, 11=reserved)
ADVD3WUP = Advertise Cold Wake Up Capability
D/UD = Dock/Undock status
RFCE = Rx Flow-Control Enable
RST = Device Reset
TFCE = Tx Flow-Control Enable
PHYRST = Phy Reset
VME = VLAN Mode Enable
82573L
Extended Control (0x0018)
31
R
=0
30
R
=0
?
29
28
R
ITCE
15
SPD
BYPS
=0
14
R
=0
27
IAME
13
26
R
=0
12
EE ASD
RST CHK
25
24
23
22
Tx
LS
Tx
LS
Flow
=0
9
8
DF
PB
PAR PAR
EN
EN
11
10
21
R
=0
7
20
19
18
Phy
DMA
Pwr
Dyn
Down
GE
En
6
5
17
R
3
R
R
R
R
R
R
R
R
R
=0
=0
=0
=0
=0
=0
=0
=0
=0
ASDCHK = AutoSpeed Detection Check
EERST = EEPROM Reset
SPDBYPS = Speed-selection Bypass
RODIS = Relaxed-Ordering Disable
DMADynGE = DMA Dynamic-Gating Enable
PhyPwrDownEn = Phy PowerDown Enable
R
RO
DIS
=0
4
16
=0
2
1
0
R0
=0
R
0=0
R
=0
TxLSFlow = Tx Large-Send Flow
TxLS = Tx Large-Send functionality
PBPAREN = Packet-Buffer Parity-Error Detect
DFPAREN = Descriptor-FIFO Parity-Error Detect
IAME = Interrupt-Acknowledge Auto-Mask Enable
ITCE = Interrupt Timers Cleared Enable
82573L
Example
// clear STATUS bit #31
iowrite32( 0x00000000, io + E1000_STATUS );
// initiate Device-Reset and Phy-Reset
iowrite32( 0x84000000, io + E1000_CTRL );
// wait until STATUS bit #31 is set
while ( ( ioread32( io + E1000_STATUS )&(1<<31)) == 0 );
// program Link Up with desired operating-mode settings
iowrite32( 0x00040241, io + E1000_CTRL );
// wait until LU-bit in STATUS is set
while ( ( ioread32( io + E1000_STATUS )&(1<<10)) == 0 );
Interrupt Cause Read (0x00C0)
Mechanism for NIC-event notifications
31
30
29
28
27
26
25
24
23
22
21
20
19
18
INT
R
R
R
R
R
R
R
R
R
R
R
R
R
assert
=0
=0
=0
=0
=0
=0
=0
=0
=0
=0
=0
=0
=0
15
TXD
LOW
14
13
12
11
10
R
R
R
R
R
=0
=0
=0
=0
=0
9
MDAC
8
R
=0
7
6
5
4
RXT0 RXO
R
RXD
MT0
=0
17
16
S
R
P
D
A
C
K
3
R
=0
2
1
0
L
S0
C
T
X
0Q
E
T
X
D
W
TXDW = Transmit Descriptor Written back
LSC = Link Status Changed
TXQE = Transmit Queue Empty
MDAC = MDI/O Access Completed
SRPD = Small Receive Packet Detected
ACK = Receive ACK-frame detected
RXT0 = Receiver Timer Interrupt
RXO = Receiver Overrun
TXDLOW = Transmit Descriptor Low Threshhold Reached
RXDMT0 = Receive Descriptor Minimum Threshhold Reached
INT-Assert = Interrupt Assertion is still pending
In-Class Exercise #1
• Try compiling and installing our ‘tryreset.c’
demo-module, and examine the messages
put in the kernel’s log-file (use ‘dmesg’)
• Then modify the module-code so that it
also outputs the value in the ICR register
(Interrupt Cause Read) during each pass
through the two ‘busy-waiting’ loops
• #define E1000_ICR
0x00C0
In-Class Exercise #2
• Apply the save techniques we employed in
our earlier ‘announce.c’ demo-module so
that the ‘printk()’ statements in ‘tryreset.c’
get replaced by statements that will show
the messages onscreen, or in the current
desktop window, rather than writing them
to the kernel’s (out-of-view) log-file