This is a discussion on bge(4) transmit performance improvement within the mailing.openbsd.tech forums, part of the OpenBSD category; --> I'm interested in finding out if anyone on this list has a lab sort of setup with bge gear. ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I'm interested in finding out if anyone on this list has a lab sort of setup with bge gear. Where the following diff could be benchmarked, the idea being a router or firewall like setup. I'd like to see if this diff actually has any noticeable difference in transmit performance or if it translates out to a micro-optimization and there is very little difference, if any at all. Please try it out anyway, let me know how it goes and if you do test the diff then provide me with a dmesg too. Correct a performance bug from Bill Paul's original FreeBSD bge(4) driver: Each call to the FreeBSD bge_start() routine the transmit producer pointer index from the chip mailbox register BGE_MBX_TX_HOST_PROD0_LO. The local copy of that value is then updated by bge_encap() as bge_encap() encapsulates packets in the Tx ring. If bge_encap() succeds in encpuslating one or more packets, bge_start() tells the chip to start sending the newly-encinitiates writes the new value back to the chip mailbox register. However, comparison of the Linux drivers (Broadcom-supplied and open-source tg3.c) and to the OpenSolaris driver confirms that register BGE_MBX_TX_HOST_PROD0_LO is write-only to software. Thus, we can just keep a copy in the softc, and eliminate the (expensive) PCI register write on each call to bge_start(). From jonathan NetBSD Index: if_bge.c ================================================== ================= RCS file: /cvs/src/sys/dev/pci/if_bge.c,v retrieving revision 1.92 diff -u -p -r1.92 if_bge.c --- if_bge.c 14 Nov 2005 13:11:40 -0000 1.92 +++ if_bge.c 18 Nov 2005 18:29:02 -0000 @@ -1058,10 +1058,14 @@ bge_init_tx_ring(struct bge_softc *sc) sc->bge_txcnt = 0; sc->bge_tx_saved_considx = 0; - CSR_WRITE_4(sc, BGE_MBX_TX_HOST_PROD0_LO, 0); + + /* Initialize transmit producer index for host-memory send ring. */ + sc->bge_tx_prodidx = 0; + CSR_WRITE_4(sc, BGE_MBX_TX_HOST_PROD0_LO, sc->bge_tx_prodidx); if (sc->bge_quirks & BGE_QUIRK_PRODUCER_BUG) - CSR_WRITE_4(sc, BGE_MBX_TX_HOST_PROD0_LO, 0); + CSR_WRITE_4(sc, BGE_MBX_TX_HOST_PROD0_LO, sc->bge_tx_prodidx); + /* NIC-memory send ring not used; initialize to zero. */ CSR_WRITE_4(sc, BGE_MBX_TX_NIC_PROD0_LO, 0); if (sc->bge_quirks & BGE_QUIRK_PRODUCER_BUG) CSR_WRITE_4(sc, BGE_MBX_TX_NIC_PROD0_LO, 0); @@ -2805,7 +2809,7 @@ bge_start(struct ifnet *ifp) { struct bge_softc *sc; struct mbuf *m_head = NULL; - u_int32_t prodidx = 0; + u_int32_t prodidx; int pkts = 0; sc = ifp->if_softc; @@ -2813,7 +2817,7 @@ bge_start(struct ifnet *ifp) if (!sc->bge_link && ifp->if_snd.ifq_len < 10) return; - prodidx = CSR_READ_4(sc, BGE_MBX_TX_HOST_PROD0_LO); + prodidx = sc->bge_tx_prodidx; while(sc->bge_cdata.bge_tx_chain[prodidx] == NULL) { IFQ_POLL(&ifp->if_snd, m_head); @@ -2869,6 +2873,8 @@ bge_start(struct ifnet *ifp) CSR_WRITE_4(sc, BGE_MBX_TX_HOST_PROD0_LO, prodidx); if (sc->bge_quirks & BGE_QUIRK_PRODUCER_BUG) CSR_WRITE_4(sc, BGE_MBX_TX_HOST_PROD0_LO, prodidx); + + sc->bge_tx_prodidx = prodidx; /* * Set a timeout in case the chip goes out to lunch. Index: if_bgereg.h ================================================== ================= RCS file: /cvs/src/sys/dev/pci/if_bgereg.h,v retrieving revision 1.30 diff -u -p -r1.30 if_bgereg.h --- if_bgereg.h 9 Oct 2005 23:41:55 -0000 1.30 +++ if_bgereg.h 18 Nov 2005 18:29:03 -0000 @@ -2328,6 +2328,7 @@ struct bge_softc { u_int16_t bge_rx_saved_considx; u_int16_t bge_ev_saved_considx; u_int16_t bge_return_ring_cnt; + u_int32_t bge_tx_prodidx; u_int16_t bge_std; /* current std ring head */ u_int16_t bge_jumbo; /* current jumo ring head */ SLIST_HEAD(__bge_jfreehead, bge_jpool_entry) bge_jfree_listhead; |