Unix Technical Forum

stability problem, disappearing interfaces

This is a discussion on stability problem, disappearing interfaces within the comp.unix.bsd.openbsd.misc forums, part of the OpenBSD category; --> Hello, I encountered a rather strange stability problem. My box runs 24/24 as router on a pppoe DSL line. ...


Go Back   Unix Technical Forum > Unix Operating Systems > OpenBSD > comp.unix.bsd.openbsd.misc

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-16-2008, 05:33 AM
Alexander Ost
 
Posts: n/a
Default stability problem, disappearing interfaces

Hello,

I encountered a rather strange stability problem. My box runs 24/24 as
router on a pppoe DSL line. The setup runs smoothly for days and even
weeks, but at some point...

1. All network interfaces suddenly disappear:

# ifconfig -a
: no such interface

# ifconfig tun0
tun0: no such interface

The connection is still up (it is impossible, though, to modify
/etc/pf.conf, since pf ceases to know the interfaces (like tun0)
referenced therein).

2. All traffic stops, and socket-related operations yield
"no bufferspace" errors

Mar 26 05:27:15 l ppp[26918]: tun0: Warning: iface add: ioctl(SIOCAIFADDR, 80.146.107.110 -> 217.5.98.182
): No buffer space available
Mar 26 05:27:15 l ppp[26918]: tun0: Error: ipcp_InterfaceUp: unable to set ip address



Googling around, I found others having similar problems (eg,
<20031101004342.GH25238@griffon.lucky.openbsd.misc >,
<47bdb55e.0211300833.23bf2420@posting.google.com >,
<3FBB2396.9050508@bbyrd.net.lucky.openbsd.misc>,
<000001c3e42e$0953c7e0$6500a8c0@shuttle.lucky.open bsd.misc>), but no
solution yet. According to these postings, the problems exist (at
least) on 3.2, 3.3 and 3.4, and also 3.5-current.

Details of my setup:

- the problems appear on two completely different h/w setups
o desktop system with two PCI-based Realtek 8139 NICs
o laptop system with two PCMCIA NICs: one 3Com 3C589D (connected to
the DSL modem), one Realtek-based.

- ppp is restarted every 24 hours

- I'm running pf with altq on tun0

- recovery from above error messages is only possible by rebooting -
successively killing all processes does not help

- Kernel is 3.3Rel with altq/tun patch from benzedrine.cx... looks a
bit shaky, but the above postings indicate there's only few chance
that the problems disappear even in 3.5 (which I'll try next).

I suspect the problems _might_ be related to either the Realtek NIC
drivers (less likely) or ppp with pppoe, possibly in conjunction with
the support for altq on tunneling devices.... but this is just rough
guesses. Any further ideas, or suggestions on how to find more
evidence...?

Thanks,

/alex


ob-dmesg:
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv vvvvvvvvvvvvvvvvvvvv
OpenBSD 3.3 (build) #0: Thu Sep 4 00:52:08 CEST 2003
arost@localhost:/tmp/build
cpu0: F00F bug workaround installed
cpu0: Intel Pentium/MMX ("GenuineIntel" 586-class) 166 MHz
cpu0: FPU,V86,DE,PSE,TSC,MSR,MCE,CX8,MMX
real mem = 66699264 (65136K)
avail mem = 56258560 (54940K)
using 839 buffers containing 3436544 bytes (3356K) of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+(c6) BIOS, date 04/28/99, BIOS32 rev. 0 @ 0xf0400
apm0 at bios0: Power Management spec V1.2
apm0: AC on, battery charge unknown
pcibios0 at bios0: rev. 2.1 @ 0xf0000/0xa22
pcibios0: PCI IRQ Routing Table rev. 1.0 @ 0xf09b0/112 (5 entries)
pcibios0: PCI Interrupt Router at 000:07:0 ("Intel 82371FB PCI-ISA" rev 0x00)
pcibios0: PCI bus #0 is the last bus
bios0: ROM list: 0xc0000/0x8000
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 "Intel 82439HX" rev 0x03
pcib0 at pci0 dev 7 function 0 "Intel 82371SB PCI-ISA" rev 0x01
pciide0 at pci0 dev 7 function 1 "Intel 82371SB IDE" rev 0x00: DMA, channel 0 wired to compatibility, channel 1 wired to compatibility
wd0 at pciide0 channel 0 drive 0: <Maxtor 94098U8>
wd0: 16-sector PIO, LBA, 39082MB, 16383 cyl, 16 head, 63 sec, 80041248 sectors
wd0(pciide0:0:0): using PIO mode 4, DMA mode 2
pciide0: channel 1 disabled (no drives)
rl0 at pci0 dev 11 function 0 "Realtek 8139" rev 0x10: irq 10 address 00:e0:4c:e6:cd:7b
rlphy0 at rl0 phy 0: RTL internal phy
rl1 at pci0 dev 12 function 0 "Realtek 8139" rev 0x10: irq 11 address 00:e0:4c:e7:07:ef
rlphy1 at rl1 phy 0: RTL internal phy
isa0 at pcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard
vga0 at isa0 port 0x3b0/48 iomem 0xa0000/131072
wsdisplay0 at vga0: console (80x25, vt100 emulation), using wskbd0
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
pcppi0 at isa0 port 0x61
midi0 at pcppi0: <PC speaker>
sysbeep0 at pcppi0
lpt0 at isa0 port 0x378/4 irq 7
npx0 at isa0 port 0xf0/16: using exception 16
pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
biomask 4040 netmask 4c40 ttymask 4cc2
pctr: 586-class performance counters and user-level cycle counter enabled
dkcsum: wd0 matched BIOS disk 80
root on wd0a
rootdev=0x0 rrootdev=0x300 rawdev=0x302
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-16-2008, 05:33 AM
bards
 
Posts: n/a
Default Re: stability problem, disappearing interfaces

Alexander Ost wrote:
> Hello,
>
> I encountered a rather strange stability problem. My box runs 24/24 as
> router on a pppoe DSL line. The setup runs smoothly for days and even
> weeks, but at some point...
>
> 1. All network interfaces suddenly disappear:
>
> # ifconfig -a
> : no such interface
>
> # ifconfig tun0
> tun0: no such interface
>
> The connection is still up (it is impossible, though, to modify
> /etc/pf.conf, since pf ceases to know the interfaces (like tun0)
> referenced therein).
>
> 2. All traffic stops, and socket-related operations yield
> "no bufferspace" errors
>
> Mar 26 05:27:15 l ppp[26918]: tun0: Warning: iface add: ioctl(SIOCAIFADDR, 80.146.107.110 -> 217.5.98.182
> ): No buffer space available
> Mar 26 05:27:15 l ppp[26918]: tun0: Error: ipcp_InterfaceUp: unable to set ip address
>
>
>
> Googling around, I found others having similar problems (eg,
> <20031101004342.GH25238@griffon.lucky.openbsd.misc >,
> <47bdb55e.0211300833.23bf2420@posting.google.com >,
> <3FBB2396.9050508@bbyrd.net.lucky.openbsd.misc>,
> <000001c3e42e$0953c7e0$6500a8c0@shuttle.lucky.open bsd.misc>), but no
> solution yet. According to these postings, the problems exist (at
> least) on 3.2, 3.3 and 3.4, and also 3.5-current.
>
> Details of my setup:
>
> - the problems appear on two completely different h/w setups
> o desktop system with two PCI-based Realtek 8139 NICs
> o laptop system with two PCMCIA NICs: one 3Com 3C589D (connected to
> the DSL modem), one Realtek-based.
>
> - ppp is restarted every 24 hours
>
> - I'm running pf with altq on tun0
>
> - recovery from above error messages is only possible by rebooting -
> successively killing all processes does not help
>
> - Kernel is 3.3Rel with altq/tun patch from benzedrine.cx... looks a
> bit shaky, but the above postings indicate there's only few chance
> that the problems disappear even in 3.5 (which I'll try next).
>
> I suspect the problems _might_ be related to either the Realtek NIC
> drivers (less likely) or ppp with pppoe, possibly in conjunction with
> the support for altq on tunneling devices.... but this is just rough
> guesses. Any further ideas, or suggestions on how to find more
> evidence...?
>
> Thanks,
>
> /alex
>
>
> ob-dmesg:
> vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv vvvvvvvvvvvvvvvvvvvv
> OpenBSD 3.3 (build) #0: Thu Sep 4 00:52:08 CEST 2003
> arost@localhost:/tmp/build
> cpu0: F00F bug workaround installed
> cpu0: Intel Pentium/MMX ("GenuineIntel" 586-class) 166 MHz
> cpu0: FPU,V86,DE,PSE,TSC,MSR,MCE,CX8,MMX
> real mem = 66699264 (65136K)
> avail mem = 56258560 (54940K)
> using 839 buffers containing 3436544 bytes (3356K) of memory
> mainbus0 (root)
> bios0 at mainbus0: AT/286+(c6) BIOS, date 04/28/99, BIOS32 rev. 0 @ 0xf0400
> apm0 at bios0: Power Management spec V1.2
> apm0: AC on, battery charge unknown
> pcibios0 at bios0: rev. 2.1 @ 0xf0000/0xa22
> pcibios0: PCI IRQ Routing Table rev. 1.0 @ 0xf09b0/112 (5 entries)
> pcibios0: PCI Interrupt Router at 000:07:0 ("Intel 82371FB PCI-ISA" rev 0x00)
> pcibios0: PCI bus #0 is the last bus
> bios0: ROM list: 0xc0000/0x8000
> pci0 at mainbus0 bus 0: configuration mode 1 (bios)
> pchb0 at pci0 dev 0 function 0 "Intel 82439HX" rev 0x03
> pcib0 at pci0 dev 7 function 0 "Intel 82371SB PCI-ISA" rev 0x01
> pciide0 at pci0 dev 7 function 1 "Intel 82371SB IDE" rev 0x00: DMA, channel 0 wired to compatibility, channel 1 wired to compatibility
> wd0 at pciide0 channel 0 drive 0: <Maxtor 94098U8>
> wd0: 16-sector PIO, LBA, 39082MB, 16383 cyl, 16 head, 63 sec, 80041248 sectors
> wd0(pciide0:0:0): using PIO mode 4, DMA mode 2
> pciide0: channel 1 disabled (no drives)
> rl0 at pci0 dev 11 function 0 "Realtek 8139" rev 0x10: irq 10 address 00:e0:4c:e6:cd:7b
> rlphy0 at rl0 phy 0: RTL internal phy
> rl1 at pci0 dev 12 function 0 "Realtek 8139" rev 0x10: irq 11 address 00:e0:4c:e7:07:ef
> rlphy1 at rl1 phy 0: RTL internal phy
> isa0 at pcib0
> isadma0 at isa0
> pckbc0 at isa0 port 0x60/5
> pckbd0 at pckbc0 (kbd slot)
> pckbc0: using irq 1 for kbd slot
> wskbd0 at pckbd0: console keyboard
> vga0 at isa0 port 0x3b0/48 iomem 0xa0000/131072
> wsdisplay0 at vga0: console (80x25, vt100 emulation), using wskbd0
> wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
> pcppi0 at isa0 port 0x61
> midi0 at pcppi0: <PC speaker>
> sysbeep0 at pcppi0
> lpt0 at isa0 port 0x378/4 irq 7
> npx0 at isa0 port 0xf0/16: using exception 16
> pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
> pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
> fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
> fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
> biomask 4040 netmask 4c40 ttymask 4cc2
> pctr: 586-class performance counters and user-level cycle counter enabled
> dkcsum: wd0 matched BIOS disk 80
> root on wd0a
> rootdev=0x0 rrootdev=0x300 rawdev=0x302
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^


Yep, get rid of the realtek NICS would be a start, try and get intel
8255x cards (fxp driver). Also, try an upgrade to 3.4 or wait a week and
go to 3.5.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-16-2008, 05:33 AM
beldar
 
Posts: n/a
Default Re: stability problem, disappearing interfaces

bards wrote:
> Alexander Ost wrote:


>> - the problems appear on two completely different h/w setups
>> o desktop system with two PCI-based Realtek 8139 NICs


> Yep, get rid of the realtek NICS would be a start,


this must be one of those YMMV issues. I've been using Realtek8139 nics
for years because they've been cheap, fast and flawless for me. Others
say they suck. go figure
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-16-2008, 05:33 AM
bards
 
Posts: n/a
Default Re: stability problem, disappearing interfaces

beldar wrote:
> bards wrote:
>
>> Alexander Ost wrote:

>
>
>>> - the problems appear on two completely different h/w setups
>>> o desktop system with two PCI-based Realtek 8139 NICs

>
>
>> Yep, get rid of the realtek NICS would be a start,

>
>
> this must be one of those YMMV issues. I've been using Realtek8139 nics
> for years because they've been cheap, fast and flawless for me. Others
> say they suck. go figure


Indeed, I think they get a mention on the openbsd website too. Cant find
it but I'm sure I've read an 'avoid like plague' warning.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 02-16-2008, 05:33 AM
Ted Unangst
 
Posts: n/a
Default Re: stability problem, disappearing interfaces

On Mon, 26 Apr 2004, bards wrote:

> > this must be one of those YMMV issues. I've been using Realtek8139 nics
> > for years because they've been cheap, fast and flawless for me. Others
> > say they suck. go figure

>
> Indeed, I think they get a mention on the openbsd website too. Cant find
> it but I'm sure I've read an 'avoid like plague' warning.


i know i've shoveled gigabytes and gigabytes through rl nics without any
trouble. i can't actually recall anybody with a network problem that was
confirmed to be a realtek's fault. i've seen lots of posts where people
with rl nics had network trouble, but never a followup to say swapping the
nic solved the problem. in short, people have heard they suck, and tell
other people they suck, but no one has experienced said suckage.

people key in on the word realtek every time it's in a post, and are
always willing to blame the card, all evidence to the contrary, which
rarely solves the problem and is just more runaround.

they may not be the best performers (which just means they use more cpu,
not that they can't handle 100mbit), but if your connection is pppoe,
the realtek is really, really not the bottleneck. lots of people are fond
of quoting the driver comment which says you need "a 400MHz PII or some
equally overmuscled CPU to drive it." that was seven years ago. cpus get
twice as fast every 1.5 years. you do the math.

for the OP, output of vmstat -m and netstat -m may be informative. it
sounds like a memory leak. a lot of these were fixed with 3.4, and
another lot in 3.5.



--

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 02-16-2008, 05:33 AM
mips
 
Posts: n/a
Default Re: stability problem, disappearing interfaces

On Sun, 25 Apr 2004 20:59:02 -0700
Ted Unangst <tedu@stanford.edu> wrote:

> i know i've shoveled gigabytes and gigabytes through rl nics without
> any trouble. i can't actually recall anybody with a network problem
> that was confirmed to be a realtek's fault. i've seen lots of posts
> where people with rl nics had network trouble, but never a followup
> to say swapping the nic solved the problem. in short, people have
> heard they suck, and tell other people they suck, but no one has
> experienced said suckage.


I had troubles with 8139 at work and since they have been swapped with
3com gears these troubles disapeared. I also know people that had
troubles with realtek cards too, so no it's not a myth.

mips
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 02-16-2008, 05:33 AM
Ted Unangst
 
Posts: n/a
Default Re: stability problem, disappearing interfaces

On Mon, 26 Apr 2004, mips wrote:

> I had troubles with 8139 at work and since they have been swapped with
> 3com gears these troubles disapeared. I also know people that had
> troubles with realtek cards too, so no it's not a myth.


at last, a confirmed report. what kind of trouble were you having?
i'd like to separate out the problems that come from the nic and the ones
that don't.

--

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 02-16-2008, 05:33 AM
jpd
 
Posts: n/a
Default Re: stability problem, disappearing interfaces

On 2004-04-26, mips <anti@spam.gov> wrote:
> On Sun, 25 Apr 2004 20:59:02 -0700
> Ted Unangst <tedu@stanford.edu> wrote:
>
>> i know i've shoveled gigabytes and gigabytes through rl nics without
>> any trouble. i can't actually recall anybody with a network problem
>> that was confirmed to be a realtek's fault. i've seen lots of posts
>> where people with rl nics had network trouble, but never a followup
>> to say swapping the nic solved the problem.


I've seen enough of those in the comp.unix.bsd.*.misc groups, and I do
recall those including followups.


>> in short, people have
>> heard they suck, and tell other people they suck, but no one has
>> experienced said suckage.

>
> I had troubles with 8139 at work and since they have been swapped with
> 3com gears these troubles disapeared. I also know people that had
> troubles with realtek cards too, so no it's not a myth.


Another datapoint; I had problems with realteks[1], like consistent
low throughput or random faillures. Some cards seem to be ok (when not
pounded on), some just fail when you look at them.

Add to that recommendations from people running large enough shops to
see scale differences and some technical discussion with people In The
Know of stuff low-level enough to see realteks have severe technical
limitations built-in.

If your time is worth nothing, realteks are cheap. If it's not, better
look for something that doesn't break as easily. The price difference
is offset by less downtime, time wasted and pissed off people.[2]

I like NICs that you can trample with data and they still survive.
Realteks do the trample thing themselves, but not the surviving.


[1] Not always entirely the hardware's fault, sometimes the software
sucks too. Given that the sucky software with different hardware
did suck somewhat less means the hardware is at least partially
responsible.

[2] Which is no excuse to overprice quality cards, but I digress.

--
j p d (at) d s b (dot) t u d e l f t (dot) n l .
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 02-16-2008, 05:33 AM
Keith Matthews
 
Posts: n/a
Default Re: stability problem, disappearing interfaces

Ted Unangst wrote:

> On Mon, 26 Apr 2004, bards wrote:
>
>> > this must be one of those YMMV issues. I've been using Realtek8139
>> > nics
>> > for years because they've been cheap, fast and flawless for me. Others
>> > say they suck. go figure

>>
>> Indeed, I think they get a mention on the openbsd website too. Cant find
>> it but I'm sure I've read an 'avoid like plague' warning.

>
> i know i've shoveled gigabytes and gigabytes through rl nics without any
> trouble. i can't actually recall anybody with a network problem that was
> confirmed to be a realtek's fault. i've seen lots of posts where people
> with rl nics had network trouble, but never a followup to say swapping the
> nic solved the problem. in short, people have heard they suck, and tell
> other people they suck, but no one has experienced said suckage.
>


I've had a problem with a realtek, but it is of the unusual kind. Card
wouldn't talk at all on a 100Base-TX network. Realtek's own diagnostics
package showed problems too (can't remember exactly what - too much water
under the bridge).

Once I rebooted the host while the card was connected to a 10Base-T hub the
problem vanished, 10Base-T and 100Base-TX working fine.

Identical card in an identical host with same version of OS had no problems.

It was an early chip though. the chipset seems to have changed slightly and
it would be interesting to know if the problems related to one particular
version or not.

Perhaps manufacturing issues are the cause and chips from different lines in
the fab behave differently.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 02-16-2008, 05:33 AM
Alexander Ost
 
Posts: n/a
Default Re: stability problem, disappearing interfaces

Ted Unangst <tedu@stanford.edu> writes:

> On Mon, 26 Apr 2004, bards wrote:
>
> > > this must be one of those YMMV issues. I've been using Realtek8139 nics
> > > for years because they've been cheap, fast and flawless for me. Others
> > > say they suck. go figure

> >
> > Indeed, I think they get a mention on the openbsd website too. Cant find
> > it but I'm sure I've read an 'avoid like plague' warning.

>
> i know i've shoveled gigabytes and gigabytes through rl nics without any
> trouble. i can't actually recall anybody with a network problem that was
> confirmed to be a realtek's fault. i've seen lots of posts where people
> with rl nics had network trouble, but never a followup to say swapping the
> nic solved the problem. in short, people have heard they suck, and tell
> other people they suck, but no one has experienced said suckage.


That's my impression as well. As my original problems also appear on a
laptop with a 3Com card connected to tun0, I'm reluctant to blame the
Realteks. Also, my setups _were_ running very reliable for weeks and
months on 3.2, using Realteks (but not using altq then).

> for the OP, output of vmstat -m and netstat -m may be informative. it
> sounds like a memory leak. a lot of these were fixed with 3.4, and
> another lot in 3.5.


Thanks for the hint, I'll check the outputs. And 3.5 is on my list as
well.

/alex
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 10:07 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com