Unix Technical Forum

Non reproducible NIS failure

This is a discussion on Non reproducible NIS failure within the Linux Operating System forums, part of the Unix Operating Systems category; --> I'm having non reproducible problems with NIS, from time to time I have this error when trying to get ...


Go Back   Unix Technical Forum > Unix Operating Systems > Linux Operating System

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 01-17-2008, 06:13 PM
Cyril Bouthors
 
Posts: n/a
Default Non reproducible NIS failure

I'm having non reproducible problems with NIS, from time to time I
have this error when trying to get any user info (uid, homedir, ...)
like in the command "chown www-data file":

do_ypcall: clnt_call: RPC: Unable to send; errno = Operation not permitted

and chown fails but the same command works 99% of the time. It seems
to happen more often when the servers are loaded, for example during
cron.daily.

I'm using a master NIS master with ~4000 users and 15 NIS slaves. This
error happens on any of them.

I'm thinking of a limit reached under some circumstances like the
maximum number of concurrent connections or something like that.
Google only returns 4 pages about this error message on the web and
none on usenet, no answer. Do you have any idea where I should look
at? Where should I ask about that?

Merry Xmas with NIS.
--
Cyril Bouthors
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 01-17-2008, 06:13 PM
P.T. Breuer
 
Posts: n/a
Default Re: Non reproducible NIS failure

Cyril Bouthors <cyril@bouthors.org> wrote:
> I'm having non reproducible problems with NIS, from time to time I
> have this error when trying to get any user info (uid, homedir, ...)
> like in the command "chown www-data file":


> do_ypcall: clnt_call: RPC: Unable to send; errno = Operation not permitted


Probably a server failure due to overloading. What's your fanout? I try
never to exceed 20:1.

Or it could be local, and ypbind that's overloaded. Can happen with bad
network cable. Needs precise debugging to know.

> and chown fails but the same command works 99% of the time. It seems
> to happen more often when the servers are loaded, for example during
> cron.daily.


Ah. Yes.

> I'm using a master NIS master with ~4000 users and 15 NIS slaves. This
> error happens on any of them.


And how fast is the server? Anyway, you mean 15 slave servers? Not 15
clients?

> I'm thinking of a limit reached under some circumstances like the
> maximum number of concurrent connections or something like that.
> Google only returns 4 pages about this error message on the web and
> none on usenet, no answer. Do you have any idea where I should look
> at? Where should I ask about that?


You should talk to the author.

Peter
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 01-17-2008, 06:13 PM
Cyril Bouthors
 
Posts: n/a
Default Re: Non reproducible NIS failure

ptb@oboe.it.uc3m.es (P.T. Breuer) writes:
> Probably a server failure due to overloading.


It should delay the answer but not fail, is there a way to change the
timeout?

> What's your fanout? I try never to exceed 20:1.


I'm not sure to understand the question but my load is around 0.5 most
of the time and between 1 and 4 during cron.daily (processing
gigabytes of statistics).

> it could be local, and ypbind that's overloaded.


Is there a way to know that? Number of queries, ... ?

> Can happen with bad network cable. Needs precise debugging to know.


Network is not used since the server is a slave. By the way, there's
no problems with the network cables nor switches.

> And how fast is the server?


Both master and slaves are fast.

> Anyway, you mean 15 slave servers? Not 15 clients?


I used to have 15 clients and 1 master, I've switched to 15 slaves and
1 master for performance issues.

> You should talk to the author.


I've already asked Thorsten Kukuk.
--
Cyril Bouthors
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 01-17-2008, 06:13 PM
P.T. Breuer
 
Posts: n/a
Default Re: Non reproducible NIS failure

Cyril Bouthors <cyril@bouthors.org> wrote:
> ptb@oboe.it.uc3m.es (P.T. Breuer) writes:
> > Probably a server failure due to overloading.


> It should delay the answer but not fail, is there a way to change the
> timeout?


Oh - it'll fail. You can recompile.

> > What's your fanout? I try never to exceed 20:1.


> I'm not sure to understand the question but my load is around 0.5 most


No, the number of clients accessing each server; "fanout".

> > it could be local, and ypbind that's overloaded.


> Is there a way to know that? Number of queries, ... ?


Debug.

> > Can happen with bad network cable. Needs precise debugging to know.


> Network is not used since the server is a slave. By the way, there's


Are you saying that the server is only used locally? By people on the
localhost machine?

> no problems with the network cables nor switches.


> > And how fast is the server?


> Both master and slaves are fast.


> > Anyway, you mean 15 slave servers? Not 15 clients?


> I used to have 15 clients and 1 master, I've switched to 15 slaves and
> 1 master for performance issues.


And how many clients? I think that perhaps you are describing a
situation in which every client is also a slave server which is accessed
only from localhost?

In that case the (slave) server will go down for a while when you do
the ypxfr of the maps, or the flush of the server caches. (there are
many different ways of transferring maps, but ypxfr via pull is quite
common).

That may cause the client (ypbind) to switch to a broadcast mde
temporarily.

> > You should talk to the author.


> I've already asked Thorsten Kukuk.


And what did he say?

Peter
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 01-17-2008, 06:14 PM
Cyril Bouthors
 
Posts: n/a
Default Re: Non reproducible NIS failure

ptb@oboe.it.uc3m.es (P.T. Breuer) writes:

> No, the number of clients accessing each server; "fanout".


1 client = 1 server

> Are you saying that the server is only used locally? By people on
> the localhost machine?


Exactly.

> you are describing a situation in which every client is also a slave
> server which is accessed only from localhost?


Yes.

> In that case the (slave) server will go down for a while when you do
> the ypxfr


I've already thought about that but ypxfr in this architecture are
only made when adding new users and does not coincide with "Operation
not permitted" error messages.

> And what did he say?


No answer at this time but it's quite normal during Xmas.
--
Cyril Bouthors
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 01-17-2008, 06:14 PM
P.T. Breuer
 
Posts: n/a
Default Re: Non reproducible NIS failure

Cyril Bouthors <cyril@jexiste.fr> wrote:
> ptb@oboe.it.uc3m.es (P.T. Breuer) writes:
> > you are describing a situation in which every client is also a slave
> > server which is accessed only from localhost?


> Yes.


> > In that case the (slave) server will go down for a while when you do
> > the ypxfr


> I've already thought about that but ypxfr in this architecture are
> only made when adding new users and does not coincide with "Operation


You sure? There should be scripts in cron like ypxfr_1perday.

> not permitted" error messages.


Peter
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 01-17-2008, 06:14 PM
Nico Kadel-Garcia
 
Posts: n/a
Default Re: Non reproducible NIS failure


"Cyril Bouthors" <cyril@jexiste.fr> wrote in message
news:878ykzh7kp.fsf@wide.bouthors.org...
> ptb@oboe.it.uc3m.es (P.T. Breuer) writes:
>
> > No, the number of clients accessing each server; "fanout".

>
> 1 client = 1 server
>
> > Are you saying that the server is only used locally? By people on
> > the localhost machine?

>
> Exactly.
>
> > you are describing a situation in which every client is also a slave
> > server which is accessed only from localhost?

>
> Yes.
>
> > In that case the (slave) server will go down for a while when you do
> > the ypxfr

>
> I've already thought about that but ypxfr in this architecture are
> only made when adding new users and does not coincide with "Operation
> not permitted" error messages.


1: Look for cron jobs that tickle the ypxfr. Different OS's have different
cron jobs to tickle hourly, daly, etc. cron jobs for NIX.

2: Consider using dual hosts in your /etc/yp.conf: Linux NIS can certainly
tolerate this, and it gives you a fallover to use if one of them fails at
NIS slave service.


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 01-17-2008, 06:15 PM
Cyril Bouthors
 
Posts: n/a
Default Re: Non reproducible NIS failure

ptb@oboe.it.uc3m.es (P.T. Breuer) writes:
> You sure? There should be scripts in cron like ypxfr_1perday.


I've already checked:

# find /etc -type f | xargs grep ypxfr
/etc/init.d/nis: echo -n "ypxfrd "
/etc/init.d/nis: --exec ${NET}/rpc.ypxfrd
/etc/init.d/nis: --name rpc.ypxfrd
/etc/rpc:ypxfrd 100069
/etc/ypserv.securenets:# for NIS clients (and slave servers - ypxfrd uses this
--
Cyril Bouthors
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 01-17-2008, 06:15 PM
Cyril Bouthors
 
Posts: n/a
Default Re: Non reproducible NIS failure

"Nico Kadel-Garcia" <nkadel@comcast.net> writes:

> Look for cron jobs that tickle the ypxfr.


I've already checked, look at my other post.

> Consider using dual hosts in your /etc/yp.conf


I think this one is a good idea. I've changed yp.conf but since I
can't reproduce the bug easily, I have to wait few hours/days to see
if cron scripts happen to fail again.

Nico, Thank you.
--
Cyril Bouthors
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 01-17-2008, 06:15 PM
Nico Kadel-Garcia
 
Posts: n/a
Default Re: Non reproducible NIS failure


"Cyril Bouthors" <cyril@jexiste.fr> wrote in message
news:87zndefctt.fsf@wide.bouthors.org...
> "Nico Kadel-Garcia" <nkadel@comcast.net> writes:
>
> > Look for cron jobs that tickle the ypxfr.

>
> I've already checked, look at my other post.
>
> > Consider using dual hosts in your /etc/yp.conf

>
> I think this one is a good idea. I've changed yp.conf but since I
> can't reproduce the bug easily, I have to wait few hours/days to see
> if cron scripts happen to fail again.


It was unclear to me that you checked on both clients and the server, but
you're entirely welcome.

Also, are your clients the same OS as your server? I've run into real "fun"
with distinct OS's ideas about NIS....


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 02:45 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com