This is a discussion on Problem with system() calls in a multithreaded program on HPUX 11 within the HP-UX Operating System forums, part of the Unix Operating Systems category; --> Hello, I am porting a multithreaded program to HPUX 11 from Solaris, in which threads make calls to system() ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hello, I am porting a multithreaded program to HPUX 11 from Solaris, in which threads make calls to system() functions. The program basically creates a number of threads and runs them specified number of times. Each thread performs some task and creates a trace file. The threads then verify the trace file against standard ones to check whether the run was successful or not. The number of threads is variable. Problem: ======== As I increase the number of threads, the program hangs, while trying to run some system() call. If I remove the system() calls altogether, the program runs fine. Below is an explanation of the problem followed by the code. The program uses pthreads and runs fine on Solaris. There may be some variable names used in explanation. These names appear in code given after that. Explanation: ============ If I increase the value of noOfThreads to say 3, 4 and so on. The program hangs say around when noOfThreads is 6 or 7. Now as the problem occurs, two three defunct processes are created. I ran "ps -f -u" command and output was something like this (mtreg is the name of above program) -bash-2.05b$ ps -f -u mkumar UID PID PPID C STIME TTY TIME COMMAND mkumar 1726 1190 0 00:06:12 pts/ta 0:10 mtreg mkumar 1190 1189 0 23:04:02 pts/ta 0:01 -bash mkumar 1731 1726 0 00:06:20 pts/ta 0:00 <defunct> mkumar 1730 1726 2 00:06:20 pts/ta 0:00 <defunct> mkumar 1743 0 0 00:06:20 pts/ta 0:00 mtreg mkumar 1741 1726 0 00:06:21 pts/ta 0:00 sh -c perl strip.pl /export/home/configdev/tmp/FAAa01726mod0 mkumar 1742 1741 0 00:06:21 pts/ta 0:00 perl strip.pl /export/home/configdev/tmp/FAAa01726mod0456a.m mkumar 1751 1190 5 00:07:48 pts/ta 0:00 ps -f -u mkumar Before hanging the output at the console was: ================================================== ============================== Running perl strip.pl /export/home/configdev/tmp/EAAa07614mod0456a.myt Running perl strip.pl /export/home/configdev/tmp/DAAa07614mod0456a.myt Running perl strip.pl /export/home/configdev/tmp/AAAa07614mod0456a.myt Running perl strip.pl /export/home/configdev/tmp/CAAa07614mod0456a.myt Finished running: perl strip.pl /export/home/configdev/tmp/EAAa07614mod0456a.myt Running diff -w mod0456a.trc /export/home/configdev/tmp/EAAa07614mod0456a.myt > /export/home/configdev/tmp/EAAa07614mod0456a.myt.diff Finished running diff -w mod0456a.trc /export/home/configdev/tmp/EAAa07614mod0456a.myt > /export/home/configdev/tmp/EAAa07614mod0456a.myt.diff Running perl strip.pl /export/home/configdev/tmp/BAAa07614mod0456a.myt Finished running: perl strip.pl /export/home/configdev/tmp/DAAa07614mod0456a.myt Running diff -w mod0456a.trc /export/home/configdev/tmp/DAAa07614mod0456a.myt > /export/home/configdev/tmp/DAAa07614mod0456a.myt.diff Finished running: perl strip.pl /export/home/configdev/tmp/AAAa07614mod0456a.myt Running perl strip.pl /export/home/configdev/tmp/FAAa01726mod0456a.myt Running diff -w mod0456a.trc /export/home/configdev/tmp/AAAa07614mod0456a.myt > /export/home/configdev/tmp/AAAa07614mod0456a.myt.diff Finished running diff -w mod0456a.trc /export/home/configdev/tmp/DAAa07614mod0456a.myt > /export/home/configdev/tmp/DAAa07614mod0456a.myt.diff Running diff -w mod0456a.trc /export/home/configdev/tmp/CAAa07614mod0456a.myt > /export/home/configdev/tmp/CAAa07614mod0456a.myt.diff ================================================== ============================== Now some things that I observed are: 1. I started only one mtreg process (PID 1726). But when the program hanged, there is one more mtreg process with PPID 0 which is there. It was idle. 2. Each time the program hangs, there are one or more defunct processes. 3. I am unable to kill the program once it hangs, and system has to be rebooted. 4. The number of threads for which the program hangs is not fixed. It can hang at 5, 6 ,7 or 8 threads. It even hanged once for only 4 threads 5. Although last statement is Running "diff", it has not yet started. 6. I tried an experiment in which I removed all the system() function calls, and instead placed fclose(fopen(diffFileName, "w")). It meant just creating the file without doing anything. This time I was able to run the program even with 10 threads each doing 10 iterations. And it seems that the program might run fine for any number of threads. ( I checked uptil 15 threads). ================================================== ============================== CODE: ( The code is representative of whole code. It may not be compilable) #include <pthread.h> #include <iostream.h> #include <stdio.h> #include <string.h> #define noOfThreads 1 #define noOfIterations 1 char outFileName[512]; char standardTraceFile[512]; void * threadStartRoutine(void* p); void doOneIterationOfThread(); /* * Creates a number of threads and runs them. Waits for their completion and then exits. */ void createThreadsAndRun() { pthread_t threadList[noOfThreads]; for(int i=0; i<noOfThreads; i++) { pthread_attr_t attr; pthread_attr_init(&attr); pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM ); pthread_create(&threadList[i],&attr, threadStartRoutine, nextReq()); cerr << "start thread " << i << endl; } for (i = 0; i < noOfThreads; i++) { pthread_join(threadList[i], NULL); cerr << "finish thread " << i << endl; } } int main(int argc, char* argv) { //The arguments are not shown, as the functions are just representative of the function they are intended to perform setOutFileName(); //depending on argc and argv set the value of outFileName; outFileName is the filename for trace file setStandardTraceFileName(); //obtained from one of the arguments. sets the variable standard trace file name createThreadsAndRun(); } void * threadStartRoutine(void* p) { char* prefix = tempnam(NULL,""); sprintf(newTraceFile, "%s%s", prefix, outFileName); // set the tracefile name for(i = 0; i<noOfIterations; i++) { //do some initializations if(!doOneIterationOfThread()) { cerr<<"Run failed: "<<newTraceFile<<"for run "<<i<<endl; } else { cerr<<"Run suzzessful: "<<newTraceFile<<"for run "<<i<<endl; } } } int doOneIterationOfThread() { doCoreWork(); //writes trace into the tracefile with actual values if(verifyTrace(standardTraceFile, newTraceFile) != 0) //to verify this run of the thread { return false; } else { return true; } } /* * tracefile names are with full path */ int verifyTrace(char* standardTraceFile, char* newTraceFile) { char cmd[512]; char diffFileName[512]; sprintf(diffFileName, "%s.diff", newTraceFile); sprintf(cmd, "perl strip.pl %s", newTraceFile); cerr<<"Running "<<cmd<<endl; system(cmd); cerr<<"Finished running: "<<cmd<<endl; sprintf(cmd, "diff -w %s %s > %s", standardTraceFile, newTraceFile, diffFileName); cerr<<"Running "<<cmd<<endl; system(cmd); cerr<<"Finished running: "<<cmd<<endl; struct stat buf; stat(diffFileName, &buf); unlink(diffFileName); if(buf.st_size == 0) return true; else return false; } /* * Note ****************** * "perl strip.pl <new-trace-file-name>" actually brings the file into a normalized form. It means, that it * changes the values that are run dependent in the trace file, like time stamps and some other info to predecided * normal value. ( Like time stamps may be converted to 0x0) This makes the new trace file and standard trace file * comparable. strip.pl (perl script) performs this task by substituting regular expressions. * Note Ends ************* */ ================================================== ============================== Can anyone please tell me why system() calls are causing problem in HPUX 11 whereas the same thing runs fine on Solaris? It would be really great if you can suggest a possible solution? Thanks and regards, Mahesh Kumar |
| |||
| On 11 Feb 2005 03:46:23 -0800, Mahesh Kumar <maheshkumarjha@gmail.com> wrote: > Hello, > > I am porting a multithreaded program to HPUX 11 from Solaris, in > which threads make calls to system() functions. The program basically > creates a number of threads and runs them specified number of times. > Each thread performs some task and creates a trace file. The threads > then verify the trace file against standard ones to check whether the > run was successful or not. The number of threads is variable. > > Problem: > ======== > As I increase the number of threads, the program hangs, while trying > to run some system() call. If I remove the system() calls altogether, > the program runs fine. Below is an explanation of the problem followed > by the code. The program uses pthreads and runs fine on Solaris. > > > There may be some variable names used in explanation. These names > appear in code given after that. > > Explanation: > ============ > If I increase the value of noOfThreads to say 3, 4 and so on. The > program hangs say around when noOfThreads is 6 or 7. Now as the > problem occurs, two three defunct processes are created. I ran "ps -f > -u" command and output was something like this (mtreg is the name of > above program) > -bash-2.05b$ ps -f -u mkumar > UID PID PPID C STIME TTY TIME COMMAND > mkumar 1726 1190 0 00:06:12 pts/ta 0:10 mtreg > mkumar 1190 1189 0 23:04:02 pts/ta 0:01 -bash > mkumar 1731 1726 0 00:06:20 pts/ta 0:00 <defunct> > mkumar 1730 1726 2 00:06:20 pts/ta 0:00 <defunct> > mkumar 1743 0 0 00:06:20 pts/ta 0:00 mtreg > mkumar 1741 1726 0 00:06:21 pts/ta 0:00 sh -c perl strip.pl > /export/home/configdev/tmp/FAAa01726mod0 > mkumar 1742 1741 0 00:06:21 pts/ta 0:00 perl strip.pl > /export/home/configdev/tmp/FAAa01726mod0456a.m > mkumar 1751 1190 5 00:07:48 pts/ta 0:00 ps -f -u mkumar > [...] > > Can anyone please tell me why system() calls are causing problem in > HPUX 11 whereas the same thing runs fine on Solaris? It would be > really great if you can suggest a possible solution? > Probably the SIGCHLD signal handing got messed up. You have defunct programs that have finished but have not had their exit status collected yet. Even though system() is supposed to be thread safe it's way too sensitive to signal disposition to be using in anything but a a single threaded program. Change your program to be single threaded and use fork(), exec(), and wait(). See the unix programming books by Stevens on how to do it. Using system() from threads was a major violation of the KISS rule and you should expect to have problems when that happens. And you should expect that we aren't going to try to make programs, that are much more complicated than they should be, work. -- Joe Seigh Lock-free synchronization primitives http://atomic-ptr-plus.sourceforge.net/ |
| |||
| maheshkumarjha@gmail.com (Mahesh Kumar) writes: > I am porting a multithreaded program to HPUX 11 from Solaris, in > which threads make calls to system() functions. This is extremely bad idea (TM). Writing multithreaded programs that fork() correctly requires careful use of pthread_atfork handlers for every possible lock used, both in your code and in every library you link against. > The program basically > creates a number of threads and runs them specified number of times. I don't see why you couldn't just fork() N copies of your program. Your threads don't appear to exchange any info while they run ... > Now some things that I observed are: ... These are all consistent with a race condition, where some mutex is held across the fork(). > 2. Each time the program hangs, there are one or more defunct > processes. What you want to do is attach debugger to the parent of 'defunct', and see why that parent is not wait()ing for the zombie. The parent is likely deadlocked somewhere. Cheers, -- In order to understand recursion you must first understand recursion. Remove /-nsp/ for email. |
| |||
| On Fri, 11 Feb 2005 07:48:37 -0500 "Joe Seigh" <jseigh_01@xemaps.com> wrote: > On 11 Feb 2005 03:46:23 -0800, Mahesh Kumar <maheshkumarjha@gmail.com> > wrote: [...] > > Explanation: > > ============ > > If I increase the value of noOfThreads to say 3, 4 and so on. The > > program hangs say around when noOfThreads is 6 or 7. Now as the > > problem occurs, two three defunct processes are created. I ran "ps > > -f > > -u" command and output was something like this (mtreg is the name of > > above program) > > -bash-2.05b$ ps -f -u mkumar > > UID PID PPID C STIME TTY TIME COMMAND > > mkumar 1726 1190 0 00:06:12 pts/ta 0:10 mtreg > > mkumar 1190 1189 0 23:04:02 pts/ta 0:01 -bash > > mkumar 1731 1726 0 00:06:20 pts/ta 0:00 <defunct> > > mkumar 1730 1726 2 00:06:20 pts/ta 0:00 <defunct> > > mkumar 1743 0 0 00:06:20 pts/ta 0:00 mtreg > > mkumar 1741 1726 0 00:06:21 pts/ta 0:00 sh -c perl strip.pl > > /export/home/configdev/tmp/FAAa01726mod0 > > mkumar 1742 1741 0 00:06:21 pts/ta 0:00 perl strip.pl > > /export/home/configdev/tmp/FAAa01726mod0456a.m > > mkumar 1751 1190 5 00:07:48 pts/ta 0:00 ps -f -u mkumar > > > [...] > > > > Can anyone please tell me why system() calls are causing problem in > > HPUX 11 whereas the same thing runs fine on Solaris? It would be > > really great if you can suggest a possible solution? > > > > Probably the SIGCHLD signal handing got messed up. You have defunct > programs that have finished but have not had their exit status > collected > yet. Even though system() is supposed to be thread safe it's way too > sensitive to signal disposition to be using in anything but a a single > threaded program. The Solaris 9 man page says that system() isn't thread-safe: USAGE The system() function manipulates the signal handlers for SIGINT, SIGQUIT, and SIGCHLD. For this reason it is not safe to call system() in a multithreaded process. Concurrent calls to system() will interfere destructively with the disposition of these signals, even if they are not manipu- lated by other threads in the application. See popen(3C) for a replacement for system() that is thread-safe. So the fact that the program runs on Solaris is pure luck. > Change your program to be single threaded and use fork(), exec(), > and wait(). See the unix programming books by Stevens on how to > do it. The correct way to go about this is to use popen(). There's no need to be so drastic and forgo multi-threading altogether, but the OP should ask himself if it's really required to achieve the desired result. > Using system() from threads was a major violation of the KISS rule > and you should expect to have problems when that happens. And you > should expect that we aren't going to try to make programs, that are > much more complicated than they should be, work. Well, that depends on what one considers to be "simple". At first sight, "system()" is less complex than "popen()". System() abuse lures the inexperienced programmer. -- Stefaan -- As complexity rises, precise statements lose meaning, and meaningful statements lose precision. -- Lotfi Zadeh |
| |||
| In article <m3mzub6rjq.fsf@salmon.parasoft.com>, Paul Pluzhnikov <ppluzhnikov-nsp@charter.net> wrote: % maheshkumarjha@gmail.com (Mahesh Kumar) writes: % % > I am porting a multithreaded program to HPUX 11 from Solaris, in % > which threads make calls to system() functions. % % This is extremely bad idea (TM). % % Writing multithreaded programs that fork() correctly requires % careful use of pthread_atfork handlers for every possible lock used, % both in your code and in every library you link against. I'd say that atfork handlers are largely useless. Whatever you do with mutexes, for instance, is irrelevant because the child process isn't allowed to lock or unlock them anyway. execing a new process image is allowed, though. system() is problematic, but the problem is with waiting for the child process to finish, rather than with forking and execing. -- Patrick TJ McPhee North York Canada ptjm@interlog.com |
| |||
| ptjm@interlog.com (Patrick TJ McPhee) wrote in message news:<37a1suF55hncsU2@uni-berlin.de>... > In article <m3mzub6rjq.fsf@salmon.parasoft.com>, > Paul Pluzhnikov <ppluzhnikov-nsp@charter.net> wrote: > % maheshkumarjha@gmail.com (Mahesh Kumar) writes: > % > % > I am porting a multithreaded program to HPUX 11 from Solaris, in > % > which threads make calls to system() functions. > % > % This is extremely bad idea (TM). > % > % Writing multithreaded programs that fork() correctly requires > % careful use of pthread_atfork handlers for every possible lock used, > % both in your code and in every library you link against. > > I'd say that atfork handlers are largely useless. Whatever you do with > mutexes, for instance, is irrelevant because the child process isn't > allowed to lock or unlock them anyway. execing a new process image is > allowed, though. system() is problematic, but the problem is with > waiting for the child process to finish, rather than with forking and > execing. Well, just to make sure that there was no problem with fork and wait, I tried replacing system() function call with following, int my_system (char *command) { int status; if (command == 0) return 1; pid_t pid = fork(); if (pid == -1) return -1; if (pid == 0) { char *argv[4]; argv[0] = "sh"; argv[1] = "-c"; argv[2] = command; argv[3] = NULL; execve("/bin/sh", argv, environ); exit(127); } do { if (waitpid(pid, &status, 0) == -1) { if (errno != EINTR) return -1; } else return status; } while(1); } But still it hanged when I tried to run it for 6 threads. (I already said number of threads is not consistent.) The only difference being that there were no <defunct> processes this time. Can anyone suggest a possible reason for still hanging? Thanks and regards. Mahesh Kumar |
| |||
| On 13 Feb 2005 23:09:37 -0800 maheshkumarjha@gmail.com (Mahesh Kumar) wrote: > Can anyone suggest a possible reason for still hanging? Remember that the advice was to use fork()/exec() _AND_ to go single-threaded. The fork() functions suspend all threads in the process before they run. Threads that are already running and are in an uninterruptible wait inside the kernel cause the fork() to pause until they exit the uninterruptible wait. As a result, fork(), and your whole process, appears to be hung. Using fork() In a multithreaded process can result in interrupted blocking system calls - you should be prepared to handle EINTR errors. fork() per se isn't thread-safe, and requires TLC to mesh well with threads. If you really have to continue with your multi-threaded approach, read the popen(3) man page carefully. Notice where it says: | popen() and pclose() are thread-safe. These interfaces are not | async-cancel-safe. A cancellation point may occur when a thread is | executing popen() or pclose(). and consider whether your objectives cannot achieved through a less complex approach. Take care, -- Stefaan -- As complexity rises, precise statements lose meaning, and meaningful statements lose precision. -- Lotfi Zadeh |
| |||
| The briefest answer to your question is that system() is not MT-safe, and is almost guaranteed to break if invoked from a multi-threaded program. Replace 'system(cmd)' by 'pclose(popen(cmd, "w"))' and things should work as long as 'cmd' is reasonably well behaved. dk "Mahesh Kumar" <maheshkumarjha@gmail.com> wrote in message news:5bf55c06.0502110346.e055886@posting.google.co m... > Hello, > > I am porting a multithreaded program to HPUX 11 from Solaris, in > which threads make calls to system() functions. The program basically > creates a number of threads and runs them specified number of times. > Each thread performs some task and creates a trace file. The threads > then verify the trace file against standard ones to check whether the > run was successful or not. The number of threads is variable. > > Problem: > ======== > As I increase the number of threads, the program hangs, while trying > to run some system() call. If I remove the system() calls altogether, > the program runs fine. Below is an explanation of the problem followed > by the code. The program uses pthreads and runs fine on Solaris. > > > There may be some variable names used in explanation. These names > appear in code given after that. > > Explanation: > ============ > If I increase the value of noOfThreads to say 3, 4 and so on. The > program hangs say around when noOfThreads is 6 or 7. Now as the > problem occurs, two three defunct processes are created. I ran "ps -f > -u" command and output was something like this (mtreg is the name of > above program) > -bash-2.05b$ ps -f -u mkumar > UID PID PPID C STIME TTY TIME COMMAND > mkumar 1726 1190 0 00:06:12 pts/ta 0:10 mtreg > mkumar 1190 1189 0 23:04:02 pts/ta 0:01 -bash > mkumar 1731 1726 0 00:06:20 pts/ta 0:00 <defunct> > mkumar 1730 1726 2 00:06:20 pts/ta 0:00 <defunct> > mkumar 1743 0 0 00:06:20 pts/ta 0:00 mtreg > mkumar 1741 1726 0 00:06:21 pts/ta 0:00 sh -c perl strip.pl > /export/home/configdev/tmp/FAAa01726mod0 > mkumar 1742 1741 0 00:06:21 pts/ta 0:00 perl strip.pl > /export/home/configdev/tmp/FAAa01726mod0456a.m > mkumar 1751 1190 5 00:07:48 pts/ta 0:00 ps -f -u mkumar > > Before hanging the output at the console was: > ================================================== ============================== > Running perl strip.pl /export/home/configdev/tmp/EAAa07614mod0456a.myt > Running perl strip.pl /export/home/configdev/tmp/DAAa07614mod0456a.myt > Running perl strip.pl /export/home/configdev/tmp/AAAa07614mod0456a.myt > Running perl strip.pl /export/home/configdev/tmp/CAAa07614mod0456a.myt > Finished running: perl strip.pl > /export/home/configdev/tmp/EAAa07614mod0456a.myt > Running diff -w mod0456a.trc > /export/home/configdev/tmp/EAAa07614mod0456a.myt > > /export/home/configdev/tmp/EAAa07614mod0456a.myt.diff > Finished running diff -w mod0456a.trc > /export/home/configdev/tmp/EAAa07614mod0456a.myt > > /export/home/configdev/tmp/EAAa07614mod0456a.myt.diff > Running perl strip.pl /export/home/configdev/tmp/BAAa07614mod0456a.myt > Finished running: perl strip.pl > /export/home/configdev/tmp/DAAa07614mod0456a.myt > Running diff -w mod0456a.trc > /export/home/configdev/tmp/DAAa07614mod0456a.myt > > /export/home/configdev/tmp/DAAa07614mod0456a.myt.diff > Finished running: perl strip.pl > /export/home/configdev/tmp/AAAa07614mod0456a.myt > Running perl strip.pl /export/home/configdev/tmp/FAAa01726mod0456a.myt > Running diff -w mod0456a.trc > /export/home/configdev/tmp/AAAa07614mod0456a.myt > > /export/home/configdev/tmp/AAAa07614mod0456a.myt.diff > Finished running diff -w mod0456a.trc > /export/home/configdev/tmp/DAAa07614mod0456a.myt > > /export/home/configdev/tmp/DAAa07614mod0456a.myt.diff > Running diff -w mod0456a.trc > /export/home/configdev/tmp/CAAa07614mod0456a.myt > > /export/home/configdev/tmp/CAAa07614mod0456a.myt.diff > ================================================== ============================== > > Now some things that I observed are: > 1. I started only one mtreg process (PID 1726). But when the program > hanged, there is one more mtreg process with PPID 0 which is there. It > was idle. > 2. Each time the program hangs, there are one or more defunct > processes. > 3. I am unable to kill the program once it hangs, and system has to be > rebooted. > 4. The number of threads for which the program hangs is not fixed. It > can hang at 5, 6 ,7 or 8 threads. It even hanged once for only 4 > threads > 5. Although last statement is Running "diff", it has not yet started. > 6. I tried an experiment in which I removed all the system() function > calls, and instead placed fclose(fopen(diffFileName, "w")). It meant > just creating the file without doing anything. > This time I was able to run the program even with 10 threads each > doing 10 iterations. And it seems that the program might run fine for > any number of threads. ( I checked uptil 15 threads). > > ================================================== ============================== > > CODE: ( The code is representative of whole code. It may not be > compilable) > > > #include <pthread.h> > #include <iostream.h> > #include <stdio.h> > #include <string.h> > > #define noOfThreads 1 > #define noOfIterations 1 > > char outFileName[512]; > char standardTraceFile[512]; > > void * threadStartRoutine(void* p); > > void doOneIterationOfThread(); > > /* > * Creates a number of threads and runs them. Waits for their > completion and then exits. > */ > void createThreadsAndRun() > { > pthread_t threadList[noOfThreads]; > for(int i=0; i<noOfThreads; i++) > { > pthread_attr_t attr; > pthread_attr_init(&attr); > pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM ); > pthread_create(&threadList[i],&attr, threadStartRoutine, nextReq()); > cerr << "start thread " << i << endl; > } > for (i = 0; i < noOfThreads; i++) > { > pthread_join(threadList[i], NULL); > cerr << "finish thread " << i << endl; > } > > } > > int main(int argc, char* argv) > { > //The arguments are not shown, as the functions are just > representative of the function they are intended to perform > setOutFileName(); //depending on argc and argv set the value of > outFileName; outFileName is the filename for trace file > setStandardTraceFileName(); //obtained from one of the arguments. > sets the variable standard trace file name > createThreadsAndRun(); > } > > void * threadStartRoutine(void* p) > { > char* prefix = tempnam(NULL,""); > sprintf(newTraceFile, "%s%s", prefix, outFileName); // set the > tracefile name > for(i = 0; i<noOfIterations; i++) > { > //do some initializations > if(!doOneIterationOfThread()) > { > cerr<<"Run failed: "<<newTraceFile<<"for run "<<i<<endl; > } > else > { > cerr<<"Run suzzessful: "<<newTraceFile<<"for run "<<i<<endl; > } > } > } > > int doOneIterationOfThread() > { > doCoreWork(); //writes trace into the tracefile with actual values > if(verifyTrace(standardTraceFile, newTraceFile) != 0) //to verify > this run of the thread > { > return false; > } > else > { > return true; > } > } > > /* > * tracefile names are with full path > */ > int verifyTrace(char* standardTraceFile, char* newTraceFile) > { > char cmd[512]; > char diffFileName[512]; > sprintf(diffFileName, "%s.diff", newTraceFile); > > sprintf(cmd, "perl strip.pl %s", newTraceFile); > cerr<<"Running "<<cmd<<endl; > system(cmd); > cerr<<"Finished running: "<<cmd<<endl; > > sprintf(cmd, "diff -w %s %s > %s", standardTraceFile, newTraceFile, > diffFileName); > cerr<<"Running "<<cmd<<endl; > system(cmd); > cerr<<"Finished running: "<<cmd<<endl; > > struct stat buf; > stat(diffFileName, &buf); > > unlink(diffFileName); > > if(buf.st_size == 0) > return true; > else > return false; > } > > /* > * Note ****************** > * "perl strip.pl <new-trace-file-name>" actually brings the file into > a normalized form. It means, that it > * changes the values that are run dependent in the trace file, like > time stamps and some other info to predecided > * normal value. ( Like time stamps may be converted to 0x0) This > makes the new trace file and standard trace file > * comparable. strip.pl (perl script) performs this task by > substituting regular expressions. > * Note Ends ************* > */ > > ================================================== ============================== > > Can anyone please tell me why system() calls are causing problem in > HPUX 11 whereas the same thing runs fine on Solaris? It would be > really great if you can suggest a possible solution? > > Thanks and regards, > > Mahesh Kumar |
| |||
| In article <5bf55c06.0502132309.4dccb49a@posting.google.com >, Mahesh Kumar <maheshkumarjha@gmail.com> wrote: [...] % if (waitpid(pid, &status, 0) == -1) { If the implementation of waitpid depends on SIGCHLD being delivered to the waiting thread, then you could have problems because the signal could be delivered to any thread. I'm not sure if this is what's happening here, but it's worth considering. % said number of threads is not consistent.) The only difference being % that there were no <defunct> processes this time. That suggests all the waits completed. Curious. -- Patrick TJ McPhee North York Canada ptjm@interlog.com |
| ||||
| In article <4211a1e8@news.meer.net>, "Dan Koren" <dankoren@yahoo.com> writes: > > Replace 'system(cmd)' by 'pclose(popen(cmd, "w"))' and > things should work as long as 'cmd' is reasonably well > behaved. Note that this will crash if popen() fails, since passing in NULL to pclose() will cause a segfault. I see this is used as a suggestion in the popen man page on Solaris (in fact both examples they give are buggy), they should really fix this so that naive users don't write broken programs. Otto |