DX NetOps Manager

 View Only

Tech Tip: Debugging Processd System Errors with Strace on Linux 

Sep 01, 2015 11:13 AM

Note: This document is intended for an audience that is intimately familiar with Spectrum architecture and debugging.  If you have questions, or if you are unsure about the impact of any of these commands, please contact Spectrum support first.

 

This document is applicable to all versions of Spectrum on Linux.

 

 

processd is responsible for launching and managing interactions between Spectrum processes.  You might encounter a situation (online backup failure, for example) when it would be helpful to run processd in debug mode.  You can put processd in debug by sending a kill signal.  To start debug:

 

kill -TRAP <processd PID>

 

You will see this in the processd_log:

 

Sep 08 10:35:26 DEBUG START

 

 

 

To stop debug, send the same signal:

 

kill -TRAP <processd PID>

 

You will see this in the processd_log:

 

Sep 08 10:36:33 DEBUG END

 

 

Although there are signals for the kill command that will kill a process, kill is more broadly used for sending signals to a process.  In the case of processd, the process is instrumented to accept the "-TRAP" signal, which starts or stops debug output, while leaving the process running.

 

 

Sometimes when you are running processd in debug mode, a ticket fails with a signal that doesn’t give much detail.  If you’re running an online backup on Linux, for example, exit code 25 means "an Unrecorded Exception." The processd code suggests that the failure was somewhere in the system - e.g. a file couldn't be read or written, or does not exist, etc.

 

 

In order to get some visibility into which files are getting accessed, and by which processes, you can run strace.  By this method you could determine, for example, that the SSdbsave failed because Install-Tools/gzip was missing, or SSdbload failed because there were insufficient permissions to write the SSdbload.log file.  The following command will trace all of the file and process system calls made by processd, and it will create additional log files for each of processd's child processes: strace -s9999 -ff -e trace=file,process -v -o olb_processd_fork_strace.out -p <processd PID>

 

 

By the time the online backup has completed, you will have a olb_processd_fork_strace.out file in the directory where you executed strace.  You will also have olb_processd_fork_strace.out.<PID> files for each child process created by processd.

 

 

If you grep through the output files, you might see:

 

 

chdir("/usr/SPECTRUM/SS")                  = 0

 

getcwd("/usr/SPECTRUM/SS", 1024)           = 14

 

access("/usr/SPECTRUM/SS/SSdbload.log", R_OK) = 0

 

 

...which would indicate that whichever process tried to access the SSdbload,log, it had permission.

 

 

If you're troubleshooting an issue that happens at processd startup, you can use the --debug option from the command line:

 

SPECTRUM/lib/SDPM/# ./processd --debug

 

Then once you have duplicated the problem, you can shut off the debug using the TRAP signal to the processd process ID.

Statistics
0 Favorited
6 Views
0 Files
0 Shares
0 Downloads

Tags and Keywords

Comments

Sep 03, 2015 02:04 AM

Thanks Scott!

This is helpful

Sep 01, 2015 11:38 AM

Thank you for sharing this with the community!

Related Entries and Links

No Related Resource entered.