Hello everyone, it is a pleasure to be here. I am new in this community.
The scenario is as follows:
Environment 1 Linux RedHat 7.5
Sepctrum 10.4.0
1 VM OneClick Server 54 GB Memory 63 SWAP 300 GB Disk partition and CPU 20 Cores Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
1 VM SpectroSERVER MLS 23 GB Memory 9 SWAP 39 GB Disk partition and CPU 16 Cores Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
1 VM SpectroSERVER secondary 62 GB Memory 127 SWAP 600 GB Disk partition and CPU 16 Cores Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
1 VM SpectroSERVER secondary 62 GB Memory 127 SWAP 600 GB Disk partition and CPU 16 Cores Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
Environment 2 Linux RedHat 7.2
Sepctrum 10.3.0
1 VM OneClick Server 31 GB Memory 9 SWAP 440 GB Disk partition and CPU 16 Cores Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
1 VM SpectroSERVER MLS 31 GB Memory 9 SWAP 49 GB Disk partition and CPU 4 Cores Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
1 VM SpectroSERVER secondary 78 GB Memory 9 SWAP 540 GB Disk partition and CPU 16 Cores Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
In 2 occasions, the following error has occurred in the messages log file of the Linux OS at the same time causing the SpectroSERVER process to become corrupted or error mesaage SS "Terminated".
SS HOSTNAME-SS-MLS Environment 1 Crash
./audit/audit.log:type=ANOM_ABEND msg=audit(1589468891.480:208339): auid=4294967295 uid=1000 gid=1000 ses=4294967295 pid=29884 comm="SpectroSERVER" reason="memory violation" sig=11
./audit/audit.log:type=ANOM_ABEND msg=audit(1589473208.345:208499): auid=4294967295 uid=1000 gid=1000 ses=4294967295 pid=24787 comm="SpectroSERVER" reason="memory violation" sig=11
./audit/audit.log:type=ANOM_ABEND msg=audit(1589561673.021:2249): auid=4294967295 uid=1000 gid=1000 ses=4294967295 pid=2749 comm="SpectroSERVER" reason="memory violation" sig=11
./messages-20200517:May 14 10:08:11 HOSTNAME-SS-MLS kernel: SpectroSERVER[29884]: segfault at 7fa227b0d000 ip 00007fa26f441f00 sp 00007fa22cfb96a0 error 4 in libGlobl.so.1[7fa26f3f3000+d4000]
./messages-20200517:May 14 10:08:11 HOSTNAME-SS-MLS abrt-hook-ccpp: Process 29884 (SpectroSERVER) of user 1000 killed by SIGSEGV - dumping core
./messages-20200517:May 14 10:08:16 HOSTNAME-SS-MLS abrt-server: Executable '/home/SPECTRUM/SS/SpectroSERVER' doesn't belong to any package and ProcessUnpackaged is set to 'no'
./messages-20200517:May 14 11:20:08 HOSTNAME-SS-MLS kernel: SpectroSERVER[24787]: segfault at 7feaee361000 ip 00007feb2aa1ef00 sp 00007feae82c96a0 error 4 in libGlobl.so.1[7feb2a9d0000+d4000]
./messages-20200517:May 14 11:20:08 HOSTNAME-SS-MLS abrt-hook-ccpp: Process 24787 (SpectroSERVER) of user 1000 killed by SIGSEGV - dumping core
./messages-20200517:May 14 11:20:11 HOSTNAME-SS-MLS abrt-server: Executable '/home/SPECTRUM/SS/SpectroSERVER' doesn't belong to any package and ProcessUnpackaged is set to 'no'
./messages-20200517:May 15 11:54:33 HOSTNAME-SS-MLS kernel: SpectroSERVER[2749]: segfault at 7f17376a4000 ip 00007f177bed3f00 sp 00007f1737f856a0 error 4 in libGlobl.so.1[7f177be85000+d4000]
./messages-20200517:May 15 11:54:33 HOSTNAME-SS-MLS abrt-hook-ccpp: Process 2749 (SpectroSERVER) of user 1000 killed by SIGSEGV - dumping core
./messages-20200517:May 15 11:54:36 HOSTNAME-SS-MLS abrt-server: Executable '/home/SPECTRUM/SS/SpectroSERVER' doesn't belong to any package and ProcessUnpackaged is set to 'no'
SS HOSTNAME-SS-SECONDARY-1 Environment 1 Crash
./audit/audit.log:type=ANOM_ABEND msg=audit(1589468877.979:114432): auid=4294967295 uid=1000 gid=1000 ses=4294967295 pid=19059 comm="SpectroSERVER" reason="memory violation" sig=11
./messages-20200517:May 14 10:07:57 HOSTNAME-SS-SECONDARY-1 kernel: SpectroSERVER[19059]: segfault at 7ff50f7cf000 ip 00007ff8b8c6af00 sp 00007ff3c73856a0 error 4 in libGlobl.so.1[7ff8b8c1c000+d4000]
./messages-20200517:May 14 10:07:58 HOSTNAME-SS-SECONDARY-1 abrt-hook-ccpp: Process 19059 (SpectroSERVER) of user 1000 killed by SIGSEGV - dumping core
./messages-20200517:May 14 10:10:32 HOSTNAME-SS-SECONDARY-1 abrt-server: Executable '/home/SPECTRUM/SS/SpectroSERVER' doesn't belong to any package and ProcessUnpackaged is set to 'no'
SS HOSTNAME-SS-SECONDARY-2 Environment 1 Crash
./audit/audit.log.3:type=ANOM_ABEND msg=audit(1581442972.218:127312): auid=4294967295 uid=1000 gid=1000 ses=4294967295 pid=114545 comm="SpectroSERVER" reason="memory violation" sig=11
./audit/audit.log:type=ANOM_ABEND msg=audit(1589473208.254:191): auid=4294967295 uid=1000 gid=1000 ses=4294967295 pid=3356 comm="SpectroSERVER" reason="memory violation" sig=11
./audit/audit.log:type=ANOM_ABEND msg=audit(1589561636.412:1928): auid=4294967295 uid=1000 gid=1000 ses=4294967295 pid=6643 comm="SpectroSERVER" reason="memory violation" sig=11
./messages-20200517:May 14 11:20:08 HOSTNAME-SS-SECONDARY-2 kernel: SpectroSERVER[3356]: segfault at 7f471805e000 ip 00007f4784cdcf00 sp 00007f4723d696a0 error 4 in libGlobl.so.1[7f4784c8e000+d4000]
./messages-20200517:May 14 11:20:08 HOSTNAME-SS-SECONDARY-2 abrt-hook-ccpp: Process 3356 (SpectroSERVER) of user 1000 killed by SIGSEGV - dumping core
./messages-20200517:May 14 11:20:17 HOSTNAME-SS-SECONDARY-2 abrt-server: Executable '/home/SPECTRUM/SS/SpectroSERVER' doesn't belong to any package and ProcessUnpackaged is set to 'no'
./messages-20200517:May 15 11:53:56 HOSTNAME-SS-SECONDARY-2 kernel: SpectroSERVER[6643]: segfault at 7f936bf55000 ip 00007f93e5c1ef00 sp 00007f9384dd56a0 error 4 in libGlobl.so.1[7f93e5bd0000+d4000]
./messages-20200517:May 15 11:53:56 HOSTNAME-SS-SECONDARY-2 abrt-hook-ccpp: Process 6643 (SpectroSERVER) of user 1000 killed by SIGSEGV - dumping core
SS HOSTNAME-SS-MLS2 Environment 2 Crash
./messages-20200517:May 15 11:54:09 HOSTNAME-SS-MLS2 kernel: SpectroSERVER[10939]: segfault at 7f6e3cf74000 ip 00007f6e7fcc32b0 sp 00007f6e3ebe16a0 error 4 in libGlobl.so.1[7f6e7fc71000+d4000]
./messages-20200517:May 15 11:54:09 HOSTNAME-SS-MLS2 kernel: type=1701 audit(1589561649.470:55151873): auid=4294967295 uid=1001 gid=1001 ses=4294967295 pid=10939 comm="SpectroSERVER" reason="memory violation" sig=11
./messages-20200517:May 15 11:54:12 HOSTNAME-SS-MLS2 abrt-server: Executable '/home/SPECTRUM/SS/SpectroSERVER' doesn't belong to any package and ProcessUnpackaged is set to 'no'
SpectroSERVERs have different loads, but especially MLS have no load.
Virtual machines reside on different ESX.
Performance is not critically affected.
The VNM.OUT log do not show much information about the affected process.
Only one secondary SS shows the following lines in VNM.out file.
may 15 12:54:34 ERROR TRACE at CsIHCrMdlEv.cc(354): Model Name is not set after re-evaluation for mh:0x2157f88
may 15 12:59:54 WARNING at CsIHOverCapacity.cc(330): SpectroSERVER is over capacity threshold of 95%, generating 5 performance dumps to determine source of overload:
Saved compact diagnostic file to '/home/SPECTRUM/SS/support/SpectroSERVER_20200515_1259.dmp'
Saved compact diagnostic file to '/home/SPECTRUM/SS/support/SpectroSERVER_20200515_1300.dmp'
Saved compact diagnostic file to '/home/SPECTRUM/SS/support/SpectroSERVER_20200515_1301.dmp'
Saved compact diagnostic file to '/home/SPECTRUM/SS/support/SpectroSERVER_20200515_1303.dmp'
Saved compact diagnostic file to '/home/SPECTRUM/SS/support/SpectroSERVER_20200515_1304.dmp'
may 15 14:07:55 ERROR TRACE at CsIHPrtIPLS.cc(1090): Waited 60000ms for IPLS evaluate lock for mh: 0x209294e, continuing without lock
may 15 14:08:56 ERROR TRACE at CsIHPrtIPLS.cc(1090): Waited 60000ms for IPLS evaluate lock for mh: 0x209294e, continuing without lock
may 15 14:09:56 ERROR TRACE at CsIHPrtIPLS.cc(1090): Waited 60000ms for IPLS evaluate lock for mh: 0x209294e, continuing without lock
may 15 14:10:56 ERROR TRACE at CsIHPrtIPLS.cc(1090): Waited 60000ms for IPLS evaluate lock for mh: 0x209294e, continuing without lock
may 15 14:10:58 ERROR TRACE at CsIHPrtIPLS.cc(1090): Waited 60000ms for IPLS evaluate lock for mh: 0x209294e, continuing without lock
may 18 17:30:04 WARNING at CsIHOverCapacity.cc(330): SpectroSERVER is over capacity threshold of 95%, generating 5 performance dumps to determine source of overload:
Saved compact diagnostic file to '/home/SPECTRUM/SS/support/SpectroSERVER_20200518_1730.dmp'
Saved compact diagnostic file to '/home/SPECTRUM/SS/support/SpectroSERVER_20200518_1731.dmp'
Saved compact diagnostic file to '/home/SPECTRUM/SS/support/SpectroSERVER_20200518_1732.dmp'
Saved compact diagnostic file to '/home/SPECTRUM/SS/support/SpectroSERVER_20200518_1733.dmp'
Saved compact diagnostic file to '/home/SPECTRUM/SS/support/SpectroSERVER_20200518_1734.dmp'
I have one question:
1.- Is it possible that one SpectroServer can kill the process of other SpectroSERVERs at the same time in a distributed environment?
Do you have any idea of this behavior?
Regards, I would appreciate your help.