Deployment Solution

 View Only
  • 1.  Deployment problem

    Posted Aug 04, 2020 02:42 PM
    Symantec™ Management Platform Version 8.5 RU3

    Problem when deploy image

    Ghost does not start in the image deploy process and I don't see the job being logged in Altiris when I select the job to deploy. I enclose the logs for analysis and representative images.



    ------------------------------
    Pan American Energy LLC
    ------------------------------

    Attachment(s)

    log
    SMP.log   24 KB 1 version
    log
    PectAgent.log   63 KB 1 version


  • 2.  RE: Deployment problem

    Broadcom Employee
    Posted Aug 05, 2020 02:32 AM
    Looks like the problem is on the server side, PECTAgent cannot properly register on the server. Could you provide the server side logs please? And please check if your DS license is OK in SIM.


  • 3.  RE: Deployment problem

    Posted Aug 05, 2020 07:13 AM
    Hi Sergei,

    Shipping log the server and site server. our deployment license is fine.

    ------------------------------
    Pan American Energy LLC
    ------------------------------

    Attachment(s)

    log
    Bootwiz.log   3.32 MB 1 version
    zip
    Log Server.zip   438 KB 1 version


  • 4.  RE: Deployment problem

    Posted Aug 06, 2020 02:57 AM
    I think that we may be seeing a similar problem with our own installation of IT Management Suite 8.5 RU3. I have attached our own log files and have attempted to compare & contrast with mmarmori's findings.

    == PectAgent.log ==
    1. Our log file is version 8.05.4252 and starts with a few RegisterLibrary clauses. mmarmori's log file is version 8.05.5077 and starts with Init: clauses. Maybe a difference between ITMS and other SMP-derived bundles?
    2. Our PectAgent calculates the agent GUID with help from the ITMS Notification Server while mmarmori's agent retrieves it from PectAgent.ini.
    3. mmarmori's agent fails to communicate with a few API endpoints ("request XML is invalid").
    4. Our PectAgent log lacks some of the verbose information (e.g., the PostBasicInventory XML contents) which is present in mmarmori's.
    5. Our client is identified as a managed computer while mmarmori's is identified as an unmanaged computer.

    == SMP.log ==
    1. Our log file includes some generic initialization information before catching up to mmarmori's first few records regarding "Agent storage FULL integrity check."
    2. Our WinPE includes the "Client Task Agent," which seems to be lacking in mmarmori's WinPE (probably another difference between Symantec/Broadcom SMP suites).
    3. Our connection profile is evidently messed-up as shown by the "root certificate which is not trusted by the trust provider" on port 443, but it does fall back to port 80 successfully.
    4. Our SMP agent trails off into endless repetition of "Check for tasks," which is not explicitly shown in mmarmori's log file (maybe truncated for size?).

    Despite the differences, I would propose that both sets of log files suggest that the agent reaches a steady-state but then fails to be controlled further by the Initial Deployment job. Indeed, we tried launching an isolated Client Task from the ITMS Console and saw that it immediately was processed within WinPE (though unfortunately we captured the logs before attempting this).

    I have included a short snippet including only the "burst" of logs which appear in the Altiris Log Viewer shortly after the client system enters WinPE. These clearly show that PectAgent is registering on the server. Of particular interest is the Informational (as opposed to Warning or Error) message Adding machine into Exception with exceptionId as {guid}, which seems to suggest that some issue occurred which isn't being handled or reported properly.

    Interestingly, the following sequence seems to suggest that Computer object state is somehow persisting between deployment attempts for this PC. Consider the following sequence of steps, which we have attempted multiple times with identical results (and the same GUID for the Computer object every time):
    1. Access Manage -> Computers from the ITMS Console
    2. Search for "PECTAgent" in the list of computers
    3. Click to view the Task/Job History for the selected computer; see attached image which shows that history
    4. Delete the Computer object from ITMS (and confirm if prompted)
    5. Repeat step 2 to verify that the system is gone
    6. PXE boot the client system and wait for WinPE to start
    7. Select a Job from the Initial Deployment list
    8. Observe that the deployment never starts and that the job is never logged, just like in mmarmori's docx attachment
    9. Turn off the computer
    10. Repeat step 2 to locate PECTAgent -- it is back now
    11. Click to view the Task/Job History for the object, and see that the new attempt is still not logged -- however, all of the same old jobs have been restored from somewhere!

    Is there perhaps a way to purge this computer's GUID from the SQL Server database? I don't need any history for this object, and can't help but suspect that some stale database relation is preventing the build from working properly. Other deployments on different systems are able to complete -- this computer is somewhat unique.

    Attachment(s)

    log
    SMP.log   56 KB 1 version
    log
    PectAgent.log   29 KB 1 version


  • 5.  RE: Deployment problem

    Broadcom Employee
    Posted Aug 06, 2020 04:40 AM
    Server tries to redirect resource creation call to NS web. Since initial call to DS web service was performed over HTTPs, call to NS is also a HTTPs call. That one fails with error "The underlying connection was closed: Could not establish trust relationship for the SSL/TLS secure channel". At this point it is not clear what is the exact reason. Can you get error that is logged in IIS logs? The best way is to gather WireShark logs for this call. It will show why SSL handshake fails. 

    The problem can be related to sever name used in call and in NS web certificate. You can execute NScript.exe from Notification Server\Bin folder with attached script - it will show server name that is used during call and check that it matches server name in web certificate.
    Thanks,
    Ed.


  • 6.  RE: Deployment problem

    Posted Sep 04, 2020 12:23 PM
    I think that you are exactly right, Ed. Using Wireshark, I looked at all of the SSL/TLS handshakes which occur around the time of the "trust relationship" error. The only one which seemed problematic was the one launched by a local process (w3wp presumably) to the loopback address on port 443 (w3wp again). In this case, there was no "Application Data" sent back and forth between the two endpoints -- instead, the connection was terminated immediately after the handshake. There doesn't seem to be a hint at the protocol layer as to what happened, and the FIN packet is sent from client to server, which suggests that the server cert failed validation.

    I agree that running a C# script with NScript.exe (with all of the Altiris/ITMS assemblies available) will help to pinpoint the problem. Would you mind reattaching the script that you mentioned? I couldn't seem to find it in the previous post.



  • 7.  RE: Deployment problem

    Posted Sep 04, 2020 03:57 PM
    OK! I actually borrowed a script from this Stack Overflow post and substituted a relevant WebRequest call:

    $id = [Environment]::TickCount;
    $fileName = "${PSScriptRoot}\Powershell_log_${id}.txt"
    $listener1 = New-Object "System.Diagnostics.TextWriterTraceListener" @($fileName, "text_listener")
    $listener2 = New-Object "System.Diagnostics.ConsoleTraceListener"
    $listener2.Name = "console_listener"
    [System.Diagnostics.Trace]::AutoFlush = $true
    [System.Diagnostics.Trace]::Listeners.Add($listener1) | out-null
    [System.Diagnostics.Trace]::Listeners.Add($listener2) | out-null
    # Use reflection to enable and hook up the TraceSource
    $logging = [System.Net.Sockets.Socket].Assembly.GetType("System.Net.Logging")
    $flags = [System.Reflection.BindingFlags]::NonPublic -bor [System.Reflection.BindingFlags]::Static
    $logging.GetField("s_LoggingEnabled", $flags).SetValue($null, $true)
    $webTracing = $logging.GetProperty("Web", $flags);
    $webTraceSource = [System.Diagnostics.Tracesource]$webTracing.GetValue($null, $null);
    $webTraceSource.Switch.Level = [System.Diagnostics.SourceLevels]::Information
    $webTracesource.Listeners.Add($listener1) | out-null
    $webTracesource.Listeners.Add($listener2) | out-null
    [System.Diagnostics.Trace]::TraceInformation("About to do net stuff");
    $wr = [System.Net.WebRequest]::Create("https://ns.ad.example.com:443/Altiris/NS/Agent/CreateResource.aspx")
    $response = $wr.GetResponse()
    [System.Diagnostics.Trace]::TraceInformation("Finished doing net stuff");
    #get rid of the listeners
    [System.Diagnostics.Trace]::Listeners.Clear();
    $webTraceSource.Listeners.Clear();
    $listener1.Dispose();
    $listener2.Dispose();

    Sure enough, the resulting logfile identified the exact problem:

    System.Net Information: 0 : [12808] SecureChannel#20268497 - Remote certificate has errors:
    System.Net Information: 0 : [12808] SecureChannel#20268497 - Certificate name mismatch.
    System.Net Information: 0 : [12808] SecureChannel#20268497 - Remote certificate was verified as invalid by the user.
    System.Net Error: 0 : [12808] Exception in HttpWebRequest#47759218:: - The underlying connection was closed: Could not
    establish trust relationship for the SSL/TLS secure channel..

    We issued a Notification Server web certificate in September 2019 with a Common Name itms.example.com and two Subject Alternative Names: itms.example.com and ns.example.com. I issued and installed a new certificate with three SANs: itms.example.com, ns.example.com (like before), plus ns.ad.example.com (which is named in the URL). I'm not sure why this didn't become a problem until last month... but it's fixed now!

    Thank you very much for your help, EduardSch. BTW, I think that mmarmori would benefit from running a script like this on the Notification Server as well.


  • 8.  RE: Deployment problem

    Broadcom Employee
    Posted Sep 07, 2020 01:06 AM
    Glad to hear that. 
    Assume that something got changed in network/domain configuration and NS starts to resolve itself with different name.