Ghost Solution Suite

 View Only
  • 1.  RPC Server is unavailable

    Posted Jun 22, 2006 11:11 AM
    When a PC is trying to join the domain we get a warning on "Post-configuration status" that states the following:
    "Failed to join domain WWCS.priv: The RPC server is unavailable."

    The PC fails to join the domain and therefore fails to grab packages from a UNC path. You can run the same task again, without the clone option (leaving just configuration, deploy AI package, and refresh config) and usually get the PC deployed correctly.

    Not every PC gets this error. If you were to clone, configure, and deploy packages on a lab of 10 PCs, maybe 4 of them would get this error. It's very random, it's never the same physical PCs that have the error, and not always the same percentage or number of computers fail. It's usually less than 50%. I've tried 2 completely different images and they have the same issue and amount of failures.

    I installed Ghost 8.3 on a new server recently, and our old server is running Ghost 8.2. Both servers get the same warning message. I don't know exactly when the error started, but I do not recall any changes to our servers at that time.

    What could be causing this error? I think it may have something to do with our domain controllers, but beyond that I don't know how to troubleshoot the issue further.

    Thanks for any help you can provide!Message was edited by:
    Matt Jones


  • 2.  RE: RPC Server is unavailable

    Posted Jun 22, 2006 07:12 PM
    If this is in the post-configuration status, then it's happening in the clients as a return code from calling (or setting up to call) the NetJoinDomain() API.

    Normally, this wouldn't have much to do with the domain controller - this error is almost always a local thing happening on the client machine. You can verify this by checking the %WINDOWS%\Debug\NetSetup.log file on your client machines after the error occurs and seeing what the last entry says. If the last entry in the log is for a call to NetJoinDomain() and it returns an error, Windows at least got that far and depending on what is in that log file we may be able to divine the cause.

    However, my conjecture is that you won't see even a call to NetJoinDomain(). Many Windows APIs have hidden, undocumented requirements that particular services be running in order for them to work. For instance, all a call to NetJoinDomain() does is in fact try and connect to a second service (in this case LanManWorkstation, these days more commonly known as the Workstation service) where the real implementation of this API call is.

    If the underlying service that the API needs is not running, or does not respond, then the caller of the API recieves that rather vague error. It's called "RPC" because it actually does use the same Remote Procedure Call system used to connect machines, but in this case it's actually trying to connect to a service process on the same machine.

    Now, the console client does internally start the Workstation service itself and waits for that service to report that it is running before it tries to do anything at all related to domains, and as far as I'm aware that has always been enough.

    Unfortunately these kind of startup-time problems are extremely difficult to debug, even when they occur for us under laboratory conditions. The first thing I'd probably look at is trying to reduce the number of services set to auto-start on the client machines to see if you and streamline the startup process (or even disable outright some services). Alternatively, if you can alter the client configurations in a way that makes the error happen repeatably we can probably reproduce the situation here in the lab.


  • 3.  RE: RPC Server is unavailable

    Posted Jun 23, 2006 08:57 AM
    Below are NetSetup.log files from a PC that failed due to RPC Server is unavailable and one that completed successfully. Thanks for your insight, this is a much better response than Symantec's Gold support gave me when I first started having this issue.

    Failure:
    06/23 08:19:16 -----------------------------------------------------------------
    06/23 08:19:16 NetpDoDomainJoin
    06/23 08:19:16 NetpMachineValidToJoin: 'IMAGINGRAK00016'
    06/23 08:19:16 NetpGetLsaPrimaryDomain: status: 0x0
    06/23 08:19:16 NetpMachineValidToJoin: status: 0x0
    06/23 08:19:16 NetpJoinDomain
    06/23 08:19:16 Machine: IMAGINGRAK00016
    06/23 08:19:16 Domain: WWCS.priv
    06/23 08:19:16 MachineAccountOU: (NULL)
    06/23 08:19:16 Account: (NULL)
    06/23 08:19:16 Options: 0xc1
    06/23 08:19:16 OS Version: 5.1
    06/23 08:19:16 Build number: 2600
    06/23 08:19:16 ServicePack: Service Pack 2
    06/23 08:19:16 NetpValidateName: checking to see if 'WWCS.priv' is valid as type 3 name
    06/23 08:19:17 NetpCheckDomainNameIsValid for 'WWCS.priv' returned 0x0
    06/23 08:19:17 NetpValidateName: name 'WWCS.priv' is valid for type 3
    06/23 08:19:17 NetpDsGetDcName: trying to find DC in domain 'WWCS.priv', flags: 0x1020
    06/23 08:19:17 NetpGetLsaPrimaryDomain: status: 0x0
    06/23 08:19:17 NetpDsGetDcName: found DC '\\wwoods-dc2.WWCS.priv' in the specified domain
    06/23 08:19:17 NetUseAdd to \\wwoods-dc2.WWCS.priv\IPC$ returned 1326
    06/23 08:19:17 Trying add to \\wwoods-dc2.WWCS.priv\IPC$ using NULL Session
    06/23 08:19:17 NetpJoinDomain: status of connecting to dc '\\wwoods-dc2.WWCS.priv': 0x0
    06/23 08:19:17 NetpGetLsaPrimaryDomain: status: 0x0
    06/23 08:19:17 NetpGetDnsHostName: Read NV Hostname: IMAGINGRAK00016
    06/23 08:19:17 NetpGetDnsHostName: PrimaryDnsSuffix defaulted to DNS domain name: WWCS.priv
    06/23 08:19:17 NetpLsaOpenSecret: status: 0xc0000034
    06/23 08:19:18 NetpGetLsaPrimaryDomain: status: 0x0
    06/23 08:19:18 NetpJoinDomain: w9x: status of validating account: 0x0
    06/23 08:19:18 NetpGetLsaPrimaryDomain: status: 0x0
    06/23 08:19:19 NetpSetLsaPrimaryDomain: for 'WWCS' status: 0x0
    06/23 08:19:19 NetpJoinDomain: status of setting LSA pri. domain: 0x0
    06/23 08:19:19 NetpJoinDomain: status of managing local groups: 0x0
    06/23 08:19:19 NetpSetNetlogonDomainCache: NlWriteFileForestTrustList failed: 0x6ba
    06/23 08:19:19 NetpJoinDomain: status of setting netlogon cache: 0x6ba
    06/23 08:19:19 NetpJoinDomain: initiaing a rollback due to earlier errors
    06/23 08:19:20 NetpJoinDomain: rollback: local group management: 0x0
    06/23 08:19:20 NetpSetLsaPrimaryDomain: for 'UNATTEND' status: 0x0
    06/23 08:19:20 NetpJoinDomain: rollback: status of setting NULL domain sid: 0x0
    06/23 08:19:20 NetpLsaOpenSecret: status: 0x0
    06/23 08:19:20 NetpJoinDomain: rollback: status of deleting secret: 0x0
    06/23 08:19:20 NetpJoinDomain: status of disconnecting from '\\wwoods-dc2.WWCS.priv': 0x0
    06/23 08:19:20 NetpDoDomainJoin: status: 0x6ba

    Success:
    06/23 08:22:59 -----------------------------------------------------------------
    06/23 08:22:59 NetpDoDomainJoin
    06/23 08:22:59 NetpMachineValidToJoin: 'IMAGINGRAK00017'
    06/23 08:22:59 NetpGetLsaPrimaryDomain: status: 0x0
    06/23 08:22:59 NetpMachineValidToJoin: status: 0x0
    06/23 08:22:59 NetpJoinDomain
    06/23 08:22:59 Machine: IMAGINGRAK00017
    06/23 08:22:59 Domain: WWCS.priv
    06/23 08:22:59 MachineAccountOU: (NULL)
    06/23 08:22:59 Account: (NULL)
    06/23 08:22:59 Options: 0xc1
    06/23 08:22:59 OS Version: 5.1
    06/23 08:22:59 Build number: 2600
    06/23 08:22:59 ServicePack: Service Pack 2
    06/23 08:22:59 NetpValidateName: checking to see if 'WWCS.priv' is valid as type 3 name
    06/23 08:23:00 NetpCheckDomainNameIsValid for 'WWCS.priv' returned 0x0
    06/23 08:23:00 NetpValidateName: name 'WWCS.priv' is valid for type 3
    06/23 08:23:00 NetpDsGetDcName: trying to find DC in domain 'WWCS.priv', flags: 0x1020
    06/23 08:23:00 NetpGetLsaPrimaryDomain: status: 0x0
    06/23 08:23:01 NetpDsGetDcName: found DC '\\wwoods-dc2.WWCS.priv' in the specified domain
    06/23 08:23:01 NetUseAdd to \\wwoods-dc2.WWCS.priv\IPC$ returned 1326
    06/23 08:23:01 Trying add to \\wwoods-dc2.WWCS.priv\IPC$ using NULL Session
    06/23 08:23:01 NetpJoinDomain: status of connecting to dc '\\wwoods-dc2.WWCS.priv': 0x0
    06/23 08:23:01 NetpGetLsaPrimaryDomain: status: 0x0
    06/23 08:23:01 NetpGetDnsHostName: Read NV Hostname: IMAGINGRAK00017
    06/23 08:23:01 NetpGetDnsHostName: PrimaryDnsSuffix defaulted to DNS domain name: WWCS.priv
    06/23 08:23:01 NetpLsaOpenSecret: status: 0xc0000034
    06/23 08:23:02 NetpJoinDomain: w9x: status of validating account: 0x0
    06/23 08:23:02 NetpGetLsaPrimaryDomain: status: 0x0
    06/23 08:23:03 NetpGetLsaPrimaryDomain: status: 0x0
    06/23 08:23:03 NetpSetLsaPrimaryDomain: for 'WWCS' status: 0x0
    06/23 08:23:03 NetpJoinDomain: status of setting LSA pri. domain: 0x0
    06/23 08:23:04 NetpJoinDomain: status of managing local groups: 0x0
    06/23 08:23:04 NetpJoinDomain: status of setting netlogon cache: 0x0
    06/23 08:23:05 NetpJoinDomain: status of setting ComputerNamePhysicalDnsDomain to 'WWCS.priv': 0x0
    06/23 08:23:05 NetpUpdateW32timeConfig: 0x0
    06/23 08:23:05 NetpJoinDomain: status of disconnecting from '\\wwoods-dc2.WWCS.priv': 0x0
    06/23 08:23:05 NetpDoDomainJoin: status: 0x0


  • 4.  RE: RPC Server is unavailable

    Posted Jun 23, 2006 08:25 PM
    > 06/23 08:19:19 NetpSetNetlogonDomainCache: NlWriteFileForestTrustList failed: 0x6ba

    Interesting. I've not seen this specific error before. If I had to guess, although the WorkStation service has started and we get into NetJoinDomain() correctly, this tiny inner sub-step at the end of the join process has bumped into a prerequisite service that hasn't yet started.

    In order to know more, I'll need to spend some quality time exploring the innards of NETAPI32.DLL to see what it's trying to do at this point. Then there is the question of exactly why your situation is different than most others - I've never heard this error reported to us before.

    Give me a day or so to do some research to determine the next stage in diagnosis.


  • 5.  RE: RPC Server is unavailable

    Posted Jun 24, 2006 12:30 AM
    It looks like my first guess was wrong, but I have an excuse since the logging code inside Windows for the error is somewhat misleading. :-)

    NlWriteForestTrustList () basically just writes the list of trusted domains to %WINDOWS%\system32\config\netlogon.ftj, which doesn't RPC anywhere. The error is actually coming from the call immediately before it inside NetpSetNetlogonDomainCache(), which populates the list of trusted domains by calling the documented API DsEnumerateDomainTrusts() which is also located in NETAPI32.DLL.

    DsEnumerateDomainTrusts() begins by doing a local RPC to the NetLogon service, to call on the real implementation which is in DsrEnumateDomainTrusts() in NETLOGON.DLL

    Now, the NetLogon service is special - it won't run until the machine is at least partially configured as a domain member, and it's NetJoinDomain() that takes care of setting it up and running it. So, it's unlikely that it's this RPC that is failing, it's probably one of the inner RPCs inside DsrEnumerateDomainTrusts().

    Fortunately there is detailed debug logging inside this code, but you need to enable it in the registry on the client machine. You can read about this at http://support.microsoft.com/Default.aspx?id=109626

    The DsrEnumerateDomainTrusts() log entries are classified as "misc" events, but you may as well set the flag to log everything temporarily just so we can get a bit more context. Turn on the DBFlag to at least 0xFF and then we can inspect the %WINDIR%\Debug\NetLogon.log file to see more.


  • 6.  RE: RPC Server is unavailable

    Posted Jun 26, 2006 09:40 AM
    When trying to recreate the issue with debugging turned on, I could not get a PC to generate that error! It seems to only occur immediately after we image a PC. I renamed 5 PCs (using a configuration task) 7 times each with no errors.

    I added the netlogon debug mode registry entry to the image itself so hopefully I can capture the error. I will update when I get more info.