Deployment Solution

 View Only

SplashDiag -Diagnostic Splash For LinPE 

May 27, 2011 11:05 AM

One thing which the Altiris automation environment lacks is a nice diagnostic splash screen for troubleshooting. At the moment, should you hit a problem when imaging you have to ferret around within Linux running a multitude of commands like fdisk, dsmeg, lspci & ifconfig just to figure out what's gone amiss and why.

Today's download aims to take a lot of that pain away. From today, your Linux automation can boot like this!

Pretty nifty. What we have between the yellow box markers is,

  1. Disk information (both fixed and removable)
  2. Basic Network Data
  3. Altiris Server Connectivity status 

At a glance you can now see a lot of what your Linux environment sees. This is great for those IT staff who are less familiar with Linux -they can now confidently determine if the computer,

  • has a valid IP address
  • can see the host's harddisks
  • can connect to the Altiris server

If something is amiss, they can now directly give you a lot of useful information which can only accelerate resolution for when something goes awry...

 

How to Install SplashDiag.sh

Installing SplashDiag into Linux Automation is pretty straightforward,

  1. Download ZIP file
    Download splash_diag.zip from this article and extract the file splash_diag.sh to your desktop
     
  2. Open up BootDisk Creator
    From the Deployment Console, open BootDisk Creator and expand the "Linux Additional Files" section
    Within the "Linux Additional Files" section, ensure you have the startup folder created. If not present right-click "Linux Additional Files" and select the option New -> Folder option from the context menu.




     
  3. Add Script to Linux Startup
    Add this articles shell script file here by right-clicking the startup folder and selecting the "Add File" option. Browse to the splash_diag.sh script to add it to the folder.

    Below I show a screenshot of a BDC screen which has a more involved startup folder. Here you can see I've got three shell scripts added;

    a-mountUSB.sh
    This is the shell script Mounts Removable Disks in Linux Automation to ensure that should a USB Flash drive be connected it is mounted to /mnt/usb.

    b-splash_diag.sh
    This is the shell script from today's download, but with a prefix of 'b-' added (which I explain below).

    menu.sh
    The shell script which calls ImageInvoker, a client-server app to allow the DS imaging process to be initiated from the client. 



    What I've done here then is to rename the splash-diag.sh script to b-splash-diag.sh -this it to ensure it executes after a-mountUSB.sh and before ImageInvoker's menu.sh. If you have other startup scripts, then have a think here about the ordering you want. Of course, you don't have anything pre-existing in startup, then just add splash_diag.sh as it is -no need to rename.
     
  4. Rebuild Linux automation
    Right-Click your target Linux automation configuration, and selecting the option to create your bootdisks.


     

If you use PXE and require SplashDiag there, then just regenerate your Linux PXE options using the PXE Configuration utility.

 

Testing SplashDiag

All you have to do now just boot from your newly created media, and as Linux automation loads you should see the screenshot shown at the beginning of today's article.

Now press ALT-F2 on your  keyboard and watch what happens.... With luck you'll see this.

 

A bonus feature of the SplashDiag output is that it is sent to both of Linux automation consoles, formally known as tty1 and tty2. To  switch back to the default tty1 console hit ALT-F1. This second output is handy as it keeps the diagnostic info presented cleanly on the alternate screen when it might be long lost due to scrolling on the default tty1 console.

 

SplashDiag -Helping you to Image From USB Sticks!!

A special mention now does a small piece of real estate on the above screenshot -the text (1st Disk) which appears highlighted in the top right hand corner of the screen. This text is there to inform you what the first fixed disk is. Now, why is this important? Well, when imaging or booting from USB sticks, some hardware can see the USB stick as the first disk and put it in the Disk 1 slot. Should you in this scenario deploy an image to your computer, your USB drive will be imaged, not the computer!!

So, this first disk tag acts as a warning. If the first fixed disk is not Disk 1, but instead refers to a flash drive, then take note that by default images will be deployed to your flash device. 

To assist you here SplashDiag stores in in /tmp/firstdisk.txt the Disk index for the first fixed disk. If you have this problem with imaging through USB sticks in your environment then just grab this index from the file and call RDeploy in a script referencing explicitly this disk index.

 

How SplashDiag Works

SplashDiag works by munging data from the following sources,

  1. Linux's FDisk utility
  2. The Linux Kernel's message buffer, dmesg
  3. The proc filesystem /proc
  4. The Linux network interface configuration utility, ifconfig
  5. The Linux lspci utility which lists details PCI bus device data 

 

Getting the Network Data

Let's first start which how we get all this network info. First the lspci utility can be used to great effect for zooming in on PCI device classes. For example, Ethernet controllers are a subclass of the network device class. As Network Conrollers have a class code of 0x02 and Ethernet Controllers have a sub-class code of 0x00, we can quickly list any ethernet presence on our PCI bus with the following command,

 /# lspci -n | grep "0200:" 

Where we've called lspci with the '-n- option to show numeric IDs. The output on my VM is,

 02:01.0 0200 8086:100f (rev 01) 

This might not look visually stunning, but each of these numbers tells us something, 

02 : the bus number the device is attached to
01 : the device number
.0 : PCI device function 

02 : Device Class ID
00 : Device Subclass ID

8086: Vendor  ID
100f : Vendor Device ID

If we now run lspci again in the more human mode (without the -n option) we can narrow down on the PCI bus entry directly for our controller,

 /# lspci | grep "02:01.0" 

And this gives us now the extra information of the Ethernet controller's name,

 02:01.0 Ethernet Controller: Intel Corporation 82545EM Gigabit Ethernet Controller 

 

Now we have the name of our ethernet card, what about it's IP and MAC address? For this, we turn to Linux's interface configuration utility ifconfig,

 /# ifconfig 

 

Which gives us the following output,

 

 eth0      Link encap:Ethernet  HWaddr 00:0C:29:1F:60:BF  
          inet addr:192.168.157.10  Bcast:192.168.157.255  Mask:255.255.255.0
          UP BROADCAST NOTRAILERS RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1395 errors:0 dropped:0 overruns:0 frame:0
          TX packets:809 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1717855 (1.6 Mb)  TX bytes:67270 (65.6 Kb)

lo        Link encap:Local Loopback 
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
 
Now, all the information we need is here, but it's a little hidden away. The MAC address appears after the HWaddr substring and the IP address appears after the inet addr substring. Luckily, for those occasions where we need to extract data from a regular text pattern we can turn to regular expressions. These expressions are code formulations which represent text patterns. By defining your text pattern in regular expression code, it is then a simple operation to scan any text and extract the data you need.
 
So, for example I'm looking for a pattern which,
  1. Looks for the string "eth" followed by a number
  2. Which is followed by some white space
  3. Which is followed by some text, leading up to a colon
  4. Which is followed by the word Ethernet
  5. Which is followed by some whitespace
  6. Which is followed by the text "HWaddr"
  7. Which is followed by some whitespace
  8. Which is followed by two letters/numbers
  9. Which is followed by a colon
  10. Which is followed by two letters/numbers
  11. Which is followed by a colon
  12. ...etc..

I should point out here that our pattern only has to match up to the limit of the data we are looking for -it behaves like a substring match to find your pattern anywhere within the text. So I start at the "eth" entry and end after the IP address is found.

For for example, some code which looks in the ifconfig output for the device name and hardware address might be as follows.

 

   if [[ "$output" =~ "(eth[0-9])[^:]*:Ethernet[[:space:]]*HWaddr[[:space:]]*(\w\w:\w\w:\w\w:\w\w:\w\w:\w) ]]
  then
    echo "IP Address is ${BASH_REMATCH[3]}"
    echo "MAC is :  ${BASH_REMATCH[2]}"
  fi 
 

Which is a pretty complex regular expression. In the if clause we have on the left of the regular expression operator =~ the input to scan, and to the right of the operator we have the programmatic pattern we want to search for -our regular expression. Here's some simple tips to help you decipher the code,

  • [0-9] matches any number
  • [^:] matches any character that isn't a colon
  • [^:]* matches any number of consecutive characters that are not colons
  • [[:space:]] matches a space
  • \w matches a character which is either a letter or number


If you are not familiar with regular expressions and want to know more I've included a couple of references at the end of this download to help you get started.

The important characters for me in the regular expression are the brackets ( ) which surround specific entries of interest. The tell the regular expression engine that when if finds a match to the regular expression within the string that to keep hold of the matches it finds within these brackets so that we can retrieve them later.

So for example I've bracketed "eth[0-9]" so that the ethernet device gets pushed into the variable ${BASH_REMATCH[1]}. I've then bracketed the hardware address and IP address entries in the same way to retrieve as bash rematch variables. This is what then allows me to store just the specific info I need from the output.

 

Getting the Disk Data

Getting all the pieces to data together to correlate the disk information was actually quite a challenge. I for example wanted the user to see,

  1. What any presented disk actually is. For example is it a Seagate or Western digital disk, or a USB flash device.
  2. How big it is
  3. Where in the  disk order it lies

I also had on my wish list for the user to be able to see the partition structures, but I didn't manage to get that far in the time I had. And as I haven't actually needed that info I've not yet reserved the time to add it in.

The first task of finding out what a disk actually is turned out to be quite the challenge. After rummaging through Linux Automation's sock drawers for a little while, I found part of my answer in Linux's filesystem interface into the kernel -the /proc filesystem. In particular the file /proc/scsi/scsi

 

 #/ more /proc/scsi/scsi

Attached devices:

Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA Model: VMware Virtual I Rev: 0000
  Type: Direct-Access ANSI SCSI revision: 05

Host: scsi0 Channel: 00 Id: 01 Lun: 00
  Vendor: NECVMWar Model: VMware IDE CDR01 Rev: 1.00
  Type: CD-ROM ANSI SCSI revision: 05

Host: scsi3 Channel: 00 Id: 00 Lun: 00
  Vendor: Generic Model: Flash Disk Rev: 1.68
  Type: Direct-Access ANSI SCSI revision: 02
 

So here I have a link between the hierarchical SCSI addressing scheme (adapter number, channel number, id number & LUN) and the devices Model information as received by the SCSI 'INQUIRY' command.

So here we can see that we have a VMWare IDE disk on scsi address 0:0:0:0 and a Flash disk on 3:0:0:0.

This addressing scheme is also used by the Linux kernel when presenting messages about device attachment. This messages help tell us where the device is attached in the /dev/ virtual filesystem.

 

 #/ dmesg | grep "Attached" | grep "0:0:0:0"
sd 0:0:0:0 [sda] Attached SCSI disk

#/ dmesg | grep "Attached" | grep "3:0:0:0"
sd 0:0:0:0 [sda] Attached SCSI removable disk

 

So this gives us our link between disk number (i.e. whether it's sda, sdb, sdc etc) and what the device actually is. A quick fdisk query can then tell us the disk's size,
 
 /# fdisk -l | grep "Disk /dev/sda"
Disk /dev/sda: 8589 MB, 8589934592 bytes

/# fdisk -l | grep "Disk /dev/sdb"
Disk /dev/sdb: 39949 MB, 3994025984 bytes
 
 
Programmatically, we can attack this with loops and regular expression matching. This can help us merge the results of a fdisk with those we already have from dmesg and /proc/scsi/scsi. The objective is to link device number to it's name and size.
 
Finally, as in Altiris terms we talk of disk numbers rather than letters, we lastly converting the disk letters to numbers (a=1, b=2 etc) to make the resulting data that little bit more presentable.
 
And I have to say if you got to reading this far give yourself a pat on the back! The last section could never be claimed to be easy going ;-) 
 
Kind Regards,
Ian./

 

References

Statistics
0 Favorited
0 Views
1 Files
0 Shares
0 Downloads
Attachment(s)
zip file
splash_diag.zip   1 KB   1 version
Uploaded - Feb 25, 2020

Tags and Keywords

Comments

Jun 23, 2011 04:07 AM

Just to let you know I'm still thinking on the best way to add ennumeration in to maintain the disk ordering -it's not as simple as changing the 's' to 'h' in the regular expressions unfortunately. Will provide an update when I come up with something....

Just some live working... to keep this in my mind...

 

 
 
 sd 0:0:0:0: [sda] Attached SCSI disk (USB DRIVE)
  Disk /dev/sda: 40.0 GB, 40060403712 bytes
  Device     Boot      Start         End      Blocks     Id  System
  /dev/sda1            1             5174     39115408+  7   HPFS/NTFS


sd 1:0:0:0: [sdb] Attached SCSI removable disk (USB FLASH DISK)
   Disk /dev/sdb: 262 MB, 262144000 bytes
   Device     Boot      Start         End      Blocks     Id  System
   /dev/sdb1   *        2             16000    255984     b   W95 FAT32

hda: 1024MB ATA Flash Disk, ATA DISK drive
hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4
hda: MWDMA2 mode selected
hda: max request size: 128KiB
hda: 2001888 sectors (1024 MB) w/1KiB Cache, CHS=1986/16/63
hda: cache flushes not supported
hda: hda1 hda2
   
   Disk /dev/hda: 1024 MB, 1024966656 bytes
   Device     Boot      Start         End      Blocks   Id  System
   /dev/hda1   *        1             1984     999904+  7   HPFS/NTFS
   /dev/hda2            1985          1985          63  45  Unknown
 

Jun 16, 2011 08:21 AM

Ian,

I welcome your work in the extra info that this splash-screen provides.

The diskinfo is not recognising the local disk on my HP thinclient T5730. I attached the output of dmesg, showdisk and fdisk for you. The disk in this hardware is not /dev/sda* but /dev/hda*.

The attached zip-file T5730_1.zip contains the output of a vanilla T5730.

The attached zip-file T5730_2.zip contains the output of a T5730 client with a connected USB stick and a connected USB disk.

Eventually I want to use your generated /tmp/firstdisk.txt as a parameter to my rdeploy task.

Maybe you can have a look at it.

Jun 16, 2011 05:42 AM

Amazing!! I couldn't actually find a computer here which mounted disks under /dev/hd ! Thanks for the logs -will take a look.

Kind Regards,
Ian./

Jun 12, 2011 02:29 PM

Hi Kyle -thinking about doing something similar for WinPE as it's always a pain to figure out if the NIC/Mass-storage stuff is loaded properly there too.

Other fish to fry first... but will get there!

 

Jun 01, 2011 01:20 PM

More fine work as always Ian!  We don't use Linux automation here (yet...there is talk of a Linux workstation build!), but I'm sure this would be a big help to those who do.

Related Entries and Links

No Related Resource entered.