How to Enable SSL for CSWapache2

If you’ve spent any time at all around Solaris 10, you know that Sun has invested a fair amount of effort developing a pretty snazzy Service Management Facility (SMF). It is extremely flexible and feature rich, but it’s not quite as strait forward as the old legacy /etc/init.d scripts. If you’re running the OpenCSW Apache package, it installs a Service Manifest into the SMF, so you’ll have to edit this to run Apache SSL… Here’s how:


# svccfg

svc:> select cswapache2
svc:/network/http:cswapache2> listprop httpd/ssl

httpd/ssl  boolean  false

svc:/network/http:cswapache2> setprop httpd/ssl=true
svc:/network/http:cswapache2> exit

Now, make the changes active:


# svcadm disable cswapache2
# svcadm enable cswapache2
# svcprop -p httpd/ssl svc:/network/http:cswapache2

false

# svcadm refresh cswapache2
# svcprop -p httpd/ssl svc:/network/http:cswapache2

true

ZoneType.sh Version 2.0

We just started supporting Solaris 10 in our VMware cluster so I had to update my zone type script to detect if the OS is running there. I’m not sure how I feel about depending on the output of ptrdiag since the interface is labeled “unstable”, but it works for now, and I really don’t see Sun changing the first line of output where the system configuration is listed. Anyhow, when issued with the -v or –vmware flag, the script returns 0 if it’s running on the cluster and 1 if it is not.

Usage:

# zonetype.sh -g or –global
Return 0: The machine is a global zone with 1 or more local zones
Return 1: The machine is not a global zone

# zonetype.sh -l or –local
Return 0: The machine is a local zone
Return 1: The machine is not a not a local zone

# zonetype.sh -v or –vmware
Return 0: The machine is running on a VMware hypervisor
Return 1: The machine is not running in VMware

#! /bin/bash
#
# When issued with the -g or --global flag, this script will return:
# 0 if the machine is a global zone and has one or more local zones. 
# Otherwise, it will return 1
#
# When issued with the -l or --local flag, this script will return:
# 0 if if is a local zone and 1 if it is not
#
# When issued with the -v or --vmware flag, this script will return:
# 0 if it is a vmware host and 1 if not.
#

list=( `/usr/sbin/zoneadm list -civ | awk '{ print $1 }'`)

  case "$1" in
    -g|--global)
        # If the third element in our array is null, set it to 0
        if [ "${list[2]}" == ""  ]; then
        list[2]=0
        fi
        # This is a global zone only if it has one or more local zones.
        if [ ${list[1]} -eq 0 ] && [ ${list[2]} -ge 1 ]; then
        # 1 is returned if we have a global and local zone, 
        # otherwise, we return 0
                exit 0
            else
                exit 1
        fi
              ;;
    -l|--local)
        # If the second element in our array is = or > 1, it is a local zone.
        if [ ${list[1]} -ge 1 ]; then
        # Return 1 if this is a local zone, otherwise return 0.
                exit 0
            else
                exit 1
        fi

              ;;
   -v|--vmware)
        # Don't run our check on local zones... Prtdiag can't run there
        if [ ${list[1]} != 0 ]; then
                exit 1
           else 
                vmhost=( `/usr/sbin/prtdiag | grep System | awk '{ print $5 }'`)
                if [ $vmhost == VMware ]; then
                        #If the host is running on the vmware cluster return 0, 
                        # otherwise, return 1
                        exit 0
                else
                        exit 1
                fi
        fi
              ;;
        *)
        echo "Usage: /local/adm/zonetype.sh {-l | --local | -g | --global | -v | --vmware}"
        exit 1
  esac

Script to Determine Solaris 10 Zone Type

We use a lot of local zones in our Solaris 10 environment. We also use cfengine pretty heavily and there are some instances when we need to include or exclude certain automated tasks based on what type of zone we are working with. I wrote this little script that checks to see what type of zone we are dealing with. Based on the return value, I can set a cfengine class and control what gets run and where.

  • Return 0 if the machine is a global zone with 1 or more local zones
  • Return 1 if the machine is either a local zone or a global zone with 0 local zones
#! /bin/bash
#
# When issued with the -g or --global flag, this script will return:
# 0 if the machine is a global zone and has one or more local zones.
# Otherwise, it will return 1
#
# Wen issued with the -l or --local flag, this script will return:
# 0 if if is a local zone and 1 if it is not
#

list=( `/usr/sbin/zoneadm list -civ | awk '{ print $1 }'`)
  case "$1" in
    -g|--global)
        # If the third element in our array is null, set it to 0
        if [ "${list[2]}" == ""  ]; then
        list[2]=0
        fi
        # This is a global zone only if it has one or more local zones.
        if [ ${list[1]} -eq 0 ] && [ ${list[2]} -ge 1 ]; then
        # 1 is returned if we have a global and local zone, otherwise, we return 0
                exit 0
            else
                exit 1
        fi
              ;;
    -l|--local)
        # If the second element in our array is = or > 1, it is a local zone.
        if [ ${list[1]} -ge 1 ]; then
        # Return 1 if this is a local zone, otherwise return 0.
                exit 0
            else
                exit 1
        fi
              ;;
        *)
        echo "Usage: /local/adm/zonetype.sh {-l | --local | -g | --global}"
        exit 1
  esac

Replace Failed SVM Mirror Drive

So you have used SVM to mirror your disk, and one of the two drives fails. Aren’t you glad you mirrored them! You don’t have to do a restore from tape, but you are going have to replace the failed drive.

Many modern RAID arrays just require you to take out the bad drive and plug in the new one, while everything else is taken care of automatically. It’s not quite that easy on a Sun server, but it’s really just a few simple steps. I just had to do this, so I thought I would write down the procedure here.

Basically, the process boils down to the following steps:

  • Detach the failed meta devices from the failed drive
  • Delete the meta devices from the failed drive
  • Delete the meta databases from the failed drive
  • Unconfigure the failed drive
  • Remove and replace the failed drive
  • Configure the new drive
  • Copy the remaining drive’s partition table to the new drive
  • Re-create the meta databases on the new drive
  • Install the bootblocks on the new drive
  • Recreate the meta devices
  • Attach the meta devices

Let’s look at each step individually. In my case, c0t1d0 has failed, so, I detach all meta devices on that disk and then delete them:


# metadetach -f d0 d2
# metadetach -f d10 d12
# metadetach -f d40 d42
# metaclear d2
# metaclear d12
# metaclear d42

Next I take a look at the status of my meta databases. Below we can see the the replicas on that disk have write errors:

# metadb -i
        flags           first blk       block count
     a m  p  luo        16               8192            /dev/dsk/c0t0d0s3
     a    p  luo        8208             8192            /dev/dsk/c0t0d0s3
     W    p  luo        16                8192            /dev/dsk/c0t1d0s3
     W    p  luo        8208            8192            /dev/dsk/c0t1d0s3
 r - replica does not have device relocation information
 o - replica active prior to last mddb configuration change
 u - replica is up to date
 l - locator for this replica was read successfully
 c - replica's location was in /etc/lvm/mddb.cf
 p - replica's location was patched in kernel
 m - replica is master, this is replica selected as input
 W - replica has device write errors
 a - replica is active, commits are occurring to this replica
 M - replica had problem with master blocks
 D - replica had problem with data blocks
 F - replica had format problems
 S - replica is too small to hold current data base
 R - replica had device read errors

The replicas on c0t1d0s3 are dead to us, so let’s wipe them out!


# metadb -d c0t1d0s3
# metadb -i

        flags           first blk       block count
     a m  p  luo        16               8192            /dev/dsk/c0t0d0s3
     a    p  luo        8208             8192            /dev/dsk/c0t0d0s3

The only replicas we have left are on c0t0d0s3, so I’m all clear to unconfigure the device. I run cfgadm to get the c0 path:


# cfgadm -al

Ap_Id                          Type         Receptacle   Occupant     Condition
c0                             scsi-bus     connected    configured   unknown
c0::dsk/c0t0d0                 disk         connected    configured   unknown
c0::dsk/c0t1d0                 disk         connected    configured   unknown
c0::dsk/c0t2d0                 disk         connected    configured   unknown
c0::dsk/c0t3d0                 disk         connected    configured   unknown
c1                             scsi-bus     connected    configured   unknown
c1::dsk/c1t0d0                 CD-ROM       connected    configured   unknown
usb0/1                         unknown      empty        unconfigured ok
usb0/2                         unknown      empty        unconfigured ok
usb1/1.1                       unknown      empty        unconfigured ok
usb1/1.2                       unknown      empty        unconfigured ok
usb1/1.3                       unknown      empty        unconfigured ok
usb1/1.4                       unknown      empty        unconfigured ok
usb1/2                         unknown      empty        unconfigured ok

I run the following command to unconfigure the failed drive:


# cfgadm -c unconfigure c0::dsk/c0t1d0

The drive light turns blue
Pull the failed drive out
Insert the new drive

Configure the new drive:


# cfgadm -c configure c0::dsk/c0t1d0

Now that the drive is configured and visible from within the format command, we can copy the partition table from the remaining mirror member:


# prtvtoc /dev/rdsk/c0t0d0s2 | fmthard -s - /dev/rdsk/c0t1d0s2

Next, I install the bootblocks onto the new drive:


# installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c0t1d0s2

Create the state replicas:


metadb -a -c 2 c0t1d0s3

Recreate the meta devices:

metainit -f d2 1 1 c0t1d0s0
metainit -f d12 1 1 c0t1d0s1
metainit -f d42 1 1 c0t1d0s4

And finally, reattach the metadevices which will sync them up with the mirror.


metattach d0 d2
metattach d10 d12
metattach d40 d42

Install Solaris Package in Alternate Base Directory

Unless you specify a different administrative file, the pkgadd command reads “/var/sadm/install/admin/default”, which specifies the base directory as “/opt”. Do not change the settings in this file, but rather create a custom admin file and enter an alternate “basedir” directive if you want to install your package into a different directory. We are going to install our package into “/var/applications”, and call our custom admin file “custom”.

First, create and edit “/var/sadm/install/admin/custom”, adding a line similar to this:
basedir=/var/applications/$PKGINST

Next, issue the pkgadd command with the “-a” flag to call you alternative admin file:

pkgadd -d device -a custom PackageName

This really comes in handy when your customers want to retain control over their packages, but you don’t want to give them access to write packages into the system area. More detailed instructions can be found here.

X11 Forwarding Broken on Solaris

If you’re running Solaris 8 or 9 and an upgrade results in broken SSH X11 forwarding, the problem may be Sun’s socfs bug. The symptom will be SSH’s failure to set the $DISPLAY variable and an error in your system log looking something like this:

Jun 3 09:40:24 servername sshd[26432]: [ID 800057 auth.error] error: Failed to allocate internet-domain X11 display socket.

To fix this, you can either install Sun’s latest socfs patch for your version of the OS, or simply force sshd into IPv4 mode by doing the following:

Edit you sshd_config file, adding the following:

# IPv4 only
ListenAddress 0.0.0.0

Edit your sshd startup script to issue a “-4″ to sshd on start:

case "$1" in
'start')
echo 'starting ssh daemon'
/usr/local/sbin/sshd -4
;;

Restart sshd, and that should pretty much do it… Enjoy.

Sun Project Blackbox – Datacenter in a Can

Lots of small companies want to hire an IT department in a can… You know, the ones who hire only one person to run their Linux servers, code their websites, architect their networks, support their users and order more printer toner. It’s a hard job, but it’s pretty common to see them advertised. What I never dreamed I would see is an entire data center in a can… Literally, in a can… Or at least a shipping container, which is really not that far off.

Leave it to Sun though. Not only have they packed an entire datacenter into a shipping container, they have packed a really good datacenter into a shipping container. Complete with integrated power, cooling, fire suppression, cable managment and redundant everything, this little server room-in-a-box has it all. They even showed off how tough it is by putting it through an earthquake!

All told, I really like the idea of my brand new datacenter rolling in on the back of a tractor-trailer truck. It kinda reminds me of the setup the bad guys had in latest Die Hard movie. I just hope nobody buys one and hires only one person to run it.