Archive for the ‘Systems Management’ Category

Debian-Powered Drupal Configuration Policy

Thursday, October 4th, 2012

By Elizabeth Krumbach

LinuxForce’s web hosting services are designed to provide our customers with the benefits of strong security, simple on-going administration and maintenance, and support for the web software to work well in a large number of situations including multi-site support for applications.

We achieve these objectives by operating a Debian infrastructure with close adherence to Debian’s policy, web application policy and PHP policy.

In particular, we use the Debian package for Drupal 6. This provides community supported upgrades with a strong, well-documented policy. Like other software which ships in Debian, the software version is often somewhat older than the most recent upstream release, but it is regularly patched for security by the maintainer and the Debian security team.

In addition, this infrastructure also offers:

  • Only one package needs to be upgraded whenever there is a security patch, instead of every site individually
  • Strong separation from site specific themes, modules, libraries, uploaded files helps keep files you want to edit away from the Drupal core files (which you don’t want to edit)
  • The automation the infrastructure provides disk, memory, and sysadmin time to be minimized, thus reducing costs
  • The benefits of code maturity, as the Debian Drupal maintainers have thought through many boundary cases which it would take our staff time and trial-and-error to re-discover

Site layout

Experienced Drupal admins may find some of the file locations for the package confusing at first, so let’s clarify the differences to ease the transition.

Our infrastructure supports multiple sites per server. This is implemented by providing each site with their own dbconfig.php and settings.php files, plus the following directories:

  • files/
  • themes/
  • modules/
  • libraries/

All of these are configurable by the user, and are located in /etc/drupal/6/sites/drupal.example.com/. The site also inherits the contents of these default directories from core Drupal.

Access to these files via FTP is discussed below.

Core Drupal files are located in /usr/share/drupal6/ and are shared between all the Drupal sites on the server. They should not be edited since all changes to these files will be lost upon upgrade of the Debian Drupal package.

Users & Permissions

A jailed userdrupal FTP account is created to manage the files for the Drupal install.

The userdrupal account is the owner of all the configurable files located in /etc/drupal/6/sites/drupal.example.com/

The system-wide www-data user must have access to the following by having the www-data group own them and be given the appropriate permissions:

  • Read access to dbconfig.php
  • Write access to the files/ directory (this is where Drupal typically stores uploaded files)

All files (with the exception of dbconfig.php) should be writable by userdrupal, the userdrupal group itself is enforced by the system but all users should be configured client-side to respect group read-write permissions to maintain strict security, this is a umask of 002.

Additional Users and non-Drupal Directories

If there is a non-developer who needs to have access to the files directory, for instance, a specific FTP user for that use may be created and added to the userdrupal group.

For ease of administration, if multiple users exist and we are able to support an ssh account (static IP from client required) for handling administration, sudo can be configured to allow said user to execute commands on behalf of the userdrupal ftp user. Remember to add “umask 002” to the .bashrc to respect group read-write permissions.

If additional directories outside the Drupal infrastructure are required for a site, they will be placed in /srv/www/example.com/ and a separate jailed ftp user created to manage the files here. If a cgi-bin is required, it will be placed in /srv/www/example.com/cgi-bin/

Caveats and Cautions

Because of shared use for core Drupal files located in /usr/share/drupal6/ you may experience problems with the following:

  • Some drush commands and plugins assume the drupal files are editable files within the same document root as the rest of your files, thus may not work as expected
  • When you upgrade the drupal6 package all sites are upgraded at once without testing, customers who are concerned about the impact of the upgrade changes and require very high uptimes can be accommodated by testing the site with the upgraded versions of PHP and Drupal in a testing VM
  • Since the Debian package name does not change, you cannot install the drupal6 package from an older Debian version alongside one from a current version, a separate virtualized Debian environment may be needed for testing upgrades if support is uncertain (this issue does not exist with upgrades from drupal6 to drupal7, as they can be installed alongside each other)
  • A policy for handling root level .htaccess files should be developed if they wish to be used

Conclusion

Although the Debian way differs from the Drupal tarball approach, it makes it possible to scale the service to many sites saving disk, memory, and sysadmin effort. By leveraging this Drupal infrastructure provided by Debian, Linuxforce provides one-off Debian package deployments to dedicated systems, shared arrangements for small businesses who are running several sites, and infrastructure deployments for businesses who provide hosting services. We also offer a boutique hosting service for select customers on one of our systems.

5 things about FOSS Linux virtualization you may not know

Tuesday, July 17th, 2012

By Elizabeth Krumbach

In January I attended the 10th annual Southern California Linux Expo. In addition to speaking and running the Ubuntu booth, I had an opportunity to talk to other sysadmins about everything from selection of distribution to the latest in configuration management tools and virtualization technology.

I ended up in a conversation with a fellow sysadmin who was using a proprietary virtualization technology on Red Hat Enterprise Linux. Not only did he have surprising misconceptions about the FOSS (Free and Open Source Software) virtualization tools available, he assumed that some of the features he was paying extra for (or not, as the case may be) wouldn’t be in the FOSS versions of the software available.

Here are five features that you might be surprised to find included in the space of FOSS virtualization tools:

1. Data replication with verification for storage in server clusters

When you consider storage for a cluster there are several things to keep in mind:

  • Storage is part of your cluster too, you want it to be redundant
  • For immediate failover, you need replication between your storage devices
  • For data integrity, you want a verification mechanism to confirm the replication is working

Regardless of what you use for storage (a single hard drive, a RAID array, or an iSCSI device), the open source DRBD (Distributed Replicated Block Device) offers quick replication over a network backplane and verification tools you can run at regular intervals to ensure deta integrity.

Looking to the future, the FOSS distributed object store and file system Ceph is showing great promise for more extensive data replication.

2. Automatic failover in cluster configurations

Whether you’re using KVM Kernel-based Virtual Machine or Xen, automatic failover can be handled via a couple of closely integrated FOSS tools, Pacemaker and Corosync. At the core, Pacemaker handles core configuration of the resources themselves and Corosync handles quorum and “aliveness” checks of the hosts and resources and logic to manage moving Virtual Machines.

3. Graphical interface for administration

While development of graphical interfaces for administration is an active area, many of the basic tasks (and increasingly, more complicated ones) can be made available through the Virtual Machine Manager application. This manager uses the libvirt toolkit, which can also be used to build custom interfaces for management.

The KVM website has a list of other management tools, ranging from command-line (CLI) to Web-based: www.linux-kvm.org/page/Management_Tools

As does the Xen wiki: wiki.xen.org/wiki/Xen_Management_Tools

4. Live migrations to other hosts

In virtualized environments it’s common to reboot a virtual machine to move it from one host to another, but when shared storage is used it is also possible to do live migrations on KVM and Xen. During these live migrations, the virtual machine retains state as it moves between the physical machines. Since there is no reboot, connections stay intact and sessions and services continue to run with only a short blip of unavailability during the switch over.

Documentation for KVM, including hardware and software requirements for such support, can be found here: www.linux-kvm.org/page/Migration

5. Over-allocating shared hardware

KVM has the option to take full advantage of hardware resources by over-allocating both RAM (with adequate swap space available) and CPU. Details about over-allocation and key warnings can be found here: Overcommitting with KVM.

Conclusion

Data replication with verification for storage, automatic failover, graphical interface for administration, live migrations and over-allocating shared hardware are currently available with the FOSS virtualization tools included in many modern Linux distributions. As with all moves to a more virtualized environment, deployments require diligent testing procedures and configuration but there are many on-line resources available and the friendly folks at LinuxForce to help.

An Infrastructure for Server Clusters for High Availability

Thursday, May 17th, 2012

By Elizabeth Krumbach

As announced in our Cluster Services Built With FOSS post, LinuxForce’s Cluster Services are built exclusively with Free and Open Source Software (FOSS). Here is an expanded outline of the basic architecture of our approach to High-Availability (HA) clustering.

Overview

In any HA deployment there are two main components: hosts and guests. The hosts are the systems which are the core of the cluster itself. The host runs with very limited services dedicated for the use and functioning of the cluster. The host systems handle resource allocation, from persistent storage to RAM to the number of CPUs each guest gets. The host machines give an “outside” look at guest performance and give the opportunity to manipulate them from outside the guest operating system. This offers significant advantages when there are boot or other failures which traditionally would require physical (or at least console) access to debug. The guests in this infrastructure are the virtual machines (VMs) which will be running the public-facing services.

On the host, we define a number of “resources” to manage the guest systems. Resources are defined for ping checking the hosts, bringing up shared storage or storage replication (like drbd) as primary on one machine or the other and launching the VMs.

In the simplest case, the cluster infrastructure is used for new server deployments, in which case the operating system installs are fresh and the services are new. More likely a migration from an existing infrastructure will be necessary. Migrations from a variety of sources are possible including from physical hardware, other virtualization technologies (like Xen) or different KVM infrastructures which may already use many of the same core features, like shared storage. When a migration is required downtime can be kept to a minimum through several techniques.

Hardware configuration

The first consideration when you begin to build a cluster is the hardware. The basic requirement for a small cluster is 3 servers and a fast dedicated network backplane to connect the servers. The three servers can all be active as hosts, but we typically have a configuration where two machines are the hosts and a third, less powerful arbitrator system is available to make sure there is a way to break ties when there is resource confusion.

Two live resource hosts

These systems will be where the guests are run. They should be as similar as possible down to the selection of processor brand and amount of RAM and storage capabilities so that both machines are capable of fully taking over for the other in case of a failure, thus ensuring high availability.

The amount of resources required will be heavily dependent upon the services you’re running. When planning we recommend thinking about each guest as a physical machine and how many resources it needs, allowing room for inevitable expansion of services over time. You can over-commit both CPU and RAM on KVM, so you will want to read a best practices guide such as Chapter 6: Overcommitting with KVM. Disk space requirements and configuration will vary greatly depending upon your deployment, including the ability to use shared storage backplanes and replicated RAID arrays, but Linux Software RAID will typically be used for the core operating system install controlling each physical server. Additionally, using a thorough testing process so you know how your services will behave if they run out of resources is critical to any infrastructure change.

Tie-breaker (arbitrator)

A third server is required to complete quorum for the cluster. In our configuration this machine doesn’t need to have high specs or a lot of storage space. We typically use at least RAID1 so we have file system redundancy for this host.

1000M switch

A fast switch whose only job is to handle traffic between the three machines is highly recommended for assured speed of these two vital resources:

  1. Storage backplane
  2. Corosync/Pacemaker communication

It’s best to keep these off a shared network, which may be prone to congestion or failure, since fast speeds for both these resources are important for a properly functioning cluster.

Key software components

There are many options when it comes to selecting your HA stack, from which Linux distribution to use, to what storage replication system to use. We have selected the following:

Debian GNU/Linux

Like most LinuxForce solutions, we start with a base of Debian stable, currently Debian “Squeeze” 6.0. All of the software mentioned in this article comes from the standard Debian stable repository and is open source and completely free of charge.

Logical Volume Manager (LVM)

We use LVM extensively throughout our deployments for the flexibility of easy reallocation of filesystem resources. In a cluster infrastructure it is used to create separate disk images for each guest and then may be used again inside this disk image for partitioning.

Distributed Replicated Block Device (DRBD)

DRBD is used for replicating storage between the two hosts which have their own storage. Storage needs could also be met by shared storage or other data replication mechanisms.

Kernel-based Virtual Machine (KVM)

Since hardware-based virtualization is now ubiquitous on modern server hardware we use KVM for our virtualization technology. It allows fully virtualized VMs running their own unmodified kernels to run directly on the hardware without the overhead of a hypervisor or emulation.

Pacemaker & Corosync

Pacemaker and Corosync are used together to do the heavy lifting of the cluster management. The two services are deeply intertwined, but at the core Pacemaker handles core configuration of the resources themselves, and Corosync handles quorum and “aliveness” checks of the hosts and resources and determination of where resources should go.

Conclusion

We have deployed this infrastructure for mission critical services including DNS, FTP and web server infrastructures serving everything from internal ticketing systems to high-traffic public-facing websites. For a specific example of implementation of this infrastructure, see Laird Hariu’s report File Servers – The Business Case for High Availability where he covers the benefits of HA for file servers.

Image in this post by Jeannie Moberly, licensed under CC BY-SA 3.0.

Cluster Services Built With FOSS

Thursday, January 12th, 2012

By Elizabeth Krumbach

Built on the Free/Open Source Software (FOSS) model for cluster deployments, LinuxForce staff has been hard at work over the past months developing and deploying LinuxForce Cluster Services built upon exclusively FOSS technologies and on December 15th we put out a press release:

Announcing LinuxForce Cluster Services

In September Laird Hariu wrote the article “File Servers – The Business Case for High Availability” where, in addition to building a case to use clusters, he also briefly outlined how Debian and other FOSS could be used to create a cluster for a file server. File servers are just the beginning, we have deployed clusters which host web, mail, DNS and more.

The core of this infrastructure uses Debian 6.0 (Squeeze) 64-bit and then depending upon the needs and budget of the customer, and whether they have a need for high availability, we use tools including Pacemaker, Corosync, rsync, drbd and KVM. Management of this infrastructure is handled remotely through the virtualization API libvirt using the virsh and Virtual Machine Manager.

The ability to use such high-quality tools directly from the repositories in the stable Debian distribution keeps our maintenance costs down, avoids vendor lock-in and gives companies like ours the ability offer these enterprise-level clustering solutions to small and medium size businesses for reasonable prices.

File Servers – The Business Case for High Availability

Tuesday, September 13th, 2011

By Laird Hariu

Introduction

You have probably heard of high availability transaction processing servers.  You have most likely read about the sophisticated systems used by the airlines to sell tickets online.  They have to be non-stop because downtime translates to lost orders and revenue.  In this article I will discuss the economics of using non-stop technologies for everyday applications.  I will show that even ordinary file sharing applications can benefit from inexpensive Linux based Pacemaker clustering technology.

Availability Goal

What is our availability goal?  Our goal should be to take prudent and cost effective measures to reduce computer downtime to nil in the required service window.  I’m not talking about 99.999 % (five 9s) up time.  This is the popular (and very expensive) claim made by high availability vendors.  I’m talking about maintaining enough up time to service the application.  Take a simple example, for office document preparation the service time window is office hours (9-5).  The rest of the time the desktop PCs can be turned off, nobody is there to operate them anyway.  You only need the PCs for 5 days a week for 8 hours a day or for 2080 hours per desktop PC per year.  This translates into an up time requirement of 24 percent.   Ideally you want the desktop PCs to be available all the time during office hours but are willing to give up availability for routine maintenance and for the infrequent breakdowns that may occur only once per workstation every five years or so.  Perhaps you have two spare desktop PC workstations for every 100.  This extra capacity allows your office workers to resume their work on a spare while their workstation is being repaired.  In this example the cost of maintaining adequate availability is the cost of maintaining two spare desktop PCs.  You might adjust this cost to account for real world conditions at the work site.  Wide swings in operating temperature or poor quality electricity supply, might dictate that you increase the number of spare PCs.  Sounds like a low stress, straightforward availability solution.

Network Effects

The problem gets more complicated when the desktop PCs are networked together and all the documents are stored on a central file server rather than on each workstation’s hard drive.  There is a multiplicative effect.  If the file server is not available then all 100 document processing PCs are rendered unavailable.  Then you have 98 (remember the 2 spares from above) workers being paid but not producing documents.  A failure during office hours can become expensive.  One hour of downtime can cost as much as $1500 in lost worker wages.  A day of downtime can cost $12,000 of lost worker wages.  How long will it take for a hardware repair person to travel to your site?  How long will it take for spare parts to arrive?  How long will it take the repair person to replace the parts?  How long will it take for damaged files to be replaced from backup by your own people?  A serious but not unlikely failure can take several days to be completely resolved.  Its not unreasonable to assume that such a $24,000 failure can occur once every 5 years.  This is a very simple example.  We are not talking about a complicated order-entry or inventory control system.  We are talking about 98 office workers saving files to a central file share so that they can be indexed and backed up.

The Effects of Time

I’m going to add another wrinkle to our office document processing example.  This file sharing setup has been in use for 4 years.  Time flies.  The hardware is getting old faster than you realize.  Old hardware is more likely to fail.  It has been through more thunderstorms, more A/C breakdowns, people knocking the server by accident and all that.  You’ve been noticing that your hardware maintenance plan is costing more every year.  How long is the hardware vendor going to stock spare parts for your obsolete office equipment?  Please forgive me for playing on your paranoia but the real world can be rude.

Time for an Upgrade

In this scenario you conclude that you are going to have to replace that file server soon.  Its going to be a pain to migrate all the files to a new unit.  I am going to have to upgrade to a new version of Windows server.  How much is that going to cost?  How much has Windows changed?  If I am going to have to go to all this trouble, why not get some new improvement out of it.  I know I can get bigger disks and more RAM (random access memory) for less money than I paid for the old server.  Whoops.  Windows is going to cost more.  I have to pay a charge for every workstation attached to it.  That CAL (Client Access License) price has gone up.  I read something about high availability clustering in Windows.  Enterprise Server does that.  Wow.  Look at the price of that!  Remember that $12,000 per day of downtime cost overhang?  It’s more of an issue now that you are dealing with an old system.

A Debian Cluster Solution

Enough of this already.  Since I asked so many questions and raised so many doubts, I owe you, the reader, some answers.  Debian Linux provides a very nice high availability solution for file servers.  You need two servers with directly attached storage and also a third little server that can be little more than a glorified workstation.  You need Debian Squeeze 64 bit edition that has the Pacemaker, Corosync, drbd and Samba packages installed for each server.  The software is free.  You pay for the hardware and a trustworthy Linux consultant who can set everything up for you.  What you get is a fully redundant quorum cluster with fully redundant storage, multiple CPU cores on each node, much more RAM than you had before and much more storage capacity.

Here are hardware price estimates:

Tie breaker node: Two hard drives, 512MB RAM $500.00
Name brand file server node: 8 2TB SATA drives, 24GB RAM, 1 4 core CPU chip,  3 year on site parts and labor warranty. $6,000.00
Second file server node like above. $6,000.00
Misc parts for storage and control networks. $200.00
Total: $12,700.00

Each file server node has software RAID 5 and each node holds 14 terabytes of disk storage.  Because it is completely redundant across nodes, total cluster storage capacity is 14 terabytes.  Performance of this unit will be much better than the old unit.  It effectively has 4 CPUs per file storage node and much more RAM for file buffering.  Software updates from Debian are free.  You just need someone to apply the security patches and version upgrades.

The best feature is complete redundancy for file processing.  In our file server example, any one of the nodes can completely fail and file server processing will continue.  Based on the lost labor time cost estimates above, this system pays for itself if it eliminates 1 day of downtime in a five year period.  You also have hardware maintenance savings of whatever the yearly charge is for your old system times 3 years because you get 3 years of warranty coverage on the new hardware.  You have the consultant’s charges for converting to the new system, but remember, you were going to have to pay that fee for a new Windows system as well.

Conclusion

I hope I have stirred your interest in Linux Pacemaker based clusters.  I have shown a file server upgrade that pays for itself by reducing downtime.  You also upgrade your file server’s performance while reducing out of pocket expenses for software and hardware maintenance.  Not a bad deal.

One way to migrate Xen virtual machines to KVM in Debian

Thursday, November 18th, 2010

By Elizabeth Krumbach

There are dozens of virtualization products on the market. When we launched our first high-availability cluster in early 2008 we chose Xen due to it’s ability to run on non-virtualized hardware, support in Debian 4.0 (Etch) and general flexibility. We’ve learned a lot about the upstream of Xen, including the challenges that Debian maintainers face, and we were increasingly drawn to another free and open source virtualization technology, Kernel Based Virtual Machine (KVM). The primary downside to KVM is that it requires special CPU hardware support to run, but this hardware support is now almost ubiquitous on modern servers. KVM has the advantage of being supported upstream in the Linux kernel itself, removing the onus of difficult kernel patching from the Debian Developers and has become the supported virtualization option for Ubuntu, Fedora and Red Hat. Additionally, KVM allows the guests to run unaltered, meaning you don’t need a special kernel and can run many OSes, from Linux to FreeBSD to Windows 7.

We still work on a number of machines which lack hardware virtualization support, but as our customers upgrade hardware we’ve begun moving production virtual machines from Xen to KVM. In tackling the migrations of these production virtual machines we encountered several challenges, the major ones being:

  • In Xen, partitions were created in separate logical volumes on the host and mounted by Xen itself and as a result we didn’t require Logical Volume Manager (LVM) within our Xen guests, in KVM the logical volume on the host for a virtual machine is a single disk image, not separate partitions.
  • In Xen, the kernel package is not installed, in KVM it is required
  • In Xen, you don’t have an independent bootloader on the OS, in KVM you need one to boot

The first step was to create a partition table on the new KVM image which is identical to the one on Xen. We wanted to use LVM within the guest, which required a Matryoshka (Russian) doll approach. First we’d create a volume group on the host to give us the typical flexibility of LVM host-side, and then we’d create one on the disk image giving us the flexibility required within the guest itself to expand any partitions. Finally we’d need to bootstrap the new system and copy the files over.

One way to go about all of this is a manual process. This solution would allow for scripting the procedure but requires a significant investment of time to get everything right so there is the least amount of down time possible. Since we only have a half dozen of these VMs to migrate in total, we looked for some way which already existed for handling all these steps in a familiar way, which is when we looked to the Debian Installer. Assuming a local mirror of core Debian packages (recommended), we could install a skeleton system on a test IP address which was properly partitioned, had LVM configured and bootstrapped in less than 20 minutes. We could then take this skeleton system that we’re sure is an functioning, bootable system and move the files over from the Xen installs, arguably decreasing our risk with each migration and the downtime required.

To get started, you’ll first need to calculate the total size of the current VM and lvcreate a slice of that size on the KVM system, then launch the Debian installer against the image using virt-install, something like:

virt-install --connect qemu:///system -n guest.example.com -r 768 -f /dev/mapper/VolumeGroup-guest.example.com -s 12 -c /var/lib/libvirt/images/debian-506-i386-netinst.iso --vnc --noautoconsole --accelerate --os-type linux --os-variant debianLenny

In this example we assume the debian-506-i386-netinst.iso is in /var/lib/libvirt/images/ and we want 768M of RAM, the information you put here is similar to the information that you would have previously defined in your /etc/xen/guest.example.com.cfg file for Xen.

Then use virt-manager to connect (we actually connected from a remote desktop running virt-manager) to the running install session (the standard Debian installer does not provide serial access) and install Debian, you will need the root password to launch the installer. Proceed to install.

When you get to the step to partition the disks, lay out the partition table to be identical to the VM you want to migrate to it except put it on LVM, put /boot on a separate partition outside of LVM. Complete the install, including installing grub.

Confirm that the system will boot and works on a test IP address and make a copy of your /etc/fstab to the host system, you’ll need this later.

You now have a skeleton install of Debian which runs on KVM with the LVM partitions you need.

To begin the actual data migration, you’ll want to mount the volumes within your new KVM disk, this can be done with the help of a great little mapping tool called kpartx. To map and activate the volume group on the guest, following these steps:

kpartx -av /dev/VolumeGroup/guest.example.com
vgscan
vgchange -ay guest

In this example we assume the Linux Volume on the host is called “guest.example.com” and the Linux Volume within the guest is called “guest”.

Now that the host can see the Volume Group on the guest, and all the Volumes in /dev/mapper/ you’ll want to mount them.

Once mounted, you’ll be able to start your rsync. To incur the least amount of downtime, you’ll want to rsync the large data partitions (like /srv, /home, /var, perhaps /usr) while your production host is still running. Note: All rsyncs completed during this process must be done with the “–numeric-ids” option so the permissions are not inherited from the host!

While you’re rsyncing the data, you’ll want to go into your Xen system and install the following packages (installing these will no impact your Xen system, it doesn’t strictly use them so it will simply ignore them):

  • linux-image-2.6-686
  • grub
  • lvm2

When you’ve completed moving the largest portions of your Xen guest, bring down the production Xen guest (downtime starts now!) and mount the filesystems. And begin rsyncing the data over (preferably over a crossover cable for the fastest transfer, remember to use –numeric-ids in your rsync).

Once the rsync is complete, edit the following files on the KVM host:

  • /mnt/guest/etc/fstab – use the version you saved to the host in a previous step
  • /mnt/guest/etc/inittab – uncomment the ttyS0 line to allow for serial access from the KVM host for virsh
  • /mnt/guest/etc/udev/rules.d/70-persistent-net.rules – comment out eth0 line so eth0 can come up on the new KVM MAC address

Unmount the filesystems on both sides and on the KVM side disable the volume group and use kpartx again to unmap the filesystem from the host:

sudo vgchange -an guest
sudo kpartx -dv /dev/VolumeGroup/guest.example.com

You’re now ready to boot the VM on the KVM side. This can be done with virt-manager or virsh.

Since you just moved the machines to a new server, and probably new MAC addresses, you will probably need to run the arping command to reclaim the IP address of the VM and all its service IP addresses.

Some things to confirm are working:

  • Networking (confirm there are no lingering arp caching issues)
  • Email (where applicable, confirm system messages etc are being sent)
  • All services running (confirm key services, review monitoring dashboard, log in via ssh)
  • Confirm that you have contiguous logs

Now that we’ve completed one of these migrations we have a lot of ideas about how to improve the process, including the possibility of making the whole process more scriptable, but this quick method leveraging the Debian installer for easier disk configuration and bootstrapping worked very well.

Licensing Considerations When Integrating FOSS and Proprietary Software

Friday, October 29th, 2010

By CJ Fearnley

Recently, I was looking for resources to help explain the implications of integrating FOSS (Free and Open Source Software) with proprietary software. This question is important for any organization who might want to build an “embedded” or “dedicated” system or product which might include either their own proprietary software or third party applications. I discovered that most of the information available about FOSS licensing addresses this issue rather obliquely. This post will cover the basics so that such organizations can see how straightforward it would be to include FOSS in their projects. Standard disclaimers about my not being a lawyer apply.

Use of any software system requires understanding the software licensing involved. There are many dozens of FOSS licenses that could apply in your situation, so understanding the details is necessary to assess compatibility. In broad terms, these can be effectively summarized by the Debian Free Software Guidelines and The Open Source Definition. Any software that complies with those specifications will freely (in both the “freedom” and money senses of the word) permit running proprietary and FOSS applications on the same system. This is easier said than done. Since Debian has analyzed most common FOSS and quasi-FOSS licenses, their archive and their license page can be used to assess how the license might work in your situation (for example, anything in “main” would be OK for co-distribution and running in mixed environments). So we always start with Debian’s meticulous and well-documented analysis to assess any license.

From a high-level perspective, the requirements that FOSS licensing will impose on an organization wanting to include it with proprietary software will primarily be in the form of providing appropriate attribution (acknowledgement) and providing the source code for any FOSS software (including any modifications) shipped as part of an integrated system. Typically the attribution requirement can be satisfied by simply referencing the source code of the FOSS components. The source code requirement can be met by providing the source code for all of the FOSS distributed as part of the integrated system, for example, by placing it on a documented ftp site.

There are some subtle issues that may arise during the software development process if proprietary and FOSS code are “linked” together. In such cases it is necessary to ensure that code over which one wants to assert proprietary control is only combined with FOSS code that explicitly supports such commingling. So it is necessary for a company wanting to keep its code proprietary to keep it “separate” from the FOSS code used in the integrated system. The use of the words “linked” and “separate” is intentionally vague because the terms of each FOSS license will need to be examined by legal counsel to understand the precise requirements. In these situations, there are license compliance management systems that can be adopted to help ensure that this separation is maintained.

Those are the basic issues. The references below describe these and related issues involved with Linux & FOSS licensing in much more depth:

Please Document the Shop: On the importance of good systems documentation

Friday, May 21st, 2010

By Laird Hariu

We have all heard this: You need to document the computer infrastructure. You never know when you might be “hit by a bus”. We hear this and think many frightening things, reassure ourselves that it will never happen and then put the request on the back burner. In this article I will expand on the phrase “hit by a bus” and then look at the consequences.

Things do happen to prevent people from coming into work. The boss calls home. Talks to the wife and makes the sad discovery that Mike wont be coming in anymore. He passed away last night in bed. People get sudden illnesses that disable them. Car accidents happen.

More often than these tragedies occur, thank goodness, business conditions change without warning. In reorganizations whole departments disappear, computer rooms are consolidated and moved, companies are bought and whole workforces replaced. I have had the unhappy experience of living through some of this.

Some organizations have highly transient workforces because of the environment that they operate in. Companies located near universities benefit from an influx of eager young, upwardly mobile university graduates. These workers are eager to gain experience but soon find higher paying jobs in the “real world” further away from campus. These companies have real turnover problems. People are moving up so quickly, they don’t have time to write things down.

Even when you keep people in place and maintain a fairly stable environment, people discover that what they have documented in their heads can just fade away. This is getting to be more and more of an issue. Networks and servers and other such infrastructure functions have been around for 20 years in many organizations. Fred the maintainer retired five years ago. Fred the maintainer was transferred to sales. The longer systems are around, the more things can happen to Fred. Fred might be right where he was 20 years ago. He just can’t remember what he did.

What does all this mean? What are the consequences of losing organizational knowledge in a computer organization? To be blunt, it creates a hideous environment for your computer people. The system is a black box to them. They are paralyzed. They are rightfully afraid. Every small move they make can bring down the system in ways they cannot predict. Newcomers take much longer to train. Old-timers learn to survive by looking busy while doing nothing. The politics of the shop and the whole company is made bloody by the various interpretations of the folklore of the black box. He/she who waves their arms hardest rules the day. This is no way for your people to live.

This is no way for the computer infrastructure to live as well. While the games are played the infrastructure evolves more slowly and slowly. Before long the infrastructure is frozen. Nobody dares to touch it. The only way to fix it is to completely replace it at considerable expense. In elaborate infrastructures this is easier said than done. The productive lifetime of the platform is shortened. It was not allowed to grow and evolve to lengthen its lifetime. Think of the Hubble Telescope without all the repairs and enhancements over the years. It would have burned out in re-entry long ago.

Having made my case, I ask again; for your own good, please document the shop. Make these documents public and make them accurate. Record what actually is rather than what you wish it to be. It is better to be a little embarrassed for a short while than to be mislead later on. Update the documentation when changes occur. An out of date document can be as bad as no document at all. Make an effort to record facts. At the same time don’t leave out general philosophies that guided the design and other qualitative information because it helps your successors interpret the facts when ambiguities occur.

Think of what you leave behind. Persuade your boss to make this a priority as well. Hopefully the people at your next workplace will do the same.

The Nature and Importance of Source Code and Learning Programming with Python

Thursday, February 25th, 2010

By CJ Fearnley

Last year a client asked us for advice on getting started with programming. So I thought I’d share some thoughts about programming, its relationship with FOSS (Free and Open Source Software) management and why Python is a good language for learning programming including some great on-line resources. But first I want to make sure our business-oriented readers understand the nature and importance of source code.

The “source” aka “the code” provides a language in which computer users can create or change software. One does not have to be a programmer to work on the code. In fact, every computer user is, ipso facto, a programmer! Menus, web interfaces, and graphical user interfaces (GUIs) are some of the more facile “languages” for computer programming that everyone, even children, can readily learn and use. Of course, building complex software systems requires a more expressive specification language than a web form, for instance, can provide.

Although all computer software is specified with source code, FOSS systems are unique in that the source code is made available with the software. In contradistinction, software lock-in or vendor lock-in describes the unfortunately all too common practice of many organizations to block access to their source code.

Having access to the source code provides huge operational benefits. For one, the source can be used to understand how the software works: it is a form of software documentation (indeed, it is the most definitive form of software documentation possible!). Also, code can be easily changed to add diagnostics or to test a possible solution to a problem or to modify or add functionality. In addition, the source is a language both for specifying features to the computer and for discussing computing with others. So most mature FOSS languages have vibrant support communities in which one can participate, learn and get help.

The source is a tool: a powerful, multi-purpose, critically important tool.

Since LinuxForce focuses on FOSS, we are able to take full advantage of the availability of the code. We are always working with the source! Since most of our work is systems administration, we usually “program” configuration files. However, we also write systems software and scripts and we support software developers extensively, so we have a persistent, deep, and productive relationship with code.

But what to suggest to someone like our customer who wants to learn programming?

I remembered seeing a blurb in Linux Journal referencing an article they published in May 2000 by Eric Raymond entitled "Why Python" which argues persuasively for the virtues of the programming language Python. I had often felt that Perl‘s idiosyncrasies made it difficult to use, so Eric’s critique of Perl and accolades for Python were convincing to me. In addition, I follow FOSS mathematics software and I was aware that Sage is a Python “glue” to more than fifty FOSS math libraries. I’ve been meaning to look into Python so I could use Sage. Another pull comes from my work at LinuxForce where we use a lot of Python-based software including mailman, fail2ban, Plone, and several tools used for virtual machine management such as kvm, virtinst and xen-tools. Python has a huge software repository and community. So one is likely to find good libraries to build upon (thus avoiding the extra learning curve of building everything from scratch). Python is an interpreted language which makes it easier to debug and use so the learning process is smoother.

To finish the recommendation, I just needed to find some on-line resources. First, Kirby Urner suggested these two: Wikieducator’s Python Tutorials and "Mathematics for the Digital Age and Programming in Python". Then, I checked out the Massachusetts Institute of Technology’s (MIT) OpenCourseWare which provides extensive course materials for many of their classes (I’ve already watched the full video set for a couple of MIT’s courses including the legendary Walter Lewin’s "Classical Mechanics" and have been very impressed by the quality and content of their materials). After nearly 30 years of introducing students to programming with Scheme, MIT switched to Python in 2008! The materials for their introductory Python-based course "6.00 Introduction to Computer Science and Programming" are very thorough, accessible and helpful. Their free on-line materials include the full video lectures of the class plus assignments, sample test problems, class handouts, and an excellent Readings section with references to "the Python Tutorial" and a very good free on-line textbook "How to Think Like a Computer Scientist: Learning with Python".

In conclusion, if you or anyone you know wants to learn how to program computers, I recommend starting with Python using MIT’s on-line course materials supplemented with the other on-line resources mentioned above (and summarized in the table below). I’ve now watched more than half of the videos from the MIT 6.00 course and I’ve worked through several of their assignments: this is a great course! Even with nearly three decades experience programming including a couple of college-level courses in the 1980s, I’m finding the class is more than just good review for me: I’ve learned a few new things (in particular, dynamic programming and the knapsack problem). Python’s clean syntax and elegant design will help as one delves into writing code for the first time. Its extensive libraries and repositories will support the application of one’s newly acquired computing skills to solve problems in the area of the student’s special interests whatever they may be … and that’s the way we learn best: by doing something that we personally care about!

Summary of On-Line Resources for Learning Python

Some thoughts on best practices for SMTP blocking of e-mail spam

Thursday, January 21st, 2010

By CJ Fearnley

Blocking e-mail spam at the time of SMTP (Simple Mail Transfer Protocol) transfer has become a best practice. There is no point wasting precious bandwidth & disk space and spending time browsing a huge spambox when most of the incoming flow is clearly spam. At LinuxForce our e-mail hygiene service, LinuxForceMail, makes extensive use of SMTP blocking techniques (using free and open source software such as Exim, Clam AV, SpamAssassin and Policyd-weight). But we are extremely careful to only block sites and e-mails that are so “spammy” that we are justified in blocking it. That doesn’t prevent false positives, but it keeps them to a minimum.

Recently we investigated an incident where one of our users had their e-mail blocked by another company’s anti-spam system. In investigating the problem, we learned that some vendors support an option to block e-mail whose Received header is on a blacklist (in our case it was Barracuda, but other vendors are also guilty). Let me be blunt: this is boneheaded, but the reason is subtle so I can understand how the mistake might be made.

First, blocking senders appearing on a blacklist at SMTP time is good practice. But to understand why blocking Received headers at SMTP time is bad, it is important to understand how e-mail transport works. The sending system opens a TCP/IP connection from a particular IP address. That IP address should be checked against blacklists. And other tests on the envelope can help identify spam. But the message headers including the Received header are not so definite. We shall see that even a blacklisted IP in these headers may be legitimate. So blocking such e-mail incurs unnecessary risks.

The problem occurs when a user of an ISP (Internet Service Provider) sends an e-mail from home, they are typically using a transient, “dynamic” IP address. Indeed it is possible that their IP address has just changed. Since the new address may have been previously used by someone infected with a virus sending out spam, this “new” IP address may be on the blacklists. So, due to no fault of your own, you have a blacklisted IP address (I will suppress my urge to rant for IPv6 when everyone can finally have their own IP address and be responsible for its security).

Now, when you send an e-mail through your ISP’s mail server, it records your (blacklisted) IP as the first Received header. So your (presumably secure) system sending a legitimate message through your ISP’s legitimate, authenticating mail server is blacklisted by your recipients’ overambitious anti-spam system. Ouch. That is why blocking such an e-mail is just wrong. This kind of blocking creates annoying unnecessary complications for the users and admins at both sides. Using e-mail filtering to put such e-mails into a spam folder would be a reasonable way to handle the situation. Filtering is able to handle false positives whereas blocking generates unrecoverable errors.

Do not block e-mail based on the Received header!