Attending Debian Day and DebConf10 Next Week

By CJ Fearnley

Since I’ve been involved with Debian GNU/Linux for over 15 years, it is exciting that I will be able to attend the first two and a half days of DebConf10 including Debian Day from Sunday to Tuesday August 1–3.

I am particularly looking forward to the following sessions: Pedagogical Freedom: Debian, Free Software, and Education, Beyond Sharing: Open Source Design What are the challenges for the collaborative design process?, FLOSS Manuals: A Vibrant Community for Documentation Development, Bits from the DPL, The Java Packaging Nightmare, Collaboration between Ubuntu and Debian, How We Can Be the Silver Lining of the Cloud, Enterprise Infrastructure BOF How enterprise technologies such as Kerberos, LDAP, Samba, etc can work better together in Debian, Using Debian for Enterprise Infrastructure Stanford University: A Case Study, and more (see the schedule for each day).

I’m also hoping I can also attend on Thursday when the math and science focused sessions will be held, but I’ll have to see how next week’s schedule works out in the office. If you are coming to DebConf, I’ll see you there!

Beyond the Cloud: The Comprehensive Flexibility of FOSS May Bring Clearer Skies

By CJ Fearnley

Last week’s InformationWeek has a good article on cloud computing, Cloud ROI: A Grounded View.  It seems that even with all the hype (or because of it?) most are not “running blindly” to adopt “the cloud”.  I must admit the cloud metaphor has a powerful poetic charm to it.  That is probably why it has grabbed the attention of so many over the past few years. Everything in our world is ephemeral, so there is an aptness to the concept of a “cloud”. Moreover, I too like and use cloud analogies. But I am now looking for clearer skies!  Here is a short list of my gripes about "the cloud":

  • What does “cloud computing” mean? It isn’t at all clear! Here is some data: CIO magazine cites a Forrester report that says "the number one challenge in cloud computing today is determining what it really is". CIO also reported on a McKinsey study that "found 22 separate definitions of cloud computing"! And that leads to my second point:
  • The word “cloud” is so … vacuous and amorphous …  ”A cloud:  it looks like Zeus!” only to transform in shape before your very eyes “Wait, now it looks like Aphrodite!” … and then its gone!  Is this the kind of model people should entrust with their business data?  It has no structural stability:  inherently:  it is just rapidly moving gases … far out of reach … away in the sky. What kind of business model is that?
  • Although RADLab (Reliable Adaptive Distributed Systems Laboratory) has put out some interesting papers, I was a little surprised when I read their acknowledgements in the CACM article A View of Cloud Computing.  It reads like a who’s who in cloud computing: Amazon, Google, Microsoft, Sun, eBay, Cloudera, Facebook, and so on. The original Berkeley paper has a shorter list of cloud companies funding their work. I’m sure they are maintaining their academic integrity, but it does show that they are not wholly independent. Remember what Kitty Foyle said:

    I’ve taught myself a lesson, or I hope I have: when I find myself thinking something I stop a minute and ask myself, Now who had it all figured out beforehand that was the way they wanted me to think?
    — From Christopher Morley’s novel “Kitty Foyle

  • Although the Berkeley papers raise a number of very interesting issues, none of them requires vacuous meaningless jargon to further obscure the subtleties and complexities of emerging technology trends! So my final gripe is that the name “cloud” tends to obscure what is really important even when I agree with “the cloud thinkers”.

Perhaps the most important issue “the cloud people” are missing is what might be called comprehensive flexibility. As a user of software technology, I want my computing functionality everywhere … in every imaginable format. For example, I’d like to be able to use the software that I’ve invested the time to learn to be available on my desktop (32-bit, 64-bit, Mac, Windows, or Linux), and I’d like it to work whether the Net is working or not, on my cell phone and other portable devices (again with network or not), in the data center (clustered or not), in the WAN (Wide Area Network, note that the Internet is our shared, global WAN), perhaps distributed among several hosting providers, and perhaps even provided by “utilities” (to save the trouble of maintenance and scaling costs). But I think software should be so flexible that it can live in each of those environments. Talk about utility computing: wouldn’t software have so much more utility if it worked everywhere instead of being beholden to whatever your software provider offers or what hardware you happen to have in front of you right now?

Fortunately this type of flexible software does exist. It is called Free and Open Source Softare (FOSS) and it is becoming ubiquitous. In fact, whether you know it or not, you are using FOSS software: Apache, the FOSS web server, runs this web site and indeed the majority of all web sites. Wordpress, the blogging software we use here is also “everywhere” and you can purchase it from “cloud” utility providers or install, run, and modify it yourself. The list of important FOSS software goes on and on and this blog is dedicated to helping elucidate its importance as well as the issues involved in managing it.

So I would argue that instead of letting our heads go to the “clouds” we need to ask how can we make software that works in all environments, on all hardware, and for all people? … how can we make software that is comprehensively flexible?

Please Document the Shop: On the importance of good systems documentation

By Laird Hariu

We have all heard this: You need to document the computer infrastructure. You never know when you might be “hit by a bus”. We hear this and think many frightening things, reassure ourselves that it will never happen and then put the request on the back burner. In this article I will expand on the phrase “hit by a bus” and then look at the consequences.

Things do happen to prevent people from coming into work. The boss calls home. Talks to the wife and makes the sad discovery that Mike wont be coming in anymore. He passed away last night in bed. People get sudden illnesses that disable them. Car accidents happen.

More often than these tragedies occur, thank goodness, business conditions change without warning. In reorganizations whole departments disappear, computer rooms are consolidated and moved, companies are bought and whole workforces replaced. I have had the unhappy experience of living through some of this.

Some organizations have highly transient workforces because of the environment that they operate in. Companies located near universities benefit from an influx of eager young, upwardly mobile university graduates. These workers are eager to gain experience but soon find higher paying jobs in the “real world” further away from campus. These companies have real turnover problems. People are moving up so quickly, they don’t have time to write things down.

Even when you keep people in place and maintain a fairly stable environment, people discover that what they have documented in their heads can just fade away. This is getting to be more and more of an issue. Networks and servers and other such infrastructure functions have been around for 20 years in many organizations. Fred the maintainer retired five years ago. Fred the maintainer was transferred to sales. The longer systems are around, the more things can happen to Fred. Fred might be right where he was 20 years ago. He just can’t remember what he did.

What does all this mean? What are the consequences of losing organizational knowledge in a computer organization? To be blunt, it creates a hideous environment for your computer people. The system is a black box to them. They are paralyzed. They are rightfully afraid. Every small move they make can bring down the system in ways they cannot predict. Newcomers take much longer to train. Old-timers learn to survive by looking busy while doing nothing. The politics of the shop and the whole company is made bloody by the various interpretations of the folklore of the black box. He/she who waves their arms hardest rules the day. This is no way for your people to live.

This is no way for the computer infrastructure to live as well. While the games are played the infrastructure evolves more slowly and slowly. Before long the infrastructure is frozen. Nobody dares to touch it. The only way to fix it is to completely replace it at considerable expense. In elaborate infrastructures this is easier said than done. The productive lifetime of the platform is shortened. It was not allowed to grow and evolve to lengthen its lifetime. Think of the Hubble Telescope without all the repairs and enhancements over the years. It would have burned out in re-entry long ago.

Having made my case, I ask again; for your own good, please document the shop. Make these documents public and make them accurate. Record what actually is rather than what you wish it to be. It is better to be a little embarrassed for a short while than to be mislead later on. Update the documentation when changes occur. An out of date document can be as bad as no document at all. Make an effort to record facts. At the same time don’t leave out general philosophies that guided the design and other qualitative information because it helps your successors interpret the facts when ambiguities occur.

Think of what you leave behind. Persuade your boss to make this a priority as well. Hopefully the people at your next workplace will do the same.

Organizations Learning to Contribute to FOSS “The Right Way”

By Elizabeth Krumbach

A couple weeks ago I wrote that I would be attending the 4th Annual Linux Foundation Collaboration Summit. I wrote about much of my experience there and at the Open Source Business Conference back in March over in my personal blog: “Lessons from Open Source Business Conference and the Linux Foundation Collaboration Summit”.

However, I also wanted to make a post here to cycle back to some of what I learned from the Collaboration Summit in relation to my March 30th post about contributing, “How and why contributing to FOSS can benefit your organization”. In this post I discussed using community tools, getting involved in the community and what steps you could take to get there. This was based upon several years of my own involvement in the FOSS (Free/Open Source Software) community directly and now my experience working for a company which makes FOSS contributions.

The talks at the Collaboration Summit strengthened my resolve in and increased the clarity of my understanding about the right way of going about contributing to FOSS as a company. At this conference there were multiple talks from major companies and figures within the FOSS business world which drove home the need for working with the community. All of these companies had stories about how they had tried to contribute to FOSS and struggled because they went about contributing as a walled off company rather than contributing just like other contributors did and using the same tools that contributors did.

A keynote which really stood out and succinctly discussed all of this was Dan Frye’s talk, “10+ Years of Linux at IBM” (video). The first half of the keynote discusses the progress of Linux within IBM, but then he moves into discussing contributing itself. Some of their take-aways were that they needed to get involved directly with small contributions and do away with closed-door meetings and canned corporate responses, IBM employees were empowered to become community members. They needed to learn to collaborate with the community to develop higher quality solutions than they could have in-house, and to start these discussions with the community early in the brainstorming process. Related to collaboration, he also discusses control, and how a company does not have it within a community and needs to learn to deal with that, instead what a company should strive for is influence within a project to help guide direction and priorities. He also suggests never creating a project. Instead he encourages companies to join a project that’s close to what they need and work with them to take it in a direction that can benefit everyone and reach their goals and scratch their itches.

What struck me most at the conference regarding the subject of contributing is they are all reaching the same conclusions about the proper ways to successfully contribute. In the end, they learned that they must fully collaborate openly throughout development with the open source communities they’re working with.

Attending the Linux Foundation Collaboration Summit 2010

By Elizabeth Krumbach

On the heels of the 5th Annual Emerging Technologies for the Enterprise Conference (ETE 2010) in Philadelphia that CJ attended last week, I’ll be attending the 4th Annual Linux Foundation Collaboration Summit tomorrow through Friday in San Francisco.

The Linux Foundation Collaboration Summit is an exclusive, invitation-only summit gathering core kernel developers, distribution maintainers, ISVs, end users, system vendors and other community organizations for plenary sessions and workgroup meetings to meet face-to-face to tackle and solve the most pressing issues facing Linux today.

My attendance will be in my capacity as a member of the Ubuntu Community Council as well as my role as a Debian Systems Administrator. As such, my attention will be split at the summit between community and governance interests, like the FOSSBazaar Workgroup and Josh Berkus’ How to Prevent Community: Making Sure Your Pond Stays Small, and talks and panels like Does Open Source Mean Open Cloud? where Ubuntu founder Mark Shuttleworth will be a panelist, and the Linux Standard Base Workgroup and Virtualization discussions.

It’s shaping up to be an exciting summit, if you are also attending be sure to say “Hello”!

Anticipating the Emerging Technologies for the Enterprise (ETE 2010) Event

By CJ Fearnley

I will be attending the 5th Annual Emerging Technologies for the Enterprise Conference (ETE 2010) this Thursday and Friday, April 8-9, 2010. The event is billed for “developers, architects, and IT executives” and attempts to provide a dynamic forum for “emerging technology and Open Source”.

I look forward to seeing Robert C. (Uncle Bob) Martin’s keynote on “Bad Code, Craftsmanship, Engineering, and Certification”, a panel discussion on “Open source is a commercial enterprise”, another panel on “Social Media: Why should I care?”, a second Bob Martin presentation on “Agility and Architecture”, Mary Poppendieck on “Cost Center Disease”, Bonnie Aumann on managing developers, Michael Coté’s keynote on “The Pragmatic Cloud”, Geir Magnusson Jr. on “Project Voldemort”, and Brian McCallister on “Failure Happens” (one of the very few talks on systems administration). Then there’s an interesting panel on “Battle of the Frameworks II” (its predecessor the ETE 2008 “Web Framework Shootout” is on-line in two parts I (here) and II (here). Hopefully this year people will respect each others’ frameworks more and have a mature discussion about the tradeoffs that each incurs. I was impressed with Marjan Bace, the moderator, for helping facilitate some reasonable comments amidst too much hyperbole and for brining the discussion to an effective conclusion). Finally, I think I’ll attend the presentations by Molly Holzschlag on “Demystifiying HTML5″, David A. Black on some CS (computer science) precepts, and Audrey Troutt on “Influencing your way to agile”.

It looks like it will be an engaging two-day event. I’m looking forward to meeting many leaders in the local Philly and broader FOSS (Free and Open Source Software) technology community and getting to downtown Philly for some out of the office learning and networking.

While I’m mentioning events, for those who do not know, I moderate the Q&A for the first Wednesday of the month meeting of the Philadelphia Linux User’s Group (PLUG) which will be on “Functional Programming Using Haskell” this month. It is going to be a busy week! If you plan to attend either event, I’ll see you there.

How and why contributing to FOSS can benefit your organization

By Elizabeth Krumbach

At first glance, the ecosystem in the Free and Open Source Software (FOSS) world can seem a bit complicated. There are several ways to get software: project websites where you can download it directly, use a software management tool that your Linux distribution provides, or you may also be able to install a Linux distribution that includes everything you need right out of the box! Once you understand this ecosystem, you can find where your contributions would be most useful, and why contributing is beneficial to your organization and the FOSS community.

So, where does this all begin? FOSS often originates with a project which maintains the source code for the software and provides its own development and support infrastructure.

A Linux distribution is a carefully culled collection of software from these upstream projects which makes a complete operating system and even includes a lot of application software. This collection of software is tested and prepared to run securely and maintainably together. Debian is built upon this model.

Some distributions of Linux use Debian as a source project unto itself. There are a number of Linux distributions based on Debian, including the popular KNOPPIX and Ubuntu distributions. Being “based on Debian” can mean several things, but it primarily means they draw from the software repository at some point in the release cycle, and they use the Advanced Packaging Tool (apt) to manage this software. In these cases Debian is an intermediary between the original FOSS project and the “children” distributions which may also pull from original software projects to expand upon what Debian provides to target their particular focus.

So where in this software ecosystem should your organization contribute? Why would your organization choose to contribute to Debian rather than to the original project (”upstream” of Debian) or a project like Ubuntu (”downstream” of Debian)? It really depends on your goals.

If your organization is interested in using FOSS in a way which requires rapid development, new and diverse features released quickly, or specializations that the distribution may not easily support, you will probably want to work directly on the upstream project. Frequently this requires programming experience, but many projects need other kinds of help such as bug reports in the form of feature requests which they may be able to satisfy in later releases. In these cases, contributing to development in these projects directly is the best way to meet your needs in using and building upon the software.

If your organization needs to use FOSS in a stable, maintainable and secure way, you should probably work directly with Debian. The primary duty of most developers within the Debian community is working on the “packages” which make up the operating system: creating, updating, patching, tracking their security and handling bugs, forwarding details and patches to the upstream projects when applicable. This is what maintains the solid, core operating system that makes up not only Debian, but the child distributions which depend on it, and which could not exist without it. By contributing to Debian you’re also contributing to Ubuntu, Knoppix, and dozens more, improving the tool shelf for everyone (related: Given 250,000 tools on the shelf, how do you manage them?). Contributing to Debian also helps the upstream projects, taking the burden off of them to provide installation documents and support on Debian and placing that upon you, plus making their software more readily available to users through a simple search through the Debian repository.

If the target of one of Debian’s children better meets your organization’s needs which cannot be achieved through Debian directly, then by all means contribute directly to it. Child distributions already exist which focus on everything from being an Open Source LiveCD toolbox (like KNOPPIX) to being a polished desktop operating system (like Ubuntu). As an example, even within Ubuntu’s family there are targeted projects, like Edubuntu, focused on education by packaging and shipping a collection of educational software and a project devoted to making your computer a PVR like TiVo called Mythbuntu which works with the MythTV project to easily deliver their software on a platform. Contributing to projects like these also expands the open source ecosystem and may be the preferred method to reach your organization’s goals.

Understanding the way in which these projects and distributions work together and selecting a place in the workflow for your organization to contribute is the first step. But perhaps a more important question is why you’d want to work on a FOSS project instead of doing in-house development. The benefits for the FOSS community are obvious, they will reap the benefits of having your expertise, from having the packages in Debian and beyond, but are there benefits for your organization?

I believe there are big benefits, which include:

  • Peer review of packages and software now and in the future
  • Processes for asking the community for assistance
  • Bug reporting infrastructure, which may include patches submitted by community members
  • Procedures to become informed about security problems and policy changes
  • Free collaborative resources provided for FOSS projects (Alioth for Debian,  SourceForge, LaunchPad or the Apache Foundation, etc) for development, including development mailing lists and hosted revision control systems like git, bazaar, svn.
  • Opportunity to learn key FOSS development strategies and industry “best practices” via freely available documentation, chat rooms, forums and mailing lists

In short, by putting the time in to releasing software, packaging for Debian or work in children distributions, you not only are doing good for the FOSS community, you get to take advantage of the plethora of tools, resources and people available to assist in the development process.

The Nature and Importance of Source Code and Learning Programming with Python

By CJ Fearnley

Last year a client asked us for advice on getting started with programming. So I thought I’d share some thoughts about programming, its relationship with FOSS (Free and Open Source Software) management and why Python is a good language for learning programming including some great on-line resources. But first I want to make sure our business-oriented readers understand the nature and importance of source code.

The “source” aka “the code” provides a language in which computer users can create or change software. One does not have to be a programmer to work on the code. In fact, every computer user is, ipso facto, a programmer! Menus, web interfaces, and graphical user interfaces (GUIs) are some of the more facile “languages” for computer programming that everyone, even children, can readily learn and use. Of course, building complex software systems requires a more expressive specification language than a web form, for instance, can provide.

Although all computer software is specified with source code, FOSS systems are unique in that the source code is made available with the software. In contradistinction, software lock-in or vendor lock-in describes the unfortunately all too common practice of many organizations to block access to their source code.

Having access to the source code provides huge operational benefits. For one, the source can be used to understand how the software works: it is a form of software documentation (indeed, it is the most definitive form of software documentation possible!). Also, code can be easily changed to add diagnostics or to test a possible solution to a problem or to modify or add functionality. In addition, the source is a language both for specifying features to the computer and for discussing computing with others. So most mature FOSS languages have vibrant support communities in which one can participate, learn and get help.

The source is a tool: a powerful, multi-purpose, critically important tool.

Since LinuxForce focuses on FOSS, we are able to take full advantage of the availability of the code. We are always working with the source! Since most of our work is systems administration, we usually “program” configuration files. However, we also write systems software and scripts and we support software developers extensively, so we have a persistent, deep, and productive relationship with code.

But what to suggest to someone like our customer who wants to learn programming?

I remembered seeing a blurb in Linux Journal referencing an article they published in May 2000 by Eric Raymond entitled "Why Python" which argues persuasively for the virtues of the programming language Python. I had often felt that Perl’s idiosyncrasies made it difficult to use, so Eric’s critique of Perl and accolades for Python were convincing to me. In addition, I follow FOSS mathematics software and I was aware that Sage is a Python “glue” to more than fifty FOSS math libraries. I’ve been meaning to look into Python so I could use Sage. Another pull comes from my work at LinuxForce where we use a lot of Python-based software including mailman, fail2ban, Plone, and several tools used for virtual machine management such as kvm, virtinst and xen-tools. Python has a huge software repository and community. So one is likely to find good libraries to build upon (thus avoiding the extra learning curve of building everything from scratch). Python is an interpreted language which makes it easier to debug and use so the learning process is smoother.

To finish the recommendation, I just needed to find some on-line resources. First, Kirby Urner suggested these two: Wikieducator’s Python Tutorials and "Mathematics for the Digital Age and Programming in Python". Then, I checked out the Massachusetts Institute of Technology’s (MIT) OpenCourseWare which provides extensive course materials for many of their classes (I’ve already watched the full video set for a couple of MIT’s courses including the legendary Walter Lewin’s "Classical Mechanics" and have been very impressed by the quality and content of their materials). After nearly 30 years of introducing students to programming with Scheme, MIT switched to Python in 2008! The materials for their introductory Python-based course "6.00 Introduction to Computer Science and Programming" are very thorough, accessible and helpful. Their free on-line materials include the full video lectures of the class plus assignments, sample test problems, class handouts, and an excellent Readings section with references to "the Python Tutorial" and a very good free on-line textbook "How to Think Like a Computer Scientist: Learning with Python".

In conclusion, if you or anyone you know wants to learn how to program computers, I recommend starting with Python using MIT’s on-line course materials supplemented with the other on-line resources mentioned above (and summarized in the table below). I’ve now watched more than half of the videos from the MIT 6.00 course and I’ve worked through several of their assignments: this is a great course! Even with nearly three decades experience programming including a couple of college-level courses in the 1980s, I’m finding the class is more than just good review for me: I’ve learned a few new things (in particular, dynamic programming and the knapsack problem). Python’s clean syntax and elegant design will help as one delves into writing code for the first time. Its extensive libraries and repositories will support the application of one’s newly acquired computing skills to solve problems in the area of the student’s special interests whatever they may be … and that’s the way we learn best: by doing something that we personally care about!

Summary of On-Line Resources for Learning Python

Some thoughts on best practices for SMTP blocking of e-mail spam

By CJ Fearnley

Blocking e-mail spam at the time of SMTP (Simple Mail Transfer Protocol) transfer has become a best practice. There is no point wasting precious bandwidth & disk space and spending time browsing a huge spambox when most of the incoming flow is clearly spam. At LinuxForce our e-mail hygiene service, LinuxForceMail, makes extensive use of SMTP blocking techniques (using free and open source software such as Exim, Clam AV, SpamAssassin and Policyd-weight). But we are extremely careful to only block sites and e-mails that are so “spammy” that we are justified in blocking it. That doesn’t prevent false positives, but it keeps them to a minimum.

Recently we investigated an incident where one of our users had their e-mail blocked by another company’s anti-spam system. In investigating the problem, we learned that some vendors support an option to block e-mail whose Received header is on a blacklist (in our case it was Barracuda, but other vendors are also guilty). Let me be blunt: this is boneheaded, but the reason is subtle so I can understand how the mistake might be made.

First, blocking senders appearing on a blacklist at SMTP time is good practice. But to understand why blocking Received headers at SMTP time is bad, it is important to understand how e-mail transport works. The sending system opens a TCP/IP connection from a particular IP address. That IP address should be checked against blacklists. And other tests on the envelope can help identify spam. But the message headers including the Received header are not so definite. We shall see that even a blacklisted IP in these headers may be legitimate. So blocking such e-mail incurs unnecessary risks.

The problem occurs when a user of an ISP (Internet Service Provider) sends an e-mail from home, they are typically using a transient, “dynamic” IP address. Indeed it is possible that their IP address has just changed. Since the new address may have been previously used by someone infected with a virus sending out spam, this “new” IP address may be on the blacklists. So, due to no fault of your own, you have a blacklisted IP address (I will suppress my urge to rant for IPv6 when everyone can finally have their own IP address and be responsible for its security).

Now, when you send an e-mail through your ISP’s mail server, it records your (blacklisted) IP as the first Received header. So your (presumably secure) system sending a legitimate message through your ISP’s legitimate, authenticating mail server is blacklisted by your recipients’ overambitious anti-spam system. Ouch. That is why blocking such an e-mail is just wrong. This kind of blocking creates annoying unnecessary complications for the users and admins at both sides. Using e-mail filtering to put such e-mails into a spam folder would be a reasonable way to handle the situation. Filtering is able to handle false positives whereas blocking generates unrecoverable errors.

Do not block e-mail based on the Received header!

2009 Milestones for Debian GNU/Linux

By CJ Fearnley

In February 2009 Debian released version 5.0, “Lenny” with more than 25 000 packages including many security enhancements such as PHP’s Suhosin system. LinuxForce has compiled a list of 49 news articles documenting milestones of the Debian GNU/Linux project in 2009:
http://www.linuxforce.net/debian.milestones.html#2009.