Getting started at CDOT, High Availability Virtualization Project

I’m very happy to say that I’m starting to work at CDOT – The Centre for Development of Open Technology in a research project.

I’ve been following for more than two years the awesome work that’s being done at CDOT and I’m very excited to get a chance to become part of the team.
I’ll be working with Kieran Sedgwick under the supervision of Andrew Smith

The goal of the project is to research open source alternatives for high availability vitalization tools and in the end combine them all in a simple ready to use package.

So far we’ve just discussed a bit about the project requirements and some tools that we are considering to use, but the project still on its initial stages and we’ll be evaluating which tools fit best the end solution

I’m listing some of tools/technologies that we plan to start researching about and see how they fit in the overall goal of the project.

1. OpenNebula

A short definition of OpenNebula taken from their website: is an open-source project developing the industry standard solution for building and managing virtualized enterprise data centers and IaaS clouds.

It looks like OpenNebula aggregates a bunch of different services and provides a all-in-one interface to manage all separate parts of a cloud infrastructure, for example:

  • Virtualization
  • Networking
  • Storage
  • Hosts & Clusters
  • Users & Groups
  • Other Subsystems

More info here
There is also a good book published about OpenNebula

2. Kernel Based Virtual Machine

KVM is the virtualization solution we’ll be using in this project

3. iSCSI – Internet Small Computer System Interface

As wikipedia summarizes:

It is an Internet Protocol (IP)-based storage networking standard for linking data storage facilities

A few interesting points.

  • iSCSI allows the creation of SANs (Storage Area networks)
  • It uses TCP to estabilish a connection so the “initiator” can send SCSI commands to storage devices(targets) on a remote server.

An important point about iSCSI and other SAN protocols is that they do not encrypt the data being sent in the network, all the traffic is sent as cleartext.
iSCSI uses the CHAP(Challenge-Handshake Authentication Protocol) to authenticate the supplicant and verifier during the initial stage of the connection, but after that all the communication is done in the open
Some risks generated by not using encryption:

  • reconstruct and copy the files and filesystems being transferred on the wire
  • alter the contents of files by injecting fake iSCSI frames
  • corrupt filesystems being accessed by initiators, exposing servers to software flaws in poorly tested filesystem code.

IPSec could be use to encrypt the communication. However that would generate a big overhead as far as performance goes.

More info can be found:

4. Linux-HA

The definition from Hearthbeat’s wiki:

Heartbeat is a daemon that provides cluster infrastructure (communication and membership) services to its clients. This allows clients to know about the presence (or disappearance!) of peer processes on other machines and to easily exchange messages with them.

Hearthbeat project is under the umbrella of Linux-HA(High Availability)

Some of the packages from Linux-HA are:

I just started reading the Linux-HA user guide, which by the way, it is very detailed and contains a lot of information.

5. CentOS

We’ll most likely use CentOS as our main Linux distro

CentOS is based of the RedHat Enterprise Linux Edition
It has a growing community and lots of documentation online.
A lot of useful information can be found on their wiki

6. libvirt

oVirt is a open source plataform virtualization web management tool.
RedHat is one of the main contributors to the proejct and oVirt can manage instances of VirtuaBox, KVM and Xen.

oVirt is built on top of libvirt, the actuall library that does the heavy lifiting.

Other usefull links: