One of the beautiful things about working with IaaS is the disposable nature of instances. If they are not behaving properly due to a bug or have been misconfigured for some reason, instances can be terminated and rebuilt with more ease than debugging a long lived and churned through Linux system. As a quality engineer, this dispensability has become invaluable in testing and developing new tools without needing to baby physical or virtual machines.
One of the projects I have been working on lately is an easy deployment of Riak CS into the cloud in order to quickly and repeatedly test the object storage integration provided by Eucalyptus in the 4.0 release. Riak CS is a scalable and distributed object store that provides an S3 interface for managing objects and buckets.
Before testing the Eucalyptus orchestration of Riak CS (or any tool/backend/service that Euca supports for that matter), it is important to understand the basic activities that Eucalyptus will be performing on behalf of the user. Thankfully, Neil Soman wrote a great blog post about how our Riak CS integration is designed.
In this model we can see that we require:
- A multi-node Riak CS cluster
- A load balancer
- A machine to run the Eucalyptus Object Storage Gateway (OSG)
This topology is extremely simple to deploy in Eucalyptus 3.4 using our ELB and by using Vagrant to deploy our Riak CS cluster. Here’ s how to get your groove on.
- CentOS 6 image loaded into your cloud
- Keypair imported or created in the cloud
- Security group authorized for port 8080,8000 and 22
- Install Vagrant
Deploy Riak CS
In order to deploy Riak CS in our cloud we will use Vagrant+Chef+Berkshelf as follows:
- Install Vagrant plugins using the following commands:
- Import the dummy vagrant box necessary to use vagrant-aws:
- Clone the following repository
- Edit the following items in the Vagrantfile to reflect the pre-requisites above and to point to your target cloud
- Set the number of nodes to deploy at the top of the Vagrantfile:
- Once the cloud options are set start the Vagrant “up” process which will deploy the Riak CS nodes and Stanchion:
- Once Vagrant is complete, login to the first Riak CS node to get its private hostname:
- Join each node to the first that was deployed. For example, to join the second node to the cluster I would run:
In order to get your access and secret keys login to http://riak1-public-ip:8000
Load Balance Your Riak CS Cluster
- Create an ELB with the following command:
eulb-create-lb -z <AZ-of-your-riak-nodes> -l "lb-port=80, protocol=TCP, instance-port=8080,instance-protocol=TCP" RiakCS
- The command above will return you the DNS name that you will use as the endpoint for the “objectstorage.s3provider.s3endpoint” property when setting up the OSG. From the sample output below we would use “RiakCS-229524229045.lb.home”
- Register your Riak CS nodes with that load balancer:
You have now successfully deployed a Riak CS cluster. You can stop here if you’d like but the real fun starts when you add IAM, ACL, versioning, multipart upload, and bucket lifecycle support to the mix using the Eucalyptus OSG.
True enthusiasts continue below.
Install and Configure the Eucalyptus OSG Tech Preview
- Spin up another CentOS 6 instance in the same security group as used above
- Follow the instructions found here to finish the OSG installation and configuration, remember to use the DNS name returned in step 1 from above as the s3endpoint:
Introduction to Aminator
The Netflix Cloud Platform has shown how a large scale system can be deployed in a public cloud and maintain an extreme level of performance and reliability. As Adrian Cockcroft has said in the past, Netflix focused on having functional and scalable code rather than worrying immediately about how to make it portable. At Eucalyptus we have been working over the past year on making sure that as many of the NetflixOSS projects that interact directly with the cloud can be used on Eucalyptus. This level of portability means that anyone with even 1 single linux box can use their system as a test bed for deploying NetflixOSS more broadly on a public cloud. So far at Eucalyptus we have working versions of Asgard, Simian Army, Aminator, and Edda. These tools are cornerstones for app deployment and monitoring with the NetflixOSS stack. In this post I will show how to use Aminator with your Eucalyptus cloud.
Aminator is a tool created by Netflix to populate and catalog application images that are the building blocks for any service infrastructure you can dream up. Aminator works by taking a “Foundation AMI” and mounting a snapshot from it in order to provision your application. It does this by mounting a volume, created from an image snapshot, to a running instance then performing a chroot that runs provisioners such as Chef or Ansible. Once the provisioning step is complete the volume is snapped and registered as an AMI. Aminator doesn’t stop there however. It also creates tags of the snapshot and the AMI so that they can be easily identified. Some of the information included in the tags is:
* Package version info
* Name of application
* Base AMI
Having this information allows an Aminator user to trace back the history of how one of their applications was deployed, also pinning the deployment to a particular person.
Some of the benefits of using Aminator for app deployment include:
- Ensuring exact and dependable recovery of previous application stack (including dependencies and software)
- Allows applications to be deployed in AutoScaling groups as each AMI is completely self contained version of the application
- Ensures application images are tagged with appropriate meta data for traceability
- Allows traceability of ownership of images (since Netflix uses one large AWS account)
With the addition of Eucalyptus cloud to deploy on you can enjoy the following:
- An internal test and development platform for NetflixOSS
- Gives application developers an easy way to catalog, build, and deploy their test applications
- Ensures a repeatable process is in place for creating an image that will eventually go into production
- Test changes to an image quickly/cheaply on local private infrastructure before deploying into production
Using Aminator in Eucalyptus
In order to run Aminator we will first need to build our Aminator instance (which will also be the AMI we use as the “Foundation AMI”).
- Download the Ubuntu Precise QCOW disk image to a machine that has the qemu-img tool
- Convert the QCOW image to RAW format using the following command:
- qemu-img convert -O raw ubuntu-12.04-server-cloudimg-amd64-disk1.img ubuntu-12.04-server-cloudimg-amd64-disk1.raw
- Once the image is converted start up an instance so we can create our “Foundation AMI”
- After the instance is booted copy the raw image to the instances ephemeral storage
- Attach a 2G volume to the instance
- Copy the disk file to the volume using dd:
- dd if=ubuntu-12.04-server-cloudimg-amd64-disk1.raw of=/dev/vdb
- Create a snapshot from your volume
- Register the snapshot as an image in your cloud. Remember this AMI ID as it will be what we pass to Aminator in later steps.
- Run an instance from the newly created image and log into it
In these steps we have now created the base image for future application deployments and created the instance where we will run our Aminator tasks. Next up we will install Aminator and the Eucalyptus plugin:
- Clone the Aminator repository
- git clone https://github.com/Netflix/aminator.git
- Edit the aminator/default_conf/environments.yml file and add the following block:
- Run the setup script twice from inside the aminator directory
- cd aminator;python setup.py install; python setup.py install
- Now clone the eucalyptus-cloud Aminator plugin and install it
- Now that you have all the dependencies lets run an amination to install and label an Apache web server image:
- sudo aminate -e euca_apt_linux –ec2-endpoint <clc-ip> -B emi-CD544111 apache2
In the above command we have told Aminator a few things about what we want to do:
- -e: Use the Eucalyptus environment for provisioning a Linux machine with APT
- –ec2-endpoint: Use this IP to connect to the Eucalyptus cloud
- -B: Use this AMI (the one we registered in the steps above) as the base image
- apache2: Install the apache2 package and tag the appropriate version information onto the snapshot and image
After a few minutes Aminator should complete, letting you know which AMI it has registered for you.
Enjoy your new application deployment tool!
Throughout my tenure as a Quality Engineer, I have had a love/hate relationship with logs. On one hand, they can be my proof that a problem is occurring and possibly the key to tracking down a fix. On the other hand, they can be an endless stream of seemingly unintelligible information. In debugging a distributed system, such as Eucalyptus, logging can be your only hope in tracing down issues with operations that require the coordination of many components.
Logs are generally presented to users by applications as flat text files that rotate their contents over time in order to bound the amount of space they will take up on the filesystem. Gathering information from these files often involves terminal windows, tail, less, and timestamp correlation. The process of manually aggregating, analyzing and correlating logs can be extremely taxing on the eyes and brain. Having a centralized logging mechanism is a great leap forward in streamlining the debug process but still leaves flat text files around for system administrators or testers to analyze for valuable information.
A month or so ago I sought out to reinvigorate my relationship with logs by making them sexy again. I looked around at the various open source and proprietary tools on the market and decided to give Logstash a shot at teaching me something new about Eucalyptus through its logs. The “getting started” links I found on the docs page presented a quick and easy way to see what LogStash could do for my use case, namely ingesting and indexing logs sent from rsyslog. Once I got some logs to appear in the ElasticSearch backend, I got a bit giddy as I was now able to search and filter the logs through an API. But alas! I was still looking at text on a freaking black and green screen. BORING! There had to be a better way to visualize this data.
I looked around a bit and found Kibana. This beautiful frontend to ElasticSearch gives you a simple and clean interface for creating/saving dashboards that reflect interesting information from your logs. Within minutes of installing Kibana, I had a personalized dashboard setup that was showing me the following statistics from my Eucalyptus install that was undergoing a stress test:
- Instances run
- Instances terminated
- Volumes created
- Volumes deleted
I had proven that there was value in using Logstash and it was not complicated to setup or use. I then began to use other dashboards, filters, and search terms to look for anomalous patterns in the log messages. This type of analysis resulted in a couple of issues being opened that I would not have found looking at one screen of text at a time.
Below I will outline the steps to begin your own Logstash journey with Eucalyptus or any other system/application that logs to a filesystem on a Linux box.
- Install packages
- Set proper timezone
- Download Logstash
- Create LogStash config file for rsyslog input. Create and edit a file named logstash.conf
- Run logstash JAR
- Configure rsyslog on Eucalyptus components by adding the following to the /etc/rsyslog.conf file and replacing <your-logstash-ip>
- Restart rsyslog
Installing Kibana 3
- Clone the repository from GitHub
- Edit the kibana/config.js file and set the elasticsearch line to:
- Copy the Kibana repository to your web server directory
Point your browser to http://<your-logstash-public-ip> and you should be presented with the Kibana interface. Kibana is not specifically a frontend for Logstash but rather a frontend to any ElasticSearch installation. Kibana does provide a default Logstash dashboard as a starting point for you customizations: http://<your-logstash-public-ip>/index.html#/dashboard/file/logstash.json