Category: nova

Setting up an OpenStack Cloud using Ansible

I use Ansible and OpenStack quite a bit on a daily basis, so I wanted to check out the work the community has done with the openstack-ansible project and get an OpenStack environment set up in my lab. I encourage you to read through the documentation as it is really detailed. Let’s do this!

img

My Lab Environment

My setup consists of:

4 x Dell PowerEdge R720s with 128GB of RAM
Quad 10G Intel NICs
Cisco Nexus 3k switches

I set aside one of the nodes for deployment and the other three were going to be used as targets. openstack-ansible currently supports Ubuntu 14.04 LTS (Trusty) so the first order of business was to install the OS to the servers. Future support for 16.04 LTS (xenial) and CentOS 7 may be coming down at some point as well.

Setting up Networking

Once the OS was installed, the first thing to do was to set up the initial networking config in /etc/network/interfaces. For my setup, I’ll be assigning networks to vlans for my setup.

Add some initial packages on the target host and enable some modules:

apt-get install bridge-utils debootstrap ifenslave ifenslave-2.6 \
lsof lvm2 ntp ntpdate openssh-server sudo tcpdump vlan
echo 'bonding' >> /etc/modules
echo '8021q' >> /etc/modules

Drop your interfaces file onto all of your hosts you’ll be deploying and reboot them to apply the changes so that they set up all of the bridges for your containers and instances. In my example this configuration sets up dual bonds, VLANs, and bridges that OpenStack Ansible will plug everything into.

Initial Bootstrap

You’ll want to use one server as the deployment host so log into that server, check out openstack-ansible, and run the initial Ansible bootstrap:

git clone https://github.com/openstack/openstack-ansible.git /opt/openstack-ansible
cd /opt/openstack-ansible
git checkout stable/mitaka
scripts/bootstrap-ansible.sh

The bootstrap-ansible.sh script will generate keys so make sure to copy the contents of the public key file on the deployment host to the /root/.ssh/authorized_keys file on each target host.

Copy the example openstack_deploy directory to /etc/:

cp -R /opt/openstack-ansible/etc/openstack_deploy /etc/openstack_deploy
cp /etc/openstack_deploy/openstack_user_config.yml.example /etc/openstack_deploy/openstack_user_config.yml

Modify the openstack_user_config.yml for the settings you want. You’ll need to specify which servers you want each role to do. The openstack_user_config.yml is pretty well commented and provides lots of docs to get started.

My config:

If you have enough memory and CPU on the hosts, you can also reuse the infrastructure nodes as compute_nodes to avoid having to set up dedicated nodes for compute.

Credentials

cd /opt/openstack-ansible/scripts
python pw-token-gen.py --file /etc/openstack_deploy/user_secrets.yml

User Variables

We’ll start with just the basics for now to get operational. Make sure to enable at least a few options in the /etc/openstack_deploy/user_variables.yml otherwise it will have a hard time assembling the variables (these haven’t made it into the mitaka stable branch yet):

## Debug and Verbose options.
debug: false
verbose: false

Run the playbooks

cd /opt/openstack-ansible/playbooks
openstack-ansible setup-hosts.yml
openstack-ansible haproxy-install.yml
openstack-ansible setup-infrastructure.yml
openstack-ansible setup-openstack.yml

If there are no errors, then the initial cluster should be setup. The playbooks are all idempotent so you can rerun them at anytime.

Using the Cluster

Once these playbooks complete, you should have a functional OpenStack Cluster. To get started, you can log into Horizon with either the external vip IP you set up in openstack_user_config.yml or by hitting the server directly.

You’ll use the user name “admin” and the password will be in your /etc/openstack_deploy/user_secrets.yml file that you generated earlier:

grep keystone_auth_admin_password /etc/openstack_deploy/user_secrets.yml
keystone_auth_admin_password: 4lkwtwtpmasldfqsdf

Each target node will have a utility container that you can ssh into to grab the openstack client credentials or run the client from the container. You can find it by doing an:

root@osa-node1:~# lxc-ls | grep -i util
node1_utility_container-860a6cd9
root@osa-node1:~# ssh root@node1_utility_container-860a6cd9
Welcome to Ubuntu 14.04.4 LTS (GNU/Linux 3.13.0-85-generic x86_64)
root@node1-utility-container-860a6cd9:~# openstack server list
+--------------------------------------+-------------+--------+----------------------+
| ID                                   | Name        | Status | Networks             |
+--------------------------------------+-------------+--------+----------------------+
| 1b7f1a7f-db87-47fe-a884-c66875ceed00 | my-instance | ACTIVE | Public=192.168.20.165|
+--------------------------------------+-------------+--------+----------------------+

Create and Setup Your Network

In Horizon under the System tab, select networks and then “+Create network”. The main thing to note is depending on the network you are setting up, make sure to specify that type in the Physical Network box as well. In my case, I set up a vlan network, so I made sure to set:

Name: Public
Project: admin
Provider Network Type: VLAN
Physical Network: vlan
Admin State: UP

Once the network is created, click on the Network Name and click “+Create Subnet”. Add your:

Subnet Name: VLAN_854
Network Address: 10.127.95.0/24
Gateway IP: 10.127.95.1
Allocation Pools: <Start Address>,<End Address>
DNS Name Servers: <DNS Servers>

Add Images to Glance

You’ll need to add some images to get up and running. You can find a list of supported images that include Cloud-Init here.

Name: Image Name
Image Source: Image Location
Image Location: Enter in URL of Cloud Image
Format: QCOW2, RAW, or whatever the image format may be
Architecture: x86_64 or whatever hardware you might be using
Public: Checked

Security Groups

By default security groups are enabled, so you’ll want to enable some ingress rules like SSH and ICMP by default so you can connect to your instance.

Start an Instance

Under the instances tab, click “Launch Instance”. Fill in your desired options, including boot from image, add any keypairs you might want, and make sure to select the Security Group you set up previously. You’ll also want to make sure you are plugged into the right network as well. Once all of those things are set up, you should be able to launch the instance and attempt to connect to it.

Things to Note

Cluster Recovery

The target hosts are in a DB cluster so if you need to reboot them, make sure to stagger them, so that the cluster doesn’t fail. If you find the DB is not coming up, you can run the galera-bootstrap playbook which should bring the cluster back up (docs):

openstack-ansible galera-install.yml --tags galera-bootstrap

If you run into any issues running through this, please let me know in the comments or ping me in #openstack-ansible on Freenode as antonym.

Stateless Hypervisors at Scale

Running a public cloud that provisions infrastructure has many challenges, especially when you start getting to very large scale. Today I’m going to touch on the hypervisor piece, the main part of a public cloud that contains the customers data running in their instances.

Hypervisors typically run on bare metal, have some sort of operating system, host configuration, the customer’s instance settings and then if using local storage, the virtual disks.

Traditionally, an operating system is installed and configuration management like Puppet, Chef, or Ansible is ran to bring the host machine up to deployed specification and then added to automation that ultimately provisions instances for a public cloud.

Over time, new features get implemented, bugs are fixed, and operational knowledge is gained so your deployed infrastructure will evolve over time. As your infrastructure grows, your older legacy style infrastructure can start looking a lot different and start becoming out of sync and inconsistent.

To break it down into a few points:

  • Hypervisors become inconsistent over time by ongoing maintenance, code releases, and manual troubleshooting by operations.
  • Optimizations, patches, and security fixes are pushed with newer builds but older builds in production never get caught up.
  • Critical kernel or hypervisor updates that require reboots are hard to do because of the uptime requirements of a public cloud.

So what if we got rid of the traditional methods of OS installation and configuration management and instead created a snapshot of your server build once and then deployed that to thousands of servers?

“We’ll Do It Live”

If you’ve ever installed Ubuntu, typically you’ll use what’s called a Live CD to install the OS. The CD loads an OS into RAM and brings up the GUI so that you can then run the install from there. Many distributions over the years have used Live CDs for installation, rescue, or to serve as a tool for recovering from data loss.

The same concept can be applied to a hypervisor or a server running a work load. If you think about it, the hypervisors typically have one purpose, to run instances virtually for a user. Why have thousands of independent installs?

Creating a Live Image

The process I’ve been using to create live images is relatively simple. I’ve detailed some very high level basics and will deep dive into each one of these at a later date:

  • Create an initial minimal chroot of the filesystem
  • Using Ansible, run configuration management one time within the chroot. This includes all additional packages needed, any customizations, and other additional things you’d normally do in your configuration run.
  • Install tools to allow for Live Booting to work
  • CentOS/Debian/Fedora/OpenSUSE/Ubuntu – dracut
  • Regenerate the initrd to inject the live boot tools into the initrd
  • Copy the kernel and initrd out
  • Create an image file and sync the filesystem into the image file.

From there you now the entire build of your OS represented by three files that can be used to boot the operating system over the network, from Grub, or via kexec.

To persist or not persist?

Now at this point you essentially will have an image that can boot into RAM using iPXE, Grub, or even kexec which is fully stateless. But what if you want to actually make the data persist? With a few scripts added to the boot time, you can very easily separate the actual operating system and applications which will need updating over time from the user’s data which will need to persist and be constant.

The scripts create symlinks from the filesystem in RAM to local storage on the server so that when the application tries to write to a directory, it gets redirected to a persistent storage on the local disk. The scripts to build the symlinks are part of the image so they are recreated every time the server boots the image.

In the example of an Openstack Nova Compute running Libvirt+KVM booting as a LiveOS, I have just a few locations on the filesystem that symlink to /data which is mounted on local storage on /dev/sda2:

  • /etc/libvirt – libvirt configurations
  • /etc/nova – Openstack Nova configuration
  • /etc/openvswitch – openvswitch settings and config
  • /etc/systemd/network – systemd networking configs
  • /var/lib/libvirt/ – libvirt files
  • /var/lib/nova/ – instance location
  • /var/lib/openvswitch/ – openvswitch database

Those locations and files within them make up the unique part of each hypervisor and keep them separate from the rest of the overall OS which will need to go through constant upgrading or changes.

Squashible

I’ve been working on making some of the bits we’ve been working on available to the public. It’s a project called Squashible. The name came from mashing SquashFS with Ansible. We switched away from using SquashFS for the time being but the name stuck for now until I can come up with a better name.

You can play around with it here. It’s a constant work in progress so please use at your own risk. It currently runs through various roles to create an image with the minimal set of packages you need to run a hypervisor of a certain type. Many thanks to Major Hayden for working with me side by side on a lot of this project over the past year.

Openstack

A video to my presentation and slides are below for Openstack Austin 2016 – Stateless Hypervisors at Scale.


Feedback

Comments, concerns, ideas? Let me know!

Developing boot.rackspace.com

When I started down the path of building osimag.es, I started realizing that it could be really useful for others, especially in a cloud environment. Since my main focus has been working on Rackspace Cloud Servers for a number of years, I decided to see how feasible it would be to put together a menu driven installer for any Operating System working in a Infrastructure as a Service type of environment. I figured there’s probably a number of power users who might not want to start out with the default images provided, but possibly would want the opportunity to create their own custom image from scratch.

Will it even work?

I started testing out the XenServer boot from ISO code in Openstack to see if someone might have already gotten that working for another use case. To my delight, the boot from ISO code worked out pretty well. I was able to upload the iPXE 1MB iso into Glance and boot from that image type.

The next problem to solve was the fact that Rackspace Cloud Servers assigns static IP addresses and does not currently run a DHCP service to assign out the networking. iPXE usually works best when DHCP is used as the network stack gets set up automatically. Because of this, a customer launching a cloud server could boot the iPXE image but would have to specify the networking manually of the instance in order to chain load boot.rackspace.com.

We started thinking about how to automate this, and with the help of a few developers came up with a solution. The solution retrieves an iPXE image on boot, brings it down to the hypervisor, extracts the iPXE kernel, and regenerates the ISO with a new iPXE startup script that contains the networking information of the instance. Then when the instance is started, iPXE is able to get on the network and load up boot.rackspace.com automatically. Once iPXE has those values, they can then be passed to kernel command line for distributions that support network options. This allows for the user to not have to worry about any networking input during installation.

Hosting the Menu

Because boot.rackspace.com is just a bunch of iPXE scripts, they are hosted on Cloud Files in a container. The domain is a CNAME to the containers URL and then hosted on the Akamai CDN. The source is deployed from Github to the Cloud Files container when new commits are checked in via a Jenkins job. This makes it very lightweight and very scalable to run. The next thing I’m probably going to look at is seeing if I can remove the Jenkins server completely and just run the deploy out of Github. I was also able to enable CDN logs within the container and I’m using a service called Qloudstat to parse those logs and provide metrics on the usage of the scripts.

Delete those old ISOs
Having a small 1MB image is really nice for those times when you need to deploy an OS onto a remote server, or just need to install something into Virtual Box or VMware. There’s really no point in storing tons of ISOs on your machine if you can just stream the packages you need.

What’s Next?
I have a few ideas about some new features that I’d like to add. I’d like to add a menu of experimental items and I’d also like to have the ability to generate a new version of the menu from a pull request so that new changes can be quickly validated before being merged into the main code base. If you haven’t tried out boot.rackspace.com yet, I encourage you to check it out. You can get a quick overview from my Rackspace blog post.

Citrix XenServer 6.1 Automated Installer for Openstack

I’ve put my Openstack XenServer 6.1 (Tampa) installer onto Github here: https://github.com/amesserl/xs-tampa-openstack.

It has all of the modifications to get it running with the XenAPI Openstack Nova code and also includes the latest hotfixes. All you need to do is snag the latest CD and drop it in. I’ll continue to publish repos for new versions as they come out (Clearwater should be released soon). You can also boot it from osimag.es as well.