Author: antonym

Leveraging GitLab for CI/CD Deployments of OpenStack

This blog post is going to cover a way to set up Continuous Integration and Continuous Deployments (CI/CD) of your OpenStack cluster and is easily adapatable for various types of deployments like openstack-ansible or kolla-ansible.

An important part of any OpenStack deployment is the configuration for the environment. Once you have your ideal configuration set up and have production loads running, it's important to ensure your environment is deployed and updated in a consistent and controlled way. By utilizing version control, you can track all changes made to your configuration and ensure that new changes can be validated and approved by your peers and also know what previous changes have deployed successfully. Then by having a common method of deployment once those changes have merged in, you can ensure the environment is deployed the same way every time.

Enter Gitlab

The way we'll do this is by leveraging Gitlab and it's all in one functionality to do version control with git and CI/CD automation.

We'll store the configurations in a git repository and encrypt the secrets with Ansible Vault. Then we will add a .gitlab-ci.yml to that repository that will add the functionality to do actions when merge requests come in or merges occur.

We will then set up a gitlab runner to the deployment host within the environment you want to have automatically deploy to. Gitlab will pass the job over to that deployment host and run the deployment and configuration changes. Then it'll gather up all the logs and save them as artifacts so that the deployment is fully logged and can be debugged if needed.

Configure your OpenStack Environment

We'll make the assumption you've already done a fresh install and have your environment the way you want it. If you new to OpenStack or just getting started, these are some of the popular tools for deploying OpenStack:

openstack-ansible - https://docs.openstack.org/openstack-ansible/latest/
kolla-ansible - https://docs.openstack.org/kolla-ansible/latest/

Set up GitLab

You can either use a self hosted Gitlab server or the publicly hosted gitlab.com. Either will work, you just need to make sure that whatever host you are using for deployment has access to the internet as it will need to be able to talk to Gitlab to retrieve jobs.

Create your configuration repo

mkdir ~/my-openstack-configs/osa-configs
cp -R /etc/openstack_deploy/* ~/my-openstack-configs/osa-configs/

Encrypt your credentials

# create a vault password file
openssl rand -base64 40 > ~/.vault_pass.txt
# encrypt your user_secrets if OSA, or passwords.yml if kolla
ansible-vault encrypt ~/my-openstack-configs/osa-configs/user_secrets.yml --vault-password-file ~/.vault_pass.txt
# check your file and ensure it's encrypted after running

Set up your git repo

cd ~/my-openstack-configs
git init
git remote add origin git@gitlab.com:your_repo/my-openstack-configs.git
git add .
git commit -m "Initial commit"
git push -u origin master

Your configuration repo is now setup! Now to automate deployments!

Set up your .gitlab-ci.yml

First we'll create a .gitlab-ci.yml that contains the Deployment job:

stages:
  - deploy
variables:
  ANSIBLE_VAULT_PASSWORD_FILE: /root/.vault_pass.txt
include:
  - local: .gitlab-ci-deploy.yml

#########################
## Deployment Pipeline ##
#########################

deploy_dc1:
  stage: deploy
  image: ubuntu:18.04
  extends: .deploy_script
  variables:
    DATACENTER: dc1
    OSA_RELEASE: stable/stein
  environment:
    name: production
  when: manual
  only:
  - master
  artifacts:
    paths:
      - artifacts/
    expire_in: 1 week 
  tags:
    - acme_dc1

#############################
## End Deployment Pipeline ##
#############################

Set up .gitlab-ci-deploy.yml

Then we'll create a .gitlab-ci-deploy.yml that contains a lot of the common deployment scripting. This gets included in the original .gitlab-ci.yml. It's broken out into another file to keep the original pipeline file a bit cleaner. This can also be put into scripts and called with a script but showing it here as an example.

.deploy_script:
  before_script:
    # install ssh-agent
    - 'which ssh-agent || ( apt-get update -y && apt-get install openssh-client -y )'
    # run ssh-agent
    - eval $(ssh-agent -s)
    - export SSH_PRIVATE_KEY=${DATACENTER}_SSH_PRIVATE_KEY
    - echo "${!SSH_PRIVATE_KEY}" | tr -d '\r' | ssh-add - > /dev/null
    - mkdir -p ~/.ssh
    - chmod 700 ~/.ssh
    - echo -e "Host *\n\tStrictHostKeyChecking no\n\n" > ~/.ssh/config
    - ln -s $CI_PROJECT_DIR/osa-configs/ /etc/openstack_deploy
    - echo ${ANSIBLE_VAULT_PASSWORD} > /root/.vault_pass.txt
  script:
    - apt update
    - apt install -y python-pip git
    - git clone --branch $OSA_RELEASE https://github.com/openstack/openstack-ansible /opt/openstack-ansible
    - cd /opt/openstack-ansible
    - scripts/bootstrap-ansible.sh
    - cd /opt/openstack-ansible/playbooks
    - ansible-vault decrypt /etc/openstack_deploy/user*secret*.yml
    - openstack-ansible setup-hosts.yml
    - openstack-ansible setup-infrastructure.yml
    - openstack-ansible setup-openstack.yml
  after_script:
    - cp -R /openstack/log ${CI_PROJECT_DIR}/artifacts
    - ansible-vault encrypt /etc/openstack_deploy/user*secret*.yml
    - rm /root/.vault_pass.txt

CI/CD Repo Settings to Tune on Gitlab Repo

The following settings are need to ensure CI/CD has enough time to run and so it can inject the SSH private key into the docker image during deployment run so that it can talk to the other hosts.

  • Add DATACENTER_SSH_PRIVATE_KEYS variable to GitLab Settings CI/CD variables
  • Increase Job Timeout from 1h to 8h (or desired timeout for larger environments) in CI/CD Settings, General Pipeline settings of configuration repo
  • Set contents of .vault_pass.txt to ANSIBLE_VAULT_PASSWORD so that the passwords can be decrypted during the job.

Setup gitlab runner on your deployment host

You can choose to have the runner execute on the host directly with SSH or have it run all scripting in docker. I'm using docker in this example as it keeps the environment clean and uncluttered. The examples also currently build up the docker from scratch each time, but this can be optimized as well by capturing the docker image with the dependencies and configurations needed and then using that newly created image.

Install docker to deployment host

apt install docker.io
systemctl enable docker
systemctl start docker

Install gitlab runner

More information about gitlab runner is here: https://docs.gitlab.com/runner/

curl -L https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh | sudo bash
sudo apt-get install gitlab-runner
systemctl enable gitlab-runner
systemctl start gitlab-runner

Configure Gitlab runner

# configure gitlab runner, headless options also available to skip user input, easy to automate
gitlab-runner register
# Server: https://gitlab.com or location of self hosted gitlab server
# Token: Found from repo's CI/CD Runners section in settings
# Description: Set description, i.e. "dc1 production runner" 
# Tag: tag used by gitlabci file to send job to runner, i.e. acme_dc1.  This is how you ensure it's listening to the right deploy job
# Type of deployer, docker or ssh (Please enter the executor: custom, docker, parallels, ssh, docker+machine, kubernetes, docker-ssh, shell, virtualbox, docker-ssh+machine)

Once setup, the runner will listen to the Gitlab Server set and action on any jobs requesting the tag on the token you configured the gitlab-runner for. The runner can be shutdown when out of a maintenance window to ensure deploys don't happen outside of a window if so desired.

From there changes merged in will kick off the pipeline and pause before deployment. Pressing the play button on the deployment will send the job to the runner and kick off the deployment.

Upgrading Kolla Ansible for Deploying OpenStack

If you are already running an environment with Kolla and want to upgrade it to the next release of OpenStack, it\'s pretty easy and quick to do. First you\'ll want to determine the version of the next version you\'ll want to deploy to and install the kolla-ansible for that version of OpenStack. You can view the releases of kolla-ansible here: https://releases.openstack.org/teams/kolla.html

For example:

Stein (8.x): 8.0.0.rc1
Rocky (7.x): 7.0.2
Queens (6.x): 6.2.1

So say you are running Rocky (7.x) and want to upgrade to Stein (8.x). You\'ll want to:

pip install --upgrade kolla-ansible==8.0.0

You will now have the latest kolla-ansible code to do the upgrade to Stein. Upgrades need to be ran incrementally which means major version to major version. It is not a good idea to skip versions as you may miss things like needed database migrations for the environment.

Next we will need to update the configuration and inventory files to ensure we have all of the latest values from the new version of kolla-ansible. The files you need are the globals.yml, passwords.yml, and inventory files. Their locations are:

CentOS

/usr/share/kolla-ansible/etc_examples/kolla/{globals.yml,passwords.yml} # configuration files
/usr/share/kolla-ansible/ansible/inventory/{aio,multinode} # inventory

Ubuntu

/usr/local/share/kolla-ansible/etc_examples/kolla/{globals.yml,passwords.yml} # configuration files
/usr/local/share/kolla-ansible/ansible/inventory/{aio,multinode} # inventory

I would recommend starting with those files and porting any deviations or changes to your environment into those files so that you do not run into any unexpected issues. Each release can have some deviations in inventory, globals or passwords so it is important to redo the configurations for each release so that you don not run into unexpected issues when doing the upgrade.

Make sure to set the openstack_release in globals.yml to the version of kolla-ansible you want to deploy:

openstack_release: 8.0.0.rc1

You will also need to make sure you have the latest list of passwords by merging the new list of passwords with the old list:

mv /etc/kolla/passwords.yml passwords.yml.old
cp kolla-ansible/etc/kolla/passwords.yml passwords.yml.new
kolla-genpwd -p passwords.yml.new
kolla-mergepwd --old passwords.yml.old --new passwords.yml.new --final /etc/kolla/passwords.yml

Once you have the configurations set up, you can begin the process of pulling down the images.

kolla-ansible -i multinode pull

This will identify all images that need to be pulled down to hosts and prestage the images on the machines. By prestaging the docker images down the host, we can ensure we can rapidly upgrade the environment and reduce downtime. Once the images are pulled down we can begin the process of upgrading.

kolla-ansible -i multinode upgrade

This will begin the process of restarting the docker containers one by one and handle the upgrading from one image to the next. At the end of the upgrade, all containters will be upgraded from rocky to stein and the environment will be available for use. The upgrade per node is usually pretty quick as it is just reloading a new image and doing things like migrating database during each container restart. Because all of the images have been pregenerated, deployment and upgrading can be very fast reducing downtime and maintenance times for large scale environments.

Official upgrade documentation can be found here:
https://docs.openstack.org/kolla-ansible/latest/user/operating-kolla.html

My Home Networking Setup

I've had a few asks from various people about my networking setup at home so I thought I'd throw together a blog post detailing it.

I wanted something that was going to be stable, performant, and reliable so I bit the bullet and decided to go with equipment that typically isn't used in most households. No Linksys or Netgear consumer grade equipment here!

My current setup consists:

1 x Ubiquiti EdgeRouter 4
1 x Ubiquiti Edge Switch ES-24-250W
4 x Ubiquiti Unifi 802.11ac Dual-Radio PRO Access Point (UAP-AC-PRO-US)

My Edge Router 4 (datasheet) is connected to AT&T Fiber for a lightning fast 1GB up and down. Eventually, I may add a secondary failover provider to ensure the internet is always up and keep the family happy at all times. The router also runs Debian which lets you ssh into the router and use the CLI to configure or hack around on the router.

The 24 Port PoE Switch is a relatively cheap switch that provides PoE (Power over Ethernet). All of the AC-Pro Wifi access points are powered with PoE from the switch and distributed around the house for maximum signal. This is great because installation of devices just requires an ethernet line instead of the additional power. An added benefit is that all of the equipment can be battery backed up from one central location. My home is hard wired for ethernet so all wiring aggregates in one location to the switches. Wifi is usually nice to have for roaming devices, but for dedicated devices that usually require higher bandwidth, like gaming consoles, Apple TVs, and the TVs themselves, it's nice to have those hardwired into the network.

Overall I've been really happy with the setup and haven't really had to mess with it much other than the occasional firmware updates to keep the devices up to date.

Setting up python-openstackclient with Rackspace

I finally got around to switching to python-openstackclient after using python-novaclient and supernova for a number of years. Here's a quick way to get it set up if you're using Rackspace as a public cloud OpenStack provider and want to use multiple clouds or regions from one client.

Install python-openstackclient:

pip install python-openstackclient

Create a directory to hold your cloud config file:

mkdir -p $HOME/.config/openstack

Generate a clouds.yaml file, but make sure to put valid information for logging in (make sure to use your password instead of API key):

cat <<EOF > $HOME/.config/openstack/clouds.yaml
clouds:
  rackspace:
    cloud: rackspace
    auth:
      auth_url: 'https://identity.api.rackspacecloud.com/v2.0/'
      project_id: rackspace-account-number
      username: rackspace-account-username
      password: rackspace-account-password
    region_name: DFW,ORD,IAD,LON,SYD,HKG
EOF

When running the openstack client command, you'll need to specify the cloud you want to use, and then specify the region you want to call. By default, if --os-region-name isn't specified, it will use the first entry set in region_name. To list servers in an account:

openstack --os-cloud rackspace --os-region-name=IAD server list

To boot a quick server, first grab the flavor id and image id that you want to boot:

openstack --os-cloud rackspace --os-region-name=IAD flavor list
openstack --os-cloud rackspace --os-region-name=IAD image list

Then take the values you want and boot the server:

openstack --os-cloud rackspace --os-region-name=IAD server create --image <image-uuid> --flavor <flavor-name> --key-name my_key mytestserver

If for any reason you get an error that states the following:

No module named openstack.common

Check and see if you have some older Rackspace novaclient pip packages installed and remove them with pip uninstall:

positron:~ ant$ pip freeze | grep ext
os-diskconfig-python-novaclient-ext==0.1.2
os-networksv2-python-novaclient-ext==0.25
os-virtual-interfacesv2-python-novaclient-ext==0.19

For more information on configuration options, check out the python-openstackclient configuration docs.

Setting up an OpenStack Cloud using Ansible

I use Ansible and OpenStack quite a bit on a daily basis, so I wanted to check out the work the community has done with the openstack-ansible project and get an OpenStack environment set up in my lab. I encourage you to read through the documentation as it is really detailed. Let's do this!

img

My Lab Environment

My setup consists of:

4 x Dell PowerEdge R720s with 128GB of RAM
Quad 10G Intel NICs
Cisco Nexus 3k switches

I set aside one of the nodes for deployment and the other three were going to be used as targets. openstack-ansible currently supports Ubuntu 14.04 LTS (Trusty) so the first order of business was to install the OS to the servers. Future support for 16.04 LTS (xenial) and CentOS 7 may be coming down at some point as well.

Setting up Networking

Once the OS was installed, the first thing to do was to set up the initial networking config in /etc/network/interfaces. For my setup, I'll be assigning networks to vlans for my setup.

Add some initial packages on the target host and enable some modules:

apt-get install bridge-utils debootstrap ifenslave ifenslave-2.6 \
lsof lvm2 ntp ntpdate openssh-server sudo tcpdump vlan
echo 'bonding' >> /etc/modules
echo '8021q' >> /etc/modules

Drop your interfaces file onto all of your hosts you'll be deploying and reboot them to apply the changes so that they set up all of the bridges for your containers and instances. In my example this configuration sets up dual bonds, VLANs, and bridges that OpenStack Ansible will plug everything into.

Initial Bootstrap

You'll want to use one server as the deployment host so log into that server, check out openstack-ansible, and run the initial Ansible bootstrap:

git clone https://github.com/openstack/openstack-ansible.git /opt/openstack-ansible
cd /opt/openstack-ansible
git checkout stable/mitaka
scripts/bootstrap-ansible.sh

The bootstrap-ansible.sh script will generate keys so make sure to copy the contents of the public key file on the deployment host to the /root/.ssh/authorized_keys file on each target host.

Copy the example openstack_deploy directory to /etc/:

cp -R /opt/openstack-ansible/etc/openstack_deploy /etc/openstack_deploy
cp /etc/openstack_deploy/openstack_user_config.yml.example /etc/openstack_deploy/openstack_user_config.yml

Modify the openstack_user_config.yml for the settings you want. You'll need to specify which servers you want each role to do. The openstack_user_config.yml is pretty well commented and provides lots of docs to get started.

My config:

If you have enough memory and CPU on the hosts, you can also reuse the infrastructure nodes as compute_nodes to avoid having to set up dedicated nodes for compute.

Credentials

cd /opt/openstack-ansible/scripts
python pw-token-gen.py --file /etc/openstack_deploy/user_secrets.yml

User Variables

We'll start with just the basics for now to get operational. Make sure to enable at least a few options in the /etc/openstack_deploy/user_variables.yml otherwise it will have a hard time assembling the variables (these haven't made it into the mitaka stable branch yet):

## Debug and Verbose options.
debug: false
verbose: false

Run the playbooks

cd /opt/openstack-ansible/playbooks
openstack-ansible setup-hosts.yml
openstack-ansible haproxy-install.yml
openstack-ansible setup-infrastructure.yml
openstack-ansible setup-openstack.yml

If there are no errors, then the initial cluster should be setup. The playbooks are all idempotent so you can rerun them at anytime.

Using the Cluster

Once these playbooks complete, you should have a functional OpenStack Cluster. To get started, you can log into Horizon with either the external vip IP you set up in openstack_user_config.yml or by hitting the server directly.

You'll use the user name "admin" and the password will be in your /etc/openstack_deploy/user_secrets.yml file that you generated earlier:

grep keystone_auth_admin_password /etc/openstack_deploy/user_secrets.yml
keystone_auth_admin_password: 4lkwtwtpmasldfqsdf

Each target node will have a utility container that you can ssh into to grab the openstack client credentials or run the client from the container. You can find it by doing an:

root@osa-node1:~# lxc-ls | grep -i util
node1_utility_container-860a6cd9
root@osa-node1:~# ssh root@node1_utility_container-860a6cd9
Welcome to Ubuntu 14.04.4 LTS (GNU/Linux 3.13.0-85-generic x86_64)
root@node1-utility-container-860a6cd9:~# openstack server list
+--------------------------------------+-------------+--------+----------------------+
| ID                                   | Name        | Status | Networks             |
+--------------------------------------+-------------+--------+----------------------+
| 1b7f1a7f-db87-47fe-a884-c66875ceed00 | my-instance | ACTIVE | Public=192.168.20.165|
+--------------------------------------+-------------+--------+----------------------+

Create and Setup Your Network

In Horizon under the System tab, select networks and then "+Create network". The main thing to note is depending on the network you are setting up, make sure to specify that type in the Physical Network box as well. In my case, I set up a vlan network, so I made sure to set:

Name: Public
Project: admin
Provider Network Type: VLAN
Physical Network: vlan
Admin State: UP

Once the network is created, click on the Network Name and click "+Create Subnet". Add your:

Subnet Name: VLAN_854
Network Address: 10.127.95.0/24
Gateway IP: 10.127.95.1
Allocation Pools: <Start Address>,<End Address>
DNS Name Servers: <DNS Servers>

Add Images to Glance

You'll need to add some images to get up and running. You can find a list of supported images that include Cloud-Init here.

Name: Image Name
Image Source: Image Location
Image Location: Enter in URL of Cloud Image
Format: QCOW2, RAW, or whatever the image format may be
Architecture: x86_64 or whatever hardware you might be using
Public: Checked

Security Groups

By default security groups are enabled, so you'll want to enable some ingress rules like SSH and ICMP by default so you can connect to your instance.

Start an Instance

Under the instances tab, click "Launch Instance". Fill in your desired options, including boot from image, add any keypairs you might want, and make sure to select the Security Group you set up previously. You'll also want to make sure you are plugged into the right network as well. Once all of those things are set up, you should be able to launch the instance and attempt to connect to it.

Things to Note

Cluster Recovery

The target hosts are in a DB cluster so if you need to reboot them, make sure to stagger them, so that the cluster doesn't fail. If you find the DB is not coming up, you can run the galera-bootstrap playbook which should bring the cluster back up (docs):

openstack-ansible galera-install.yml --tags galera-bootstrap

If you run into any issues running through this, please let me know in the comments or ping me in #openstack-ansible on Freenode as antonym.

Stateless Hypervisors at Scale

Running a public cloud that provisions infrastructure has many challenges, especially when you start getting to very large scale. Today I'm going to touch on the hypervisor piece, the main part of a public cloud that contains the customers data running in their instances.

Hypervisors typically run on bare metal, have some sort of operating system, host configuration, the customer's instance settings and then if using local storage, the virtual disks.

Traditionally, an operating system is installed and configuration management like Puppet, Chef, or Ansible is ran to bring the host machine up to deployed specification and then added to automation that ultimately provisions instances for a public cloud.

Over time, new features get implemented, bugs are fixed, and operational knowledge is gained so your deployed infrastructure will evolve over time. As your infrastructure grows, your older legacy style infrastructure can start looking a lot different and start becoming out of sync and inconsistent.

To break it down into a few points:

  • Hypervisors become inconsistent over time by ongoing maintenance, code releases, and manual troubleshooting by operations.
  • Optimizations, patches, and security fixes are pushed with newer builds but older builds in production never get caught up.
  • Critical kernel or hypervisor updates that require reboots are hard to do because of the uptime requirements of a public cloud.

So what if we got rid of the traditional methods of OS installation and configuration management and instead created a snapshot of your server build once and then deployed that to thousands of servers?

"We’ll Do It Live"

If you’ve ever installed Ubuntu, typically you’ll use what’s called a Live CD to install the OS. The CD loads an OS into RAM and brings up the GUI so that you can then run the install from there. Many distributions over the years have used Live CDs for installation, rescue, or to serve as a tool for recovering from data loss.

The same concept can be applied to a hypervisor or a server running a work load. If you think about it, the hypervisors typically have one purpose, to run instances virtually for a user. Why have thousands of independent installs?

Creating a Live Image

The process I've been using to create live images is relatively simple. I've detailed some very high level basics and will deep dive into each one of these at a later date:

  • Create an initial minimal chroot of the filesystem
  • Using Ansible, run configuration management one time within the chroot. This includes all additional packages needed, any customizations, and other additional things you'd normally do in your configuration run.
  • Install tools to allow for Live Booting to work
  • CentOS/Debian/Fedora/OpenSUSE/Ubuntu - dracut
  • Regenerate the initrd to inject the live boot tools into the initrd
  • Copy the kernel and initrd out
  • Create an image file and sync the filesystem into the image file.

From there you now the entire build of your OS represented by three files that can be used to boot the operating system over the network, from Grub, or via kexec.

To persist or not persist?

Now at this point you essentially will have an image that can boot into RAM using iPXE, Grub, or even kexec which is fully stateless. But what if you want to actually make the data persist? With a few scripts added to the boot time, you can very easily separate the actual operating system and applications which will need updating over time from the user's data which will need to persist and be constant.

The scripts create symlinks from the filesystem in RAM to local storage on the server so that when the application tries to write to a directory, it gets redirected to a persistent storage on the local disk. The scripts to build the symlinks are part of the image so they are recreated every time the server boots the image.

In the example of an Openstack Nova Compute running Libvirt+KVM booting as a LiveOS, I have just a few locations on the filesystem that symlink to /data which is mounted on local storage on /dev/sda2:

  • /etc/libvirt - libvirt configurations
  • /etc/nova - Openstack Nova configuration
  • /etc/openvswitch - openvswitch settings and config
  • /etc/systemd/network - systemd networking configs
  • /var/lib/libvirt/ - libvirt files
  • /var/lib/nova/ - instance location
  • /var/lib/openvswitch/ - openvswitch database

Those locations and files within them make up the unique part of each hypervisor and keep them separate from the rest of the overall OS which will need to go through constant upgrading or changes.

Squashible

I've been working on making some of the bits we've been working on available to the public. It's a project called Squashible. The name came from mashing SquashFS with Ansible. We switched away from using SquashFS for the time being but the name stuck for now until I can come up with a better name.

You can play around with it here. It's a constant work in progress so please use at your own risk. It currently runs through various roles to create an image with the minimal set of packages you need to run a hypervisor of a certain type. Many thanks to Major Hayden for working with me side by side on a lot of this project over the past year.

Openstack

A video to my presentation and slides are below for Openstack Austin 2016 - Stateless Hypervisors at Scale.


Feedback

Comments, concerns, ideas? Let me know!

Booting Linux ISOs with Memdisk and iPXE

There are a number of distributions out there that provide proper support for booting the distribution over the network. A lot of the more popular distributions usually provide a installer kernels that can be easily downloaded for use. You point at the vmlinuz and the initrd and can them immediately proceed with the install streaming down packages as needed. These distributions make it great for tools like netboot.xyz to install using iPXE.

There are some distributions out there that don't have this functionality and typically only produce the ISO without any repositories that provide installer kernels or the rootfs.

In those cases, occasionally you can use memdisk and iPXE to boot those ISOs but they don't always work. In doing some research, I ran across one of the major issues as to why.

Syslinux - Memdisk

The following was taken from syslinux - memdisk.

The majority of Linux based CD images will also fail to work with MEMDISK ISO emulation. Linux distributions require kernel and initrd files to be specified, as soon as these files are loaded the protected mode kernel driver(s) take control and the virtual CD will no longer be accessible. If any other files are required from the CD/DVD they will be missing, resulting in boot error(s). Linux distributions that only require kernel and initrd files function fully via ISO emulation, as no other data needs accessing from the virtual CD/DVD drive once they have been loaded. The boot loader has read all necessary files to memory by using INT 13h, before booting the kernel.

There is also another solution, which requires the phram and mtdblock kernel module and memdiskfind utility of the Syslinux package (utils/memdiskfind). memdiskfind will detect the MEMDISK mapped image and will print the start and length of the found MEMDISK mapped image in a format phram understands:

modprobe phram phram=memdisk,$(memdiskfind)
modprobe mtdblock

This will create a /dev/mtdblock0 device, which should be the .ISO image, and should be mountable.

If your image is bigger than 128MiB and you have a 32-bit OS, then you have to increase the maximum memory usage of vmalloc, by adding:

vmalloc=<at_least_size_of_your_image_in_mib>Mi</at_least_size_of_your_image_in_mib>

Example: vmalloc=256Mi to your kernel parameters.

memdiskfind can be compiled with the klibc instead of with the glibc C library to get a much smaller binary for use in the initramfs:

cd ./syslinux-4.04/utils/
make spotless
make CC=klcc memdiskfind

Implementations of phram and mtdblock

ArchLinux has implemented the above concept here and here.

Debian Live used it here.

It's also been implemented in Clonezilla and GParted.

Antergos Linux based on Arch Linux works great with memdisk using the phram module.

Conclusion

I think it would be great for more distributions to attempt to implement something like this so that iPXE tools can be used to load the ISOs instead of actually having to burn or look for the location of the latest ISO every time.

Some of the distributions I'd love to see network support or better memdisk support are:

Linux Mint
Manjaro
Elementary
Solus Project

There are also many other new distributions being released all the time. I typically use DistroWatch to determine the most popular distributions to attempt to add to netboot.xyz. I'd love to get a lot of these added to make it really easy to install anything on the fly.

I'd also love to see some of the hypervisors out there crack open the ISOs, pull them outside of their paywalls, and host the bits on their servers so that it's much easier to immediately boot an install to test something out without having to jump through many hoops. I have working installs for VMware ESX and Citrix XenServer but I'd need to have them host the bits or allow permission to do so for a public facing installer menu.

netboot.xyz

My newest project on the side is netboot.xyz. If you've seen boot.rackspace.com, this should look pretty familiar to you. I ran across cheap .xyz domains from Namecheap (one dollar at the time!), and figured the netboot.xyz name space was much easier to remember and was more neutral to the goal I was trying to accomplish. I forked boot.rackspace.com (still doing basic maintenance) and am now focusing my efforts on netboot.xyz.

My goal with the project is to make it easy as possible to boot many of the popular operating systems from bare metal, virtual machines, etc without the hassle of hunting down the latest ISO for the OS you want to download. Ideally it's usable with any service provider or just someone who maintains their own servers.

I usually try and use operating systems that make their boot loaders available via mirrors, although there are occasionally some exceptions. I'm also experimenting with various new builds like WinPE, Live Booted OS, and I'd like to even pursue getting some hypervisors on there as well to make it as easy as possible to install everything.

It's also a great place to just let people play around with new operating systems with just a menu and learn about the many many distributions out there.

Check it out when you get a chance and drop me some feedback or make a pull request if you see something I'm missing. I've added a really easy way to test your pull request from the utility menu, all you need to do is enter in your github username and branch or hash of the commit you want to test.

I'm still working on a bunch of documents for demonstrating how easy it is to plug the 1MB iPXE ISO into things like VMware Fusion, Virtual Box, Openstack, so bear with me while I try and get all of those available.

Enjoy!

Creating Custom Security Updates In XenServer

Some of you may have heard about the latest vulnerability affecting QEMU codenamed VENOM.

Sometimes security vulnerabilities are released faster than the vendor can qualify a valid hot fix. In this post, I'll walk you through how to generate your own XenServer hotfix in order to rapidly patch the issue.

How XenServer Patching Works

The sources for XenServer are provided each release, usually in a binpkg.iso. Here's are some links for the latest version of XenServer 6.5:

XenServer Primary Download Page

XenServer 6.5 Hypervisor

XenServer 6.5 Sources

XenServer 6.5 DDK

Creating your Own Custom Patch

The first thing you'll need to do is to download the DDK for the affected version. The DDKs are released for each version of XenServer and also released anytime the kernel revs within the major release. The DDK provides the same environment that the SRPMs were created under, so it makes it really easy to rebuild the RPMs. It comes packaged as an appliance, so you'll want to import that appliance into a build of XenServer and boot it up

Determine What Needs to be Patched

If QEMU needs patching, more than likely it's the qemu-dm binary (/usr/lib64/xen/bin/qemu-dm). To determine which packages sources you need to retrieve, run a rpm query on that binary:

[root@hostname /]# rpm -qf /usr/lib64/xen/bin/qemu-dm
xen-device-model-1.9.0-199.7656

Now we know that we need to make the changes to the xen-device-model.

If we needed to patch Xen:

[root@hostname boot]# rpm -qf xen-4.4.1-xs100346.gz
xen-hypervisor-4.4.1-1.9.0.462.28802

And so on. Once you know what the package is, then we can go about finding the source rpm.

Obtaining the Source RPM

Assuming the version of XenServer you're using is up to date on patches, you'll want to grab either the latest deployed patch to your environment, or grab the latest patch that contained the version you want to update. Each xsupdate contains updated RPMs, so you might need to run through all of the latest patches to find the right one.

Anytime a hotfix is released, the hotfix will include the sources that were changed as part of the update release. For example, within the zip of a hotfixed release, 6.5SP1 in this case, you'll have two files:

  • the xsupdate that is used to apply to the server, XS65ESP1.xsupdate
  • the sources package, XS65ESP1-src-pkgs.tar.bz2

The sources package includes all of the SRPMs that were used to create the latest xsupdate.

Extracting the Sources

We'll want to take the latest available sources, grab the Source RPM, and install it to the DDK server. We'll use the one out of this hotfix to simulate updating QEMU for VENOM:

wget http://downloadns.citrix.com.edgesuite.net/10325/XS62ESP1021.zip
unzip XS62ESP1021.zip
bunzip2 XS62ESP1021-src-pkgs.tar.bz2
tar xvf XS62ESP1021-src-pkgs.tar 

Create .rpmmacros so that the sources extract to a known location:

# ~/.rpmmacros
%packager %(echo "$USER")
%_topdir %(echo "$HOME")/rpmbuild

Make directories:

mkdir ~/rpmbuild ~/rpmbuild/SOURCES ~/rpmbuild/RPMS ~/rpmbuild/BUILD ~/rpmbuild/SRPMS ~/rpmbuild/SPECS 

Install the sources:

rpm -i xen-device-model-1.8.0-105.7582.i686.src.rpm

Copy the patch file to ~/rpmbuild/SOURCES/:

cp xsa133-qemut.patch ~/rpmbuild/SOURCES/

Update the SPEC file to include the new patch and bump the release from 105.7582 to 105.7582.1custom. We do this so we can prevent conflicts from future versions but still differentiate which version we're on:

[root@localhost]# diff -u xen-device-model.spec xen-device-model.spec.mod
--- xen-device-model.spec   2015-03-17 12:02:05.000000000 -0400
+++ xen-device-model.spec.mod   2015-05-12 19:35:53.000000000 -0400
@@ -1,11 +1,12 @@ 
 Summary: qemu-dm device model
 Name: xen-device-model
 Version: 1.8.0
-Release: 105.7582
+Release: 105.7582.1custom
 License: GPL
 Group: System/Hypervisor 
 Source0: xen-device-model-%{version}.tar.bz2
 Patch0: xen-device-model-development.patch
+Patch1: xsa133-qemut.patch
 BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-buildroot
 BuildRequires: SDL-devel, zlib-devel, xen-devel, ncurses-devel, pciutils-devel
@@ -14,6 +15,7 @@
 %prep
 %setup -q
 %patch0 -p1
+%patch1 -p1
 %build
 ./xen-setup --disable-opengl --disable-vnc-tls --disable-blobs
@@ -37,6 +39,9 @@
 %dir /var/xen/qemu
 %changelog
+* Tue Mar 17 2015 MyPatch <www.mypatch.com> [1.8.0 105.7582.1custom]
+- xsa133-qemu
+
 * Tue Mar 17 2015 Citrix Systems, Inc. <www.citrix.com> [1.8.0 105.7582]
 - Build ioemu.

Regenerate the RPM from the sources, and watch for errors.

rpmbuild -ba xen-device-model.spec

Make sure your patches apply cleanly and if they do, after the compile has completed, the fresh RPMs will be present in ~/RPMS:

ls ~/rpmbuild/RPMS/i386/xen-device-model* 
xen-device-model-1.8.0-105.7582.1custom.i386
xen-device-model-debuginfo-1.8.0-105.7582.1custom.i386.rpm 

Deploying the RPMs to XenServer

You'll want to take the new RPM and deploy it using:

rpm -Uvh xen-device-model-1.8.0-105.7582.1custom.i386 

If you need to revert to the original version, you can run

rpm --force -Uvh xen-device-model-1.8.0-105.7582.i386

Depending on the type of patching you're doing, you'll need to determine your reload strategy. If it's Xen or a kernel for instance, you'll know you'll have to reboot. If it QEMU, you know that you'll have to detach the disks and reload them so that they get the newly patched process.

Booting VMware ESXi in iPXE

This is a quick method of setting up VMware ESXi installers to run in iPXE. This particular version works on the 5.5 Update 2 ISO. You'll need to drop the ISO contents in a directory and then these other files in either the same directory or another to make it cleaner:

iPXE Code:

:esx55u2
kernel http://httpserver/configs/vmware/esx55u2/mboot.c32 -c http://httpserver/configs/vmware/esx55u2/boot.cfg
boot

boot.cfg contents:

bootstate=0
title=Loading ESXi installer
prefix=http://httpserver/configs/vmware/esx55u2/bits
kernel=tboot.b00
kernelopt=runweasel ks=http://httpserver/configs/vmware/esx55u2/ks.cfg
modules=b.b00 --- jumpstrt.gz --- useropts.gz --- k.b00 --- chardevs.b00 --- a.b00 --- user.b00 --- sb.v00 --- s.v00 --- ata_pata.v00 --- ata_pata.v01 --- ata_pata.v02 --- ata_pata.v03 --- ata_pata.v04 --- ata_pata.v05 --- ata_pata.v06 --- ata_pata.v07 --- block_cc.v00 --- ehci_ehc.v00 --- elxnet.v00 --- weaselin.t00 --- esx_dvfi.v00 --- xlibs.v00 --- ima_qla4.v00 --- ipmi_ipm.v00 --- ipmi_ipm.v01 --- ipmi_ipm.v02 --- lpfc.v00 --- lsi_mr3.v00 --- lsi_msgp.v00 --- misc_cni.v00 --- misc_dri.v00 --- mtip32xx.v00 --- net_be2n.v00 --- net_bnx2.v00 --- net_bnx2.v01 --- net_cnic.v00 --- net_e100.v00 --- net_e100.v01 --- net_enic.v00 --- net_forc.v00 --- net_igb.v00 --- net_ixgb.v00 --- net_mlx4.v00 --- net_mlx4.v01 --- net_nx_n.v00 --- net_tg3.v00 --- net_vmxn.v00 --- ohci_usb.v00 --- qlnative.v00 --- rste.v00 --- sata_ahc.v00 --- sata_ata.v00 --- sata_sat.v00 --- sata_sat.v01 --- sata_sat.v02 --- sata_sat.v03 --- sata_sat.v04 --- scsi_aac.v00 --- scsi_adp.v00 --- scsi_aic.v00 --- scsi_bnx.v00 --- scsi_bnx.v01 --- scsi_fni.v00 --- scsi_hps.v00 --- scsi_ips.v00 --- scsi_lpf.v00 --- scsi_meg.v00 --- scsi_meg.v01 --- scsi_meg.v02 --- scsi_mpt.v00 --- scsi_mpt.v01 --- scsi_mpt.v02 --- scsi_qla.v00 --- scsi_qla.v01 --- uhci_usb.v00 --- tools.t00 --- xorg.v00 --- imgdb.tgz --- imgpayld.tgz
build=
updated=0

On future versions, you have to make sure the boot.cfg in the ISO modules line up to your custom boot.cfg.

And a quick example ks.cfg for some automation:

# Sample scripted installation file
# Accept the VMware End User License Agreement
vmaccepteula
# Set the root password for the DCUI and ESXi Shell
rootpw mypassword
# Install on the first local disk available on machine
install --firstdisk --overwritevmfs
# Set the network to DHCP on the first network adapater, use the specified hostname and do not create a portgroup for the VMs
network --bootproto=dhcp --device=vmnic0 --addvmportgroup=0
# reboots the host after the scripted installation is completed
reboot

Update 11/11/2015: You'll need to enable COMBOOT support within iPXE in order to properly boot ESXi. You can do this by creating a source/config/local/general.h in the iPXE source with the following contents:

#define IMAGE_COMBOOT

Then you can recompile iPXE and use the new binaries to boot.