typedef int (*funcptr)();

An engineers technical notebook

Neutron L3 agent with multiple provider networks

Due to requirements outside of my control, there was a requirement to run multiple "provider" networks each with each providing their own floating address pool from a single network node, I wanted to do this as simply as possible using a single l3 agent rather than having to figure out how to get systemd to start multiple with different configuration files.

Currently I've installed and configured an OpenStack instance that looks like this:

+---------------------+
|                     |
|                  +--+----+
|                  |       |
|      +-----------+-+  +--+----------+
|      | Compute     |  | Compute     |
|      |     01      |  |     02      |
|      +------+------+  +-----+-------+
|             |               |
|             |               +----------+
|             +------------+--+          |
|                          |             |
| +-------------+    +-----+-------+     |
| | Controller  |    |   Network   |     |
| |             |    |             |     +---+  Tenant Networks (vlan tagged) (vlan ID's 350 - 400)
| +-----+----+--+    +------+----+-+
|       |    |              |    |
|       |    |              |    +-----------+  Floating Networks (vlan tagged) (vlan ID's 340 - 349)
|       |    |              |
|       |    |              |
+------------+--------------+----------------+  Management Network (10.5.2.0/25)
        |
        |
        +------------------------------------+  External API Network (10.5.2.128/25)

There are two compute nodes, a controller node that runs all of the API services, and a network node that is strictly used for providing network functions (routers, load balancers, firewalls, all that fun stuff!).

There are two flat networks that provide the following:

  1. External API access
  2. A management network that OpenStack uses internally to communicate between instances and to manage it, which is not accessible from the other three networks.

The other two networks are both vlan tagged:

  1. Tenant networks, with the possibility of 50 vlan ID's
  2. Floating networks, with existing vlan ID's for existing networks

Since the OpenStack Icehouse release, the l3 agent has supported the ability to use the Open vSwitch configuration to specify how traffic should be routed rather than statically defining that a single l3 agent routes certain traffic to a single Linux bridge. Setting this up is fairly simple if you follow the documentation, with one caveat, variables you think would be defined to no value, actually have a value and thus need to be explicitly zeroed out.

On the network node

First, we need to configure the l3 agent, so let's set some extra variables in /etc/neutron/l3-agent.ini:

gateway_external_network_id =
external_network_bridge =

It is important that these two are set, not left commented out, unfortunately when commented out they have some defaults set and it will fail to work, so explicitly setting them to blank will fix that issue.

Next, we need to set up our Open vSwitch configuration. In /etc/neutron/plugin.ini the following needs to be configured:

  • bridge_mappings
  • network_vlan_ranges

Note, that these may already be configured, in which case there is nothing left to do. Mine currently looks like this:

bridge_mappings = tenant1:br-tnt,provider1:br-ex

This basically specifies that any networks created under "provider name" tenant1 are going to be mapped to the Open vSwitch bridge br-tnt and any networks with "provider name" provider1 will be mapped to br-ex.

br-tnt is mapped to my tenant network and on the switch has vlan ID's 350 - 400 assigned, and br-ex has vlan ID's 340 - 349 assigned.

Following the above knowledge, my network_vlan_ranges is configured as such:

network_vlan_ranges = tenant1:350:400,provider1:340:349

Make sure to restart all neutron services:

openstack-service restart neutron

On the controller (where neutron-server lives)

On the controller we just need to make sure that our network_vlan_ranges matches what is on the network node, with one exception, we do not list our provider1 vlan ranges since we don't want to make those available to accidentally be assigned when a regular tenant creates a new network.

So our configuration should list:

network_vlan_ranges = tenant1:350:400

Make sure that all neutron services are restarted:

openstack-service restart neutron

Create the Neutron networks

Now, as an administrative user we need to create the provider networks.

source ~/keystonerc_admin

neutron net-create "192.168.1.0/24-floating" \
--router:external True \
--provider:network_type vlan \
--provider:physical_network provider1 \
--provider:segmentation_id 340

neutron net-create "192.168.2.0/24-floating" \
--router:external True \
--provider:network_type vlan \
--provider:physical_network provider1 \
--provider:segmentation_id 341

Notice how we've created two networks, given them each individual names (I like to use the name of the network they are going to be used for) and have been attached to the provider1. Note that provider1 is completely administratively defined, and could just as well have been physnet1, so long as it is consistent across all of the configuration files.

Now let's create subnets on this network:

neutron subnet-create "192.168.1.0/24-floating" 192.168.1.0/24 \
--allocation-pool start=192.168.1.4,end=192.168.1.254 \
--disable-dhcp --gateway 192.168.1.1

neutron subnet-create "192.168.2.0/24-floating" 192.168.2.0/24 \
--allocation-pool start=192.168.2.4,end=192.168.2.254 \
--disable-dhcp --gateway 192.168.2.1

Now that these networks are defined, you should be able to have tenants create routers and set their gateways to either of these new networks by selecting from the drop-down in Horizon or by calling neutron router-gateway-set <router id> <network id> on the command line.

The l3 agent will automatically configure and set up the router as required on the network node, and traffic will flow to either vlan 340 or vlan 341 as defined above depending on what floating network the user uses as a gateway.

This drastically simplifies the configuration of multiple floating IP networks since no longer is there a requirement to start up and configure multiple l3 agents each with their own network ID configured. This makes configuration less brittle and easier to maintain over time.

OpenStack resizing of instances

One thing that is not always adequately explained in the OpenStack documentation is how exactly instance resizing works, and what is required, especially while using KVM as the virtualisation provider, with multiple compute nodes.

You might find something similiar to the following in your logs, and no good documentation on how to fix it.

ERROR nova.compute.manager [req-7cb1c029-beb4-4905-a9d9-62d488540eda f542d1b5afeb4908b8b132c4486f9fa8 c2bfab5ad24642359f43cdff9bb00047] [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b] Setting instance vm_state to ERROR
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b] Traceback (most recent call last):
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 5596, in _error_out_instance_on_exception
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b]     yield
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 3459, in resize_instance
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b]     block_device_info)
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4980, in migrate_disk_and_power_off
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b]     utils.execute('ssh', dest, 'mkdir', '-p', inst_base)
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b]   File "/usr/lib/python2.7/site-packages/nova/utils.py", line 165, in execute
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b]     return processutils.execute(*cmd, **kwargs)
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b]   File "/usr/lib/python2.7/site-packages/nova/openstack/common/processutils.py", line 193, in execute
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b]     cmd=' '.join(cmd))
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b] ProcessExecutionError: Unexpected error while running command.
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b] Command: ssh 10.5.2.20 mkdir -p /var/lib/nova/instances/99736f90-db0f-4cba-8f44-a73a603eee0b
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b] Exit code: 255
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b] Stdout: ''
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b] Stderr: 'Host key verification failed.\r\n'
TRACE nova.compute.manager [instance: 99736f90-db0f-4cba-8f44-a73a603eee0b] 
ERROR oslo.messaging.rpc.dispatcher [-] Exception during message handling: Unexpected error while running command.
Command: ssh 10.5.2.20 mkdir -p /var/lib/nova/instances/99736f90-db0f-4cba-8f44-a73a603eee0b
Exit code: 255
Stdout: ''
Stderr: 'Host key verification failed.\r\n'

When OpenStack's nova is instructed to resize an instance it will also change the host it is running on, almost never will it schedule the instance on the same host and do the resize on the same host it already exists. There is a configuration flag to change this, however in my case I would rather the scheduler be run again, especially if the instance size is changing drastically. During the resize process, the node where the instance is currently running will use SSH to connect to the instance where the resized instance will live, and copy over the instance and associated files.

There are a couple of assumptions I will be making:

  1. Your nova, and qemu user both have the same UID on all compute nodes
  2. The path for your instances is the same on all of your compute nodes

Configure the nova user

First things first, let's make sure our nova user has an appropriate shell set:

cat /etc/passwd | grep nova

Verify that the last entry is /bin/bash.

If not, let's modify the user and make it so:

usermod -s /bin/bash nova

Generate SSH key and configuration

After doing this the next steps are all run as the nova user.

su - nova

We need to generate an SSH key:

ssh-keygen -t rsa

Follow the directions, and save the key WITHOUT a passphrase.

Next up we need to configure SSH to not do host key verification, unless you want to manually SSH to all compute nodes that exist and accept the key (and continue to do so for each new compute node you add).

cat << EOF > ~/.ssh/config
Host *
    StrictHostKeyChecking no
    UserKnownHostsFile=/dev/null
EOF

Next we need to make sure we copy the the contents of id_rsa.pub to authorized_keys and set the mode on it correctly.

cat ~/.ssh/id_rsa.pub > .ssh/authorized_keys
chmod 600 .ssh/authorized_keys

This should be all the configuration for SSH you need to do. Now comes the import part, you will need to tar up and copy the ~nova/.ssh directory to every single compute node you have provisioned. This way all compute nodes will be able to SSH to the remote host to run the commands required to copy an instance over, and resize it.

Reset state on existing ERROR'ed instances

If you have any instances that are currently in the ERROR state due to a failed resize, you will be able to issue the following command to reset the state back to running and try again:

nova reset-state --active <ID of instance>

This will start the instance, and you will be able to once again issue the resize command to resize the instance.

Build numbers in binaries using waf

My build system of choice these days for any C++ project is waf. One of the things I always like havig is the build number included in the final binary, so that with a simple ./binary --version or even ./binary the version is printed that it was built from. This can make it much simpler to debug any potential issues, especially if fixes may have already been made but a bad binary was deployed.

Setup the wscript

Make sure that your wscript somewhere near the top contains the following:

APPNAME = 'myapp'
VERSION = '0.0.0'

Then in your configure(cfg) add the following:

cfg.env.VERSION = VERSION
cfg.env.APPNAME = APPNAME

git_version = try_git_version()

if git_version:
    cfg.env.VERSION += '-' + git_version

The try_git_version() function is fairly simple and looks like this:

def try_git_version():
    import os
    import sys

    version = None
    try:
        version = os.popen('git describe --always --dirty --long').read().strip()
    except Exception as e:
        print e
    return version

It runs git describe --always --dirty --long which will return something along these lines: 401b85f-dirty. If you have any annoted tags, it will return the tag name as well.

If git is not installed, or it is not a valid git directory, then it will simply return None. At that point all we have to go on is the VERSION variable set at the top of the wscript.

Now that we have our configuration environment set up with the VERSION we want to get that into a file that we can then include in our C++ source code.

Create a build_version.h.in file

#ifndef BUILD_VERSION_H_IN_941AD1F24D0A9D
#define BUILD_VERSION_H_IN_941AD1F24D0A9D

char VERSION[] = "@VERSION@";

#endif /* BUILD_VERSION_H_IN_941AD1F24D0A9D */

Add the following to build(ctx)

ctx(features='subst',
        source='build_version.h.in',
        target='build_version.h',
        VERSION = ctx.env['VERSION'],
        )

This uses the substitution feature to transform build_version.h.in into build_version.h, while inserting the version into the file.

Include build_version.h in your source code

#include "build_version.h"

And add something along these lines to your main():

std::cerr << "Version: " << VERSION << std::endl;

This will print out the VERSION that has been stored in build_version.h.

Full example

Check out my mdns-announce project on Github for an example of how this is implemented.

Tamper proof session cookies and session storage

As a follow-up to my previous article regarding User sessions, what data should be stored where?, I wanted to discuss how to store the session, and how to generate cookies that are tamper proof.

What are we trying to accomplish?

Ultimately we want to be able to have X amount pieces of data that are tied to a particular user. Unfortunately due the fact that HTTP is a stateless protocol we have to use cookies. Cookies are small little pieces of data that are transmitted from the server to the client (generally done once), and then upon the user coming back to the website they are transmitted from the client to the server. This allows us to uniquely track a single user across connections to our website.

If the website allows a user to authenticate and the fact that they are authenticated is stored in the session, we also want to make sure that we can aggressively expire a session, if this is possible depends on our session storage.

Session storage

There are a multitude of ways to store the session data, but it ultimately boils down to server-side or client-side. Server-side can be done in Cassandra, Memcache, Redis or even in a SQL database.

Server-side storage

The main one that has been used for years is to use server side storage. Storing a small file on the servers hard drive that contains the data, and the client is sent a cookie that contains a unique identifier that is linked to the on-disk storage.

For example:

1 => /tmp/session_1
2 => /tmp/session_2
...
N => /tmp/session_N

Easily expire sessions

The upside to server-side storage is that it is possible for us to very easily expire a session, simply remove the associated file/data that is stored and the users session has now become invalid.

Client-side storage

The other method that has recently started being used more to make it easier to scale the server side is to store session data encoded in base64 in the cookie itself. In this case there is no unique session ID, and no data is stored server side.

Expiration is more difficult

The downside to using client-side storage is that there is no way, short of the expiration on the cookie itself for the website to expire a session. There are work-arounds, but they all require storing state server-side. A hybrid approach for example is possible, store a unique ID along with the session data, and store that unique ID server side, but none of the extra data. Remove the unique ID server side and if we receive a session that contains a unique ID we don't recognise, we simply clear the session.

Expiration, why do I care?

Being able to easily expire a users sessions allows for extra security measures. For example in Google Mail it is possible to sign out all other locations, this forces those other locations to re-authenticate before gaining access to your account.

This is a good security measure to have, so that if a users cookie is stolen, or their credentials are compromised upon changing their password all their sessions are invalidated and an attacker using an old cookie/session ID can't continue to wreak havoc on the users account.

Cookie format

If we are just storing a session ID, or the full session the cookie should be hardened so that it can not be tampered with by a client. Even if you are protecting the cookie using SSL, we still don't want to allow a malicious user to modify the cookie to change the session ID or the session itself.

Signing your cookie

The single best way to make sure your cookie has not been tampered with is to cryptographically sign your cookie, and upon receiving the cookie from the client verifying that the signature matches what you are expecting. This is especially important if you are using client-side storage, because you don't want someone to be able to change the user ID from 950 to 1 and suddenly impersonate a different user.

Use an HMAC

HMAC (Hash-based message authentication code) is an cryptographic construct that uses a hashing algorithm (SHA-1, SHA-256, SHA-3) to create a MAC (message authentication code) with a secret key. It is very easy given the secret key and the original data to create the MAC, but it is very difficult if not impossible to take the original data, and MAC and get the secret key.

This allows us to do the following:

data = "Hello World"
mac = HMAC(data, sha256, "SEEKRIT")

Our mac would now be equal to:

e655f98cb9b3c02f45576f7906d64b0b7f8731f25a5319c42ca666917aca45a4

If we now create our cookie as follows:

cookie = mac + " " + data

It would look as follows:

cookie = e655f98cb9b3c02f45576f7906d64b0b7f8731f25a5319c42ca666917aca45a4 Hello World

We can then send that to the client that requested the page. Once the client visits the next page, their browser will send that same cookie back to use. If we split the mac from the data, we can then do the following operation:

cookie = e655f98cb9b3c02f45576f7906d64b0b7f8731f25a5319c42ca666917aca45a4 Hello World
data = "Hello World"
mac = e655f98cb9b3c02f45576f7906d64b0b7f8731f25a5319c42ca666917aca45a4

mac_verify = HMAC(data, sha256, "SEEKRIT")

mac_verify == mac

If and only if mac_verify and mac are the same can we be sure that the cookie has not been tampered with.

This requires that the client is NEVER aware of what we are using as our secret key. In the above exmaples that is "SEEKRIT". In your web application you will be required to make this a configuration variable, and you will have to take care not to commit that configuration variable to a git repository and upload it to github (for example).

Do not use a bare hash algorithm

Using a bare hash algorithm allows for length extension attacks if used incorrectly, this would allow an attacker to concatenate extra data to the end of our existing data, modify the "MAC" and the server would accept it.

This construct is thus very dangerous:

data = "Hello World"
key = "SEEKRIT"
mac = SHA1(key + data)

The following construct is still not recommended, but is not nearly as dangerous:

mac = SHA1(data + key)

Due to the key being last, this is not vulnerable to a length extension attack, however please don't do this, instead stick to using an HMAC instead.

Encrypting session data

When using client-side storage, it may be beneficial to encrypt the data to add an extra layer of security. Even if encrypting the data you need to continue using a MAC.

Using just encryption will not protect you against decrypting bad data because an attacker decided to provide invalid data. Signing the cookie data with a MAC makes sure that the attacker is not able to mess with the ciphertext.

What are web frameworks/languages doing by default?

I am most familiar with the Pylons Project's Pyramid Web Framework, the default session implementation that is provided by the project is named SignedCookieSessionFactory, as the name implies this uses a client-side cookie to store the session data, which is signed using a secret key that is provided upon instantiation of the factory.

Flask sessions also uses a signed cookie for client-side session storage.

Ruby on Rails uses a signed/encrypted cookie for client-side session storage by default.

PHP does not by default sign the session cookie, it does however use server-side storage for session data by default. However extra security can be added by installing PHP SuHoSin which adds session cookie encryption/signing.

Building custom ports with Poudriere and Portshaker

Guest post by Scott Sturdivant.

Maintaining custom ports and integrating them into your build process doesn't need to be difficult. The documentation surrounding this process however is either non-existent, or lacking in its clarity. At the end of the day, it really is as simple as maintaining a repository whose structure matches the ports tree layout, then managing that repository and the standard ports tree with portshaker, and finally handing the end result off to poudriere.

Your Custom Repository

For this example, we'll assume a git repo is used and that you're already familiar with how to build FreeBSD ports. We'll also assume that we have but a single port that we're maintaining and that it is called myport. The hierarchy of your repo should simply be category/myport. We'll refer to this repo simply as myrepo.

Portshaker

Portshaker is the tool responsible for taking multiple ports sources and then merging them down into a single target. In our case, we have two sources: our git repo (myrepo) containing myport, and the standard FreeBSD ports tree. We aim to merge this down into a single ports tree that poudriere will then use for its builds.

To configure portshaker, add the following to the /usr/local/etc/portshaker.conf file:

# vim:set syntax=sh:
# $Id: portshaker.conf.sample 116 2008-09-30 16:15:02Z romain.tartiere $

#---[ Base directory for mirrored Ports Trees ]---
mirror_base_dir="/var/cache/portshaker"

#---[ Directories where to merge ports ]---
ports_trees="default"

use_zfs="no"
poudriere_ports_mountpoint="/usr/local/poudriere/ports"
default_poudriere_tree="default"
default_merge_from="freebsd myrepo"

Some key points here are that the two items listed in for the default_merge_from argument need to have scripts present in the /usr/local/etc/portshaker.d directory. Further more, the combination of the poudriere_ports_mountpoint and default_poudriere_tree needs to be a ports tree that is then registered with poudriere.

Next, we need to tell portshaker how to go off and fetch our two types of ports trees, freebsd and myrepo. For the freebsd ports tree, create /usr/local/etc/portshaker.d/freebsd with the following contents and make it executable:

#!/bin/sh
. /usr/local/share/portshaker/portshaker.subr
method="portsnap"
run_portshaker_command $*

Next, create a similar script to handle our repository containing our custom port. /usr/local/etc/portshaker.d/myrepo should contain the following and similarly be executable:

#!/bin/sh
. /usr/local/share/portshaker/portshaker.subr
method="git"
git_clone_uri="http://github.com/scott.sturdivant/packaging.git"
git_branch="master"
run_portshaker_command $*

Obviously replace the git_clone_uri and git_branch variables to reflect your actual configuration. For more information about the values and what they can contain, consult man portshaker.d

Now, portshaker should be all set. Execute portshaker -U to update your merge_from ports trees (freebsd and myrepo). You'll see the standard portsnap fetch and extract process as well as a git clone. After a good bit of time, these will both be present in the /var/cache/portshaker directory. Go ahead and merge them together by executing portshaker -M.

Hooray! You now have /usr/local/poudriere/ports/default/ports that is a combination of the normal ports tree and your custom one.

We're effectively complete with configuring portshaker. Whenever your port is updated, just re-run portshaker -U and portshaker -M to grab the latest changes and perform the merge.

Poudriere

Poudriere is a good tool for building ports. We will use it to handle our merged directory. Begin by configuring poudriere (/usr/local/etc/poudriere.conf):

NO_ZFS=yes
FREEBSD_HOST=ftp://ftp.freebsd.org
RESOLV_CONF=/etc/resolv.conf
BASEFS=/usr/local/poudriere
USE_PORTLINT=no
USE_TMPFS=yes
DISTFILES_CACHE=/usr/ports/distfiles
CHECK_CHANGED_OPTIONS=yes

Really there's nothing here that is specific to the problem at hand, so feel free to consult the provided configuration file to tune it to your needs.

Now, the step that is specific is to set poudriere up with a ports tree that it does not manage, specifically our resultant merged directory. If you consult man poudriere, it specifies that for the ports subcommand, there is a -m method switch which controls the methodology used to create the ports tree. By default, it is portsnap. This is confusing as in our case, we do not want poudriere to actually do anything. We want it to just use an existing path. Fortunately, there is a way!

The poudriere wiki has an entry for using the system ports tree, so we adopt it for our needs by executing:

poudriere ports -c -F -f none -M /usr/local/poudriere/ports/default \
-p default

If you've consulted the poudriere manpage, you'll see that the -F and -f switches both reference ZFS in their help. As we're not using ZFS, it's not clear how they will behave. However, in conjunction with the custom mountpoint (-M /usr/local/poudriere/ports/default), we ultimately wind up with what we want, a ports tree that poudriere can use, but does not manage:

# poudriere ports -l
PORTSTREE            METHOD     PATH
default              -          /usr/local/poudriere/ports/default

Note that this resulting PATH is the combination of the poudriere_ports_mountpoint and default_poudriere_tree variables present in our /usr/local/etc/portshaker.conf configuration file.

Building software from your custom ports tree

Go ahead and create your jail(s) like you normally would (i.e. poudriere -c -j 92amd64 -V 9.2-RELEASE -a amd64) and any other configuration you would like, and then go ahead and build myport with poudriere bulk -j 92amd64 -p default category/myport. Success!