OpenStack Part 10: Getting it up

I’ve been having a whole lot of problems launching an instance.  The error messages from OpenStack said ‘Not enough hosts available’, which didn’t really help me in finding the cause.  The python stack trace pointed me to the network configuration, with no idea what was wrong.

So after a long time staring at the traces, and looking over the neutron configuration, with little or no effect, I decided to go for broke: I’d just redo the entire neutron configuration step, and in the worst case I could just scrap it al and redo the entire installation tutorial again, and now not deviate from the example configuration too much.

First: trying to fix the neutron config.  I followed these guides, instead of the example installation:
Deploy OVS provider: First I needed to create a provider network.
Deploy OVS self service: Then make it self service with vxlan.
I ended up removing all the “vlan” interfaces and virtual network, as I didn’t need these anymore.

I was getting a different error message at this point, which I could resolve with this suggestion: Failed to allocate the networks, not reschuduling.

IT WORKS!

I can create an instance!

Still to be investigated: I can’t login on the cirros instance via the console, because it keeps complaining about “unknown key pressed” when I try to login.
And the routing to the external network doesn’t work yet, I will probably need to add the external interface to the provider ovs bridge somehow.
Still a bit of work to be done, but I’m getting closer.

Update 1: the console issue is probably related to the antiquated version of novnc that’s packaged by Ubuntu. It includes version 0.4 instead of 0.6.2 (latest), and since 0.4 alot of improvements in foreign keyboard support has been added (mine is a Belgian layout). If I set the keyboard layout to US on my machine, the console works just fine. I could either try to upgrade the version of novnc on my installation, or switch to SPICE instead.

Update 2: the networking issue has also been resolved: I just needed to add the external port to the br-provider bridge.

OpenStack part 9: Using a Raspberry Pi as out-of-band server management

As an intermezzo to all my OpenStack woes, I’ve built an out-of-band management device to control the OpenStack machine, based on a Raspberry Pi 2 I had laying around.

The problem I’m trying to solve is twofold: In order to conserve some energy consumption of the server, I’ve removed the graphics card, and I turn the machine off when I’m not using it. But that means that the times I do need it, I need to physically be at the machine and push the power button. When I’m not at home, I’m out of luck. Secondly removing the graphics card means that when I screw up the boot process of the machine (as I’ve done a couple of times when playing with the LVM partitions), I need to pull the machine out from under the desk, open it up, install the graphics card, attach screen and keyboard, debug and when done, undo it all again.

Normally the Core i7 CPU comes with a feature called vPro (link), which should allow remote access to basic functionality, even with the machine powered down (there is always a 5V standby available in a standard ATX power supply, which is used for USB, keyboard, network cards to allow a wake up of the machine via WoL/keyboard shortcut), but the BIOS on my motherboard doesn’t support this.

So I built my own: the Raspberry Pi will control a relay switch to simulate the power button being pushed, and it will connect to the server via serial connection (set up as a console) to be able to debug and fix issues in the grub bootloader, linux OS, network, etc…

The Raspberry Pi is powered via USB, directly connected to the motherboard on one of the connections meant for the front panel USB connector and card readers.

Hardware:

  1. Raspberry Pi 2: with Debian Jessie
  2. RSR232 to TTL, female serial convertor, based on the MAX3232
  3. 2 channel relay module for arduino
  4. DB9 to 10 pin header cable (those who’ve tried this before might have noticed already that this is exactly the wrong cable for the job)
  5. Pin header to USB (female) cable
  6. Pin header to pin header

 

Connecting all the hardware

I connect the serial convertor like this: TX is connected to GPIO14 on pin 8, RX is connected to GPIO15 on pin 10, VCC is connected to 3.3V on pin 1 and GND is connected to ground on pin 14.

The relay module is connected to ground on pin 9, VCC to 5V on pin 2 and I’m using GPIO4 on pin 7 to control IN1.

I also have a double pin header where I soldered all the rows together.  This way I can plug in the cable to the front panel switch, the cable to the relay and the cable to the motherboard power button input, all in parallel. This way, either a connection made by the relay, or by the front panel power switch will turn on the computer.

Configuration on the Raspberry Pi:

  1. The GPIO serial is disabled by default, to enable it edit /boot/config.txt and add a line at the end:
    enable_uart=1
  2. By default the Raspberry Pi is configured to have a console output on the serial interface. to disable the console:
    $ sudo systemctl stop serial-getty@ttyAMA0.service
    $ sudo systemctl disable serial-getty#ttyAMA0.service
  3. Remove the console from the kernel command line: edit the file /boot/cmdline.txt and remove the part “console=serial0,115200”
  4. Reboot for the changes to take effect

Information has been found here: Link.

Configuration on the server

The machine I’m using as an OpenStack server will need to be configured to use the serial output port as a console:

  1. Configure grub to use ttyS0 as console.  For this edit /etc/default/grub and append or modify:
    GRUB_CMDLINE_LINUX='console=tty0 console=ttyS0,19200n8'
    GRUB_TERMINAL=serial
    GRUB_SERIAL_COMMAND="serial --speed=19200 --unit=0 --word=8 --parity=no --stop=1"

    update-grub

  2. List the working serial port in Linux
    # setserial -g /dev/ttyS[0123]

    sample output:

    /dev/ttyS0, UART: 16550A, Port: 0x03f8, IRQ: 4
    /dev/ttyS1, UART: unknown, Port: 0x02f8, IRQ: 3
    /dev/ttyS2, UART: unknown, Port: 0x03e8, IRQ: 4
    /dev/ttyS3, UART: unknown, Port: 0x02e8, IRQ: 3
    

    This means the serial port is at /dev/ttyS0

  3. Start the serial-getty serial at /dev/ttyS0:
    sudo systemctl enable serial-getty@ttyS0.service
    sudo systemctl start serial-getty@ttyS0.service

 

Triggering power ON

The relay comes with three connections, making up two contacts which are triggered together when the relay is activated: one normal open (NO) and one (normal closed).  I’m using the NO contact to simulate a power button press.  However it is important to note that the input is stated to be active low, this means that a 0 will activate the relay and a 1 will deactivate it.

Controlling the GPIO pins is pretty easy on the command line:

  1. echo 4 > /sys/class/gpio/export
    This will enable GPIO to enable that pin, by default the direction will be an input.
  2. echo out > /sys/class/gpio/gpio4/direction
    This will set the GPIO to an output pin, default value will be 0.  As I said before, since the input on the relay os active low, this will activate the relay!  I tried several things to change this, setting the value to 1 before setting the direction is not possible, because the value is not writable when the GPIO is configured as an input.
  3. Wait for half a second, so that is would seem like the power button has been pressed.
  4. echo 1 > /sys/class/gpio/gpio4/value
    This will deactivate the relay again, the server is now starting up.

Script:

#! /bin/bash

GPIO_NUM=4

# the relay to power on the server is controller bu gpio 4
echo ${GPIO_NUM} > /sys/class/gpio/export
sleep 0.5

# trigger the relay (note that the relay is active LOW, and the gpio will always be LOW when you set the direction to "out"
echo out > /sys/class/gpio/gpio4/direction
echo 0 > /sys/class/gpio/gpio${GPIO_NUM}/value
sleep 0.5

# release the relay
echo 1 > /sys/class/gpio/gpio${GPIO_NUM}/value

# cleanup
echo in > /sys/class/gpio/gpio${GPIO_NUM}/direction
echo ${GPIO_NUM} > /sys/class/gpio/unexport

Problems encountered during setting up the serial connection

It took a while to set up the serial connection.  This was because the serial cable I ordered was of the wrong type.   The proper pinout:

While in the cable I received it was something like 1-6-2-7-3-8-4-9-5.  I had a spare DB-9 connector in my stash, and I soldered it on the correct way.

Also, connect the Rx of the convertor to the Rx pin on the Raspberry PI, and the Tx of the convertor to the Tx pin on the Pi…  Some sources claim you need to cross the Tx and Rx, but in my case this wasn’t necessary.

OpenStack Part 8: Upgrade and failures

When I started with my OpenStack try-out I made a crucial error with my choice of operating system. I chose to begin with installing Ubuntu 16.04, as I was unaware that the version of OpenStack would be strongly tied to the version of the underlying operating system; and at time, the version supported for Ubuntu 16.04 was still in development. Instead of stepping back to an earlier version of Ubuntu, I soldiered on…

Last september the official release of OpenStack Newton was released for Ubuntu 16.04, and I decided to go for the upgrade. A few quick “apt-get update/upgrade” commands later, nothing worked anymore 🙁

Interfaces and options became deprecated or mismatched and has to be painstakingly found and corrected. It took me months of effort, a couple of hours here and there whenever I found the courage, to dig through all the configuration files again and located the few errors.

In the end I managed to get it back to the situation before the upgrade: Invalid Block Device. It took me a while, but I found the error in the cinder.conf configuration file: the options that points to the LVM configuration file was wrong. By default it points to /etc/cinder/lvm.conf, while the ubuntu default location for that file would be /etc/lvm/lvm.conf.

Next failure to figure out, when trying to create a VM instance, I get the “No valid host was found. There are not enough hosts available”. As far as I can tell from the logs it appears to be caused by an issue in the networking subsystem, it seems to claim that it can’t create a port. I’m stuck at this point, and I will probably need to post some question on the OpenStack support forum…

TO BE CONTINUED…

OpenStack part 7: Storage (cinder)

Installation of cinder was quite straightforward. I created a new VM with the same specifications as on the network node, and this will be my first storage node.

I assigned a 250GB logical volume to the VM in libvirt. And once booted I inserted that disk into another LVM volume group, so I can assign it to cinder for creation of the volumes.

I had to set up the networking (just 1 interface is needed), the ntp, and add the proper apt repositories, just like the other machines.

Installation of cinder went without a hitch: guide. And that’s all there is to that.

OpenStack Part 6.5: Adding disks

As announced in the previous post, I bought 4 used disks from eBay and inserted them into the machine. This is the synopsis of creating a RAID5 volume, put it into an LVM volume group, and assign that VG to libvirt. Libvirt can then create logical volumes in that VG, and attach them the virtual machines.

A Debian configuration guide I followed:
https://debian-handbook.info/browse/stable/advanced-administration.html#sect.raid-and-lvm

Checking the presence of the new SATA disks in the system:
root@PANICLOUD:/home/nicky# lsblk
NAME                            MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
fd0                               2:0    1     4K  0 disk
sda                               8:0    0 232.9G  0 disk
└─sda1                            8:1    0 232.9G  0 part
└─md0                           9:0    0 465.5G  0 raid0
├─md0p1                     259:0    0 243.1M  0 md    /boot
├─md0p2                     259:1    0     1K  0 md
└─md0p5                     259:2    0 465.3G  0 md
├─PANICLOUD--vg-root      253:0    0   9.3G  0 lvm   /
├─PANICLOUD--vg-swap_1    253:1    0  18.1G  0 lvm   [SWAP]
├─PANICLOUD--vg-home      253:2    0   100G  0 lvm   /home
├─PANICLOUD--vg-images    253:3    0   100G  0 lvm   /opt/images
└─PANICLOUD--vg-instances 253:4    0   100G  0 lvm   /opt/instances
sdb                               8:16   0 232.9G  0 disk
└─sdb1                            8:17   0 232.9G  0 part
└─md0                           9:0    0 465.5G  0 raid0
├─md0p1                     259:0    0 243.1M  0 md    /boot
├─md0p2                     259:1    0     1K  0 md
└─md0p5                     259:2    0 465.3G  0 md
├─PANICLOUD--vg-root      253:0    0   9.3G  0 lvm   /
├─PANICLOUD--vg-swap_1    253:1    0  18.1G  0 lvm   [SWAP]
├─PANICLOUD--vg-home      253:2    0   100G  0 lvm   /home
├─PANICLOUD--vg-images    253:3    0   100G  0 lvm   /opt/images
└─PANICLOUD--vg-instances 253:4    0   100G  0 lvm   /opt/instances
sdg                               8:96   0 931.5G  0 disk
sdh                               8:112  0 931.5G  0 disk
sdi                               8:128  0 931.5G  0 disk
sdj                               8:144  0 931.5G  0 disk

 

Creating the RAID5 volume:
root@PANICLOUD:/home/nicky# mdadm --create /dev/md1 --level=5 --raid-devices=4 /dev/sdg /dev/sdh /dev/sdi /dev/sdj

Creating the LVM phyiscal volume:
root@PANICLOUD:/home/nicky# pvcreate /dev/md1
Physical volume "/dev/md1" successfully created
root@PANICLOUD:/home/nicky# pvdisplay
--- Physical volume ---
PV Name               /dev/md0p5
VG Name               PANICLOUD-vg
PV Size               465.28 GiB / not usable 2.02 MiB
Allocatable           yes
PE Size               4.00 MiB
Total PE              119110
Free PE               35298
Allocated PE          83812
PV UUID               UdCEwD-mlv1-EuIw-L0jc-lrgK-QGjX-RM1AdD

“/dev/md1” is a new physical volume of “2.73 TiB”
— NEW Physical volume —
PV Name               /dev/md1
VG Name
PV Size               2.73 TiB
Allocatable           NO
PE Size               0
Total PE              0
Free PE               0
Allocated PE          0
PV UUID               FRsRMz-6HNs-S6d6-QXOw-beOf-DbB8-MGuLu1

Creating the volume group:
root@PANICLOUD:/home/nicky# vgcreate PANICLOUD_STORAGE-vg /dev/md1
Volume group "PANICLOUD_STORAGE-vg" successfully created
root@PANICLOUD:/home/nicky# vgdisplay
--- Volume group ---
VG Name               PANICLOUD-vg
System ID
Format                lvm2
Metadata Areas        1
Metadata Sequence No  9
VG Access             read/write
VG Status             resizable
MAX LV                0
Cur LV                5
Open LV               5
Max PV                0
Cur PV                1
Act PV                1
VG Size               465.27 GiB
PE Size               4.00 MiB
Total PE              119110
Alloc PE / Size       83812 / 327.39 GiB
Free  PE / Size       35298 / 137.88 GiB
VG UUID               nWgVHx-Xuq9-AGAh-iI6g-4DoR-HcIn-Nq1WhR

— Volume group —
VG Name               PANICLOUD_STORAGE-vg
System ID
Format                lvm2
Metadata Areas        1
Metadata Sequence No  1
VG Access             read/write
VG Status             resizable
MAX LV                0
Cur LV                0
Open LV               0
Max PV                0
Cur PV                1
Act PV                1
VG Size               2.73 TiB
PE Size               4.00 MiB
Total PE              715305
Alloc PE / Size       0 / 0
Free  PE / Size       715305 / 2.73 TiB
VG UUID               AmZ7Yx-4p37-ZgLy-wIvC-1pOU-wdwP-JK64m1

Make sure that afterwards you run “updateinitramfs -u” so that the RAID array is assembled again after a reboot of the machine.

This guide summarizes the process to assign the volume group to libvirt with virt-manager.

OpenStack part 6: Networking and problem solving

I found a second networking guide, where I followed the instructions for the classic openvswitch implementation and got it up and running.

I’ve done all this for neutron:

    • Disable libvirt networking on the compute node: guide

, this wasn’t really necessary, but to avoid confusions with the virbr0 networks the libvirt creates.

  • Create the basic networks in openvswitch and configure the proper interfaces.
  • Connect to the internet.

After all this I still wasn’t able to launch instances via the dashboard:

  1. Configuration errors: it is very easy to make alot of typo’s 🙁
  2. After that it seems like creating an instance fails because no volume could be created, so bought four used hard drives from eBay, insert them in the OpenStack machine and created a RAID5 volume. I installed cinder on a seperate VM and assigned it 250GB. Volumes can be created now, but still errors when trying to create an instance. I will create a seperate post an that subject.
  3. I’ve now been searching for the cause of the failures, and I believe it is somewhere located in the network configuration. It can create ports on the proper networks, and it fails because of that. A few things that I’ve found, besides more typo’s:
    • The documentation says to set interface_driver = neutron.agent.linux.interface.OVSInterfaceDriver in /etc/neutron/l3_agent.conf and /etc/neutron/dhcp_agent.conf, but this class can’t be loaded. I has to dig through the neutron python code, and found another way: interface_driver = openvswitch. It is an alias that does seem to work. I think this is a regression in the Newton code, as the code should be backwards compatible with the class path.
    • I also have to have the ip_gre, 8021q and vxlan kernel modules loaded. These are essential if you want to create GRE/VXLAN tunnels or VLAN networks. Just modprobe them, and put them in /etc/modules.
    • Still not working though, but now I do see the dhcp server plugged in on the network, so progress…

OpenStack Part 5: Setting up OpenStack

Still following this guide.

For the controller node I needed to install first:

  • SQL database: MariaDB
  • NoSQL database: MongoDB
  • Message queue: RabbitMQ
  • Memcached

to support these services on the controller node:

  • Identity (keystone)
  • Image service (glance)
  • Dashboard (horizon) – will only be installed after setting up the networking node

these on the network node:

  • Networking (neutron)

and these on the compute node:

  • Compute (nova)

First I install keystone and glance on the controller node, then nova on the compute node, and neutron on the network node. Back to the controller node, I will add horizon and ceilometer.

keystone

While installing keystone, and populating the database, I ran into an issue where the script returned an error:

2016-07-27 22:36:21.749 3111 ERROR oslo_db.sqlalchemy.exc_filters [-] DBAPIError exception wrapped from (pymysql.err.InternalError) (1071, u'Specified key was too long; max key length is 767 bytes')

A solution was found here: https://bugs.launchpad.net/openstack-manuals/+bug/1575688

It basicly comes down to ubuntu is now using a character set with 4 bytes per character, while the python engine, while trying to create a field 250 characters long can only support 767 bytes, which is less than the 1000 bytes it would take. The fix described will reconfigure mariadb to use 1 byte characters instead, and recreate the keystone database.

glance

As described in the guide

nova

As described in the guide, but I did have some trouble because I entered the wrong password for the nova user while creating it in openstack.

neutron

I found this example architecture, but then I would need to add another VLAN interface to the compute and network node machines, and the controller node only needs the management interface, so I can remove interface to the tunnel network again:
http://docs.openstack.org/mitaka/networking-guide/scenario-classic-ovs.html

On the controller node I would need to install:

  • neutron-server
  • neutron-plugin-ml2

On the network node:

  • openvswitch-switch
  • neutron-plugin-ml2
  • neutron-openvswitch-agent
  • neutron-l3-agent
  • neutron-dhcp-agent
  • neutron-metadata-agent

And on the compute node:

  • openvswitch-switch
  • neutron-plugin-ml2
  • neutron-openvswitch-agent

The guide describes how to install a basic neutron service on the controller node, but instead I would like it to run on the network node instead.

Installation of the above mentioned services has been done, but setting it up proved too much at this point.  I will cover this in a following post.

More information will probably come from this http://docs.openstack.org/mitaka/networking-guide/, instead of the basic installation guide.

horizon

The was very straightforward, as one would expect.  Installation followed the guide.

ceilometer (still optional)

Postponing installation of ceilometer for later.

OpenStack Part 4: Preparing the nodes

I’ll be starting with setting up a general purpose OpenStack environment as described in the architecture design description. Only the most basic of components will be considered at this point:

I’ll be skipping Object storage (swift) and Block storage (cinder) for now.
Ceilometer is also optional, but will be included to collect usage data for learning purposes.

Several virtual machines will be set up:

  • A controller node: 1 CPU, 4GB RAM, 20GB storage
    • Identity (keystone)
    • Dashboard (horizon)
    • Telemetry (ceilometer)
    • Image service (glance)
    • This node will have 2 networking interfaces:
      • Management network
      • Tunnel network
  • A networking node: 1 CPU, 4GB RAM, 10GB storage
    • Networking (neutron)
    • This node will have 3 networking interfaces:
      • Management network
      • Tunnel network
      • The Internet
  • And a compute node: 2 CPU, 8GB RAM, 20GB storage
    • Compute (nova)
    • This node will have 2 networking interfaces:
      • Management network
      • Tunnel network

Sizing and resizing of these virtual machines can be done later on with: http://libguestfs.org/virt-resize.1.html

I’ve decided to go for an Ubuntu Server 16.04 installation for the OpenStack virtual machines, because it is claimed to have better support for the OpenStack components than debian.

Three networks will be created:

  • Management network: NAT to eth0 on the host machine
  • Tunnel network: host only network bridge
  • Internet: directly assigned to networking node (if possible)

Following the installation guide here, I installed three Ubuntu virtual machines, version 16.04.1 LTS.  I’ve set up the provider network, configured DNS,  NTP, and added the Openstack apt repository.  Do note that the install guide is specified for Openstack version Mitaka, Ubuntu Xenial will only support Openstack Newton, which is still under development at this moment.

OpenStack Part 3: DevStack

I’ve been trying DevStack on Ubuntu and Debian, just to get a feel with what an OpenStack installation would require from a VM (diskspace and the like).

Lessons learned:

  • Use static ip addresses on your VMs, changing them later on means updating all the entries in the database that contain this IP address.
  • One does not simple reboot a devstack machine! Before the reboot: unstack.sh, reboot, afterwards rejoin-stack.sh. Even though, most of the times the cinder volume service doesn’t seem to come back online after the rejoin-stack, and I’ll have to run stack.sh to make it work again. I’m not sure whether this is side effect of the devstack installation or whether this is an openstack thing, I would’ve hoped that starting and stopping virtual machines with the properly configured services would just register themselves automatically. (Update: it appears that the openstack services under the devstack tool are started in a screen session, so that yet another tool I need to learn to master if I want to progress in investigating this implementation further. But it does imply that this reboot behavior is not indicative of openstack, but only of devstack.)
  • I still have a lot to learn about what all these things actually are. The dashboard isn’t really making it any easier…

Openstack