ESXi VM – The CPU has been disabled by the guest operating system

For some weeks now, a couple of my virtual machines on ESXi would stop working out of nowhere. They were completely unresponsive (including via the ESXi VM Console). Nothing would help, except a shutdown / start of the VM. Just to find out later that, randomly, the VM would become unresponsive again.

The only human readable information about these failures was in the ESXi host Events and was saying something like this (among other things):

 The CPU has been disabled by the guest operating system

One other thing which I should mention is that all my VM encountering this issue where Linux based, mainly Ubuntu 20.04 as OS distribution.

Not much to work with, but I gave it a try and searching for the error did point me to this VMware KB: https://kb.vmware.com/s/article/2000542

The KB is clearly accurate, just that it didn’t help me at all to resolve my problem. The troubleshooting process explain in the KB lead me to a dead end.

Other web resources (for the above error) pointed to articles which explained a procedure for VMware Workstation / Player. Not my case, since I’m using ESXi.

More research done, which took a while – that’s why I’m writing this article, hopefully others with this problem will find it easier – pointed to a BUG. Seems this BUG is a particular case between my VM Linux kernel and the version of the ESXi I’m using currently.

I’ve arrived to this VMware KB https://kb.vmware.com/s/article/2151480 which was a game changer. In my case this KB was hard to find, because the title – Linux VM fails with the error “kernel BUG at drivers/net/vmxnet3/vmxnet3_drv.c:1413!” (2151480) – is completely different than the error I was seeing and which I used searching the web.

Skipping the long output at the beginning of the KB, I saw something interesting in lower part of the page:

This issue occurs due to a bug in VMXNET3 vNIC backend which is part of the vmkernel. This issue occurs if the following conditions are met:

    Linux VM is running kernel >= 4.8
    HW version of VM is >=13
    ESXi version is 6.5

All the above fits my scenario, VMXNET3 as vNIC, Kernel 5.4, VM HW version 13 and ESXi 6.5

Like in most of BUG cases, the obvious solution is upgrade. Same here:

This issue is resolved in VMware ESXi 6.5 U1

Just that I cannot upgrade now for various reasons.

So, I’ve decide to look into the workarounds.

Second workaround on the page seems to be more simple and I don’t even have to restart the VM:

ethtool -G ethX rx-mini 0

Of course replace the ethX with your interface name.

Worked like a charm without any visible side-effects.

The other workaround is also doable, but I didn’t want to modify the .vmx file

Power off the virtual machine
         
Edit the vmx file and add the below parameter:
vmxnet3.rev.30 = FALSE
         
Power on the virtual machine

Now I’m just curious if I would encounter the same issues using another vNIC adapter type, like E1000 or E1000E instead of VMXNET3. Maybe I’ll give it a try…

Ubuntu image for EVE-NG – Python for network engineers

Lately I’ve started working more and more with EVE-NG to test various network scenarios, automation and in general to try and learn something everyday.

If you’re familiar with EVE-NG, you know where to find various Linux images which you can download and install . Very helpful indeed, however all of them are coming without any pre-installed tools which I need for network oriented tests. I need Python, IPerf, Ansible, various Python libraries for network automation, etc.
Basically every time when I setup a new lab in EVE-NG, I need to make sure that the Linux image has a connection to Internet to download all these tools. Doable, but too much time consuming.

Lately EVE-NG has the Pro version, where you have Docker images which support some of the tools for a network engineer needs to test automation. If you already have EVE-NG Pro, then maybe this is a bit redundant. However if you’re still using the Community version, it may sounds interesting.

I’ve developed the Ubuntu (18.04) image using the same tools that you can find in my Docker image (Ubuntu 16:04 Pfne):
* If you’re not sure what I’m talking about, please read my previous post.

  • Openssl
  • Net-tools (ifconfig..)
  • IPutils (ping, arping, traceroute…)
  • IProute
  • IPerf
  • TCPDump
  • NMAP
  • Python 2
  • Python 3
  • Paramiko (python ssh support)
  • Netmiko (python ssh support)
  • Ansible (automation)
  • Pyntc
  • NAPALM

The image is hosted on my Firstdigest Project at Sourceforge.
If you are in a hurry, download directly using this link: Ubuntu 18.04 Pfne for EVE-NG.

For convenience here are the steps, but if you run into trouble be sure to check the EVE-NG Documentation.

  • Download the image
  • Using favorite SFTP Client (WinSCP, FileZilla) connect to your EVE-NG and upload the image to the location: /opt/unetlab/addons/qemu/
  • Connect via SSH to your EVE-NG machine and go to location:
cd /opt/unetlab/addons/qemu/
  • Unzip your uploaded image file.
tar xzvf linux-ubuntu-server-18.04-pfne.tar.gz
  • Remove the archived image file (be sure to have a copy somewhere to avoid you have to download it again)
rm -f linux-ubuntu-server-18.04-pfne.tar.gz
  • Fix permissions
/opt/unetlab/wrappers/unl_wrapper -a fixpermissions

The image comes with the following predefined username and password (security was not the point here):

User: root
Password: root
User: pfne
Password: pfne

With this image you have everything ready for your tests. You want to test QoS? Just design a network and two (client / server pair) machine using this image and push some packets with IPerf. Or maybe you want to test some automation. Here you have it, just start playing with.

Btw, I assume you have the EVE-NG installed. If not and you’re into learning topics, I’ll advise you to install this great application. You can start with Community version which is free (and honestly has enough features for most of the self-teaching engineers out there) and if you feel like go with the Pro version.

Let me know if you find it useful. In case of problems, please comment and I’ll try to help in my spare time.

Docker image – Python for network engineers

Lately I’m looking more and more into Python, with respect to automation implementations useful for network engineers. In the learning process I’ve used different materials, like the excellent video trainings Python Programming for Network Engineers from David Bombal which are available free on Youtube.

This training in particular relies on a Ubuntu Docker image in order to support Python learning following interaction with Cisco devices in GNS3. Everything is great, just that the image doesn’t contain all necessary tools (like Paramiko, Netmiko, Ansible…). As you can guess, whenever you close / open the Project in GNS3, all the installed packages installed in the Ubuntu Docker image are gone.

Since we’re talking automation, I got bored to install the necessary tools everytime I wanted to start a new project or I had to close GNS3 for some reason. I’ve tried to find a Docker image that suits my needs, but I couldn’t (please point me to one if you know it).

So, I’ve build a Docker image, based on Ubuntu 16.04, which contains the necessary tools to start learning Python programming oriented for network engineers:

  • Openssl
  • Net-tools (ifconfig..)
  • IPutils (ping, arping, traceroute…)
  • IProute
  • IPerf
  • TCPDump
  • NMAP
  • Python 2
  • Python 3
  • Paramiko (python ssh support)
  • Netmiko (python ssh support)
  • Ansible (automation)
  • Pyntc
  • NAPALM

The above list can be extended, but I would like to keep it to the minimum necessary (I want to keep the image size at decent level).

If you’re interested, please find the image at: https://hub.docker.com/r/yotis/ubuntu1604-pfne/, or you can download it:

$ docker pull yotis/ubuntu1604-pfne

I’ve tested the image for couple of days and it works fine. However if something doesn’t work as expected, please let me know and I’ll try to fix it.

For those using GNS3 is possible to import the image above directly into GNS3 using the PFNE Appliance.

Ubuntu OVF images for download

Lately I’m playing a lot with virtualization features and for this I needed a rapid way to deploy from scratch new instances. First I had the virtual machines converted to templates, but then I had to rebuild from zero the entire ESXi environment and those images were gone.

I realized then it was more easier to have OVF images saved on a distributed storage and deploy them as soon as I need them. I start looking around Internet and I could not find something that suit my needs.
Don’t get me wrong, there are plenty of OVF images around, but mostly have GUI and a lot of packages already installed that I do not need.

I wanted to have OVF files with low-end hardware and only CLI interface. Why should I download and deploy a 20 or 30 GB instance if the only things I need is CLI? From this I could customize it everytime exactly the way I wanted.

I started to create my OVF files and I’m pretty satisfied with them. Then I said why not to share them with the community?

I did chose Sourceforge to host my files because of their CDN and because it is free. On this blog I have to think how to organize them, because I don’t know if “post” format is the best idea. Until then, please find below the first two OVF images for Ubuntu 12.04 LTS.

All archives contain a text files with details about distro, user/passwd and services enabled. There may be other services enabled like postfix, but the listed ones are mandatory if you want network functionality and remote connection.

Here are the details for the below listed downloads:

Server images 32/64 bits

username: notroot
passwd: 123qweASD!

username: root
passwd: firstdigest
HDD: 8GB, ext4, 1 partition, thin provision
RAM: 256 CPU Core: 1

Services enabled:
SSHd
DHCP client

Downloads

Ubuntu 12.04 Server (i386) OVF

Ubuntu 12.04 Server (x86_64) OVF

If you encounter problems with these images please let me know here in Comments or on Sourceforge Project Discussions page.

In the upcoming days I will add here more images from different distros.

CCIE home rack – Ubuntu persistent net rules

In one of my last posts, I was writing about my CCIE home rack which has one server that runs Ubuntu + Ethernet Quad cards + Cisco switches. Before connecting the cables between server and Cisco switches, I had one problem that was driving me crazy.

I have three Ethernet Quad cards connected in 3 PCI slots in my server. The issue is that sometimes (quite often) the port numbers were changing during the reboot of my client. To give you and example, an ethernet port that was  eth1 during one server boot could change to eth2 next time. For twelve ports you can imagine what mess this creates after almost each server reboot. When this was happening, the ports were not matched correctly in Dynamips and from it would result in a lot of connectivity problems.

I will try to explain in a few word why this is happening. With Udev and modular network drivers, the network interface numbering is not persistent across reboots by default, because the drivers are loaded in parallel and, thus, in random order. For example, on a computer having two network cards made by Intel and Realtek, the network card manufactured by Intel may become eth0 and the Realtek card becomes eth1. In some cases, after a reboot the cards get renumbered the other way around. To avoid this, create Udev rules that assign stable names to network cards based on their MAC addresses or bus positions.

My solution is based on MAC addresses. I’m not saying that’s the best, but it works for me and I hope it will help you as well. Before editing the file for persistent Udev rules, we have to gather some information.

First I needed to check what cards I have in my server:

root@athos:/etc/udev/rules.d# lspci | grep Ethernet
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01)
04:00.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 05)
04:01.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 05)
04:02.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 05)
04:03.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 05)
05:04.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
05:05.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
05:06.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
05:07.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
06:04.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
06:05.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
06:06.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
06:07.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)

So, I have my onboard Realtek card, then one Intel quad card and then two D-Link DFE-570TX quad cards. The Intel card I got it from a friend and the D-Link I bought over eBay. I must say that Ethernet quad cards tend to be a little bit too expensive, but I’ve found 2 brands over eBay that were cheaper, around 25 Eur / piece. One brand is the D-Link and the other one is from Sun. There are some arguments to choose D-Link over Sun on the INE Online Community discussion. You should check that discussion as it has a lot of good tips and tricks. All cards are supported natively in Ubuntu.

Next, I had to collect the MAC addresses of all Ethernet cards.

root@athos:/etc/udev/rules.d# grep -H . /sys/class/net/*/address | grep eth
/sys/class/net/eth0/address:00:25:22:53:57:40
/sys/class/net/eth10/address:00:80:c8:ca:d8:7e
/sys/class/net/eth11/address:00:80:c8:ca:d8:7f
/sys/class/net/eth12/address:00:80:c8:ca:d8:80
/sys/class/net/eth1/address:00:e0:b6:06:a6:3b
/sys/class/net/eth2/address:00:e0:b6:06:a6:3a
/sys/class/net/eth3/address:00:e0:b6:06:a6:39
/sys/class/net/eth4/address:00:e0:b6:06:a6:38
/sys/class/net/eth5/address:00:80:c8:ca:bb:59
/sys/class/net/eth6/address:00:80:c8:ca:bb:5a
/sys/class/net/eth7/address:00:80:c8:ca:bb:5b
/sys/class/net/eth8/address:00:80:c8:ca:bb:5c
/sys/class/net/eth9/address:00:80:c8:ca:d8:7d

On the last step I had to edit the file in charge for Udev persistent rules for network cards. In Ubuntu, this file is under:

/etc/udev/rules.d/70-persistent-net.rules

I already had a sample file, so I just modified it so match my own rules. My file looks like this:

# This file maintains persistent names for network interfaces.
# See udev(7) for syntax.
#
# Entries are automatically added by the 75-persistent-net-generator.rules
# file; however you are also free to add your own entries.
#
### File generated by cc ####
### Count start from top card left side with eth1 ####
### # eth4  # eth3  # eth2  # eth1 ###
### # eth8  # eth7  # eth6  # eth5 ###
### # eth12 # eth11 # eth10 # eth9 ###
 
# PCI device 0x10ec:0x8168 (r8169)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:25:22:53:57:40", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"
 
# PCI device 0x1011:0x0019 (tulip)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:80:c8:ca:d8:80", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth12"
 
# PCI device 0x1011:0x0019 (tulip)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:80:c8:ca:d8:7f", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth11"
 
# PCI device 0x1011:0x0019 (tulip)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:80:c8:ca:d8:7e", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth10"
 
# PCI device 0x1011:0x0019 (tulip)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:80:c8:ca:d8:7d", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth9"
 
# PCI device 0x1011:0x0019 (tulip)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:80:c8:ca:bb:5c", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth8"
 
# PCI device 0x1011:0x0019 (tulip)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:80:c8:ca:bb:5b", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth7"
 
# PCI device 0x1011:0x0019 (tulip)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:80:c8:ca:bb:5a", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth6"
 
# PCI device 0x1011:0x0019 (tulip)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:80:c8:ca:bb:59", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth5"
 
# PCI device 0x8086:0x1229 (e100)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:e0:b6:06:a6:38", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth4"
 
# PCI device 0x8086:0x1229 (e100)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:e0:b6:06:a6:39", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth3"
 
# PCI device 0x8086:0x1229 (e100)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:e0:b6:06:a6:3a", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2"
 
# PCI device 0x8086:0x1229 (e100)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:e0:b6:06:a6:3b", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"

I did some remarks for myself on the top file, just to know in future how the ports are arranged. Save this file and you’re good to go.

Please remember, don’t just copy / paste the output from this file. I mean you can do it, but change at least the ATTR{address} and NAME. I don’t know what other attributes (ATTR{dev_id}==”0x0″, ATTR{type}==”1″) are doing. They were in the original file and I just copy/paste them. Everything is working and for me that is enough for me.

Good luck and let me know if you have any problems implementing the above solution.