VCSA, 503 Service Unavailable – possible fix

My ESXi hosting the VCSA crashed for whatever reason and after reboot the VCSA was displaying a “503 Service Unavailable” error.

What I was seeing actually was a blabbering long line:

503 Service Unavailable (Failed to connect to endpoint: [N7Vmacore4Http20NamedPipeServiceSpecE:0x00007fa69401a900] _serverNamespace = / action = Allow _pipeName =/var/run/vmware/vpxd-webserver-pipe)

The ESXi hosting my VCSA is not the fastest in the world, so I’ve waited a while, but the error was still there. Searching the Interne returned a lot of possible root causes for this errors, ranging from simple to complex one (like duplicate database table entry where you have to manually touch the postgresql instance).

I didn’t want to jump directly into touching things like the database, so I started with something more simple.

Below is what worked for me, maybe you’ll find it useful and can try before going into advanced troubleshooting.

I’ve connected to the VCSA CLI using the root credentials.

[email protected]'s password:
Connected to service

* List APIs: "help api list"
* List Plugins: "help pi list"
* Launch BASH: "shell"

Command>

Launched BASH by typing shell at the Command> prompt.

Now I have a Linux like CLI terminal.

Next step I’ve ran

service-control --status --all

which resulted in the following output:

root@vcsa [ ~ ]# service-control --status --all
Running:
 lwsmd vmafdd vmcad vmdird vmdnsd vmonapi vmware-cis-license vmware-cm vmware-eam vmware-rhttpproxy vmware-sca vmware-sts-idmd vmware-stsd vmware-vapi-endpoint vmware-vmon vmware-vpostgres vmware-vpxd-svcs vsphere-client
Stopped:
 applmgmt pschealth vmcam vmware-content-library vmware-imagebuilder vmware-mbcs vmware-netdumper vmware-perfcharts vmware-psc-client vmware-rbd-watchdog vmware-sps vmware-statsmonitor vmware-updatemgr vmware-vcha vmware-vpxd vmware-vsan-health vmware-vsm vsphere-ui

I’m not a certified expert in VCSA, but this doesn’t look good. Too many stopped services.

So, I just give it a try to see if I can start them by running

service-control --start --all

The next output is a long one, but basically it will check what services are up and start the ones which are stopped

root@vcsa [ ~ ]# service-control --start --all
Perform start operation. vmon_profile=ALL, svc_names=None, include_coreossvcs=True, include_leafossvcs=True
2020-04-06T17:31:57.180Z   Running command: ['/usr/bin/systemctl', 'is-enabled', u'lwsmd']
2020-04-06T17:31:57.185Z   Done running command
2020-04-06T17:31:57.188Z   Service lwsmd does not seem to be registered with vMon. If this is unexpected please make sure your service config is a valid json. Also check vmon logs for warnings.
2020-04-06T17:31:57.188Z   Running command: ['/sbin/service', u'lwsmd', 'status']
2020-04-06T17:31:57.213Z   Done running command
Successfully started service lwsmd
2020-04-06T17:31:57.217Z   Running command: ['/usr/bin/systemctl', 'is-enabled', u'vmafdd']
2020-04-06T17:31:57.589Z   Done running command
2020-04-06T17:31:57.593Z   Service vmafdd does not seem to be registered with vMon. If this is unexpected please make sure your service config is a valid json. Also check vmon logs for warnings.
2020-04-06T17:31:57.593Z   Running command: ['/sbin/service', u'vmafdd', 'status']
2020-04-06T17:31:57.617Z   Done running command
Successfully started service vmafdd
2020-04-06T17:31:57.621Z   Running command: ['/usr/bin/systemctl', 'is-enabled', u'vmdird']
2020-04-06T17:31:57.627Z   Done running command
2020-04-06T17:31:57.630Z   Service vmdird does not seem to be registered with vMon. If this is unexpected please make sure your service config is a valid json. Also check vmon logs for warnings.
2020-04-06T17:31:57.630Z   Running command: ['/sbin/service', u'vmdird', 'status']
2020-04-06T17:31:57.654Z   Done running command
Successfully started service vmdird
2020-04-06T17:31:57.657Z   Running command: ['/usr/bin/systemctl', 'is-enabled', u'vmcad']
2020-04-06T17:31:57.663Z   Done running command
2020-04-06T17:31:57.667Z   Service vmcad does not seem to be registered with vMon. If this is unexpected please make sure your service config is a valid json. Also check vmon logs for warnings.
2020-04-06T17:31:57.667Z   Running command: ['/sbin/service', u'vmcad', 'status']
2020-04-06T17:31:57.690Z   Done running command
Successfully started service vmcad
2020-04-06T17:31:57.694Z   Running command: ['/usr/bin/systemctl', 'is-enabled', u'vmware-sts-idmd']
2020-04-06T17:31:57.700Z   Done running command
2020-04-06T17:31:57.703Z   Service vmware-sts-idmd does not seem to be registered with vMon. If this is unexpected please make sure your service config is a valid json. Also check vmon logs for warnings.
2020-04-06T17:31:57.703Z   Running command: ['/sbin/service', u'vmware-sts-idmd', 'status']
2020-04-06T17:31:57.727Z   Done running command
Successfully started service vmware-sts-idmd
2020-04-06T17:31:57.730Z   Running command: ['/usr/bin/systemctl', 'is-enabled', u'vmware-stsd']
2020-04-06T17:31:57.736Z   Done running command
2020-04-06T17:31:57.739Z   Service vmware-stsd does not seem to be registered with vMon. If this is unexpected please make sure your service config is a valid json. Also check vmon logs for warnings.
2020-04-06T17:31:57.740Z   Running command: ['/sbin/service', u'vmware-stsd', 'status']
2020-04-06T17:31:57.763Z   Done running command
Successfully started service vmware-stsd
2020-04-06T17:31:57.767Z   Running command: ['/usr/bin/systemctl', 'is-enabled', u'vmdnsd']
2020-04-06T17:31:57.773Z   Done running command
2020-04-06T17:31:57.777Z   Service vmdnsd does not seem to be registered with vMon. If this is unexpected please make sure your service config is a valid json. Also check vmon logs for warnings.
2020-04-06T17:31:57.777Z   Running command: ['/sbin/service', u'vmdnsd', 'status']
2020-04-06T17:31:57.801Z   Done running command
Successfully started service vmdnsd
2020-04-06T17:31:57.805Z   Running command: ['/usr/bin/systemctl', 'is-enabled', u'vmware-psc-client']
2020-04-06T17:31:57.812Z   Done running command
2020-04-06T17:31:57.815Z   Service vmware-psc-client does not seem to be registered with vMon. If this is unexpected please make sure your service config is a valid json. Also check vmon logs for warnings.
2020-04-06T17:31:57.815Z   Running command: ['/sbin/service', u'vmware-psc-client', 'status']
2020-04-06T17:31:57.839Z   Done running command
2020-04-06T17:31:57.843Z   Running command: ['/usr/bin/systemctl', 'daemon-reload']
2020-04-06T17:31:57.927Z   Done running command
2020-04-06T17:31:57.927Z   Running command: ['/usr/bin/systemctl', 'set-property', u'vmware-psc-client.service', 'MemoryAccounting=true', 'CPUAccounting=true', 'BlockIOAccounting=true']
2020-04-06T17:31:57.943Z   Done running command
Successfully started service vmware-psc-client
Service-control failed. Error Failed to start vmon services.vmon-cli RC=1, stderr=Failed to start statsmonitor services. Error: Operation timed out

The last line above is not too encouraging, “failed” keywords is not something to I wanted to see in the output. I was thinking my attempt didn’t work.

However checking the service status again, I’ve seen the following:

root@vcsa [ ~ ]# service-control --status --all
Running:
 applmgmt lwsmd pschealth vmafdd vmcad vmdird vmdnsd vmonapi vmware-cis-license vmware-cm vmware-content-library vmware-eam vmware-perfcharts vmware-psc-client vmware-rhttpproxy vmware-sca vmware-sps vmware-sts-idmd vmware-stsd vmware-updatemgr vmware-vapi-endpoint vmware-vmon vmware-vpostgres vmware-vpxd vmware-vpxd-svcs vmware-vsan-health vmware-vsm vsphere-client vsphere-ui
Stopped:
 vmcam vmware-imagebuilder vmware-mbcs vmware-netdumper vmware-rbd-watchdog vmware-statsmonitor vmware-vcha

This was for sure better than before.

I gave it a try by opening the https://vcsa.local.domain and there it was, the webpag working fine.

I’m not sure exactly why the restart of the VCSA resulted in some services not to start properly, but seems that a kick will do the job.

How to create your own Docker image

I mentioned in my previous post that I’ll explain how to create your own Docker image and customize it however you’d like. While is great to just use an image from Docker Hub, it can be that you need some customized image to fit your needs. As said before, is not hard at all to create the image and worth knowing how to do it.

I’ll use for this tutorial a fresh Ubuntu 18.04 minimal installation. You can follow the same steps (or almost) using different Linux distro, Microsoft Windows or MacOS. The reason why I chose Ubuntu is simply because is the distro that I’m most familiar and enjoy working with.

For all steps below you need to be root or run the commands via sudo. So you’ll see either # at the begining of the command if you’re root or $ sudo if you pick to run it with elevated rights.

Install Docker

# apt install -y docker.io

A word of advice here. Be sure to have docker.io typed. If you miss the .io, the system will install a docker, but that’s a different package:

docker/bionic 1.5-1build1 amd64
  System tray for KDE3/GNOME2 docklet applications

You’ll end up with something that cannot be used for what we want to achieve, since the docker command isn’t even there.

You can test if the installation completed successfully by using the following command:

# service docker status

You should see something like this in the output:

Docker service successful status

Since this is a new installation, you’ll have no images, no containers, nothing.
You can check, just to be sure.

# docker image ls

The result should be:

Docker image ls return nothing

I’ll add at the end of the post some basic (and most important) Docker commands to get you started.

Pull Ubuntu 18.04 image – Optional step

This step is optional, but I’d advise to do it, just to test that everything is fine with your Docker installation In this case we’re going to use the official Ubuntu 18.04 minimal Docker image. If you want to read more about this image you can check the explanation on Ubuntu 18.04 minimal Docker image and check their repository on Docker Hub – Ubuntu.

# docker pull ubuntu:18.04

If everything goes well you should see a message ending with “Status: Downloaded newer image for ubuntu:18.04” :

Docker successful download of Ubuntu image

Time to run our first container:

# docker run -i -t ubuntu:18.04 /bin/bash

You should be now in container shell:

Docker container

Now that we tested you can type exit to leave the container.

Create Dockerfile

The Dockerfile is nothing more than a text document which contains all the commands a user could call on the command line to create an image.
A detailed explanation is beyond the scope of this post, but if you’d like to learn more, you can check the Docker Documentation – Dockerfile

Here is a sample that’s good to start with:

# My custom Ubuntu 18.04 with 
# various network tools installed
# Build image with:  docker build mycustomlinux01 .


FROM ubuntu:18.04
MAINTAINER Calin C., https://github.com/yotis1982
RUN apt-get update --fix-missing
RUN apt-get upgrade -y
RUN apt-get install -y software-properties-common
RUN apt-get install -y build-essential
RUN apt-get install -y net-tools mtr curl host
RUN apt-get install -y iputils-arping iputils-ping iputils-tracepath
RUN apt-get install -y iproute2
RUN apt-get install -y traceroute
RUN apt-get install -y tcpdump

A short explanation:

# – This is a comment, add here whatever you think is useful. I’ve picked the name “mycustomlinux01”, but you can add whatever you like.
FROM – is always your first instruction, because it names the base image you’re building your new image from.
MAINTAINER – is the creator of the Dockerfile.
RUN – instruction to run the specified command, in this case apt-get to install various packages

There are multiple instructions for setting environment variables like ADD, COPY, ENV, EXPOSE, LABEL, USER, WORKDIR, VOLUME, STOPSIGNAL, and ONBUILD. You can read all about them in the Docker Documentation – Dockerfile

Using RUN you can add whatever package you need in your custom image. The same like you would do on a regular Ubuntu installation.
Yes, all the packages above could have been added in one RUN line, but for the sake of better visibility I would suggest to have separate lines.

Create your custom Docker image

After you save the Dockerfile is time to create your image

# docker build -t mycustomlinux01 .

You’ll see a lot of output, the same like when you’re installing new packages in any Linux distro. When you see the following lines, you’ll know that the image was successful created:

Docker successful image creation

Let’s check if the image is listed using:

# docker image ls

You should see the mycustomlinux01 image listed:

List my Docker image

Since the image is created successful I’d suggest that you run a container using this image following the same steps like in the “Pull Ubuntu 18.04 image”

Basically that’s it, you just created your custom image.

As mentioned above, here is a list of commands that I find useful to have at hand when working with Docker containers.

List images:

# docker image ls

Start a container from an image:

# docker run -i -t ubuntu:12.04 /bin/bash

Using an ID (you get the ID from List image command):

# docker run -i -t 8dbd9e392a96 /bin/bash

List all containers:

# docker ps -a

List running containers:

# docker ps -l

Attach running container:

# docker attach “container ID”

Remove a container:

# docker rm “container ID”

Last but not least. If you liked my Ubuntu 18.04 Docker image customized for network engineers who wants to learn Python and you would like to install additional packages, here is the Dockerfile:

# Ubuntu 18.04 with Python, Paramiko, Netmiko, Ansible
# various other network tools installed and SSH activated
# Build image with:  docker build -t yotis/ubuntu1804-pfne .

FROM ubuntu:18.04
MAINTAINER Calin C., https://github.com/yotis1982
RUN apt-get update --fix-missing
RUN apt-get upgrade -y
RUN apt-get install -y software-properties-common
RUN apt-get install -y build-essential
RUN apt-get install -y openssl libssl-dev libffi-dev
RUN apt-get install -y net-tools mtr curl host socat
RUN apt-get install -y iputils-arping iputils-ping iputils-tracepath
RUN apt-get install -y iproute2
RUN apt-get install -y iptraf-ng traceroute
RUN apt-get install -y tcpdump nmap
RUN apt-get install -y iperf iperf3
RUN apt-get install -y python python-pip python-dev
RUN apt-get install -y python3 python3-pip python3-dev
RUN apt-get install -y openssh-client telnet
RUN apt-get install -y nano
RUN apt-get install -y netcat
RUN apt-get install -y socat
RUN pip install --upgrade pip
RUN pip install cryptography
RUN pip install paramiko
RUN pip install netmiko
RUN pip install pyntc
RUN pip install napalm
RUN apt-add-repository ppa:ansible/ansible
RUN apt-get update
RUN apt-get install -y ansible
RUN apt-get clean
VOLUME [ "/root" ]
WORKDIR [ "/root" ]
CMD [ "sh", "-c", "cd; exec bash -i" ]

Obviously there is more about Docker than is covered on this post. It wasn’t in my scope to make a detailed analyze of Docker, rather a cheatsheet on how to create your custom image. If you want to learn more there are plenty resources out there and a good starting point is the Docker website.

I hope you find this how-to useful. As always, if you need to add something or you have questions about, please use the Comments form to get in contact with me.

New Ubuntu 18.04 Docker image – Python For Network Engineers

About one year ago I’ve created the Ubuntu 16.04 PFNE Docker image. It’s time for a new version of the Ubuntu PFNE Docker image to support Network engineers learn Python and test automation.

Recently, Ubuntu announced that on the Ubuntu Docker Hub the 18.04 LTS version is using the minimal image.

With this change when launching a Docker instance using

$ docker run ubuntu:18.04

you’ll have an instance with the latest Minimal Ubuntu.

While this is great, especially if you need to quickly pull an image, the fact stays that it doesn’t have preinstalled the necessary tools to test network automation, learn Python or run some QoS tests using packages like IPerf.

Based on My previous Ubuntu 16.04 PFNE Docker image, I’ve created the same using the new Ubuntu 18.04 LTS minimal image.

It contains all the tools found in Ubuntu 16.04 PFNE:

  • Openssl
  • Net-tools (ifconfig..)
  • IPutils (ping, arping, traceroute…)
  • IProute
  • IPerf
  • TCPDump
  • NMAP
  • Python 2
  • Python 3
  • Paramiko (python ssh support)
  • Netmiko (python ssh support)
  • Ansible (automation)
  • Pyntc
  • NAPALM

and two new additions:

  • Netcat
  • Socat

I’ve added these two because some blog followers asked me, after reading the Ubuntu image for eve-ng – Python For Network Engineers post, if I can add to image servers installation like web, ftp, etc.

Personally, I don’t think is needed to burden the image with these extra packages. You can have tools like Netcat testing various servers. This is one of the reasons I’ve added Netcat and Socat.

It’s easy for me to add them to this image or future ones (and I’ll do it if I get more requests), however I’m planning some articles on how to do your own Docker images and add whatever packages you need.

While writing this post, time to push it to Docker Hub :)
Docker push

If you want to test the new Ubuntu 18.04 PFNE Docker image, please pull it from Docker Hub:

$ docker pull yotis/ubuntu1804-pfne

To start it use:

$ docker run -i -t yotis/ubuntu1804-pfne /bin/bash

Let me know if you find this useful, happy testing and most important Never Stop Learning!

Goodbye firstdigest.com, welcome ipnet.xyz

I hope you’ve noticed that when you access firstdigest.com (or a link associated with this domain) you’re getting redirected to ipnet.xyz. No, I didn’t got my blog hacked.
I’ve decided to go ahead with another domain for this blog.

If you ever had to curiosity to read the About page (yes, it needs badly an update), you’ve seen that around 2007 when I bought firstdigest.com, the plan was to use it for a kind of review blog (travel, food, social life, etc.) FirstDigest (inspired by Reader’s Digest name) seemed like a good name and to be fully honest was the only one free and for an acceptable fee.
Remember this was pre-ICANN-era generic top-level domains and there were limited readable names for .com or .net.

That project went well (please feel the sarcasm). It was mostly because I had no idea what to write about and no time to do proper research. Instead I was more drawn to write about topics that I deal with everyday in my network engineer life.

Since I had the domain already, I said why not to use it. The name never fit the topics on this blog, but I’ve made the best of it. And here is 10 years later, I’ve decided to change it. I was thinking for a long time to adapt the domain name to the topics of this blog, but I never took the time to do it.

The difference between 2007 and now is that you have other possibilities due to the surge of new TLDs. However, premium is premium and names like net [dot] something is still expensive. This is why I’ve came with IP Networks (ipnet). Looking at various TLDs, the .xyz seems to be good one. If it was good enough for Alphabet (aka. Google), it sure does the thing for me. Still, it may have some people raise an eyebrow wondering, holy cow what is .xyz, but at least the domain name match the topics of this blog.

Due to the large number of links pointing back to firstdigest.com and because most of my visitors are coming via a link posted somewhere, I’m keeping this domain alive with a 301 Permanent Redirect to ipnet.xyz. I think I’ll get some impact from search engines like Google, but eh, I’m not that keen on statistics. Is not like I’m making a living out of this blog :)

Now to get back and write something which is actually interesting for the tech savvy ones out there.

INE’s CCIE R&S v5 topology for EVE-NG using CSR1000v

In my previous blog post I’ve adapted the INE’s CCIE Routing and Switching topology to be used with EVE-NG using IOSv (or vIOS) L3 images for routers and L2 images for switches.

Following the promise in that blog post, I’ve adapted the same topology using Cisco CSR1000v images for routers and IOSv L2 images for switches. There isn’t much to say about this topology since mostly is matching the original INE’s one for routers (including the configuration files) and the major difference is the utilization of virtual switches.

Since we’re using virtual switches, the configuration files for switches are still different. I’ve adapted these ones to match the interface names of IOSv-L2:

Real Switches – vIOS-L2

Fa0/1  - Gi0/0 - SW1 only connection to bridge

Fa0/19 - Gi0/2
Fa0/20 - Gi0/3
Fa0/21 - Gi1/0
Fa0/22 - Gi1/1
Fa0/23 - Gi1/2
Fa0/24 - Gi1/3

For convenience, the switches topology looks like this in EVE-NG:

CCIE R&S V5 Switches

For routers, the interfaces stays the same since we’re using CSR1000v images.

Here is how the network topology looks like. No surprise here, just added for convenience:

INE CCIE R&S v5 topology with csr1000v

If you don’t have the CSR1000v image added to your installation of EVE-NG, please download it (Cisco.com is a good starting point) and add it following these How-to:
HowTo add Cisco Cloud Service Router (CSR1000V NG) Denali and Everest
or
HowTo add Cisco Cloud Service Router (CSR1000V) – for pre Denali or Everest images.

If you’re curious I’m using csr1000v-universalk9.03.12.03.S.154-2.S3-std image when testing for this lab.

Before sharing the download links, a word of advice.

It may be that CSR1000v images are more stable and support all features when compared with IOSv-L3, however this comes with a cost in term of resources. Each node has by default 3GB RAM assigned and I wouldn’t recommend trying to decrease it.

Once the nodes boot up, the actual used RAM will be less, but still you need more resources to use CSR1000v.
My recommendation for 10 routers using CSR1000v images would be at least 16GB RAM assigned to the EVE-NG virtual machine. The same if you’re using EVE-NG on a bare metal machine.

Last but not least, somebody asked me when I’m going to provide the same topology with 20 routers.
No need. You can extend the default topology with as many devices as you want. The modified configuration files for labs with 20 routers are already modified and present in the archive you download. Just add the missing R11 until R20 devices.

If you encounter errors that are critical, please let me know and I’ll try to correct them.

Download files:
EVE-NG-INE-CCIEv5-Topology-CSR1000v.zip
INE-CCIEv5-RS-Initial-Configuration-for-EVE-NG-CSR1000v.zip

Happy labbing and let me know if you find these materials useful!