Working with Cisco PSS APIs

As I work for a Cisco Partner at the moment I have been looking to get access to the Cisco PSS APIs specifically to get coverage status on a Cisco device serial number.

If you have a Cisco account you can access the Device Coverage Checker online and check up to 20 serial numbers at a time. I have used this extensively. The same information can also be viewed if you have access to Intersight.

But I am looking to integrate with our DCIM tool Netbox to allow for easy check of coverage via API calls. Those API calls are for us available via the PSS API call to the endpoint SN2INFOv2.

Now of course this requires some sort of authentication and Cisco has an intricate process for getting access which boils down to creating a TAC case an request access.

Once you have access you need to create an application and grant that application access to the SN2INFOv2 APIs with “Client Credential” privileges. This generates a Key and a Client Secret unique to the application which is needed to get access.

Now here’s the problem. The Cisco API Developer has great documentation on the SN2INFOv2 API and how to format the request – but those need a Token to be accessed. The token needs to be generated first which was not immediately clear how to do.

I deciphered that I needed to do a OAUTH2 login agains cloudsso.cisco.com but could not find the documentation on how to format the request. I searched around to figure out how and found reference to a different API that showed an example on how to do this.

Problem was it refenced a “Client ID” which I did not seem to have. So I guessed a bit and assumed that “Client ID” must be the “Key” I had as the login required “Client ID” and “Client Secret” and I had “Key” and “Client Secret”.

So formatted the GET request but got a 405 Method not allowed. Now I was a bit lost. But searching a bit more I fell upon a dodgy PHP developer forum which I will not link to. But here was an example of a cURL request that showed me an approach. The request looked like this:

 curl -s -k -H "Content-Type: application/x-www-form-urlencoded" -X POST -d "client_id=..." -d "client_secret=..." -d "grant_type=client_credentials" https://cloudsso.cisco.com/as/token.oauth2

Now there was still reference to “Client ID” but again assumed it to be the “Key” I had and would you know – the API returned me an access token.

This access token needs to be passed on requests to the SN2INFOv2 API as:

curl -X GET -s -k -H "Accept: application/json" -H "Authorization: Bearer <TOKEN>" https://api.cisco.com/sn2info/v2/coverage/status/serial_numbers/<SERIALNUMBER>

And there you go! Easy to setup in Postman or Golang or Python or what ever you prefer!

WSL2 issues – and how to fix some of them

I have been waiting in anticipation for WSL2 (Windows Subsystem for Linux) and on May 28th when the update released for general availability I updated immediately.

At first I was super hyped. WSL2 and the Ubuntu 20.04 image just worked and ran smoothly and quickly. Combined it with the release version of Windows Terminal it was a real delight.

I also went and grabbed Docker Desktop for Windows as it now has support for WSL2 as the underlying system. And joy it just installed and worked. Now being capable of running Docker containers directly from my shell without doing some of doing it the way I did before having a Ubuntu VM running in VMware Workstation and connecting to it via docker-machine on my WSL1 Ubuntu image. A hassle to get to work and not a very smooth operation.

Having the option to just start Docker containers is amazing!

But then I had to get some actual work done and booted up VMware Workstation to boot a VM. And it failed. With a Device Guard error. I followed the guides and attempted to disable Device Guard to no avail. Then it dawned on my. WSL2 probably enables the Hyper-V role! And that is exactly what happened.

Hyper-V and Workstation (or VirtualBox for that matter) do not mix well – that is until VMware released Workstation 15.5.5 to fix this exact problem just the day after WSL2 released. Perfect timing!

Simple fix – just update Workstation to 15.5.5 and reboot and WSL2 and Workstation now coexisted fine!

I played a bit more with WSL2 in the following days but ended up hitting some wierd issues where networking would stop working in the WSL2 image. No real fixes found. Many indicate DNS issues and stuff like that. Just Google “WSL2 DNS not working” and look at the mountains of issues.

But I suspected something else because DNS not working was just a symptom – routing out of the WSL2 image was not working. Pinging IPs outside the image did not work. Not even the gateway IP. And if the default gateway is not working of course DNS is not working.

I found that restarting fixed the issue so got past it that way but today it was back. I was very interested in figuring out what happened. And then I realized the potential problem and tested the fix. I was connected to my work network via Cisco AnyConnect. I tried disconnecting from VPN and testing connectivity in WSL again – now it works. Connected to VPN again and connectivity was gone.

Okay – source found – what’s the fix? I found this thread on Github that mentions issues with other VPN providers even when not connected. Looking through the comments I found a reference to a different issue of the same problem but regarding AnyConnect specifically.

I looked through the comments and many fixes around changing DNS IP and other things but the fix that seem to do the trick was running the following two lines of Powershell in an elevated shell after connecting to VPN

Get-NetIPInterface -InterfaceAlias "vEthernet (WSL)" | Set-NetIPInterface -InterfaceMetric 1
Get-NetAdapter | Where-Object {$_.InterfaceDescription -Match "Cisco AnyConnect"} | Set-NetIPInterface -InterfaceMetric 6000

Those two lines change the Interface Metric so that the WSL interface has a higher priority than the VPN connection. This inadvertently also fixed an issue that I had with local breakout when on VPN not working correctly.

Downside of the fix is that this needs to be run every time you connect to VPN. I implemented a simple Powershell function in my profile so I just have to open an elevated shell and type “Fix-WSLNet”.

That is all for now!

vRealize Orchestrator 8.1 (and others) announced!

I’m late to the party as usual but simply needed to write up a little quick post on this.

VMware announced a whole slew of new releases yesterday with the primary focus being on vSphere 7 and the new Kubernetes integrations that brings. I hope to get time to look more into Kubernetes on vSphere once that becomes available as this is an area I have much interest in learning more about.

But the biggest thing for me as of right now is the announcements for vRealize Orchestrator 8.1!

I have really wanted to like the new HTML 5 interface that came in 7.6 but it had issues! No lie there. And as I have not had the time to test it in 8.0 yet I hope that 8.1 will bring back some of the glory to vRO!

Among the features I will look forward to the most is the return of the “Tree-View” to show a hierarchical sorting and bundling of related workflows. The tag based approach used in 7.6 and 8.0 don’t really appeal to me. I like to be able to tag workflows but not being able to sort and organize them in any other way is not optimal.

But that said. The absolute biggest wish on my wishlist for vRO has come true! To quote the announcement:

Multiple Scripting Languages: PowerShell, Node.js,Python. Support for multiple scripting languages have been added: PowerShell, Node.js, and Python. This makes vRealize Orchestrator more accessible and easier to use for non-JavaScript users. “

Finally Powershell will be directly available in vRO not requiring a complicated setup using a Windows host and all of the double hop authentication issues that arise from this. And to get Python as well! It’s almost Christmas!

I can’t and won’t go over all the announcements yesterday – other bloggers out there are already doing this and I’d like to give some credit to those working hard on this. For that reason I will point you all to Eric Siebert’s list of links to articles and annoucements regarding vSphere 7 and related releases.

Take look at the list here: http://vsphere-land.com/news/vsphere-7-0-link-o-rama.html

vRealize Orchestrator VC plugin version

I keep forgetting this to be a problem so might as well write it down for myself and anyone else stumbling upon this.

When using vRO, in my case 7.5 or 7.6, you might get a problem where you are unable to add a vCenter instance of a vCenter version 6.7. The error is not very informative:

It doesn’t really scream out what the error is. But as I had seen the error before I had a hunch when my colleague was configuring vRO in our vRealize Automation platform.

On the vRO VMTN forum there is a post that contains the latest release of the vRO VC plugin – https://communities.vmware.com/docs/DOC-32872

Simply download the zip attached. Unpack the vmoapp file. Login to the vRO control-center on https://<FQDNorIP>:8283/vco-controlcenter/ and select “Manage Plug-ins”. Here under “Install plug-in” click browse and select the vmoapp file and upload. Accept EULA and install. After about 2 minutes the vRO will have restarted and the plugin updated.

vCenter instances can now be added 🙂

Updated udp_client.py for testing UDP heartbeats

A while back i stumbled on a set of KB’s for testing UDP heartbeat connectivity between ESXi and vCenter. I wrote this article to describe how to do it.

Now today I had to do the same and went back to these KB’s to find the script. This was however on newer 6.5 U2 hosts and not old 5.5 hosts. And as KB1029919 describes it is only applicable to 4.0.x to 5.5.x versions of ESXi.

Why is this important? Because between ESXi 5.5.x and 6.5 U2 the included Python was updated from 2 to 3. Some of you may know that there are many breaking changes in Python 3 when compared to Python 2 and some of those were present in the original udp_client.py script.

So I took the time to fix the few issues that the script had and upload a version to GitHub here. In the Python folder there is a version of udp_client.py that is Python 3 compatible and I included the original script as udp_client-v2.py for reference.

The major changes were in line 25 that print is now a function and has to be used with parentheses and the “%” change to “,” as seen here:

original:
print "\nSent %s UDP packets over port %s to %s" % (numtimes,port,host) 

python 3:
print("\nSent %s UDP packets over port %s to %s", (numtimes, port, host)) 

After syntax error was fixed I found that there was a change to how “socket.sendto” works and it now expects a bytearray instead of a string. Simple fix was to introduce a int variable “datasize” set to 100 and change the “data” variable from “100” to “bytearray(datasize)” as seen here:

original:
data = "100" 

python 3:
datasize = 100       
data = bytearray(datasize) 

After this the scipt works on a 6.5 U2 host and I was able to UDP connectivity.

This also marks the first time I have my own public Github repsitory so – yay! 🙂

System logging not Configured on host

A few weeks ago I noticed a warning on some of our hosts in our HyperFlex clusters and wondered what was going on. It was only hitting Compute Only nodes in the clusters.

The warning is indicating that the Syslog.global.logDir is not set as per KB2006834. But when I looked via ssh on the host it was logging data and the config option was set so it was working – so why the warning?

Well it turns out to be something not that complicated to fix. The admin who set up the nodes set the option to:

[] /vmfs/volumes/<UUID>/logs/hostname

That is giving it an absolute path on the host like you would do with the ScratchConfig.ConfiguredScratchLocation option. This works but triggers the warning as if it was not set.

The fix is simple. Simply change it to use the DatastoreName notation as this:

[DatastoreName] logs/hostname

This immediately removed the warning and everything continued as it had before.

Migrating VMs to an older ESXi version

Hi all, just a short post about a small task I had on my desk last week. Customer needed to migrate a 3 servers from current provider to one of our older platforms. Few issues to overcome. First off only had VPN access to the provider and access to the ESXi 6.0 web UI that was running the VMs. So how to export them without downtime? No way to do something like a Veeam replication as I only had the VPN connection. Had to do an export-import scenario. Clone them? Requires vCenter. Hmm.

So what did I do? Well I asked nicely and was allowed to deploy a temporary VCSA onto the host and add the host itself to the vCenter. This allowed me to clone the 3 VMs (after removing a metric f***ton of snapshots). Then I could export the cloned VMs as OVFs to a machine in our network. It was lucky that I could do this and did not need to do the entire operation in a service window. The last copy of the machines were not needed as it was more of a “configuration copy” than anything else. So while the customers systems were running we could move the cloned VMs.

Now came the tricky part that I did not foresee but quickly identified! I exported from a vCSA 6.7 U1 and an ESXi 6.0 host. This makes a SHA256 signed OVF. Trying to import this to a 5.5 vCenter fails as the 5.5 vCenter does not support SHA256. OVFtool has a nice feature where you can convert the OVF from SHA256 to SHA1 by making a new package with the following command:

ovftool --shaAlgorithm=SHA1 .\path\to\file.ovf .\path\to\destionation.ova

Simple! Converts the OVF to an OVA with SHA1 instead of SHA256. Import now works. Wait not it did not! The machines are VMX-11 which does not run on ESXi 5.5. What to do. Recommended approaches are to use VMware converter to convert the VM and downgrade the hardware version. I took a slightly more simple but probably also more unsported route.

VMX version is defined in the OVF som it was simple to open the OVF file and locate the “SystemType” parameter and change it from “vmx-11” to “vmx-10”. This however breaks the manifest files SHA256 hash of the OVF file. This is simple to fix aswell using “certutil” on Windows. Following command generates a SHA256 has for a file:

certutil -hashfile .\path\to\file.ovf sha256

Simple replace the SHA256 thumbprint in the manifest file with the one generated above. Rerun the SHA1 conversion above and import now works. My colleague who needed the VMs converted reported back later that day and confirmed machines were booting fine and running as they should so he could continue reconfiguration of the machines.

vExpert 2018

Hi All

Just a quick update today to announce that I was able to get accepted as a vExpert 2018 in the second half as announced by VMware here: https://blogs.vmware.com/vexpert/2018/08/03/vexpert-2018-second-half-award-announcement/

After a long break in 2017 when I was at home with my twins I did not manage to get out much content to the community but after coming back in full force it is nice to be allowed back in to the program for my 4th consecutive year! I am very humbled to be let into this group of highly talented people who share so much great information with the community and the world.

As I mentioned in previous posts my role switched a bit when I changed jobs so I have a lot more areas to cover and less focus on VMware. I try and get as much time doing stuff with vSphere and hopefully vRO/vRA again soon but for now my focus is mainly developing our cloud platform along with my colleagues.

That is it for now – back into the machine room!

vSphere HA virtual machine monitoring error on VM

Hi all

Today I found a couple of VMs in a cluster, that had HA with VM monitoring enabled, that were showing a “vSphere HA virtual machine monitoring error” with a couple of different dates.

Looking into the event log of the VM via vCenter I could see the following events about every 20 seconds:

This indicated that HA VM monitoring wanted to reset the VM but failed. I tried searching for answers on first Google but with no luck. I remembered that the setup had vRealize Log Insight installed and collecting data so perhaps Log Insight had more logs to look at.

Made a simple filter on the VM name and found repeating logs starting with this error from “fdm” which is the HA component on the ESXi host.

This error looks bad to me. Not being able to find the MoRef of a running VM? Hmm. I asked out on Twitter if anyone had seen this before with out much luck. Think about it over lunch I figured that maybe it was the host that was running the VM that had gotten into some silly state about the VM and not knowing it’s MoRef. So the quick fix I tried was to simply do a compute and storage migration of the VM to a different host and a different datastore to clean up any stale references to files or the VM world running. The event log of the VM immediately stopped spamming the HA message and returned to normal after migration.

I cannot say for sure what caused the issue and what the root cause is but that at least solved it.

Simple network debugging – ESXi and VCSA

I ran into an old issue yesterday with ESXi 5.5 and vCenter disconnecting after 60 seconds. The issues is described in KB1029919 and is due to heartbeat UDP packets from ESXi not reaching vCenter on port 902 within 60 seconds or at all.

Now how to test this! Luckily the KB has a simple Python script that allows you to send UDP packets. Just edit the script and insert the IP of your vCenter server and run it. It will default send 10 packets of 100 bytes.

Now the KB mentions using Wireshark as an option on the vCenter side but when using VCSA that is not really an option. Luckily VCSA 6.5 comes with tcpdump installed! 5.5 and 6.0 don’t but worry not, VMware to the rescue. You can install tcpdump on 5.5 and 6.0 by following KB2084896. This gives you access to a client on ESXi to send UDP packets and a monitoring option on VCSA to see if the packets arrive.

All that needs to be done then is run the following:

#on the VCSA
tcpdump -i eth0 udp port 902 -vv | grep <srcIP>

#on ESXi
python udp_client.py

And then look at the results. You can lose the “| grep <srcIP>” if you want to see all packets but depending on your setup there may be many ESXi hosts sending heartbeat.