How to tell Ansible to use GCP IAP tunneling

Hiring

We are Binx. We make every organization cloud-native.

At a current customer, we’re using GCP’s IAP tunnel feature to connect to the VMs.
When deploying fluentbit packages on existing servers for this customer, I decided it would save some time if I would make an Ansible playbook for this job.
As I only have access to them through IAP, I ran into a problem; Ansible does not have an option to use gcloud compute ssh as a connection type.

It turns out, you do have the option to override the actual ssh executable used by Ansible. With some help from this post, and some custom changes, I was able to run my playbook on the GCP servers.

Below you will find the configurations used.

ansible.cfg

contents:

[inventory]
enable_plugins = gcp_compute

[defaults]
inventory = misc/inventory.gcp.yml
interpreter_python = /usr/bin/python

[ssh_connection]
# Enabling pipelining reduces the number of SSH operations required
# to execute a module on the remote server.
# This can result in a significant performance improvement 
# when enabled.
pipelining = True
scp_if_ssh = False
ssh_executable = misc/gcp-ssh-wrapper.sh
ssh_args = None
# Tell ansible to use SCP for file transfers when connection is set to SSH
scp_if_ssh = True
scp_executable = misc/gcp-scp-wrapper.sh

First we tell Ansible that we want to use the gcp_compute plugin for our inventory.
Then we will point Ansible to our inventory configuration file from which the contents can be found below.

The ssh_connection configuration allows us to use gcloud compute ssh/scp commands for our remote connections.

misc/inventory.gcp.yml

contents:

plugin: gcp_compute
projects:
  - my-project
auth_kind: application
keyed_groups:
  - key: labels
    prefix: label
  - key: zone
    prefix: zone
  - key: (tags.items|list)
    prefix: tag
groups:
  gke          : "'gke' in name"
compose:
  # set the ansible_host variable to connect with the private IP address without changing the hostname
  ansible_host: name

This will enable automatic inventory of all the compute instances running in the my-project GCP project.
Groups will be automatically generated based on the given keyed_groups configuration and in addition I’ve added a gke group based on the VM’s name.
Setting the Ansible_host to the name will make sure our gcloud ssh command will work. Otherwise Ansible will pass you the instance IP address.

misc/gcp-ssh-wrapper.sh

contents:

#!/bin/bash
# This is a wrapper script allowing to use GCP's IAP SSH option to connect
# to our servers.

# Ansible passes a large number of SSH parameters along with the hostname as the
# second to last argument and the command as the last. We will pop the last two
# arguments off of the list and then pass all of the other SSH flags through
# without modification:
host="${@: -2: 1}"
cmd="${@: -1: 1}"

# Unfortunately ansible has hardcoded ssh options, so we need to filter these out
# It's an ugly hack, but for now we'll only accept the options starting with '--'
declare -a opts
for ssh_arg in "${@: 1: $# -3}" ; do
        if [[ "${ssh_arg}" == --* ]] ; then
                opts+="${ssh_arg} "
        fi
done

exec gcloud compute ssh $opts "${host}" -- -C "${cmd}"

Ansible will call this script for all remote commands when the connection is set to ssh

misc/gcp-scp-wrapper.sh

contents:

#!/bin/bash
# This is a wrapper script allowing to use GCP's IAP option to connect
# to our servers.

# Ansible passes a large number of SSH parameters along with the hostname as the
# second to last argument and the command as the last. We will pop the last two
# arguments off of the list and then pass all of the other SSH flags through
# without modification:
host="${@: -2: 1}"
cmd="${@: -1: 1}"

# Unfortunately ansible has hardcoded scp options, so we need to filter these out
# It's an ugly hack, but for now we'll only accept the options starting with '--'
declare -a opts
for scp_arg in "${@: 1: $# -3}" ; do
        if [[ "${scp_arg}" == --* ]] ; then
                opts+="${scp_arg} "
        fi
done

# Remove [] around our host, as gcloud scp doesn't understand this syntax
cmd=`echo "${cmd}" | tr -d []`

exec gcloud compute scp $opts "${host}" "${cmd}"

This script will be called by Ansible for all the copy tasks when the connection is set to ssh

group_vars/all.yml

contents:

---
ansible_ssh_args: --tunnel-through-iap --zone={{ zone }} --no-user-output-enabled --quiet
ansible_scp_extra_args: --tunnel-through-iap --zone={{ zone }} --quiet

Passing the ssh and scp args through the group-vars makes it possible for us to set the zone to the VM’s zone already known through Ansible’s inventory.
Without specifying the zone, gcloud will throw the following error:

ERROR: (gcloud.compute.ssh) Underspecified resource [streaming-portal]. Specify the [--zone] flag.

Hopefully this post will help anyone that ran into the same issue!

Share this article: Tweet this post / Post on LinkedIn