Getting started with an automation project can be a daunting task, to say the least. As a consultant and architect at Red Hat, the question I get most often is, "How do I get started doing a network automation project?" In this post, I'll share five things you need to do to get started. 

I talk to all sorts of people about network automation. After spending years building out massive networking automation projects with tens of thousands, and hundreds of thousands of devices, I think that device management is easier and more accessible than ever before. 

Ansible has evolved quickly, and in a lot of ways, device configuration is a problem that’s been solved in a number of ways. Often, the biggest hurdle is often just getting your head wrapped around the various options and ideas about how to do things one way or another. 

And then, how do you begin even thinking about going from a small lab with a few or maybe a dozen devices to a real production network with tens, hundred, or possibly thousands upon thousands of devices — that you’re supposed to manage and automate?

Here’s my take on the easiest way for people to begin managing their network in a practical way. 

Just the Facts

Ansible’s network automation begins and ends with Ansible Facts -- simple variables/details about your inventory devices. "Facts'' can be just about anything. From predefined command output, to the full running-config, stored line-by-line. We can gather facts automatically, or create and set them ourselves. 

Either way, Ansible Facts are the backbone of everything we build. From command orchestration and state management, to performing backups and restores, and creating or syncing a CMDB. And good news! There are numerous options for collecting network facts!

The first thing I do is use Ansible’s native configuration parsers (Network Resource Modules, more on those soon) to parse the raw device configs into a data model. For Cisco, Arista, and JunOS, it’s this simple:

gather_facts: true
   # or #
- name: ios facts
  ios_facts:
    gather_subset: min  # compatible w/ Ansible <= 2.8
    gather_network_resources: all # network resource modules (2.9+)

The result:

ansible_net_version = 14.22.0F

And In Ansible < 2.8, I use a combination of the default fact modules, and custom facts that I set from getting the output of my own ad-hoc commands:

- name: collect output from ios device
  ios_command:
    commands:
      - show version
  register: output

- name: set version fact
  set_fact:
    ansible_net_version: "{{ output.stdout[0] | regex_search('Version (\\S+)', '\\1') | first }}"

The result:

ansible_net_version = 14.22.0F

Facts can be cached as well. Options include storing encrypted facts on Red Hat Ansible Tower’s database, or you can cache facts to disk/memory, or to an external caching service (e.g., memcached or redis). There are a lot of options -- find one that works best for you and your environment!

Start with state and configuration management

Building network automation playbooks has never been quicker and easier. Along with storing useful device info, facts are also used by Ansible to create a vendor-agnostic data model. Network Resource Modules will allow you to then post that data back to Ansible, to build that same device state. Config-to code, and vice versa!

ansible_facts:
  ansible_net_fqdn: rtr2
  ansible_net_gather_subset:
  - interfaces
  ansible_net_hostname: rtr2
  ansible_net_serialnum: D01E1309…
  ansible_net_system: nxos
  ansible_net_model: 93180yc-ex
  ansible_net_version: 14.22.0F
  ansible_network_resources:
    interfaces:
    - name: Ethernet1/1
      enabled: true
      mode: trunk
    - name: Ethernet1/2
      enabled: false

Identify variables, define your desired device state, and the modules do the rest. These modules know how to run the behind-the-scenes commands that get you the desired configuration state. Using these, Ansible will determine which commands need to be sent and in which order, whether certain lines need to be removed first, etc. Resource Modules have the logic built in to know how config properties need to be orchestrated, and in which specific ways. 

Use your network facts to build or enhance your configuration management database (CMDB)

Establishing a CMDB is the prerequisite to establishing long-term stability in your infrastructure. And with all of these fancy new inventory details, we have everything we need to build a CMDB from Ansible Facts — nothing else required. Gather facts and configs from everything on the network, and store it somewhere such as an ELK stack or to Git repos.

Keep in mind that Tower itself is often not the best place to be doing heavy searching and log/job analysis. In general, we recommend you offload search and analytics to an external service. And at large scale — and certainly at high volume — facts and logging are the gateway to a big data project.

To get started quickly, Netbox is a lightweight CMDB that I use anytime I have the option. See this post on Ansible.com to learn more about it: Using NetBox for Ansible Source of Truth

Start thinking about scale and performance 

Now that we have all of these basic functions in place, it’s time to begin considering scale and performance testing. Simply noting Ansible CLI or Tower Job run times is the quickest place to start. Additionally, we can look at timing for individual tasks.

To aid us in the process, there are a number of Ansible callback plugins that will help with performance tuning. The first, profile_tasks, prints out a detailed breakdown of task execution times, sorted from longest to shortest, as well as a running timer during play execution. Speaking of, timer is another useful plugin that shows us total execution time:

callback_whitelist = profile_tasks, timer

Enable the profile_tasks and timer callback plugins in your ansible.cfg, run your playbook again, and you’ll see more output. For example, profiling fact collection tasks on a single Cisco inventory host:

ansible-playbook facts.yml

ansible_facts : collect output from ios device ------------ 1.94s
ansible_facts : include cisco-ios tasks ------------------- 0.50s
ansible_facts : set config_lines fact --------------------- 0.26s
ansible_facts : set version fact -------------------------- 0.07s
ansible_facts : set management interface name fact -------- 0.07s
ansible_facts : set model number -------------------------- 0.07s
ansible_facts : set config fact --------------------------- 0.07s

Automation will be unique to every organization, and it’s important to regularly track performance benchmarks as your roles evolve. Beyond the obvious benefit of being able to accurately estimate your automation run times, you can determine where improvements can be made while proactively monitoring for faulty code/logic that will inevitably slip through peer reviews.

Don't start from scratch, use the Ansible Universe

It takes time to build out to full scale in a large network, but it’s a tried and true, and practical, way to begin your network automation adoption. The fact collection and logging that we’re doing through Ansible and Tower both lend themselves well to a quick implementation and gradual scale-up to running against networks of all sizes.

This framework — and the fundamental objective of knowing what’s running on your network at any given time — has been implemented with tremendous success in every network infrastructure project I’ve worked on. The day one results are immediate, and the foundation for your future automation can be built in the time it takes to do a proof of concept (POC).

For more information, view this webinar where we explore some of the key first steps and challenges in applying automation beyond infrastructure! You can also learn more by watching a recording on this topic from Ansible Automates Tokyo 2020 below. 


About the author

Landon Holley is a Consulting Architect and Ansible SME, and he has spent the past 15+ years working on physical infrastructure. Since joining Red Hat in 2015, Holley has spent the majority of his time building solutions and mentoring network automation users.

Read full bio