Thursday, June 20, 2013

Ansible overview - implementing ntp.


Ansible == control all now to go to the pub sooner.  

Or break it all in an instant and cry lots.  It's all up to you :)


s/chef/ansible/g

Why Ansible and not something else?

Automating sysadmin work is nothing new, we've all been doing that for ages.  Automation tools are nothing new either, and there are plenty around to choose from.  After a failed attempt at puppet i tried ansible after hearing about it at #lca2013.  I liked ansible because it felt as comfortable to use as ssh and vi, but had the power of idempotence, parallel execution, platform independence, operating system agnosticism, it was simple for me to understand and so on.  It might not work for you :) but if you do try it, please let me know how you get on.

So, ansible is a tool which lets you perform actions on servers.  They can be single actions, or multiple actions based on a definition called a playbook. You write a playbook (don’t worry, it’s not hard) where you define what needs to be done and you execute it with ansible. There’s no server-side software needed (well, except for python) and there’s hardly any configuration. By managing your servers with ansible you not only make sure their configuration is always identical and you always have the right tools installed, and you also have your server configuration documented at all times.

How to install ansible

Ansible is a python app, and it can be installed using the python software manager, "pip"; have a look at http://www.ansibleworks.com/docs/gettingstarted.html.

Architecture and authentication

Ansible works by connecting to your nodes via SSH (or locally) and pushing out small programs, called “Ansible Modules” to them. These programs are written to be resource models of the desired state of the system. Ansible then executes these modules over SSH, and deletes them when finished. Your library of modules can reside on any machine, and there are no servers, daemons, or databases required. Typically you’ll work with your favorite terminal program, a text editor, and probably a version control system to keep track of changes to your content.

Bootstrapping a host for ansible.

As mentioned earlier, ansible isn't a client/server model, in that it has no server daemon nor clients.  It just runs over ssh from your machine to the remote host.  The remote host sometimes need remote hosts may need python-simplejson installed.  They also need to allow ssh via ssh certificates, such as via:

$ ssh-copy-id remotehost
$ ssh-copy-id root@remotehost
$ ssh root@remotehost apt-get install python-simplejson

You can also have ansible use sudo instead of root@ however that option is not explored in this doco.  And Yes, somebody worked out how you can bootstrap a host for ansible, using ansible ..

Hosts definition

For ansible to work, you need a hosts definition. The default host definition is located in /etc/ansible/hosts. You can use a custom hosts definition (located outside /etc, for example) by defining them elsewhere and passing the -i [host-file location] parameter to the ansible-playbook command when running a playbook. We’re just going to work with /etc/ansible/hosts for now.

Now let’s open up the empty hosts file and put the following contents in it:
[kvmhosts]
thylacine
feathertail
bettong
kaluta
#lycosa

What we’ve done here, is define a group named ‘kvmhosts’. A group can be used in a playbook to run the playbook against a number of hosts; you can't run ansible against a host if it's not mentioned in this file.  We’ve added some hosts to that group by their hostname, but you can use IP addresses as well; either way they need to be reachable and bootstrapped as discussed earlier. You can also define just one host here and you don’t even need a group. Multiple groups are OK too, of course.  Even groups of groups.  It’s all up to you and it all happens in this file.

Task execution with ./ansible

So lets say we want to make sure all our kvmhosts have the correct time.  How can we check the time on all the hosts at once? we can run ansible with the command module, call date and see what is returned;

$ ansible kvmhosts -a 'date'
bettong | success | rc=0 >>
Mon Jun 17 11:42:34 EST 2013

kvmtest | success | rc=0 >>
Mon Jun 17 10:46:02 EST 2013

feathertail | success | rc=0 >>
Mon Jun 17 11:46:02 EST 2013

thylacine | success | rc=0 >>
Mon Jun 17 10:44:02 EST 2013

kaluta | success | rc=0 >>
Mon Jun 17 11:46:11 EST 2013

Notice we didn't actually say "run the command module"?  if we don't specify a module name explicitly, then ansible assumes that's the one you want; but there are dozens of others.  If you don't specify a user to connect as, then ansible will assume you mean to connect as yourself.

Ansible modules

So there is the command module (like we used above) which just allows us to run 'date' against a group, but what if we want to do something more interesting?  enter ansible modules.  Don't worry though, you don't need to know how they work internally, just that they just do, and there's lots of them at http://www.ansibleworks.com/docs/modules.Sohtml.

How about we want to make sure the httpd is running on all our webservers?  easy;
$ ansible webservers -u root -m service -a "name=httpd state=running"

Or that htop is installed on all your debian machines;
$ ansible debianservers -u root -m apt -a "pkg:htop state=present"

Or create a symlink;
$ ansible debianservers -u root -m src=/file/to/link/to dest=/path/to/symlink owner=foo group=foo state=link

You can do any *single task* you like in this way.  There is no program control or audit or any real repeatability in this mode, but sometimes this is all you might need.

Writing your first playbook

Obviously however, doing anything significant takes more than one task   In a playbook you define what "task"ansible should take to get the server in the state you want. Only what you define gets done.  In ansible, all tasks are idempotent which means that they only change what need to be changed and otherwise do nothing.  So you can run a playbook against a host repeatedly, and after the first iteration, nothing will be changed.  Or add a host to a group, run the playbook against the group, and only the new host gets configured etc.

Playbooks are executed with ./ansible-playbook.  Playbooks are written in YAML which means they are fussy about indenting and syntax and their special markers.  it's a bit annoying.

Making NTP work 

Lets imagine the time on all our kvm hosts is wrong.  We want to use ansible to make sure they all have NTP installed and that it's configured sanely and running etc.  Doing this manually would take a good number of steps which has to be repeated on each host; when we have multiple tasks like this, that's when we write an ansible playbook. When we get a new host, or change the config somehow, we just run it again and voila! it's done.

So lets imagine that we need to complete the following in order to get NTP working on a debian host;

  1. Do an apt-get update.
  2. Install the ntp package
  3. Unless this is called timelord, configure NTP to point to timelord
  4. Configure the timezone files appropriately.
  5. Restart NTP
  6. Wait awhile and check to see it's working OK.

For any given host, if any of these steps fail then we'd want to stop configuring it to see any error message(s) etc.

ntp.yml

An ansible playbook to complete those steps looks largely like a script written in plain english.

If you have a look below, you can see that this playbook is able to be run against all hosts, and will connect to each of them as the root user.

##
# Demo Ansible playbook for installing and configuring NTP
#
---
- hosts: all
  name: install, configure and start NTP
  user: root
  tasks:
    # Ensure the ntp package is installed
    - name: Ensure the ntp package is installed      
      action: apt update_cache=yes pkg=ntp state=installed

    # Our local time server is called timelord.  all servers need to point to that.
    - name: EXCLUDING timelord, copy over our local ntp.conf
      when_string: $ansible_hostname != 'timelord' 
      action: copy src=files/etc-ntp.conf dest=/etc/ntp.conf mode=755
    # Update /etc/localtime to point to files/usr-share-zoneinfo-Australia-Hobart
    - name: Update timezone link foo
      action: copy src=files/usr-share-zoneinfo-Australia-Hobart dest=/etc/localtime mode=755
    # Update /etc/timezone to show Australia/Hobart
    - name: Update timezone link foo
      action: copy src=files/etc-timezone-hobart dest=/etc/timezone mode=755
    # Restart ntp to apply new ntp.conf
    - name: Restart ntp to apply new ntp.conf
      action: service name=ntp state=restarted

And it is executed simply running;
$ ansible-playbook ntp.yml --limit kvmhosts

If a task fails on a given host, then execution of the remaining tasks on that host would be stopped.  You would be able to see what happened and fix it etc.

And finally, check your work and make sure NTP is indeed up and running by using ansible in task execution mode like we did earlier;

$ ansible kvmhosts -m command -a 'date'
bettong | success | rc=0 >>
Mon Jun 17 14:22:20 EST 2013

kaluta | success | rc=0 >>
Mon Jun 17 14:22:20 EST 2013

kvmtest | success | rc=0 >>
Mon Jun 17 14:22:20 EST 2013

thylacine | success | rc=0 >>
Mon Jun 17 14:22:20 EST 2013

feathertail | success | rc=0 >>
Mon Jun 17 14:22:20 EST 2013

Facts and logic and all the rest of it

This document shows the basics of ansible, and doesn't explore some of the more interesting aspects, such as conditional processing (ie when_string: $ansible_hostname != 'timelord'), or how to manage playbook execution between different host groups (hosts: all:!depricatedmachines) and so on.

There is alot of documentation on the ansible website, so if you're tempted, then please do have a look.

Internet resources and thanks to;