<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Tim Birkett ]]></title><description><![CDATA[cat /dev/urandom]]></description><link>http://www.pysysops.com</link><image><url>/images/index.jpg</url><title>Tim Birkett </title><link>http://www.pysysops.com</link></image><generator>RSS for Node</generator><lastBuildDate>Tue, 15 Oct 2019 20:41:54 GMT</lastBuildDate><atom:link href="http://www.pysysops.com/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Taming Terraform with Modules]]></title><description><![CDATA[<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>Some of us know what Terraform is. It&#8217;s an Open Source tool created and maintained by Hashicorp to help us specify and manage our Infrastructure as Code across multiple Infrastructure as a Service (IaaS) and Cloud providers.</p>
</div>
<div class="paragraph">
<p>When you start learning Terraform you make use of it&#8217;s primitive types: <code>data</code> sources and <code>resource</code> statements. If you&#8217;re trying it out and simply creating a Ghost publication in <em>The Cloud</em> these primitives do the job. It&#8217;s a great way to tie together your understandings of the building blocks to make something on whatever Cloud provider you&#8217;re using. Typically, a VPC, a few Subnets, A Routing table, a Nat Gateway, an ELB, a handful of Security Groups, an Instance and a Database will have you up and running with a shiny new Ghost publication. A bit of an expensive Ghost publication.</p>
</div>
<div class="paragraph">
<p>Unfortunately, most of us aren&#8217;t doing simple, one-off things in our day jobs. We&#8217;re building complex and often evolutionary infrastructures across multiple projects, products, teams, accounts, locations and cloud providers. Handcrafting all of that Terraform code from the primitives offered would be a nightmare from the beginning. Maintaining consistency across <em>all the things</em> would be pretty much impossible. I&#8217;ve been there and I ran away. Fast! This is where Terraform <strong>modules</strong> help us out.</p>
</div>
<div class="paragraph">
<p>Terraform has some good documenation on writing basic modules:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><a href="https://www.terraform.io/docs/modules/index.html" class="bare">https://www.terraform.io/docs/modules/index.html</a></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Modules allow us to create reusable collections of resources that are often used together to do something and expose them as a single <code>module</code> resource in Terraform.</p>
</div>
<div class="paragraph">
<p>As an example: EC2 instances will usually belong to an Autoscaling Group, have a Launch Configuration and be attached to some Security Groups. By bundling these into a module you can reuse the same code by passing different parameters into the module and you get a similar result each time. Before we break out the <em>module shotgun</em> it&#8217;s worth understanding that there are multiple types of module.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_component_modules">Component Modules</h2>
<div class="sectionbody">
<div class="paragraph">
<p>A component module has a focussed job and only interacts with a single cloud provider and only uses the low-level Terraform resources. It might group together an ASG, Launch Configuration and Security Group or a VPC, Subnets, Route Tables and Internet / NAT Gateways.</p>
</div>
<div class="paragraph">
<p>Component modules are typically the type of module you&#8217;ll find on the Terraform Registry. Some good examples are:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><a href="https://registry.terraform.io/modules/terraform-aws-modules/vpc/aws" class="bare">https://registry.terraform.io/modules/terraform-aws-modules/vpc/aws</a></p>
</li>
<li>
<p><a href="https://registry.terraform.io/modules/terraform-aws-modules/autoscaling/aws" class="bare">https://registry.terraform.io/modules/terraform-aws-modules/autoscaling/aws</a></p>
</li>
<li>
<p><a href="https://registry.terraform.io/modules/terraform-aws-modules/security-group/aws" class="bare">https://registry.terraform.io/modules/terraform-aws-modules/security-group/aws</a></p>
</li>
<li>
<p><a href="https://registry.terraform.io/modules/terraform-aws-modules/alb/aws" class="bare">https://registry.terraform.io/modules/terraform-aws-modules/alb/aws</a></p>
</li>
</ul>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_infrastructure_or_service_modules">Infrastructure or Service Modules</h2>
<div class="sectionbody">
<div class="paragraph">
<p>So, you&#8217;ve started using upstream component modules and you start to realise that you&#8217;re doing the same things, adding the same tags, using <code>autoscaling</code>, <code>security-group</code> and <code>alb</code> together to create a service and inputting a lot of default boilerplate values.</p>
</div>
<div class="paragraph">
<p>This is where Infrastructure modules take over. You can use Terraform modules within other modules. You can create modules that enforce best practice or certain requirements and create things in a consistent way across your infrastructures. They sometimes make use of multiple providers. Some examples:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><a href="https://github.com/dinocorp/webs-infra/tree/master/terraform/local-modules/asg-elb" class="bare">https://github.com/dinocorp/webs-infra/tree/master/terraform/local-modules/asg-elb</a></p>
</li>
<li>
<p><a href="https://github.com/dinocorp/webs-infra/tree/master/terraform/local-modules/mysql" class="bare">https://github.com/dinocorp/webs-infra/tree/master/terraform/local-modules/mysql</a></p>
</li>
</ul>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_data_modules">Data Modules</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Modules that do nothing more than give static output for use in other modules. They might be used to enforce available instance types, provide some data like an instance price to use for spot prices or normalise instance sizes across cloud providers. An example of a data module might be something like:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><a href="https://github.com/dinocorp/webs-infra/tree/master/terraform/local-modules/ec2_spot_prices" class="bare">https://github.com/dinocorp/webs-infra/tree/master/terraform/local-modules/ec2_spot_prices</a></p>
</li>
<li>
<p>Implemented here: <a href="https://github.com/dinocorp/webs-infra/blob/master/terraform/local-modules/asg-elb/main.tf#L16" class="bare">https://github.com/dinocorp/webs-infra/blob/master/terraform/local-modules/asg-elb/main.tf#L16</a></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Hopefully some of this makes sense and can help you evolve your Terraform code into something more beautiful than it was previously.</p>
</div>
<div class="paragraph">
<p>Something I haven&#8217;t touched on here is managing Terraform module versions and dependencies. I might braindump a post on this soon. Thanks for reading!</p>
</div>
</div>
</div>]]></description><link>http://www.pysysops.com/2019/10/03/Taming-Terraform-with-Modules.html</link><guid isPermaLink="true">http://www.pysysops.com/2019/10/03/Taming-Terraform-with-Modules.html</guid><category><![CDATA[Terraform]]></category><category><![CDATA[Terrafile]]></category><category><![CDATA[xterrafile]]></category><category><![CDATA[Infrastructure as Code]]></category><dc:creator><![CDATA[Tim Birkett]]></dc:creator><pubDate>Thu, 03 Oct 2019 00:00:00 GMT</pubDate></item><item><title><![CDATA[Running Tasks Based on Public Holidays]]></title><description><![CDATA[<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>Over the past few years I&#8217;ve worked for various financial organizations in different areas. Something that has come up quite often is public holidays or bank holidays in the UK. They&#8217;re a non-working day for most people but I&#8217;ve still encountered organisations that need certain things like cron jobs to run or not to run on these days.</p>
</div>
<div class="paragraph">
<p>Recently the scheduled task in question was a file transfer to a payment provider which required manual acceptance from someone in the business. I noticed a calendar invite pop-up for me to "disable epayment file transfer" before every bank holiday and another calendar invite to "re-enable epayment file transfer" before the next file transfer at 6am. HELL NO! I hate repetitive manual things with a passion, even if it&#8217;s only 8 days a year.</p>
</div>
<div class="paragraph">
<p>I began my search for a cli tool to help me. Failing to find such a tool I began a hunt for a Python package to help me out as the scheduled task was written in Python. I found a few packages but some required a call to external web services. Then I found the <code>holidays</code> package: <a href="https://pypi.org/project/holidays/" class="bare">https://pypi.org/project/holidays/</a></p>
</div>
<div class="paragraph">
<p>As I started to write the code to handle public holidays&#8230;&#8203; <code>import holidays</code>&#8230;&#8203; I stopped myself. I had originally been searching for a cli tool to run as part of the systemd timer so I could run something like this:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>is_it_a_holiday_command || /opt/scripts/epayment_transfer</code></pre>
</div>
</div>
<div class="paragraph">
<p>So, rather than building the logic into the transfer script I wrote a cli tool to do it.</p>
</div>
<div class="paragraph">
<p>The tool which I&#8217;ve named <code>publicholiday</code> is on Github: <a href="https://github.com/timbirk/python-publicholiday" class="bare">https://github.com/timbirk/python-publicholiday</a> and has been published to PyPi: <a href="https://pypi.org/project/publicholiday/" class="bare">https://pypi.org/project/publicholiday/</a></p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_installation">Installation</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Installation is as simple as: <code>pip install publicholiday</code></p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_usage">Usage</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Using <code>publicholiday</code> is easy, the command will exit with a status code of 0 if today is a public holiday and exit with a status code of 1 if today is not a public holiday:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>$ publicholiday --help
Usage: publicholiday [OPTIONS]

  Is it a public holiday?

Options:
  -c, --country TEXT  Supported country name or code.
  --help              Show this message and exit.

# Run a script on a public holiday
$ publicholiday &amp;&amp; /thing/to/run.sh

# Don't run a script on public holidays (run it on all other days).
$ publicholiday || /thing/to/run.sh</code></pre>
</div>
</div>
<div class="paragraph">
<p>By default, <code>publicholiday</code> uses UK bank holidays. It is possible to pass in a country using either the name or the short code as defined [here](<a href="https://pypi.org/project/holidays/" class="bare">https://pypi.org/project/holidays/</a>). Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code># Run a script on a Argentinian public holiday
$ publicholiday -c Argentina &amp;&amp; /thing/to/run.sh

# Don't run a script on US public holidays
$ publicholiday -c US || /thing/to/run.sh</code></pre>
</div>
</div>
<div class="paragraph">
<p>Currently, only countries are supported. Province / state level holidays would be reasonably easy to implement and I&#8217;d be open to a PR with tests and documentation for that but it&#8217;s outside of the scope of my current needs.</p>
</div>
<div class="paragraph">
<p>Hopefully if you stumble upon this post it helps solve your problem.</p>
</div>
</div>
</div>]]></description><link>http://www.pysysops.com/2018/07/23/Running-Tasks-Based-on-Public-Holidays.html</link><guid isPermaLink="true">http://www.pysysops.com/2018/07/23/Running-Tasks-Based-on-Public-Holidays.html</guid><category><![CDATA[Python]]></category><category><![CDATA[Bank Holidays]]></category><category><![CDATA[Automation]]></category><category><![CDATA[Public Holidays]]></category><dc:creator><![CDATA[Tim Birkett]]></dc:creator><pubDate>Mon, 23 Jul 2018 00:00:00 GMT</pubDate></item><item><title><![CDATA[Securing Jenkins Workspaces]]></title><description><![CDATA[<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>Jenkins is a powerful tool for building, packaging and deploying your software. It can stand in as an ad-hoc task runner, an orchestrator and replace your cron jobs to give some visibility into what tasks have run, when and by who or what.</p>
</div>
<div class="paragraph">
<p>I&#8217;ve been using Jenkins for most of my tech life and think that in terms of flexibility, automation and power it&#8217;s one of the best tools to have in your box. There is one problem I have with Jenkins, by default it isn&#8217;t very secure.</p>
</div>
<div class="paragraph">
<p>Even after following <a href="https://jenkins.io/doc/book/system-administration/security/">Securing Jenkins</a> it has amongst other things something which can be a security concern: Workspace access through the web UI.</p>
</div>
<div class="paragraph">
<p>By default you have 3 options to control access to job workspaces:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>they are readable (anonymously or by a group / specified users)</p>
</li>
<li>
<p>they are not readable</p>
</li>
<li>
<p>they are deleted / cleaned up after a build</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>I generally stick to option 2 and 3, remove read access and clean the job workspace post build. This keeps build agents disk space happy and reduces leaking secrets or sensitive information out through logs or files in the workspace.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_overview">Overview</h2>
<div class="sectionbody">
<div class="paragraph">
<p>I&#8217;ve setup automated internal CA&#8217;s to handle issuing certificates and re-generating them on expiry. In recent years, I&#8217;ve moved to using LetsEncrypt wherever possible.</p>
</div>
<div class="paragraph">
<p>I&#8217;ve found that in cloud environments that often scale up and down, running an agent like certbot on each instance is an anti-pattern.</p>
</div>
<div class="paragraph">
<p>Each time an instance is provisioned, it requests some certificates, LetsEncrypt will issue a new certificate to each instance. If you&#8217;ve ever tried the same thing, you&#8217;ll find that you can hit LetsEncrypt limits pretty quickly.</p>
</div>
<div class="paragraph">
<p>To solve this problem I created a Jenkins job to handle the certificate generation with the Certbot Docker container (<code>lego</code> is a good alternative), build an rpm package containing the certificates and push the rpm up to our private yum repos. From there all instances can have the <code>cp-certs</code> package installed as part of there bootstrapping / initial provisioning.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_the_problem">The Problem</h2>
<div class="sectionbody">
<div class="paragraph">
<p>This job runs on a schedule every 7 days. On the first run of certbot, various configuration files and private keys are generated. Whilst Jenkins itself is relatively secure I don&#8217;t like the idea of these files sitting there in the workspace for anyone to get hold of, try to use, accidentally commit, use to decode https traffic&#8230;&#8203;</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_the_solution">The Solution</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Initially I hunted round for a while for some sort of "Secure" or "Hidden" workspace plugin. There&#8217;s nothing. I sat and thought: "What is the actual risk I&#8217;m trying to reduce?" and came up with:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>access via the Jenkins UI</p>
</li>
<li>
<p>user access on the file system</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>It dawned on me that the solution might be easier than I thought (damn brain getting in the way).</p>
</div>
<div class="paragraph">
<p>If I move the relevant sensitive files out of the workspace, there&#8217;s no longer a risk of someone borrowing them from the UI and if I encrypt / archive the files with a strong secret then they&#8217;ll be reasonably secure on the file system.</p>
</div>
<div class="paragraph">
<p>Obviously, someone who knows or can access the secret (me) and has access to the file system as the root or Jenkins user (me) could get at them but it&#8217;s a reasonable attempt at improving security.</p>
</div>
<div class="sect2">
<h3 id="_how_do_you_do_it">How do you do it?</h3>
<div class="paragraph">
<p>⚠️ You&#8217;ll need GPG installed on your Jenkins CI agents.</p>
</div>
<div class="paragraph">
<p>First, create a strong secret in Jenkins credential store. In the job you want to secure add a credentials binding to expose the strong secret as an environment variable. <code>SECURE_KEY</code> in this example.</p>
</div>
<div class="paragraph">
<p>Add something like the following to your job as the first shell step:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>#!/bin/bash
set -eu -o pipefail

#
# Check for encrypted workspace archive and extract or create directory.
#

SECURE_DIR="/var/lib/jenkins/secured-workspace/$JOB_NAME"

if [[ -f "${SECURE_DIR}/secured_state.tgz.gpg" ]]
then
    echo 'INFO: Restoring secured workspace'
    gpg --yes --batch --passphrase=${SECURE_KEY} \
        ${SECURE_DIR}/secured_state.tgz.gpg

    tar -xzf ${SECURE_DIR}/secured_state.tgz
else
    echo 'INFO: Creating directory for secured workspace'
    mkdir -m 700 -p ${SECURE_DIR}
fi
exit 0</code></pre>
</div>
</div>
<div class="paragraph">
<p>As the last shell step or a postbuild script something like the following should work:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>#!/bin/bash
set -eux -o pipefail

#
# GPG encrypt workspace and remove all insecure files.
#

SECURE_DIR="/var/lib/jenkins/secured-workspace/$JOB_NAME"

echo 'INFO: Creating secured workspace'
tar -czf ${SECURE_DIR}/secured_state.tgz .
gpg --yes --batch --passphrase=${SECURE_KEY} -c ${SECURE_DIR}/secured_state.tgz
rm -rf ${SECURE_DIR}/secured_state.tgz ${WORKSPACE}/* ${WORKSPACE}/.*</code></pre>
</div>
</div>
<div class="paragraph">
<p>At this point you&#8217;ll have a workspace with only safe files in it. You can always apply a workspace cleanup after the archive / encryption has run to get rid of all files.</p>
</div>
<div class="paragraph">
<p>Hopefully you find this helpful if you need to secure your workspace contents. Thanks for reading!</p>
</div>
</div>
</div>
</div>]]></description><link>http://www.pysysops.com/2018/06/09/Securing-Jenkins-Workspaces.html</link><guid isPermaLink="true">http://www.pysysops.com/2018/06/09/Securing-Jenkins-Workspaces.html</guid><category><![CDATA[Jenkins]]></category><category><![CDATA[CI]]></category><category><![CDATA[Security]]></category><category><![CDATA[Secrets]]></category><dc:creator><![CDATA[Tim Birkett]]></dc:creator><pubDate>Sat, 09 Jun 2018 00:00:00 GMT</pubDate></item><item><title><![CDATA[Testing DNS Infrastructure with Goss]]></title><description><![CDATA[<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>In a previous post we got an introduction to <a href="http://www.pysysops.com/2017/01/10/Easy-Infrastructure-Testing-with-Goss.html">"Easy Infrastructure Testing with Goss"</a>.</p>
</div>
<div class="paragraph">
<p>In this post we&#8217;ll take a look at a feature I added to <a href="http://goss.rocks">Goss</a> a while ago. Enhanced DNS validation.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_why_test_dns">Why test DNS?</h2>
<div class="sectionbody">
<div class="paragraph">
<p>DNS is easy right?! It&#8217;s just an IP address and a hostname. Easy&#8230;&#8203; We&#8217;ve definitely never had an outage or failed to deploy a new application because of a DNS issue have we?</p>
</div>
<div class="paragraph">
<p>DNS can get a little more interesting when you start chaining CNAMEs, have multiple A records for a hostname and introduce DNSSEC.</p>
</div>
<div class="paragraph">
<p>PTR records which reverse map an IP to a hostname are often used by various server applications for security purposes <a href="https://community.oracle.com/message/6415013">(Java + SSL)</a>.</p>
</div>
<div class="paragraph">
<p>If DNS configuration is out of your control and another team forgets to add the records you need correctly you can end up wasting hours troubleshooting why various applications won&#8217;t start up, clients fail to connect and you have SSL connection errors.</p>
</div>
<div class="paragraph">
<p><strong>Testing your DNS with Goss will solve ALL these problems!</strong> Okay, that&#8217;s a lie. It can however help you identify when DNS records aren&#8217;t quite right, have changed, or are missing before deploying a new application.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_what_can_goss_test">What can Goss test?</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Goss can validate that any of the following record types are resolveable and can validate the values of the records.</p>
</div>
<div class="ulist">
<ul>
<li>
<p>A</p>
</li>
<li>
<p>AAAA</p>
</li>
<li>
<p>CAA</p>
</li>
<li>
<p>CNAME</p>
</li>
<li>
<p>MX</p>
</li>
<li>
<p>NS</p>
</li>
<li>
<p>PTR</p>
</li>
<li>
<p>SRV</p>
</li>
<li>
<p>TXT</p>
</li>
</ul>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_how_do_i_test_dns_records">How do I test DNS records?</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Here are a few examples of DNS record tests:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>dns:
  # Validate a CAA record
  CAA:dnstest.io:
    resolvable: true
    addrs:
    - 0 issue comodoca.com
    - 0 issue letsencrypt.org
    - 0 issuewild ;
    timeout: 2000
    server: 8.8.8.8

  # Validate a CNAME record
  CNAME:dnstest.github.io:
    resolvable: true
    server: 8.8.8.8
    addrs:
    - "github.map.fastly.net."

  # Validate a PTR record
  PTR:8.8.8.8:
    resolvable: true
    server: 8.8.8.8
    addrs:
    - "google-public-dns-a.google.com."

  # Validate and SRV record
  SRV:_https._tcp.dnstest.io:
    resolvable: true
    server: 8.8.8.8
    addrs:
    - "0 5 443 a.dnstest.io."
    - "10 10 443 b.dnstest.io."

  # Validate an MX record
  MX:dnstest.io:
    resolvable: true
    addrs:
    - 10 b.dnstest.io.
    - 5 a.dnstest.io.
    timeout: 2000
    server: 8.8.8.8</code></pre>
</div>
</div>
<div class="paragraph">
<p>The above examples will query Google&#8217;s public DNS server: <code>8.8.8.8</code> for results. You can remove the <code>server</code> parameter which will result in the system DNS resolver being used.</p>
</div>
<div class="paragraph">
<p>Combining this with the <a href="https://github.com/aelsabbahy/goss/blob/master/docs/manual.md#example-4">nagios</a> output and creating a monitoring check from it could be helpful in identifying future issues or alerting when a record might have been "cleaned up".</p>
</div>
</div>
</div>]]></description><link>http://www.pysysops.com/2017/02/20/Testing-DNS-Infrastructure-with-Goss.html</link><guid isPermaLink="true">http://www.pysysops.com/2017/02/20/Testing-DNS-Infrastructure-with-Goss.html</guid><category><![CDATA[goss]]></category><category><![CDATA[DNS]]></category><category><![CDATA[Testing]]></category><category><![CDATA[DevOps]]></category><category><![CDATA[Linux]]></category><category><![CDATA[Monitoring]]></category><dc:creator><![CDATA[Tim Birkett]]></dc:creator><pubDate>Mon, 20 Feb 2017 00:00:00 GMT</pubDate></item><item><title><![CDATA[Easy Infrastructure Testing with Goss]]></title><description><![CDATA[<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>In the world of infrastructure; new servers, VMs, containers or applications are often manually validated by a human. Some form of build document or checkbox exercise takes place to confirm that the piece of infrastructure is ready for use. Even with configuration management, mistakes happen and humans make mistakes (damn humans).</p>
</div>
<div class="paragraph">
<p>Wouldn&#8217;t it be great if there were an easy way to automatically validate new servers before they are live and we find the problems in production? Well, there is! Try <a href="https://github.com/aelsabbahy/goss">Goss</a>.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_what_is_goss">What is Goss?</h2>
<div class="sectionbody">
<div class="paragraph">
<p><a href="https://github.com/aelsabbahy/goss">Goss</a> is a tool that let&#8217;s you easily and quickly validate infrastructure. Like <a href="http://serverspec.org/">Serverspec</a> but without all the code. Goss allows you to define what a piece of infrastructure should look like with YAML or JSON. This is made even easier for us with the ability to auto add resources to the Goss configuration on the command line.</p>
</div>
<div class="paragraph">
<p>Goss allows you to validate many different resource types such as files, users, groups, packages, services and http connectivity. You can read the full Goss documentation <a href="https://github.com/aelsabbahy/goss/blob/master/docs/manual.md#available-tests">here</a>.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_why_goss">Why Goss?</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Well, a few things make goss an awesome tool for server validation:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Written in Go - This means it&#8217;s a self contained binary with no dependencies on other libraries or interpreters.</p>
</li>
<li>
<p>It&#8217;s super fast - Taking advantage of Go&#8217;s concurrency model: tests are executed and returned almost instantly.</p>
</li>
<li>
<p>It&#8217;s easy to get started with - Defining resources in YAML or JSON makes it easy for your entire team to get to grips with.</p>
</li>
</ul>
</div>
<div class="sect2">
<h3 id="_an_example">An Example</h3>
<div class="paragraph">
<p>We build a web server running Apache.</p>
</div>
<div class="paragraph">
<p>Before going live someone checks that the Apache <code>httpd</code> package is the correct version of <code>2.4.25</code>, the <code>deployment</code> user is in the <code>www-data</code> group and there is an application directory at <code>/srv/www/app</code>. They also check the <code>httpd</code> service is running, we can connect to the application at <code><a href="http://localhost/app" class="bare">http://localhost/app</a></code> and going to <code><a href="http://localhost/" class="bare">http://localhost/</a></code> gives a <code>404</code> error page.</p>
</div>
<div class="paragraph">
<p>To automate the above procedure with Goss, the YAML configuration would look like this:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code class="language-yaml" data-lang="yaml">---
package:
  httpd:
    installed: true
    versions:
    - 2.4.25
user:
  deployment:
    exists: true
    groups:
    - deployment
    - www-data
file:
  /srv/www/app:
    exists: true
    filetype: directory
service:
  httpd:
    enabled: true
    running: true
http:
  http://localhost/app:
    status: 200
    timeout: 1000
  http://localhost:
    status: 404
    timeout: 1000</code></pre>
</div>
</div>
<div class="paragraph">
<p>You can see clearly what is being validated and this goss.yaml file can be used to consistently validate all servers of the same configuration.</p>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_server_validation_and_monitoring">Server Validation and Monitoring</h2>
<div class="sectionbody">
<div class="paragraph">
<p>All this validation of files, processes, services, ports and connectiovity sounds familiar. It&#8217;s something that we quite often try to achieve with our monitoring tools like Nagios, Zabbix or Sensu.</p>
</div>
<div class="paragraph">
<p>With Goss we can create a single monitoring check that tests many resources at once. Goss has several different <a href="https://github.com/aelsabbahy/goss/#supported-output-formats">outputs</a>. The nagios_verbose output gives you out of the box testing compatible with Nagios or Sensu and gives Nagios long output explaining failures:</p>
</div>
<div class="imageblock">
<div class="content">
<img src="https://cloud.githubusercontent.com/assets/1253072/18037748/76f65a32-6d83-11e6-9aba-bceabb8430a3.png" alt="Goss - nagios_verbose output">
</div>
</div>
<div class="paragraph">
<p>Server validation can now become part of your monitoring ecosystem ensuring that any problems are identified quickly with a single monitoring check.</p>
</div>
<div class="admonitionblock warning">
<table>
<tr>
<td class="icon">
<i class="fa icon-warning" title="Warning"></i>
</td>
<td class="content">
Goss won&#8217;t replace all of your checks, it doesn&#8217;t check things like HDD space, RAM usage or errors in log files. But it makes a sweet addition to server / service monitoring.
</td>
</tr>
</table>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_http_health_endpoint">HTTP Health Endpoint</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Many applications expose "health" endpoints for applications and services.</p>
</div>
<div class="paragraph">
<p>Tools like Google&#8217;s Borgmon monitoring system and <a href="https://prometheus.io/">Prometheus</a> use this HTTP(S) scrape or pull model to retrieve monitoring and metrics results from there services.</p>
</div>
<div class="paragraph">
<p>Goss has a <a href="https://github.com/aelsabbahy/goss/blob/master/docs/manual.md#serve-s---serve-a-health-endpoint"><code>serve</code> command</a> that exposes a http endpoint for scraping. You can then then use something like <a href="https://www.phpservermonitor.org/">PHP Server Monitor</a> to show the validation status of each piece of infrastructure.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
If instrumeting your applications interests you, there&#8217;s plenty of libraries to assist you. Check out: <a href="http://metrics.dropwizard.io/3.1.0/">Dropwizard Metrics</a> and <a href="http://blog.kristian.io/django-health-check/">Django Health Check</a>.
</td>
</tr>
</table>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_final_thoughts">Final Thoughts</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Hopefully you can see the value of validation testing your infrastructure and building validation into your monitoring systems.</p>
</div>
<div class="paragraph">
<p>Goss is a young project and currently only supports Linux but it&#8217;s very active and open to contributions. You can help out by opening issues, discussing enhancements and submitting pull requests for review.</p>
</div>
<div class="paragraph">
<p>I&#8217;ll cover some advanced Goss usage in future posts. Thanks for reading!</p>
</div>
</div>
</div>]]></description><link>http://www.pysysops.com/2017/01/10/Easy-Infrastructure-Testing-with-Goss.html</link><guid isPermaLink="true">http://www.pysysops.com/2017/01/10/Easy-Infrastructure-Testing-with-Goss.html</guid><category><![CDATA[Configuration Management]]></category><category><![CDATA[Testing]]></category><category><![CDATA[Security]]></category><category><![CDATA[goss]]></category><category><![CDATA[Monitoring]]></category><dc:creator><![CDATA[Tim Birkett]]></dc:creator><pubDate>Tue, 10 Jan 2017 00:00:00 GMT</pubDate></item><item><title><![CDATA[Puppet Anti-Patterns]]></title><description><![CDATA[<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>Over my years in the tech industry I&#8217;ve gained a lot of experience with Configuration Managemnt tools such as Puppet, Chef and Ansible. In this post I&#8217;d like to share with you my experiences, opinions and advice on using Puppet as a Configuration Management tool. Hopefully this helps some of you out there to beat Puppet into submission.</p>
</div>
<div class="paragraph">
<p>Let&#8217;s just jump straight in with the patterns that aren&#8217;t too great.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_everything_in_manifests">Everything in Manifests</h2>
<div class="sectionbody">
<div class="paragraph">
<p>In many beginners tutorials you get taught to put all your code in manifests such as <code>site.pp</code> or <code>nodes.pp</code>. For example:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>node 'puppetclient1.mydomain.net' {
  include httpd_class
}

node 'puppetclient2.mydomain.net' {
  include nginx_class
  file {'/opt/deployment_script':
    ensure =&gt; 'file',
    owner  =&gt; 'deploy',
    group  =&gt; 'deploy',
    mode   =&gt; '0750'
  }
}

node default {
  package { 'perl':
    ensure =&gt; present
  }
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>This is great when you&#8217;re just starting out with a few servers to manage. You think you get it. Then you add a few more servers, you start adding more node specific config and before you know it you&#8217;ve got 10'000 lines of hand-crafted artisanal Puppet code. This was common in the early days of Puppet use. It was how I started back with Puppet 0.24.</p>
</div>
<div class="paragraph">
<p>Although, it&#8217;s not the best idea to manage your infrastructure in this way it&#8217;s actually a reasonably good way to very easily and simply bootstrap cloud instances. with separate manifests based on server type (<code>web.pp</code>, <code>app.pp</code>, <code>lb.pp</code> etc). These can than be applied using <a href="https://cloudinit.readthedocs.io/en/latest/">cloudinit</a> to create an immutable bootstrapped node.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_monolithic_code_modules_code_directory">Monolithic <code>modules</code> Directory</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Quite often I see repos where people have <code>puppet module install&#8217;d straight into the `modules</code> directory or they&#8217;ve downloaded a module and extracted it there. The whole repo including their own modules mixed in with upstream modules is then committed to source control.</p>
</div>
<div class="paragraph">
<p>This pattern has a few problems:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>you don&#8217;t know what is a locally developed module and what is an upstream module</p>
</li>
<li>
<p>there&#8217;s no way of easily seeing what versions of modules are deployed</p>
</li>
<li>
<p>it adds a lot of extra code to your Puppet repository</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>Although this way works and you know that your module versions are pinned, tools are out there that make it much easier to manage your Puppet modules such as <a href="http://librarian-puppet.com/">librarian-puppet</a> and <a href="https://github.com/puppetlabs/r10k">r10k</a>.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_configuration_data_in_code">Configuration Data in Code</h2>
<div class="sectionbody">
<div class="paragraph">
<p>When writing Puppet code it&#8217;s sometimes tempting to hard-code things like IP addresses or node specific things. For example:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code># DNS Config
class profile::base::dns{
  $dns_servers = ['192.168.1.1', '192.168.1.2']
  file { '/etc/resolv.conf':
    ensure  =&gt; present,
    owner   =&gt; 'root',
    group   =&gt; 'root',
    mode    =&gt; '0444',
    content =&gt; template('etc/resolv.conf.erb')
  }
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>This works, but the code isn&#8217;t re-usable. If you deploy to a different network or DC are your DNS servers still the same?</p>
</div>
<div class="paragraph">
<p>To improve re-usablility, change the variable to be a class parameter with an optional default value:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code># DNS Config
class profile::base::dns (
  $dns_servers = ['8.8.8.8', '8.8.4.4']
){
  file { '/etc/resolv.conf':
    ensure  =&gt; present,
    owner   =&gt; 'root',
    group   =&gt; 'root',
    mode    =&gt; '0444',
    content =&gt; template('etc/resolv.conf.erb')
  }
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>Then you can specify environment (network, node, DC) specific configuration in Hiera:</p>
</div>
<div class="paragraph">
<p>dc1.example.com.yaml:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>---
profile::base::dns:
  dns_servers:
    - '192.168.1.1'
    - '192.168.1.2'</code></pre>
</div>
</div>
<div class="paragraph">
<p>dc2.example.com.yaml:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>---
profile::base::dns:
  dns_servers:
    - '10.10.0.1'
    - '10.10.1.1'</code></pre>
</div>
</div>
<div class="paragraph">
<p>Now you avoid multiple classes or writing <code>case</code> or <code>if {&#8230;&#8203;} else {&#8230;&#8203;}</code> logic in your class file.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_everything_in_separate_repos">Everything in Separate Repos</h2>
<div class="sectionbody">
<div class="paragraph">
<p>In the example above, I&#8217;ve mentioned a "profile" class. A common pattern when dealing with Puppet code is the <strong>Roles and Profiles Pattern</strong>. The idea is that you assign one "Role" per server and the role is made up of individual bite-sized "Profiles". Roles and profiles are just you or your companies custom Puppet modules that make use of upstream modules or Puppet resources to configure systems.</p>
</div>
<div class="paragraph">
<p>A common and problematic pattern I&#8217;ve seen is maintaining an SCM repo for roles, an SCM repo for profiles, an SCM repo for the "control repo" and a separate SCM repo for hieradata. This can then turn into multiple branches of each and before you know it you&#8217;re maintaining 12 versions of a diverging code base. It&#8217;s common to see this when people have followed some "best practice" blog post without fully thinking things through. You also often see this when developers created the code initially. They love git branches and complexity. Especially if trying to bring Gitflow to Puppet code.</p>
</div>
<div class="paragraph">
<p>Here&#8217;s my top tip when deciding how to organize your repos: KISS - Have no more than one repo for your products Puppet code. Multiple repos and branches leads to managing multiple very different code bases, different module versions in each environment, merge tasks, complex git fixing&#8230;&#8203; a nightmare. There is an awesome control-repo by the guys at Example42 here: <a href="https://github.com/example42/control-repo" class="bare">https://github.com/example42/control-repo</a></p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_misuse_of_puppet_environments">Misuse of Puppet Environments</h2>
<div class="sectionbody">
<div class="paragraph">
<p>With the r10k tool it&#8217;s easy to manage environments dynamically with Git branches. You can then assign subsets of servers to use different environments (branches of Puppet code). This sounds great, we have applications deployed to different environments like test, dev, stage, uat or production. Cool! let&#8217;s create all those branches&#8230;&#8203; STOP!!</p>
</div>
<div class="paragraph">
<p>Puppet environments are a powerful thing, but I don&#8217;t believe you should confuse Puppet environments with Application environments. You should aim to manage your infrastructure in a single Puppet environment: "production". If you arrange your hieradata hierarchy sensibly you can manage the differences in configuration in a single branch.</p>
</div>
<div class="paragraph">
<p>Puppet branches should be used when you need to test big changes or new features out in a controlled way (of course you&#8217;re developing and testing on Vagrant). Create a branch like "new_feature", develop it locally testing it on Vagrant then test out the changes on a suitable piece of infrastructure by running <code>puppet apply --environment new_feature</code>. Don&#8217;t forget your security and perf testing at this point ;) if everything looks good, open a PR, get it reviewed, merged to production and delete the branch.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_closing_thoughts">Closing Thoughts</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Most of the opinions I&#8217;ve formed are from being forced to work with some painful Puppet code setups and processes. My best advice to anyone developing their infrastructure code: Keep it simple, think about it before you go ahead, keep an open mind and don&#8217;t be afraid to change your mind or refactor when necessary.</p>
</div>
</div>
</div>]]></description><link>http://www.pysysops.com/2016/11/10/1123-Puppet-Anti-Patterns.html</link><guid isPermaLink="true">http://www.pysysops.com/2016/11/10/1123-Puppet-Anti-Patterns.html</guid><category><![CDATA[Puppet]]></category><category><![CDATA[Automation]]></category><category><![CDATA[Configuration Management]]></category><category><![CDATA[Devops]]></category><dc:creator><![CDATA[Tim Birkett]]></dc:creator><pubDate>Thu, 10 Nov 2016 00:00:00 GMT</pubDate></item></channel></rss>