g0blin - Structure and Hosting

  1. Humble beginnings
  2. Death to Wordpress - Long live the new CMS
  3. Hosting
  4. CloudFront
  5. End result

Time for something slightly different. How is g0blinResearch generated, how was it hosted, how is it hosted, and how did it get to where it is.

Humble beginnings

Back in 2004, I started to explore Wordpress plugins for vulnerabilities. In short order, I'd built up a decent list of findings. Throughout my investigations, I ensured I both disclosed the vulnerabilities to the vendors prior to public disclosure. Public disclosure was then made a short time after a patch was made available.

To start with, g0blinResearch was actually hosted on none other than Wordpress. I assure you, the irony was not lost on me.

Something had to change.

Death to Wordpress - Long live the new CMS

Late in 2015, I decided to check out static site generators. In my position at the time, I was working heavily in nodejs, so it made sense to pick from one of many nodejs based static site generators. In the end, I settled on metalsmith due to its very basic structure, wealth of plugins and ease of development.

So, what did I need? I needed a generator that would not only support the template (and associated metadata) for my CVE posts, but also general blog posts. For this, I'd need to chain together a good number of plugins.

Below is a full list of plugins I use, with a short description as to their purpose. This list is also in the order defined in my metalsmith.json file.

  • metalsmith-drafts
    • Allows to control whether or not a post is included in the generated site
  • metalsmith-markdown
    • Parses .md files in to the representative html
  • metalsmith-date-formatter
    • Formats dates in metadata to a specific format
  • metalsmith-paths
    • Adds file paths to metadata for linking
  • metalsmith-in-place
    • Enables handlebars templating engine within content
  • metalsmith-register-helpers
    • Registers arbitrary handlebars helpers for use within content
  • metalsmith-collections
    • Groups content types based upon their defined collection
  • metalsmith-collections-metadata
    • Custom plugin that enriches metadata within collections
  • metalsmith-pagination
    • Generates paginated collections
  • metalsmith-autotoc
    • Generates Table of Contents for documents from header tags
  • metalsmith-excerpts
    • Generates an excerpt from the first paragraph of documents
  • metalsmith-layouts
    • Allows for use of handlebars layouts
  • metalsmith-less
    • Compiles less into css files
  • metalsmith-assets
    • Copies static assets to build directory
  • metalsmith-convert
    • Generates thumbnails for images
  • metalsmith-feed
    • Generates an xml feed for all posts

Putting this all together in a metalsmith.json config file (as below) - when provided with the expected file structure - results in generation of the g0blinResearch blog.

{
    "source": "src",
    "destination": "build",
    "metadata": {
        "site": {
            "title": "g0blin Research",
            "url": "https://g0blin.co.uk",
            "author": "James Hooker <research@g0blin.co.uk>"
        }
    },
    "plugins": {
        "metalsmith-drafts": true,
        "metalsmith-markdown": true,
        "metalsmith-date-formatter": {
            "format": "YYYY-MM-DD"
        },
        "metalsmith-paths": {
            "property": "paths",
            "directoryIndex": "index.html"
        },
        "metalsmith-in-place": {
            "engine": "handlebars",
            "partials": "partials"
        },
        "metalsmith-register-helpers": {
            "directory": "handlebars-helpers"
        },
        "metalsmith-collections": {
            "blog": {
                "sortBy": "date",
                "reverse": true,
                "metadata": {
                    "layout": "blog.html"
                }
            },
            "research": {
                "sortBy": "date",
                "reverse": true,
                "metadata": {
                    "layout": "research.html"
                }
            },
            "all": {
                "sortBy": "date",
                "reverse": true
            },
            "error": {
                "metadata": {
                    "layout": "error.html"
                }
            }
        },
        "metalsmith-collections-metadata": {
            "blog": {
                "layout": "blog.html"
            },
            "research": {
                "layout": "research.html"
            },
            "error": {
                "layout": "error.html"
            }
        },
        "metalsmith-pagination": {
            "collections.all": {
                "perPage": 9,
                "layout": "home.html",
                "first": "index.html",
                "path": "page/:num/index.html"
            }
        },
        "metalsmith-autotoc": {
            "selector": "h2, h3, h4"
        },
        "metalsmith-excerpts": true,
        "metalsmith-layouts": {
            "engine": "handlebars"
        },
        "metalsmith-less": {
            "pattern": "**/*.less",
            "render": {
                "paths": [
                    "build/css"
                ]
            }
        },
        "metalsmith-assets": {
            "source": "./assets",
            "destination": "."
        },
        "metalsmith-convert": [{
            "src": "**/*.png",
            "target": "png",
            "resize": {
                "width": 320,
                "height": 240,
                "resizeStyle": "aspectfit"
            },
            "nameFormat": "%b_thumb%e"
        }, {
            "src": "**/*.jpg",
            "target": "png",
            "resize": {
                "width": 320,
                "height": 240,
                "resizeStyle": "aspectfit"
            },
            "nameFormat": "%b_thumb%e"
        }, {
            "src": "**/*.gif",
            "target": "png",
            "resize": {
                "width": 320,
                "height": 240,
                "resizeStyle": "aspectfit"
            },
            "nameFormat": "%b_thumb%e"
        }],
        "metalsmith-feed": {
            "collection": "all",
            "limit": false
        }
    }
}

File structure

The structure for the generator goes something like this.

assets/
   css/
   img/
   js/
handlebars-helpers/
layouts/
  blog.html
  error.html
  home.html
  research.html
partials/
  disqus.html
  footer.html
  header.html
src/    
  blog-post/
    index.md
    image1.png
  cve-post/
    index.md
    image1.png

The contents of these files can be found in the repo at https://github.com/g0blinResearch/g0blin.co.uk-public.

So, what does an example blog post look like? Let's check out the content of blog-post/index.md.

---
title: 'Example blog post'
date: 2016-09-20
autotoc: true
collection:
- blog
- all
---

This is an example blog post. This paragraph will be used in the excerpt - unless it goes over `255` characters in length, then it will be truncated.

### Header 1

This is our first section of content.

### Header 2

Yet another section of content. Let's output an image!

![](image1_thumb.png)

Here comes a code block - are you ready?

    #!/bin/bash
    echo "Booya!"

That's it! The metadata contains everything metalsmith needs to categorise and generate the post, and the following content is the actual..well..content of the post. Simple, no?

What about the more complicated research or CVE content type? Here's the content for cve-post/index.md.

---
title: Super Awesome Uploader 0.1, Arbitrary File Upload
date: "2016-09-20"
software: Super Awesome Uploader
vendor_notified: true
vendor_responded: false
fix_available: false
version: "0.1"
homepage: https://wordpress.org/plugins/super-awesome-uploader
cve: CVE-2016-XXXX
cvss_score: "9"
cvss_vector: (AV:N/AC:L/Au:N/C:P/I:P/A:C)
attack_scope: remote
authorization_required: None
mitigation: |
  Filter file types prior to accepting an upload. Place .htaccess file in super_awesome directory that prevents PHP/script execution under it.
timeline:
- date: "2016-08-18"
  text: Discovered
- date: "2016-08-18"
  text: Reported to WP.org
- date: "2016-08-18"
  text: CVE ID Assigned
- date: "2016-09-20"
  text: Advisory released
poc:
- desc: "We can upload an aribtrary file by posting to the path /wp-content/plugins/super-awesome-uploader/upload.php with the parameter `file`.\n\n"
  text: ""
vulnerability_type: Arbitrary File Upload
collection:
- research
- all
---
Arbitrary file upload in Super Awesome Uploader 0.1 allows remote unauthenticated user to upload files of any type. Provides the ability to upload a PHP shell.

So, that's it. Just a ton more metadata associated.

I feel I've rambled on enough about the structure of the site itself. Time to see how it's actually hosted.

Hosting

As I mentioned, up until late 2015, g0blinResearch was being hosted on Wordpress. Following the move to being hosted via a static site generator, the only real change was symlinking the build directory to the web root of the site.

Early in 2016, I started to think. We're only hosting static files now - why not host it via s3.

Fast forward 9 months, and I've finally achieved my goal over an evening of hacking about with various services, including Github, CircleCI, S3, Route 53 and CloudFront.

Github

There's not much to speak of here. The project is simply hosted on github in a (currently) private repository. The package.json has the start command configured to run node_modules/.bin/metalsmith. The contents of package.json can be seen below.

{
  "name": "g0blin-metalsmith",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "",
    "start": "node_modules/.bin/metalsmith"
  },
  "author": "",
  "license": "ISC",
  "dependencies": {
    "bootstrap": "3.3.5",
    "grunt": "^0.4.5",
    "handlebars": "^4.0.1",
    "handlebars-helpers": "^0.5.8",
    "metalsmith": "^2.0.1",
    "metalsmith-assets": "^0.1.0",
    "metalsmith-autotoc": "^0.1.2",
    "metalsmith-broken-link-checker": "^0.1.6",
    "metalsmith-collections": "^0.7.0",
    "metalsmith-collections-metadata": "0.0.1",
    "metalsmith-convert": "^0.3.0",
    "metalsmith-date-formatter": "^1.0.2",
    "metalsmith-debug": "0.0.2",
    "metalsmith-drafts": "0.0.1",
    "metalsmith-excerpts": "^1.0.0",
    "metalsmith-feed": "^0.2.0",
    "metalsmith-filemetadata": "0.0.4",
    "metalsmith-in-place": "^1.3.1",
    "metalsmith-layouts": "^1.4.0",
    "metalsmith-less": "^2.0.0",
    "metalsmith-markdown": "^0.2.1",
    "metalsmith-pagination": "^1.1.1",
    "metalsmith-paths": "^2.1.1",
    "metalsmith-permalinks": "^0.4.0",
    "metalsmith-register-helpers": "^0.1.2",
    "metalsmith-serve": "0.0.3",
    "metalsmith-watch": "^1.0.1"
  }
}

There is also a circle.yml file which defines how CircleCI will behave when it sees a new commit.

dependencies:
  pre:
    - sudo apt-get update; sudo apt-get install libmagick++-dev
deployment:
  prod:
    branch: master
    commands:
      - npm start
      - aws configure set preview.cloudfront true
      - aws s3 sync build s3://g0blin.co.uk/ --delete
      - aws cloudfront create-invalidation --distribution-id XXXXXXXXXXXXXX --paths "/*"
machine:
  node:
    version: 0.12.7

Essentially, this will install the pre-requisites for our nodejs modules, and trigger a build when a commit is made to the master branch. It will then configure the aws client to enable the preview features of CloudFront, sync our build directory to our hosting bucket, and then inform CloudFront to invalidate all paths. The invalidation I want to improve, so only posts that were updated are invalidated, but that can wait for another time.

S3

First of all, we need to take in to account that part of this move I want to change domain from research.g0blin.co.uk to just g0blin.co.uk. I also want to direct www.g0blin.co.uk to g0blin.co.uk without the www prefix. I proceed to setup three buckets on S3.

  • g0blin.co.uk
    • This will be used to host our static files
  • research.g0blin.co.uk
  • www.g0blin.co.uk
    • Both of these will purely be used to redirect traffic to research.g0blin.co.uk

The two buckets for redirects are straight forward to setup. We simply create the buckets, and then enable Redirect all requests to another host name under Static Website Hosting with the value of https://g0blin.co.uk.

The bucket for our content is equally as straight forward to setup. We create the bucket, and then enable Enable website hosting under Static Website Hosting, specifying the Index Document and Error Document.

And that's it - we're done with the buckets.

CircleCI

Again, there's very little setup involved in CircleCI. All we need to do is sign up with our github account, and then add our project to our account. After signing up, we browse to https://circleci.com/add-projects, click on our associated github account and then click on the repository we want to build - in this case, research.g0blin.co.uk.

Once this build starts, we should see something like this.

In order to allow CircleCI to make changes to our S3 bucket, and to invalidate paths on CloudFront, we create a new policy with the following content.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "g0blincouks3deploy",
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::g0blin.co.uk",
                "arn:aws:s3:::g0blin.co.uk/*"
            ]
        },
        {
            "Sid": "g0blincoukcloudfrontinvalidate",
            "Effect": "Allow",
            "Action": [
                "cloudfront:CreateInvalidation"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}

Next I create a new user and attach this policy. After creating the user, I should have a Access Key ID and Secret Access Key. I proceed to set these in the Project Settings under AWS Permissions in CircleCI.

That's it for CircleCI - it will now build on commits to our repo, and has permission to perform actions on our target S3 bucket, and to invalidate in CloudFront.

CloudFront

Next up is CloudFront, which will act as the CDN for our static site.

g0blin.co.uk

Firstly, I create a distribution for g0blin.co.uk. I choose to use a Custom SSL Certificate, and proceed to create a wildcard certificate via Amazon's ACM service. I specify the Default Root Object as index.html. I also set the Alternate Domain Names (CNAMEs) as g0blin.co.uk.

For the origin, I specify the Origin Domain Name as the domain provided when I setup static hosting for the g0blin.co.uk bucket - g0blin.co.uk.s3-website-eu-west-1.amazonaws.com. I set the Origin Protocol Policy as HTTP Only.

For the behaviours, I essentially leave the defaults as they will suffice, with the exception of switching Viewer Protocol Policy to Redirect HTTP to HTTPS. I set the origin to the origin we defined above.

The error pages I defined as /500.html and /404.html for their respective status codes.

research/www subdomains

These are pretty straight forward to setup. We essentially follow the setup for g0blin.co.uk, but change the Origin Domain Name to match the two matching bucket URLs we created above - research.g0blin.co.uk.s3-website-eu-west-1.amazonaws.com and www.g0blin.co.uk.s3-website-eu-west-1.amazonaws.com. We also keep Viewer Protocol Policy as HTTP and HTTPS. We do not need to set the error pages for these distributions, as they will simply be redirected to the g0blin.co.uk distribution.

Route 53

Finally we setup Route 53 which will manage our DNS entries.

First of all, I migrated across my domain to Route 53. This is a pretty straight forward process, and is well documented here.

Next I create an A record as an alias with an empty value in Name, pointing towards our g0blin.co.uk CloudFront distribution.

I repeat the above for the domains research.g0blin.co.uk and www.g0blin.co.uk, pointing them to the appropriate CloudFront distributions.

End result

So, what all of this setup means is that upon a commit to the master branch in github, CircleCI will pull down the changes, run the npm start command and build our static site. It will then sync up with our g0blin.co.uk bucket, and trigger an invalidation in CloudFront.

Visitors to the domains research.g0blin.co.uk and www.g0blin.co.uk are subsequently redirected to g0blin.co.uk, retaining the path requested. The distribution for g0blin.co.uk then will serve up the generated static site from our S3 bucket.

It took quite a bit of time tinkering with CloudFront, S3 and Route 53 in order to get the redirection working, but all in all I'm really pleased with the end result. My site is no longer hosted on redundant hardware, and I no longer need to ssh on to a box in order to pull down changes and deploy them to nginx.

If anyone is particularly interested in the structure of the static site, I've created a public Github repo containing the example content referenced further up in this blog post. You can find it at https://github.com/g0blinResearch/g0blin.co.uk-public.

This blog post was mostly so I could remember how I achieved this, but I know that I spent a good amount of time putting together the metalsmith site itself, as well as connecting all the pieces of the puzzle for the hosting solution, so I thought it may be useful to others as well.