g0blin - Structure and Hosting
Time for something slightly different. How is g0blinResearch
generated, how was it hosted, how is it hosted, and how did it get to where it is.
Humble beginnings
Back in 2004
, I started to explore Wordpress
plugins for vulnerabilities. In short order, I'd built up a decent list of findings. Throughout my investigations, I ensured I both disclosed the vulnerabilities to the vendors prior to public disclosure. Public disclosure was then made a short time after a patch was made available.
To start with, g0blinResearch
was actually hosted on none other than Wordpress
. I assure you, the irony was not lost on me.
Something had to change.
Death to Wordpress - Long live the new CMS
Late in 2015
, I decided to check out static site generators
. In my position at the time, I was working heavily in nodejs
, so it made sense to pick from one of many nodejs
based static site generators
. In the end, I settled on metalsmith due to its very basic structure, wealth of plugins and ease of development.
So, what did I need? I needed a generator that would not only support the template (and associated metadata) for my CVE
posts, but also general blog posts. For this, I'd need to chain together a good number of plugins.
Below is a full list of plugins I use, with a short description as to their purpose. This list is also in the order defined in my metalsmith.json
file.
- metalsmith-drafts
- Allows to control whether or not a post is included in the generated site
- metalsmith-markdown
- Parses
.md
files in to the representativehtml
- Parses
- metalsmith-date-formatter
- Formats dates in metadata to a specific format
- metalsmith-paths
- Adds file paths to metadata for linking
- metalsmith-in-place
- Enables handlebars templating engine within content
- metalsmith-register-helpers
- Registers arbitrary handlebars helpers for use within content
- metalsmith-collections
- Groups content types based upon their defined collection
- metalsmith-collections-metadata
- Custom plugin that enriches metadata within collections
- metalsmith-pagination
- Generates paginated collections
- metalsmith-autotoc
- Generates Table of Contents for documents from header tags
- metalsmith-excerpts
- Generates an excerpt from the first paragraph of documents
- metalsmith-layouts
- Allows for use of handlebars layouts
- metalsmith-less
- Compiles
less
intocss
files
- Compiles
- metalsmith-assets
- Copies static assets to build directory
- metalsmith-convert
- Generates thumbnails for images
- metalsmith-feed
- Generates an
xml
feed for all posts
- Generates an
Putting this all together in a metalsmith.json
config file (as below) - when provided with the expected file structure - results in generation of the g0blinResearch
blog.
{
"source": "src",
"destination": "build",
"metadata": {
"site": {
"title": "g0blin Research",
"url": "https://g0blin.co.uk",
"author": "James Hooker <research@g0blin.co.uk>"
}
},
"plugins": {
"metalsmith-drafts": true,
"metalsmith-markdown": true,
"metalsmith-date-formatter": {
"format": "YYYY-MM-DD"
},
"metalsmith-paths": {
"property": "paths",
"directoryIndex": "index.html"
},
"metalsmith-in-place": {
"engine": "handlebars",
"partials": "partials"
},
"metalsmith-register-helpers": {
"directory": "handlebars-helpers"
},
"metalsmith-collections": {
"blog": {
"sortBy": "date",
"reverse": true,
"metadata": {
"layout": "blog.html"
}
},
"research": {
"sortBy": "date",
"reverse": true,
"metadata": {
"layout": "research.html"
}
},
"all": {
"sortBy": "date",
"reverse": true
},
"error": {
"metadata": {
"layout": "error.html"
}
}
},
"metalsmith-collections-metadata": {
"blog": {
"layout": "blog.html"
},
"research": {
"layout": "research.html"
},
"error": {
"layout": "error.html"
}
},
"metalsmith-pagination": {
"collections.all": {
"perPage": 9,
"layout": "home.html",
"first": "index.html",
"path": "page/:num/index.html"
}
},
"metalsmith-autotoc": {
"selector": "h2, h3, h4"
},
"metalsmith-excerpts": true,
"metalsmith-layouts": {
"engine": "handlebars"
},
"metalsmith-less": {
"pattern": "**/*.less",
"render": {
"paths": [
"build/css"
]
}
},
"metalsmith-assets": {
"source": "./assets",
"destination": "."
},
"metalsmith-convert": [{
"src": "**/*.png",
"target": "png",
"resize": {
"width": 320,
"height": 240,
"resizeStyle": "aspectfit"
},
"nameFormat": "%b_thumb%e"
}, {
"src": "**/*.jpg",
"target": "png",
"resize": {
"width": 320,
"height": 240,
"resizeStyle": "aspectfit"
},
"nameFormat": "%b_thumb%e"
}, {
"src": "**/*.gif",
"target": "png",
"resize": {
"width": 320,
"height": 240,
"resizeStyle": "aspectfit"
},
"nameFormat": "%b_thumb%e"
}],
"metalsmith-feed": {
"collection": "all",
"limit": false
}
}
}
File structure
The structure for the generator goes something like this.
assets/
css/
img/
js/
handlebars-helpers/
layouts/
blog.html
error.html
home.html
research.html
partials/
disqus.html
footer.html
header.html
src/
blog-post/
index.md
image1.png
cve-post/
index.md
image1.png
The contents of these files can be found in the repo at https://github.com/g0blinResearch/g0blin.co.uk-public.
So, what does an example blog post look like? Let's check out the content of blog-post/index.md
.
---
title: 'Example blog post'
date: 2016-09-20
autotoc: true
collection:
- blog
- all
---
This is an example blog post. This paragraph will be used in the excerpt - unless it goes over `255` characters in length, then it will be truncated.
### Header 1
This is our first section of content.
### Header 2
Yet another section of content. Let's output an image!
![](image1_thumb.png)
Here comes a code block - are you ready?
#!/bin/bash
echo "Booya!"
That's it! The metadata contains everything metalsmith
needs to categorise and generate the post, and the following content is the actual..well..content of the post. Simple, no?
What about the more complicated research
or CVE
content type? Here's the content for cve-post/index.md
.
---
title: Super Awesome Uploader 0.1, Arbitrary File Upload
date: "2016-09-20"
software: Super Awesome Uploader
vendor_notified: true
vendor_responded: false
fix_available: false
version: "0.1"
homepage: https://wordpress.org/plugins/super-awesome-uploader
cve: CVE-2016-XXXX
cvss_score: "9"
cvss_vector: (AV:N/AC:L/Au:N/C:P/I:P/A:C)
attack_scope: remote
authorization_required: None
mitigation: |
Filter file types prior to accepting an upload. Place .htaccess file in super_awesome directory that prevents PHP/script execution under it.
timeline:
- date: "2016-08-18"
text: Discovered
- date: "2016-08-18"
text: Reported to WP.org
- date: "2016-08-18"
text: CVE ID Assigned
- date: "2016-09-20"
text: Advisory released
poc:
- desc: "We can upload an aribtrary file by posting to the path /wp-content/plugins/super-awesome-uploader/upload.php with the parameter `file`.\n\n"
text: ""
vulnerability_type: Arbitrary File Upload
collection:
- research
- all
---
Arbitrary file upload in Super Awesome Uploader 0.1 allows remote unauthenticated user to upload files of any type. Provides the ability to upload a PHP shell.
So, that's it. Just a ton more metadata associated.
I feel I've rambled on enough about the structure of the site itself. Time to see how it's actually hosted.
Hosting
As I mentioned, up until late 2015
, g0blinResearch
was being hosted on Wordpress
. Following the move to being hosted via a static site generator
, the only real change was symlinking the build
directory to the web root of the site.
Early in 2016, I started to think. We're only hosting static files now - why not host it via s3
.
Fast forward 9 months, and I've finally achieved my goal over an evening of hacking about with various services, including Github
, CircleCI
, S3
, Route 53
and CloudFront
.
Github
There's not much to speak of here. The project is simply hosted on github
in a (currently) private repository. The package.json
has the start
command configured to run node_modules/.bin/metalsmith
. The contents of package.json
can be seen below.
{
"name": "g0blin-metalsmith",
"version": "1.0.0",
"description": "",
"main": "index.js",
"scripts": {
"test": "",
"start": "node_modules/.bin/metalsmith"
},
"author": "",
"license": "ISC",
"dependencies": {
"bootstrap": "3.3.5",
"grunt": "^0.4.5",
"handlebars": "^4.0.1",
"handlebars-helpers": "^0.5.8",
"metalsmith": "^2.0.1",
"metalsmith-assets": "^0.1.0",
"metalsmith-autotoc": "^0.1.2",
"metalsmith-broken-link-checker": "^0.1.6",
"metalsmith-collections": "^0.7.0",
"metalsmith-collections-metadata": "0.0.1",
"metalsmith-convert": "^0.3.0",
"metalsmith-date-formatter": "^1.0.2",
"metalsmith-debug": "0.0.2",
"metalsmith-drafts": "0.0.1",
"metalsmith-excerpts": "^1.0.0",
"metalsmith-feed": "^0.2.0",
"metalsmith-filemetadata": "0.0.4",
"metalsmith-in-place": "^1.3.1",
"metalsmith-layouts": "^1.4.0",
"metalsmith-less": "^2.0.0",
"metalsmith-markdown": "^0.2.1",
"metalsmith-pagination": "^1.1.1",
"metalsmith-paths": "^2.1.1",
"metalsmith-permalinks": "^0.4.0",
"metalsmith-register-helpers": "^0.1.2",
"metalsmith-serve": "0.0.3",
"metalsmith-watch": "^1.0.1"
}
}
There is also a circle.yml
file which defines how CircleCI
will behave when it sees a new commit.
dependencies:
pre:
- sudo apt-get update; sudo apt-get install libmagick++-dev
deployment:
prod:
branch: master
commands:
- npm start
- aws configure set preview.cloudfront true
- aws s3 sync build s3://g0blin.co.uk/ --delete
- aws cloudfront create-invalidation --distribution-id XXXXXXXXXXXXXX --paths "/*"
machine:
node:
version: 0.12.7
Essentially, this will install the pre-requisites for our nodejs
modules, and trigger a build when a commit is made to the master
branch. It will then configure the aws
client to enable the preview features of CloudFront
, sync our build
directory to our hosting bucket, and then inform CloudFront
to invalidate all paths. The invalidation I want to improve, so only posts that were updated are invalidated, but that can wait for another time.
S3
First of all, we need to take in to account that part of this move I want to change domain from research.g0blin.co.uk
to just g0blin.co.uk
. I also want to direct www.g0blin.co.uk
to g0blin.co.uk
without the www
prefix. I proceed to setup three buckets on S3
.
- g0blin.co.uk
- This will be used to host our static files
- research.g0blin.co.uk
- www.g0blin.co.uk
- Both of these will purely be used to redirect traffic to
research.g0blin.co.uk
- Both of these will purely be used to redirect traffic to
The two buckets for redirects are straight forward to setup. We simply create the buckets, and then enable Redirect all requests to another host name
under Static Website Hosting
with the value of https://g0blin.co.uk
.
The bucket for our content is equally as straight forward to setup. We create the bucket, and then enable Enable website hosting
under Static Website Hosting
, specifying the Index Document
and Error Document
.
And that's it - we're done with the buckets.
CircleCI
Again, there's very little setup involved in CircleCI
. All we need to do is sign up with our github
account, and then add our project to our account. After signing up, we browse to https://circleci.com/add-projects, click on our associated github
account and then click on the repository we want to build - in this case, research.g0blin.co.uk
.
Once this build starts, we should see something like this.
In order to allow CircleCI
to make changes to our S3
bucket, and to invalidate paths on CloudFront
, we create a new policy with the following content.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "g0blincouks3deploy",
"Effect": "Allow",
"Action": [
"s3:*"
],
"Resource": [
"arn:aws:s3:::g0blin.co.uk",
"arn:aws:s3:::g0blin.co.uk/*"
]
},
{
"Sid": "g0blincoukcloudfrontinvalidate",
"Effect": "Allow",
"Action": [
"cloudfront:CreateInvalidation"
],
"Resource": [
"*"
]
}
]
}
Next I create a new user and attach this policy. After creating the user, I should have a Access Key ID
and Secret Access Key
. I proceed to set these in the Project Settings
under AWS Permissions
in CircleCI
.
That's it for CircleCI
- it will now build on commits to our repo, and has permission to perform actions on our target S3
bucket, and to invalidate in CloudFront
.
CloudFront
Next up is CloudFront
, which will act as the CDN
for our static site.
g0blin.co.uk
Firstly, I create a distribution for g0blin.co.uk
. I choose to use a Custom SSL Certificate
, and proceed to create a wildcard certificate via Amazon's ACM
service. I specify the Default Root Object
as index.html
. I also set the Alternate Domain Names (CNAMEs)
as g0blin.co.uk
.
For the origin, I specify the Origin Domain Name
as the domain provided when I setup static hosting for the g0blin.co.uk
bucket - g0blin.co.uk.s3-website-eu-west-1.amazonaws.com
. I set the Origin Protocol Policy
as HTTP Only
.
For the behaviours, I essentially leave the defaults as they will suffice, with the exception of switching Viewer Protocol Policy
to Redirect HTTP to HTTPS
. I set the origin to the origin we defined above.
The error pages I defined as /500.html
and /404.html
for their respective status codes.
research/www subdomains
These are pretty straight forward to setup. We essentially follow the setup for g0blin.co.uk
, but change the Origin Domain Name
to match the two matching bucket URLs we created above - research.g0blin.co.uk.s3-website-eu-west-1.amazonaws.com
and www.g0blin.co.uk.s3-website-eu-west-1.amazonaws.com
. We also keep Viewer Protocol Policy
as HTTP and HTTPS
. We do not need to set the error pages for these distributions, as they will simply be redirected to the g0blin.co.uk
distribution.
Route 53
Finally we setup Route 53
which will manage our DNS entries.
First of all, I migrated across my domain to Route 53
. This is a pretty straight forward process, and is well documented here.
Next I create an A record as an alias
with an empty value in Name
, pointing towards our g0blin.co.uk
CloudFront
distribution.
I repeat the above for the domains research.g0blin.co.uk
and www.g0blin.co.uk
, pointing them to the appropriate CloudFront
distributions.
End result
So, what all of this setup means is that upon a commit to the master
branch in github
, CircleCI
will pull down the changes, run the npm start
command and build our static site. It will then sync up with our g0blin.co.uk
bucket, and trigger an invalidation in CloudFront
.
Visitors to the domains research.g0blin.co.uk
and www.g0blin.co.uk
are subsequently redirected to g0blin.co.uk
, retaining the path requested. The distribution for g0blin.co.uk
then will serve up the generated static site from our S3
bucket.
It took quite a bit of time tinkering with CloudFront
, S3
and Route 53
in order to get the redirection working, but all in all I'm really pleased with the end result. My site is no longer hosted on redundant hardware, and I no longer need to ssh
on to a box in order to pull down changes and deploy them to nginx
.
If anyone is particularly interested in the structure of the static site, I've created a public Github repo containing the example content referenced further up in this blog post. You can find it at https://github.com/g0blinResearch/g0blin.co.uk-public.
This blog post was mostly so I could remember how I achieved this, but I know that I spent a good amount of time putting together the metalsmith
site itself, as well as connecting all the pieces of the puzzle for the hosting solution, so I thought it may be useful to others as well.