Author Archive for Jeremy Kelaher


Why an Apple Watch that costs tens of thousands of dollars ?

On many online forums people are asking how Apple can justify an item that will last maybe two years that costs as much as a car. The answer is “Anchoring and the contrast principle”.


We can not help ourselves – the stupidly expensive watch makes the cheapest one look cheap (which it is not). The Middle priced one has the same effect of people looking for value.

The whole pricing structure is designed to optimize revenue and is based on good science.

More reading:


WordPress VIP Development environment setup (Mac)

Oh boy, what a pain! Setting up Unix tools on a Mac is one of my least favourite things because

  1. Apple puts Unix stuff in very odd places, mucking up install assumptions
  2. Apple messes with the whole root concept in sometimes counter intuitive ways : sudo effectively means “local admin” – not really root, and lots of installers these days blocks sudo for safety reasons, which makes install on a mac tricky because you NEED “elevated” permission to do many things.
  3. Apple builds in key “optional extras” that have now become core to Unix style installers – Python, Perl and Ruby. The Apple versions, even in up-to-date installs are … pants (English insult meaning “bad”). This means lots of parallel installs of key scripting enablers (more on this later)

So here is what I had to do, more or less blow-by-blow.

WordPress “official” instructions are here : (slightly ironic title 🙂

So … I am behind a corporate proxy, and I am going to need authoxy (or similar)

download authoxy from

  1. run the download
  2. double click Authoxy pkg (control click–>open–>open might be needed depending on your “unknown package” settings)
  3. you might need local admin to install this! type in username and password
  4. open “System Preferences”
  5. goto Network. Choose the network you connect to your corp network on, open it. Goto Proxies. Select the Automatic Proxy Configuration and cut and paste the “URL” and back out to the top level
  6. authoxy settings are in the bottom section of “System Preferences”, open them
  7. select “use automatic configuration (pac) file” and paste the “URL” in
  8. put in your network username/password
  9. change authoxy port to (say) 9000
  10. start authoxy from the button top left

download virtualbox, install.

download vagrant, install.

When  I saw that vagrant uses Ruby, my heart sank. Not because I hate Ruby, but because I always seem to have issues with it on Macs.

Your built-in Ruby may well be no good. If later steps fail with random build errors and dependency issues, try a newer ruby.

Now crack open a terminal.

> ruby –version

Mine showed 2.0. As things turned out, I needed 2.1 at least. So I used the tasty ruby-install to give me a better copy and then changed my path to look there first.

> vagrant plugin install vagrant-proxyconf

if that works you need to create a Vagrant Proxy file I used vi, vi commands are bold

> vi ~/.vagrant.d/Vagrantfile


Vagrant.configure("2") do |config|
  if Vagrant.has_plugin?("vagrant-proxyconf")
    config.proxy.http     = ""
    config.proxy.https    = ""
    config.svn_proxy.http = ""

> cd vip-quickstart

> vagrant up

wait for it … pray … wait

> vagrant provision

Wait, and maybe repeat the provision, … if you get any network errors, your proxy might not be set up right

The first time I did provision I got a blank 200 response story on vip.local (where your dev site will “live”) and some odd network errors. I second go resolved that.

victory !



The tyranny of distance, or why webscale in Australia is tough

While trying to understand some site performance observations with a guru from WordPress VIP Mr Barry Abrahamson I was reminded just how odd routes from Australia to the USA can be sometimes.

In my case from a Telstra ISP 4G service in Sydney:

  • resolved to Australia,Brisbane, via Sydney Kent St exchange about 2km from my location in inner Sydney, an odd little route given EdgeCast has a POP in Sydney !
  • routed to hong kong to bay area and the landings near LA
  • routed to texas, san antonio via dallas peering which came from Hong Kong –> Taiwan then jumped the Pacific, seemingly not the most direct route 🙂

3 distinct locations and routes for one web page.

I see these odd routes a lot, so perhaps a little look at how to trace and decode the output is in order.

First thing you need is a machine that can run traceroute, ping and curl – I will use my trusty mac pro. Secondly you need a connection that is not excessively firewalled – many corporate environments block these useful tools, and some ISPs and home firewalls might too. If you have a firewall at home you might need to allow ICMP.

Ok so first we ping each host:

ping -c3 -n
PING ( 56 data bytes
64 bytes from icmp_seq=0 ttl=57 time=78.895 ms
64 bytes from icmp_seq=1 ttl=57 time=76.929 ms
64 bytes from icmp_seq=2 ttl=57 time=137.781 ms

ping -c3 -n
PING ( 56 data bytes
64 bytes from icmp_seq=0 ttl=49 time=613.950 ms
64 bytes from icmp_seq=1 ttl=49 time=556.325 ms
64 bytes from icmp_seq=2 ttl=49 time=762.889 ms

ping -c3 -n
PING ( 56 data bytes
64 bytes from icmp_seq=0 ttl=45 time=963.449 ms
64 bytes from icmp_seq=1 ttl=45 time=572.970 ms
64 bytes from icmp_seq=2 ttl=45 time=805.222 ms

OK from this I can clearly see that is somewhere fairly local, while is “over the pond”. Also looks like might be father away than, interesting. Running ping a couple more times (I usually do it at least 3 times) will help see any other anomalies.

so now to the big guns – traceroute:

traceroute -I
traceroute to (, 64 hops max, 72 byte packets
1 (  28.801 ms  24.031 ms  1.678 ms
2  * * *
3  * * *
4  * * *
5 (  170.490 ms  30.207 ms  28.705 ms
6 (  32.643 ms  29.309 ms  32.602 ms
7 (  27.004 ms  24.701 ms  31.644 ms
8 (  28.148 ms  28.552 ms  30.501 ms

mdsl026351:~ macadmin$ traceroute -I
traceroute: Warning: has multiple addresses; using
traceroute to (, 64 hops max, 72 byte packets
1 (  1.911 ms  1.460 ms  1.406 ms
2  * * *
3  * * *
4  * * *
5 (  1253.110 ms  111.398 ms  24.262 ms
6 (  32.387 ms  29.947 ms  40.252 ms
7 (  31.405 ms  33.614 ms  38.195 ms
8 (  42.390 ms  36.117 ms  31.180 ms
9 (  30.383 ms  31.534 ms  32.370 ms
10 (  202.327 ms  201.500 ms  523.056 ms
11 (  603.329 ms  346.085 ms  577.464 ms
12 (  613.410 ms  614.335 ms  613.184 ms
13 (  307.141 ms  922.118 ms  920.883 ms
14 (  921.686 ms  607.118 ms  466.507 ms
15  * * *
16 (  812.169 ms  613.593 ms  614.946 ms

mdsl026351:~ macadmin$ traceroute -I
traceroute: Warning: has multiple addresses; using
traceroute to (, 64 hops max, 72 byte packets
1 (  24.285 ms  5.433 ms  5.907 ms
2  * * *
3  * * *
4  * * *
5 (  169.809 ms  29.662 ms  29.179 ms
6 (  30.460 ms  31.108 ms  29.384 ms
7 (  32.356 ms  40.007 ms  40.267 ms
8 (  36.161 ms  32.148 ms  41.099 ms
9 (  39.351 ms  39.832 ms  40.266 ms
10 (  394.816 ms  258.567 ms  2206.390 ms
11 (  575.466 ms  614.253 ms  614.506 ms
12 (  614.388 ms  615.384 ms  613.874 ms
13 (  306.713 ms  921.466 ms  642.280 ms
14 (  584.913 ms  700.331 ms  614.389 ms
15  * * *
16  * * *
17   (  718.998 ms  477.185 ms  442.695 ms

OK so let pick this apart:

the first thing to know is that all network engineers love order, and the naming of router nodes is easy to decode once you know how. Each line in a traceroute is a router, and each router is hit with an ICMP request that says (in effect) “name thy self”. Some say “go away!” (***). The time this “ping” takes is measured, and this helps judge the accumulated “cable distance” – you will note a general trend for time to go up for each step.

  • clearly the closest
  • gotta love Telstra – very good node naming! from Sydney we go to …
  • some boundary node at then to
  • a quick squiz in a tool like info sniper or maxmind will allow you to infer this seems to be in brisbane and is EdgeCast. NO I hear you cry! is in the USA! Well, no. The address range is assigned in the USA but the ping surely tells us it it in nearer. How can this be! Easy – EdgeCast uses AnyCast BGP, which in layman’s terms means that it’s ip addresses are in many places at once. The brisbane former hop is the giveaway – no route from Australia gets to the USA in one hop (more on this in a sec) and

  • clearly over the pond, both go to Hong Kong seeming with out any other hops first (more on this later)
  • sat then goes to Texas via Taiwan again with little in between (!)
  • the main domain routes to LA, straight from Hong Kong

So some odd things are:

  1. why Taiwan to Texas direct?
  2. how does one get from Australia to HongKong direct ?

The answer to both these question is the joys of optical networking, specifically ADM (Add Drop Mux). While there is no actual unbroken path between Australia and Hong Kong (Guam is in the middle) or Taiwan and Texas (the Pacific and Several States of the Union are in the way), there are high capacity “trails” (leased circuits) – digitally fractionated parts of massive optical under sea and land based trunks that effectively bridge the distance between those points at close to the speed of light.

The internet is stranger and more beautiful than most appreciate 🙂

Onwards and upwards 🙂


Here are my slides for WordCamp Sydney 2014

Here is my presentation from WordCamp Sydney on NewsCorp and our plans with WordPress.

WordCamp 2014 Sydney PDF (c) 2014 NewsCorp, please ask for permission before reuse.


DNS funs and games – an Aussie site on VIP

The average user probably does not notice it, but there is a complex dance underneath every delivery of a page from Basically for a site it goes like this:

  1. browser DNS resolves to
  2. resolves to multiple ips, the browsers computer picks one (this is a slightly complex area, either the top one is chosen or some “closest” logic is used)
  3. a HTTP GET is done on the IP selected for
  4. the page that is returned is searched for links to images, css, js etc and those files fetched. js Scripts are run and they may also fetch files. does not put all its files in a single place – a quick search in network view in the developer tools in your browsers will reveal:

  • stuff coming right from
  • subdomains of like selected seemingly at random as well as more specific ones like
  • requests to and
  • requests to
  • your subdomain but on eg
  • subdomains of like selected seemingly at random
  • calls to third party domains like

some of this is cached close you via a CDN (EdgeCast), and some “goes to origin” at one of the three datacenters uses. It is the latter stuff that concerns me as a webscale guy, more on this later.

so what are all these domains for?

  1. your domain (via is where php “lives”, fronted by good old nginx and batcache goodness
  2. is where theme things like css and js live, cached close to you by EdgeCast. As to why all the subdomains ? Good question – there a two main reasons this is done, one is to share load across source servers (not needed in this case thanks to EdgeCast), the other is to “trick” browsers into opening more connections at once (some browsers have a rule to the effect of “only 4 connections per domain”)
  3. stats and  is where some internal reporting is driven from – a “pixel” in webscale speak is a tiny image ( generally made invisible via CSS) on a page that tracks the progress of a page loading – when the pixel is hit the web server on the other end (nginx again in this case) logs the hit along with all the requests parameters and cookies and registers a “hit”. The log is then mined using big data techniques or a real time pipeline.  The WordPress ones are tiny smiles, nice.
  4. public-api is interesting – REST services that are called continually from your pages via scripts to update the page on the fly without a reload, /notifications is the one I noted.
  5. is where your media assets like images live, care of nginx, and this includes cool toys like the ability to rescale images eg this vs this
  6. good old – a very cool service from automattic that stores not just your profile picture but also a profile page and a whole API that can give you structured data in many forms (JSON, PHP, QR code etc). Check me out.

The goal of all this is to make you page load quickly – if you are an aspiring PHP hoster or webscale architect you could do worse than to look at what are up to, because they are an exemplar of “tier 1” internet design (the other candidates off the top of my head being google, facebook, ebay, paypal and amazon).

So why am I trying to improve on perfection? In a word, distance – my employers customers in Australia are a long way from the core of the internet in the USA, and while do an awesome job of hiding this it does not work for the complex pages used by news sites. So I need to add another layer to those parts of that are not on EdgeCast. The full complexity can wait for a future post, but basically we route some non-EdgeCast traffic (particularly those PHP pages on our main domains) via a CDN called akamai. Akamai is the unsung hero of the internet – something like half of all you internet traffic probably uses akamai, particularly images, audio and video, and in countries like Australia it is critical – akamai “stacks”, as the servers are called, live out on the edge of the network, right near where your ADSL or cable connections is aggregated for connection to the internet by your ISP. The stacks intercept the traffic they are told to manage and preform complex mapping and caching, even inserting parts of pages on the fly. This allows us to respond as if a highly customized page is locally hosted in Australia rather than on remote servers in Chicago, San Antonio, or Dallas.

onwards and upwards 🙂


After the Camp

WordCamp Sydney 2014 was lots of fun, still processing it all to be honest, more thoughts over the coming days.

My presentation on our upcoming use of WordPress VIP at NewsCorp seemed to go down well, I will link to it once it is on (or wherever).

So this week I am wrestling with DNS and WordPress VIP. This is a bit tricky for us – traditionally our sites redirected from TLD to www, but WordPress VIP works the other way about.

Also we use the advanced and rather splendid DNS from akamai, where as WordPress prefers to use their own. That would mean giving up akamai features for any subdomains that contain non-WordPress services like our API and node.js, which would be negative for both security and scalability. Using subdomains in this way is done so that same domain policies in JavaScript are managed simply.

The “modern” solution of course is to use CORS or JSONP for our non-WordPress services so domains matter less, but each of these have some issues – CORS only works on recent browsers and JSONP can cause some evil non-cacheability due to the callbacks in the requests having names the service provider can not control.

So in the end I think we just need to compromise and keep our current DNS provider.

Onwards and upwards 🙂



Wordcamp 2014 fun

Well it has been a while, sorry about that 🙂

I intend to reboot this blog with my experience of deploying some really big WordPress sites at NewsCorp Australia.

I have been invited to speak at wordcamp Sydney about the use of WordPress at NewsCorp Australia. So as not to spoil the presentation, I will update with a new posts from Monday.


WordCamp Sydney September 27-28, 2014

September 2018
« Apr