What does the average Gannett news site look like?
I run across lots of similar-looking news websites while doing research for my internship at the Progressive Dairyman magazine. A significant portion of these sites all seem to use the same template, with some color and menu changes. Turns out they're all owned by Gannett, so I asked mysef, "What does the average Gannett website look like?"
It looks like this, and you should click to embiggen.
That's an average of about 150 Gannett-owned websties, including some of their corporate ones. I got the list from here, screenshotted each one using PhantomJS, and combined the images with ImageMagick. It took about three hours to run.
At 1600 pixels wide, the sites on average have a strong three-column grid with a right rail for ads, some sort of hero story or a fancy header setup, occasionally a red alert bar at the top, a usually-blue right menu, and a black sidebar for balance.
Here's what it looks like when you add about 175 UK sites that Gannett owns:
Their UK sites use a 3-column grid that is wider than the US sites, and are generally much longer than the US sites. This image took about thirteen and a half hours to complete. The full list of sites, with some duplicates, is here.
Here's the command to combine the images:
convert -background transparent -gravity North -resize "1600x6000>" -extent 1600x6000 img/*.png -evaluate-sequence mean average-gannett-site.png
When I first ran the command, I dropped a 0
from the first 6000
and then resized all images proportionally to fit within a 600-pixel-tall space, then padded them out to 6000 pixels tall. The first attempt took a little under two hours to render, and was an utter failure:
Rendering these using ImageMagick is more memory-intensive than it is processor- or graphics-intensive. I ran the job on an Intel i5 with an nVidia NVS 5100M and 8 GB of RAM. The bottleneck was definitely memory - ImageMagick kept about 15 GB of data in my hard drive's /tmp
directory while using 7 GB of RAM. I recommend using an SSD and a lot more RAM.
For all the code used to source, save, and stack these images, see the code on GitHub.