Thursday, April 21, 2016

Can you Judge a Book by its Cover?

"they've all got the same covers, and I thought they were all o' one sample, as you may say. But it seems one mustn't judge by th' outside. This is a puzzlin' world." The Mill on the Floss by George Eliot
What is the correlation between peoples ratings of a books cover and the ratings the book receives? This post is about a game devised to get people to rate book covers and gives some great visualisations comparing a books goodreads rating to its cover rating. They gathered over 3 million ratings of 100 covers.

I took their data and got the average rating for each of the covers they tested. I then scraped these 100 books Goodreads average ratings, number of ratings and number of reviews. The Data table and the code I used to scrape and aggregate is here. There are all sorts of accuracy warnings you can imagine around these results. The main ones being that the books and their covers all look pretty good to me. They are not on the self published fan fiction end of the market. The variables here are. num_ratings: Number of Goodreads ratings. rating: average rating of the book. num_reviews: Number of people who have actually written a review. cover_rating: The average rating people gave the cover of the book.

> cor(rating,cover_rating)

[1] 0.1609114

> cor(num_ratings,num_reviews)

[1] 0.9597442

> cor(rating,num_ratings)

[1] 0.2141307

> cor(rating,num_reviews)

[1] 0.2658916

> cor(num_ratings,cover_rating)

[1] 0.3059627

> cor(num_reviews,cover_rating)

[1] 0.3307553

So no you can't judge a book by its cover the correlation in ratings is only .16. You can guess the number of ratings by the number of reviews. You can't guess how highly rated a book is by the number of ratings. Having a good cover might increase the number of reviews your book gets by a bit.

The conclusion is you shouldn't judge a book by its cover. Or by its number of sales (ratings). But people probably do judge books by their cover a bit.

Monday, March 07, 2016

Maps to hide places

Logaskino was a military base in Siberia. Over 30 years Soviet mapmakers moved it around maps to throw off enemies "How to lie with maps" talks about how the Soviets would move around the location of military bases on maps. These maps show one small base (now abandoned) and the local river and how it moved around on maps over 30 years in order to attempt to confuse enemies

Friday, January 22, 2016

England's Temperature in 2015

Nine days in 2015 were the hottest for that day of the year since 1772. This compares to three in 2014, though 2014 had a hotter average temperature and was the hottest year on record in the UK.

England has a collected data on daily temperature from 1772 in the Hadley Centre Central England Temperature (HadCET) dataset.

I downloaded this Hadley Centre dataset. And I followed this tutorial. Based on an original graphic by Tufte.


Here the black line is the average temerature for each day last year. The dark line in the middle is the average average temperature (95% confidence). the staw coloured bigger lines represent the highest and lowest average daily temperature ever recorded on that day since 1772. the red dots are the days in 2015 that were hotter than any other day at that time of year since 1772.

Looking at the black line that represents last years temperatures it was the Winter and Autumn that were far above average. Instead of a scorching hot summer most of the record hot days were in November and December. 2014 had the same pattern of a hot Winter. No day in 2015 was the coldest for that date in the recorded time.

Sunday, January 17, 2016

In 2100 there will be a kilometer tall building

I was in the Burj Khalifa last week. It is very big. But when will some bigger building be built? I want to look at the building height trend to see what the trend line says. Talking the wikipedia page on the Tallest Building. There are two eras shown. The religious era (1200-1901) and the Skyscraper era. I put the data in a csv here.

The Correlation here is cor(Year,Height) [1] 0.39831 which isn't much. Basically Cathedral's burned down and were replaced by a similar sized world's tallest building from 1200 until 1900.

Looking just at the Skyscraper era 1884 on. cor(Year,Height) [1] 0.9340458 which really looks like height increases by follow time. Running this as a linear regression the Kilometer tall bulding is not expected until the end of the century

linearModelVar <- lm(Height ~ Year, newdata)

linearModelVar$coefficients[[2]]*2010+linearModelVar$coefficients[[1]]

646.6246 The Burj Khalifa was much taller than any building was expected to be in 2010

linearModelVar$coefficients[[2]]*2099+linearModelVar$coefficients[[1]]

1002.799 finally a kilometer tall building in 2099

linearModelVar$coefficients[[2]]*2241+linearModelVar$coefficients[[1]]

1604.903 a Mile high tower 2241 far into the future?

Saturday, January 16, 2016

Is Netflix making us smarter?

Vox has an article that mentions the artistic benefits of on demand TV viewing
The first factor was the rise of the DVR, which has made it cheaper and easier than ever before for people to record their favorite shows and watch them at their leisure. This has been great for television artistically, since it means creators can now more readily assume that every single episode of their show will be consumed in sequence.

Stephen Johnson's book "Everything Bad Is Good For You" analyses the complexity of TV programs from the 1970s and today and shows how much more complex modern ones are. Compare Columbo with one murderer shown at the start and it takes 70 minutes for them to be found out. Whereas a more modern CSI is 43 min of multiple plots with loads of characters.

The Vox piece points out that episodic series like CSI with few series long story arcs now seem outdated. Viewers are expected to keep information about longer plots now. Meaning there are more details about the characters and their relationships viewers need to track. Series you can play back at any time may be cognitively as well as artistically beneficial.

Tuesday, December 01, 2015

Tiny Bits of Land People Fight Over #1 Rockall

People will fight over any bit of land. "Rockall is about 25 metres (80 ft) wide and 31 metres (100 ft) long at its base[24] and rises sheer to a height of 17.15 m (56.27 ft)" from wikipedia.

A probably fake photo from 1974 of HMS Tartar's trip there. 'A sentry-box was constructed on Hall's Ledge, with two marines in full ceremonial uniform posted alongside, and the Union Flag was hoisted above.'

Every now and again Britain lands some people on this lump and takes a photo to prove it is theres. 'Former SAS member and survival expert Tom McClean lived on the island from 26 May 1985 to 4 July 1985 to affirm the UK's claim to the island'. Waves roll over the island so he had to hide in a bolted down giant coffin for the duration.


They do this partly because owning the Falklands isn't grim enough for them. And partly for all the oil and gas and such that might be between Rockall and Ireland.

Friday, November 20, 2015

Bombing Back to the Stone Age

There is a common meme that is easy to find with a twitter search

If you read Jared Diamond or Stephen Pinker they talk about the really high levels of violence in the stone age.
Or to describe it with statistics

"By many estimates, 10 to 20 percent of all Stone Age humans died at the hands of other people.
This puts the past 100 years in perspective. Since 1914, we have endured world wars, genocides and government-sponsored famines, not to mention civil strife, riots and murders. Altogether, we have killed a staggering 100 million to 200 million of our own kind. But over the century, about 10 billion lives were lived — which means that just 1 to 2 percent of the world’s population died violently. Those lucky enough to be born in the 20th century were on average 10 times less likely to come to a grisly end than those born in the Stone Age. And since 2000, the United Nations tells us, the risk of violent death has fallen even further, to 0.7 percent."


To reduce violence don't send people back to the stone age.