Live at the Witch Trials

Thursday, April 21, 2016

Can you Judge a Book by its Cover?

"they've all got the same covers, and I thought they were all o' one sample, as you may say. But it seems one mustn't judge by th' outside. This is a puzzlin' world." The Mill on the Floss by George Eliot

What is the correlation between peoples ratings of a books cover and the ratings the book receives? This post is about a game devised to get people to rate book covers and gives some great visualisations comparing a books goodreads rating to its cover rating. They gathered over 3 million ratings of 100 covers.

I took their data and got the average rating for each of the covers they tested. I then scraped these 100 books Goodreads average ratings, number of ratings and number of reviews. The Data table and the code I used to scrape and aggregate is here. There are all sorts of accuracy warnings you can imagine around these results. The main ones being that the books and their covers all look pretty good to me. They are not on the self published fan fiction end of the market. The variables here are. num_ratings: Number of Goodreads ratings. rating: average rating of the book. num_reviews: Number of people who have actually written a review. cover_rating: The average rating people gave the cover of the book.

> cor(rating,cover_rating)
[1] 0.1609114
> cor(num_ratings,num_reviews)
[1] 0.9597442
> cor(rating,num_ratings)
[1] 0.2141307
> cor(rating,num_reviews)
[1] 0.2658916
> cor(num_ratings,cover_rating)
[1] 0.3059627
> cor(num_reviews,cover_rating)
[1] 0.3307553

So no you can't judge a book by its cover the correlation in ratings is only .16. You can guess the number of ratings by the number of reviews. You can't guess how highly rated a book is by the number of ratings. Having a good cover might increase the number of reviews your book gets by a bit.

The conclusion is you shouldn't judge a book by its cover. Or by its number of sales (ratings). But people probably do judge books by their cover a bit.

Monday, March 07, 2016

Maps to hide places

Logaskino was a military base in Siberia. Over 30 years Soviet mapmakers moved it around maps to throw off enemies "How to lie with maps" talks about how the Soviets would move around the location of military bases on maps. These maps show one small base (now abandoned) and the local river and how it moved around on maps over 30 years in order to attempt to confuse enemies

Friday, January 22, 2016

England's Temperature in 2015

Nine days in 2015 were the hottest for that day of the year since 1772. This compares to three in 2014, though 2014 had a hotter average temperature and was the hottest year on record in the UK.

England has a collected data on daily temperature from 1772 in the Hadley Centre Central England Temperature (HadCET) dataset.

I downloaded this Hadley Centre dataset. And I followed this tutorial. Based on an original graphic by Tufte.

Here the black line is the average temerature for each day last year. The dark line in the middle is the average average temperature (95% confidence). the staw coloured bigger lines represent the highest and lowest average daily temperature ever recorded on that day since 1772. the red dots are the days in 2015 that were hotter than any other day at that time of year since 1772.

Looking at the black line that represents last years temperatures it was the Winter and Autumn that were far above average. Instead of a scorching hot summer most of the record hot days were in November and December. 2014 had the same pattern of a hot Winter. No day in 2015 was the coldest for that date in the recorded time.

Sunday, January 17, 2016

In 2100 there will be a kilometer tall building

I was in the Burj Khalifa last week. It is very big. But when will some bigger building be built? I want to look at the building height trend to see what the trend line says. Talking the wikipedia page on the Tallest Building. There are two eras shown. The religious era (1200-1901) and the Skyscraper era. I put the data in a csv here.

The Correlation here is cor(Year,Height) [1] 0.39831 which isn't much. Basically Cathedral's burned down and were replaced by a similar sized world's tallest building from 1200 until 1900.

Looking just at the Skyscraper era 1884 on. cor(Year,Height) [1] 0.9340458 which really looks like height increases by follow time. Running this as a linear regression the Kilometer tall bulding is not expected until the end of the century

linearModelVar <- lm(Height ~ Year, newdata)
linearModelVar$coefficients[[2]]*2010+linearModelVar$coefficients[[1]]
646.6246 The Burj Khalifa was much taller than any building was expected to be in 2010
linearModelVar$coefficients[[2]]*2099+linearModelVar$coefficients[[1]]
1002.799 finally a kilometer tall building in 2099
linearModelVar$coefficients[[2]]*2241+linearModelVar$coefficients[[1]]
1604.903 a Mile high tower 2241 far into the future?

Saturday, January 16, 2016

Is Netflix making us smarter?

Vox has an article that mentions the artistic benefits of on demand TV viewing

The first factor was the rise of the DVR, which has made it cheaper and easier than ever before for people to record their favorite shows and watch them at their leisure. This has been great for television artistically, since it means creators can now more readily assume that every single episode of their show will be consumed in sequence.

Stephen Johnson's book "Everything Bad Is Good For You" analyses the complexity of TV programs from the 1970s and today and shows how much more complex modern ones are. Compare Columbo with one murderer shown at the start and it takes 70 minutes for them to be found out. Whereas a more modern CSI is 43 min of multiple plots with loads of characters.

The Vox piece points out that episodic series like CSI with few series long story arcs now seem outdated. Viewers are expected to keep information about longer plots now. Meaning there are more details about the characters and their relationships viewers need to track. Series you can play back at any time may be cognitively as well as artistically beneficial.

Tuesday, December 01, 2015

Tiny Bits of Land People Fight Over #1 Rockall

People will fight over any bit of land. "Rockall is about 25 metres (80 ft) wide and 31 metres (100 ft) long at its base[24] and rises sheer to a height of 17.15 m (56.27 ft)" from wikipedia.

A probably fake photo from 1974 of HMS Tartar's trip there. 'A sentry-box was constructed on Hall's Ledge, with two marines in full ceremonial uniform posted alongside, and the Union Flag was hoisted above.'

Every now and again Britain lands some people on this lump and takes a photo to prove it is theres. 'Former SAS member and survival expert Tom McClean lived on the island from 26 May 1985 to 4 July 1985 to affirm the UK's claim to the island'. Waves roll over the island so he had to hide in a bolted down giant coffin for the duration.

They do this partly because owning the Falklands isn't grim enough for them. And partly for all the oil and gas and such that might be between Rockall and Ireland.

Friday, November 20, 2015

Bombing Back to the Stone Age

There is a common meme that is easy to find with a twitter search

If you read Jared Diamond or Stephen Pinker they talk about the really high levels of violence in the stone age.
Or to describe it with statistics

"By many estimates, 10 to 20 percent of all Stone Age humans died at the hands of other people.
This puts the past 100 years in perspective. Since 1914, we have endured world wars, genocides and government-sponsored famines, not to mention civil strife, riots and murders. Altogether, we have killed a staggering 100 million to 200 million of our own kind. But over the century, about 10 billion lives were lived — which means that just 1 to 2 percent of the world’s population died violently. Those lucky enough to be born in the 20th century were on average 10 times less likely to come to a grisly end than those born in the Stone Age. And since 2000, the United Nations tells us, the risk of violent death has fallen even further, to 0.7 percent."

To reduce violence don't send people back to the stone age.

Tuesday, November 17, 2015

How Good Will the Upcoming Stephen King Adaptations be?

This is the third and final post in a series on Stephen King adaptations. (The first is Are Stephen King films better than the books?, and the second Do Stephen King's Better Books Make Better Films?)
According to IMDB the Stephen King Novels below are currently being adapted

Title	good doubled
The Talisman	8.2
Rose Madder	7.22
Lisey's Story	7.22
Mr. Mercedes	7.74
Gerald's Game	6.86
Cell	7.22
11.22.63	8.52
The Dark Tower	8

Three of these books have exactly the same rating on Goodreads. The correlation between the ratings and his books and his films previously discussed I will use to predict future movies ratings. The analysis gives a 95% confidence prediction intervals of

Title	estimate	lower bound	upper bound
The Talisman	7.5	6.4	8.6
Rose Madder	5.95	4.8	7.0
Lisey's Story	5.95	4.8	7.0
Mr. Mercedes	6.79	5.7	8.8
Gerald's Game	5.37	4.2	6.4
Cell	5.95	4.8	7.0
11.22.63	8.05	6.9	9.15
The Dark Tower	7.2	6.1	8.2

The upcoming films with their predicted IMDB ratings are in red in the graph below. Three are predicted to have a rating of 5.95.

There is an interesting 538 podcast here about a company that predicts film earnings. He mentions the correlation between film quality and earnings and film quality. This whole topic is an interesting challenge for prediction.

Stephen King is unique among authors in the number and variety of adaptations his works have gone through. So he is possibly the only author this could be even tried with. I am really looking forward to 11.22.63 and the Talisman now. And if Gerald's Game beats the predicted IMDB score that is a bonus.

Code


mydata = read.csv("King.csv")  
library(ggplot2)
attach(mydata)     # attach the data frame 
king.lm = lm(imdb ~ good.doubled)

Call:
lm(formula = imdb ~ good.doubled)

Coefficients:
 (Intercept)  good.doubled  
      -5.821         1.628  

upcoming = read.csv("upcoming.csv")  
predict(king.lm, upcoming, interval="predict")

> predict(king.lm, upcoming, interval="predict")

Title,Publication date,Pages,imdb,goodreads,good doubled,clr
Carrie,05/04/1974,199,7.4,3.89,7.78,1
Salem's Lot,17/10/1975,439,6.8,3.97,7.94,1
The Shining,28/01/1977,447,8.4,4.12,8.24,1
The Stand,Sep-78,823,7.3,4.32,8.64,1
The Dead Zone,Aug-79,428,7.3,3.88,7.76,1
Firestarter,29/09/1980,426,6,3.8,7.6,1
Cujo,08/09/1981,319,6,3.61,7.22,1
The Running Man,May-82,219,6.6,3.74,7.48,1
Christine,29/04/1983,526,6.6,3.69,7.38,1
Pet Sematary,14/11/1983,374,6.6,3.86,7.72,1
Thinner,19/11/1984,309,5.7,3.6,7.2,1
It,15/09/1986,1138,6.9,4.12,8.24,1
Misery,08/06/1987,310,7.8,4.06,8.12,1
The Tommyknockers,10/11/1987,558,5.4,3.42,6.84,1
The Dark Half,20/10/1989,431,5.9,3.71,7.42,1
Needful Things,Oct-91,690,6.2,3.84,7.68,1
Dolores Claiborne,Nov-92,305,7.4,3.76,7.52,1
The Green Mile,March–August 1996,400,8.5,4.39,8.78,1
Bag of Bones,22/09/1998,529,5.8,3.84,7.68,1
Dreamcatcher,20/03/2001,620,5.5,3.53,7.06,1
Under the Dome,10/11/2009,1074,6.8,3.89,7.78,1
Shawshank Redemption,10/11/2009,181,9.3,4.51,9.02,1
Stand by me,10/11/2009,80,8.1,4.25,8.5,1
The Mist,10/11/2009,230,7.2,3.88,7.76,1
The Langoliers,10/11/2009,230,6.1,3.71,7.42,1
Apt Pupil,1983,179,6.7,3.8,7.7,1
Hearts in Atlantis,2000,640,6.9,3.77,7.54,1
The Talisman,na,na,7.5,4.1, 8.2,2
Rose Madder,na,na,5.95,3.61, 7.22,2
Lisey's Story,na,na,5.95,3.61, 7.22,2
Mr. Mercedes,na,na,6.79,3.87, 7.74,2
Gerald's Game,na,na,5.37,3.43, 6.86,2
Cell,na,na,5.95,3.61, 7.22,2
11.22.63,na,na,8.05,4.26, 8.52,2
The Dark Tower,na,na,7.2,4,8,2


mydata = read.csv("King.csv")
attach(mydata)
p1 <- ggplot(mydata, aes(x=good.doubled, y=imdb)) +
    geom_point(colour = factor(clr),shape=1,size=2) +    # Use hollow circles
    geom_smooth(method=lm,se=FALSE)           
            
p1 <- p1 + ylab("IMDB Ratings")  
p1 <- p1 + xlab("GoodReads Ratings")  
p1 <- p1 + ggtitle("Upcoming Stephen King Adaptations")
p1 <- p1 + annotate("text", x = 6.97, y = 5.26, label = "The Tommyknockers", size=3, colour="blue3")  
p1 <- p1 + annotate("text", x = 7.85, y = 7.42, label = "Carrie", size=3, colour="blue3") 
#Salem's Lot,17/10/1975,439,6.8,3.97,7.94,1
p1 <- p1 + annotate("text", x =8.0 , y =6.9, label = "Salem's Lot", size=3, colour="blue3") 
p1 <- p1 + annotate("text", x = 7.8, y = 6.5, label = "Pet Sematary", size=3, colour="blue3") 
p1 <- p1 + annotate("text", x = 8.34, y = 8.51, label = "The Shining", size=3, colour="blue3") 
p1 <- p1 + annotate("text", x = 8.62, y = 8.2, label = "Stand By Me", size=3, colour="blue")
p1 <- p1 + annotate("text", x = 8.9, y = 8.4, label = "The Green Mile", size=3, colour="blue3") 
p1 <- p1 + annotate("text", x = 9.0, y = 9.09, label = "Shawshank\nRedemption" , size=3, colour="blue3") 
p1 <- p1 + annotate("text", x = 8.75, y = 7.3, label = "The Stand", size=3, colour="blue3") 
p1 <- p1 + annotate("text", x = 8.27, y = 6.85 , label = "It", size=3, colour="blue3") 
p1 <- p1 + annotate("text", x = 8.19, y = 7.74, label = "Misery", size=3, colour="blue3") 
p1 <- p1 + annotate("text", x = 8.05, y = 6.7, label = "Under the Dome", size=3, colour="blue3") 
#Under the Dome,10/11/2009,1074,6.8,3.89,7.78,1
p1 <- p1 + annotate("segment", x = 7.9, xend = 7.79, y = 6.7, yend = 6.8, colour = "blue3") 
#Dolores Claiborne,Nov-92,305,7.4,3.76,7.52,1
p1 <- p1 + annotate("text", x = 7.5, y = 7.3, label = "Dolores Claiborne", size=3, colour="blue3") 
#The Dark Half,20/10/1989,431,5.9,3.71,7.42,1
p1 <- p1 + annotate("text", x = 7.5, y = 5.81, label = "The Dark Half", size=3, colour="blue3")
#Bag of Bones,22/09/1998,529,5.8,3.84,7.68,1
p1 <- p1 + annotate("text", x = 7.78, y = 5.7, label = "Bag of Bones", size=3, colour="blue3")
#The Dead Zone,Aug-79,428,7.3,3.88,7.76,1
p1 <- p1 + annotate("text", x = 7.5, y = 7.72, label = "The Dead Zone", size=3, colour="blue3")
p1 <- p1 + annotate("segment", x = 7.76, xend = 7.5, y = 7.33, yend = 7.66, colour = "blue3") 
#The Mist,10/11/2009,230,7.2,3.88,7.76,1
p1 <- p1 + annotate("text", x = 7.76, y = 7.1, label = "The Mist", size=3, colour="blue3")
#Firestarter,29/09/1980,426,6,3.8,7.6,1
p1 <- p1 + annotate("text", x = 7.71, y = 6, label = "Firestarter", size=3, colour="blue3")
#The Langoliers,10/11/2009,230,6.1,3.71,7.42,1
p1 <- p1 + annotate("text", x = 7.56, y = 6.11, label = "The Langoliers", size=3, colour="blue3")
#Cujo,08/09/1981,319,6,3.61,7.22,1
p1 <- p1 + annotate("text", x = 7.23, y = 6.1, label = "Cujo", size=3, colour="blue3")
#The Running Man,May-82,219,6.6,3.74,7.48,1
p1 <- p1 + annotate("text", x = 7.48, y = 6.7, label = "The Running Man", size=3, colour="blue3")
#Christine,29/04/1983,526,6.6,3.69,7.38,1
p1 <- p1 + annotate("text", x = 7.38, y = 6.5, label = "Christine", size=3, colour="blue3")
#Thinner,19/11/1984,309,5.7,3.6,7.2,1
p1 <- p1 + annotate("text", x = 7.25, y = 5.6, label = "Thinner", size=3, colour="blue3")
#Needful Things,Oct-91,690,6.2,3.84,7.68,1
p1 <- p1 + annotate("text", x = 7.83, y = 6.2, label = "Needful Things", size=3, colour="blue3")
#Dreamcatcher,20/03/2001,620,5.5,3.53,7.06,1
p1 <- p1 + annotate("text", x = 7.2, y = 5.5, label = "Dreamcatcher", size=3, colour="blue3")
#The Talisman,na,na,7.52,4.1, 8.2,2
p1 <- p1 + annotate("text", x = 8.36, y = 7.52, label = "The Talisman", size=3, colour="red3")
#Rose Madder,na,na,5.93,3.61, 7.22,2
p1 <- p1 + annotate("text", x = 7.07, y = 5.93, label = "Rose Madder", size=3, colour="red3")
#Lisey's Story,na,na,5.93,3.61, 7.22,2
p1 <- p1 + annotate("text", x = 7.07, y = 6.05, label = "Lisey's Story", size=3, colour="red3")
#Mr. Mercedes,na,na,6.78,3.87, 7.74,2
p1 <- p1 + annotate("text", x = 7.60, y = 6.81, label = "Mr. Mercedes", size=3, colour="red3")
#Gerald's Game,na,na,5.34,3.43, 6.86,2
p1 <- p1 + annotate("text", x = 7.03, y = 5.36, label = "Gerald's Game", size=3, colour="red3")
#Cell,na,na,5.93,3.61, 7.22,2
p1 <- p1 + annotate("text", x = 7.28, y = 5.92, label = "Cell", size=3, colour="red3")
#11.22.63,na,na,8.05,4.26, 8.52,2
p1 <- p1 + annotate("text", x = 8.63, y = 8.05, label = "11.22.63", size=3, colour="red3")
#The Dark Tower,na,na,7.2,4,8,2
p1 <- p1 + annotate("text", x = 8.16, y = 7.2, label = "The Dark Tower", size=3, colour="red3")
p1 <- p1 + ylab("IMDB Ratings")  
p1 <- p1 + xlab("GoodReads Ratings")
p1
ggsave("plot.png", width=10, height=10, dpi=100)
detach(mydata)

Friday, November 13, 2015

Do Stephen King's Better Books Make Better Films?

My last post (and data) got a bit popular on reddit. Some people noticed that Stephen King films ratings and the books ratings seemed highly correlated.

Are Stephen King films better than the books? – https://t.co/HEWQJyPGcc (definitely a positive correlation) pic.twitter.com/GbS0Xq9s4w
— Max Roser (@MaxCRoser) November 11, 2015

I think Max is right on this. So I made a graph showing how movie and book ratings are correlated

Now the actual correlation figure is


 cor(good.doubled, imdb, use="complete")

[1] [1] 0.8766 so it looks like highly rated King books make highly rated films

It could be that a good film makes people read and rate highly a book. But my basic conclusion is Stephen King's highly rated books make higher rated films

Appendix: Code for the Graph


mydata = read.csv("King.csv")
attach(mydata)
cor(good.doubled, imdb, use="complete")
p1 <- ggplot(mydata, aes(x=good.doubled, y=imdb)) +
    geom_point(shape=1) +    # Use hollow circles
    geom_smooth(method=lm,   # Add linear regression line
                se=FALSE)    # Don't add shaded confidence region
p1 <- p1 + ylab("IMDB Ratings")  
p1 <- p1 + xlab("GoodReads Ratings *2")  
p1 <- p1 + ggtitle("Stephen King: Books vs. Movies Correlation")
p1 <- p1 + geom_text(aes(label=ifelse(good.doubled>0,as.character(Title),'')),hjust=0,just=0,size=2, position = "jitter")
p1
ggsave("correlate.png")

Other Correlations I changed the publication data column into a column of Years since 1974.


 cor(good.doubled, Years, use="complete")
[1] -0.1879052 Not a strong correlation about Kings adapted books get better or worse since he started
> cor(good.doubled, Pages, use="complete")
[1] -0.02715659 no relationship between the length of a book and it being rated highly or lowly.
> cor(Years, Pages, use="complete")
[1] 0.482048 King's adapted books may be getting a bit longer over time

Wednesday, November 11, 2015

Are Stephen King films better than the books?

Rejoice, Snobs: The Book IS Better Than The Movie
The literary originals have higher ratings than the film adaptations 74 percent of the time. This recent piece claims books are usually better than their films versions.

This article reminded me of a claim that Stephen King films are better than his books. So I extracted the ratings from IMDB and Goodreads. I doubled the Goodreads ratings to make them out of 10.

I included miniseries. Carrie was made twice but I only included the original. The Langoliers, Maximum Overdrive, Lawnmower man and Secret garden seem not to have independent English language printed book versions.

The Shining is one of two films with a higher rating than the book. King famously hates the film

"Are you mystified by the cult that's grown around Kubrick's Shining?
I don't get it. But there are a lot of things that I don't get. But obviously people absolutely love it, and they don't understand why I don't. The book is hot, and the movie is cold; the book ends in fire, and the movie in ice."

To make this a fair test films and book would have to be graded on a curve. By what the average rating of each is. There is a good discussion on how to normalise the original comparison here. But I think this graph is enough to show Stephen King books are better than his films.

Less than 8% of King adaptations are rated higher compared to 26% usually for book adaptations. To get King to have the average quality of adaptations, sixish Stephen King Films being rated better than is books. This would require Dolores Claiborne, The Green Mile, Stand By Me, Misery and maybe Carrie to be much worse books or these already highly rated films to be much better. I am going to call this Myth Busted*, Stephen King adaptations are less successful than the average adaptation.

Title	Publication date	Pages	imdb	goodreads	good doubled
Carrie	April 5, 1974	199	7.4	3.89	7.78
'Salem's Lot	October 17, 1975	439	6.8	3.97	7.94
The Shining	January 28, 1977	447	8.4	4.12	8.24
The Stand	Sep-78	823	7.3	4.32	8.64
The Dead Zone	Aug-79	428	7.3	3.88	7.76
Firestarter	September 29, 1980	426	6	3.8	7.6
Cujo	September 8, 1981	319	6	3.61	7.22
The Running Man	May-82	219	6.6	3.74	7.48
Christine	April 29, 1983	526	6.6	3.69	7.38
Pet Sematary	November 14, 1983	374	6.6	3.86	7.72
Thinner	November 19, 1984	309	5.7	3.6	7.2
It	September 15, 1986	1138	6.9	4.12	8.24
Misery	June 8, 1987	310	7.8	4.06	8.12
The Tommyknockers	November 10, 1987	558	5.4	3.42	6.84
The Dark Half	October 20, 1989	431	5.9	3.71	7.42
Needful Things	Oct-91	690	6.2	3.84	7.68
Dolores Claiborne	Nov-92	305	7.4	3.76	7.52
The Green Mile	March–August 1996	400	8.5	4.39	8.78
Bag of Bones	September 22, 1998	529	5.8	3.84	7.68
Dreamcatcher	March 20, 2001	620	5.5	3.53	7.06
Under the Dome	November 10, 2009	1074	6.8	3.89	7.78
Shawshank Redemption	1982	181	9.3	4.51	9.02
Stand by me	1982	80	8.1	4.25	8.5
The Mist	1983	230	7.2	3.88	7.76
Apt Pupil	1983	179	6.7	3.8	7.7
Hearts in Atlantis	2000	640	6.9	3.77	7.54

   
 mydata = read.csv("King.csv")  
 library(ggplot2)  
 p1 <- ggplot(mydata, aes(x = good.doubled, y = imdb))  
 p1 <- p1 + geom_abline(intercept=0, slope=1)  
 p1 <- p1 + geom_point(shape=1)  
 p1 <- p1 + ylim(5, 10)  
 p1 <- p1 + xlim(5, 10)  
 
 p1 <- p1 + ylab("IMDB Ratings")  
 p1 <- p1 + xlab("GoodReads Ratings *2")  
 p1 <- p1 + ggtitle("Stephen King: Books vs. Movies")  

p1 <- p1 + annotate("text", x = 7.3, y = 5.3, label = "The Tommyknockers", size=3, colour="blue3")  
p1 <- p1 + annotate("text", x = 7.95, y = 7.37, label = "Carrie", size=3, colour="blue3") 
p1 <- p1 + annotate("text", x = 8, y = 6.5, label = "Pet Sematary", size=3, colour="blue3") 
p1 <- p1 + annotate("text", x = 7.9, y = 8.54, label = "The Shining", size=3, colour="red3") 
p1 <- p1 + annotate("text", x = 8.75, y = 9.4, label = "Shawshank Redemption", size=3, colour="red3")
p1 <- p1 + annotate("text", x = 9.18, y = 8.4, label = "The Green Mile", size=3, colour="blue3") 
p1 <- p1 + annotate("text", x = 8.72, y = 8, label = "Stand By Me", size=3, colour="blue3") 
p1 <- p1 + annotate("text", x = 8.92, y = 7.3, label = "The Stand", size=3, colour="blue3") 
p1 <- p1 + annotate("text", x = 8.3, y = 6.85 , label = "It", size=3, colour="blue3") 
p1 <- p1 + annotate("text", x = 8.32, y = 7.7, label = "Misery", size=3, colour="blue3") 
p1 <- p1 + annotate("text", x = 8.18, y = 6.7, label = "Under the Dome", size=3, colour="blue3") 

p1 <- p1 + annotate("text", x = 9.28, y = 5.0, label = "Books Higher Rated", size=5, colour="blue3") 
p1 <- p1 + annotate("text", x = 5.7, y = 9.90, label = "Films Higher Rated", size=5, colour="red3") 



p1 <- p1 + annotate("text", x = 6.9, y = 6, label = "Cujo", size=3, colour="blue3") 
p1 <- p1 + annotate("segment", x = 7.05, xend = 7.2, y = 6.0, yend = 6.0, colour = "blue3") 


p1 <- p1 + annotate("text", x = 6.45, y = 5.52, label = "Dreamcatcher", size=3, colour="blue3") 
p1 <- p1 + annotate("segment", x = 6.8, xend = 7.05, y = 5.5, yend = 5.5, colour = "blue3") 


p1 <- p1 + annotate("text", x = 6.80, y = 7.4, label = "Dolores Claiborne", size=3, colour="blue3")
p1 <- p1 + annotate("segment", x = 7.28, xend = 7.5, y = 7.4, yend = 7.4, colour = "blue3") 

p1 <- p1 + annotate("text", x = 7.92, y = 5.72, label = "The Dark Half", size=3, colour="blue3")
p1
ggsave("plot.png")

*This needs proper statistical significance testing so not really busted.

Tuesday, November 03, 2015

Odd Mathematical Patents #1: Gömböc

The Gömböc is a self righting shape. A bit like a turtle that will always rotate back to standing up if placed on its back. This shape was patented under US D614077 S1 and a cease and destist has been send to someone giving plans to 3D print one.

Prof. Domokos has politely requested that I remove this file. I have respectfully declined to do so, as I do not believe it violates any of their established legal rights, and I believe it may have actual value for researchers interested in the Gömböc and mono-monostatic bodies in general.

Gömböc's are cool. And the inventors were very ingenious. You can buy one from the patent holders here. Patenting a shape though strikes me as weird.

Monday, October 12, 2015

Stacked Area Chart In Javascript

A stacked area chart of transatlantic slave trace made with c3.js and the transatlantic slave trade database

The graph shows the counts for each region the slaves were disembarked by year. The html and javascript for this graph is on a gist here

The data used can be gotten from the slavevoyages.org website with the filters beloew

An actual graph you can play with is here and in the result tab below

Saturday, October 10, 2015

Danny Boyle films

Danny Boyle films revolve around a group of friends who find X. And how X affects their relationship.

Where for each film X is

Shallow Grave	Bag of Money
Trainspotting	Bag of heroin
The Beach	A map
Millions	Bag of money
Sunshine	A spaceship
28 Days Later...	An outpost
Slumdog Millionaire	A gameshow

I can't haven't seen so do not know what X is in A Life Less Ordinary, 127 Hours, Steve Jobs or Trance

Friday, October 09, 2015

The Man in the White Suit

The man in the white suit is about a scientist who invents clothes that never wear out and never need washing. It is from 1951 and deals Alec Guinness finds himself hated by all. Both the capital owning class who realise they will have no clothes to sell anymore. And the poor labourers who won't have work making cloth, clothes or washing them.

All these 1950s cloth making jobs have left Britain in the decades since. Hans Rosling pointed out that the washing machine in common use since then is one of the best invention ever for freeing us from drudgery.

Worry about how computers will get rid of all the jobs are common now. Race against the machine is one good book on the topic. But the washer women in The Man in the White Suit are not in Britain anymore. Though much of the clothes making has just moved country without making the people doing it much richer.

We now in Europe live the terrifying world imagined in the Man in the White Suit. Where you can clothe yourself in pennies for very little. Wash your clothes for little effort of money. And it is not that bad. Technological progress is not bad. As Stephen Hawking points out it is not the technology but who the benefits go to

If machines produce everything we need, the outcome will depend on how things are distributed. Everyone can enjoy a life of luxurious leisure if the machine-produced wealth is shared, or most people can end up miserably poor if the machine-owners successfully lobby against wealth redistribution. So far, the trend seems to be toward the second option, with technology driving ever-increasing inequality.

Tuesday, October 06, 2015

Golf Courses Versus Book Shops

"Between 1990 and 2010, the number of 18-hole golf courses in Ireland exploded from 213 to 352."

Searching Golden Pages gives

300 results for Booksellers

397 results for Golf Course

It looks to me that there are more golf courses than book shops in Ireland.

Friday, October 02, 2015

Pouring one out for your Ancient Greek Homies

Hecuba mother of Hector and Paris in the Iliad pours one out for her homies

Stay, till I bring the cup with Bacchus crown'd,
In Jove's high name, to sprinkle on the ground,
And pay due vows to all the gods around

Pouring some of your alcoholic beverage on the ground is an ancient and common across many cultures. It is a practice called Libation.

The other word for today is Adoxography "fine writing on a trivial or base subject". The best adoxography books ive read recently are Red: The history of red hair, Paper an elegy and The Phone Book.

Thursday, September 24, 2015

Penrose's Law

Penrose's Law[3][4] states that the population size of prisons and psychiatric hospitals are inversely related, although this is generally viewed as something of an oversimplification

That was thought up by from Lionel Penrose in 1939 (he was Roger Penrose's father). And it still seems broadly accurate

During 1960–2004, there was a 74% population-adjusted decrease in mental institution beds and a 52% increase in the prison population. The same period saw a 500% increase in overall crime and a 900% increase in violent crimes, with a concurrent 94% increase in the size of the country's police force. Penrose's law proved remarkably robust in the longitudinal perspective

Sunday, September 20, 2015

Irish Suicide Statistics

There is an interesting article in 20th Sept issue of Irish Daily Mail on Sunday by Alison O'Reilly on the emotional difficulties priests in Ireland have.

The figures given claim priests are at a much greater risk of suicide than the general public

However if looking at who priests are I do not think this is the case. According to the CSO here

"The age-standardised death rate from suicide was 12.1 deaths per 100,000 in 2011" not the 5.12 figure in the article.

Also Males are at a higher risk anyway. "male suicide rates were five times higher at 20.5 deaths per 100,000"

And Older males at a higher risk still. "male suicide rates were highest in the 45-64 age-group (28 per 100,000)". The average age of an Irish priest is around 65.

Just by their gender and age profile priests 35ish to the general population 28 per 100,000 do not seem that different.

Finally priests are unmarried which is a well known correlate of suicide. To take one paper Suicide and marital status in Northern Ireland "Never marrying increased male suicide risk and its effect increased with age IRR among over 55 year-olds = 2.33". 2.33 * 20.5 base =47 per 100k

I had not realised quite how bad it was for those bachelor farmers in terms of suicide. This (greater than 28) rate compares with 1.2 for murder in Ireland.

The story here is not how bad suicide is for priests but how bad the problem is for all single older men in Ireland.

Sunday, September 13, 2015

Japan Loves Faxes

In Japan, faxes are still used extensively for cultural and graphemic reasons and are available for sending to both domestic and international recipients from over 81% of all convenience stores nationwide.

Coupland explains that he thinks the Japanese addressing system is the cause of its love of faxes.

Graphemic reasons are that typing was never really popular in Japan. Partly because of the three alphabets and partly just because of an appreciation of calligraphy.

Another reason is the use of forms that do not really fit electronic formats

But I love the explanation that the Japanese addressing system has lead to an 80s technology getting popular and it staying alive

Monday, June 15, 2015

When did we get too much stuff?

Kevin Kelly pointed out that we have more types of things (species of technology) than even the most wealthy had in the past. "count the number of species of technology in our household. And it came up with 6,000 different species of products. I did some research and found out that the King of England, Henry VIII, had only about 7,000 items in his household. And he was the King of England, and that was the entire wealth of England at the time." Kevin Kelly

How much more stuff can we get over time? The Argos catalog seems to get bigger each year and it seemed to me an ideal way to measure the amount of stuff we get to decide we don't want.

Retromash is a great site for old Argos catalogs. I Counted the total page numbers in each years catalog. I don't have the ~~slaves~~ undergraduates needed to count everything in the actual catalog. Retromash don't have much after 1990 so I went to ebay and they list the number of pages. This produces this data (with links to the ebay sources). One weirdness is there seemed to have been an instore catalog and superstore (to be delivered) catalog for a few years.

I have tried to give the number of pages of the Autumn/Winter Argos catalog you get in the shop in any given year.

The graph is just a simple ggplot2 scatterplot

mydata = read.table("argos.tsv", header=TRUE)
ggplot(mydata, aes(x=Year, y=pages)) + geom_point(shape=1) + geom_smooth(method=lm, se=FALSE)
ggsave(file="argos.png")

It looks like there are nine times the number of things we can choose now from Argos as there was in 1975. To put a prediction on this, the amount of things you can buy nearby doubles ever decade.