Wednesday, March 12, 2014

Lets All Move to No Insurance Land

Ryanair have finally set up their own country. In order not to buy insurance you have to select that option from the country of residence option. Because the UX of select insurance clearly is to choose from a country this option makes complete sense.

If you are going to have a new country of 'don't insure me' it is clearly not in some sort of non country section of the drop down but resides just after Denmark.

I've talked before about where Ryanair if people did what they asked they would disappear. But I still like them, I just like pointing out when some company acts oddly.

There is a level of hiding extra charges from people. Setting up your own country to get an extra few quid out of people really shows commitment.

Sunday, January 05, 2014

Goodreads Recruitment Hack

I entered in some of the node.js books I have been reading into my goodreads list. And they mentioned that they were recruiting.
The programmer who told recruiters about github meet the same fate as the police officer at the end of the wicker man. But at the risk of the same thing happening to me this is a really clever idea. If you are looking for people who know about an area checking if they read the books is one way. Only goodreads can advertise on their site but anyone can look up book reviews. The other surprising thing is with 800k followers goodreads twitter mentions must be like looking at the digital rain from the Matrix. But they noticed that I mentioned their recruitment idea and replied.

Thursday, November 14, 2013

Wheat Map of the US

I thought it would be cool to make a map of the US counties by how much wheat they grew. I took the code from this article and from the Visualize Data book by Nathan Yau

I got some wheat data from here the US department of Agriculture. The map of the US comes from here

Then I cleaned up the data by taking only the columns for state, county and total wheat production. This dataset includes a county 888 and 999 but that seems to be a combination of all the states counties so I stripped those out. Also there are more than 50 states in these county datasets which seems to be standard. There is always messing with numbers being seen as strings with these sorts of manipulations so some casting is needed.

The svg is 1.9 mbs and google drive does not want to store or convert it at the moment but if anyone wants it I can send it to them. This quality of file means zooming in on an individual state, like Kansas, is fine.

The code to create this picture is here.

JDLong on twitter pointed out where to get data for countries. I got the grains from here and a look at the 'head psd_grains_pulses.csv' shows the file layout

I think I want Country_code and value for the commodity wheat in every country in the most recent year value. The country code is 2 characters (iso 3166-1 alpha 2) and the map I have from wikipedia is that format you can get it here

The code to produce colors for each country based on this data is here. Again this is based on the "Visualize This" book from Yau. This css code to set the color of each country gets pasted into the style section of the BlankMap-World6.svg file. I should read all the documentation describing the values before doing any analysis like this. But I am only doing this to make pretty pictures in Python so I am making assumptions to work quickly.

extra: I made a stacked area graph of what crops have been grown when here with the code here.

Sunday, December 30, 2012

World Cup 2010 Heatmap

I am reading Visualize This by Yau at the moment. It is full of really pretty visualization ideas and examples. One it has is creating a heatmap of NBA players. To practice this visualization I have made one of World Cup 2010 players. The dataset I got from the Gardian Data blog 'World Cup 2010 statistics: every match and every player in data'. The data only has 5 qualities quantified but that is good enough to practice making heatmaps.
The R Package code I used is below
library(RColorBrewer)
#save the guardian data to world.csv and load it
players2<-read.csv('World.csv', sep=',', header=TRUE)
players2[1:3,]
#players with the same name (like Torres) meant I had to merge surnames and countries
players2$Name <-paste(players2[,1], players2[,2])
rownames(players2) <- players2$Name
###I removed one player by hand
###I now do not need these columns
players2$Position <- NULL
players2$Player.Surname <- NULL
players2$Team <- NULL
players3 <-players2[order(players2$Total.Passes, decreasing=TRUE),]
### or to order by time played
###players3 <-players2[order(players2$Time.Played, decreasing=TRUE),]
players3 <- players3[,1:5]
players4<-players3[1:50,]
players_matrix <-data.matrix(players4)
###change names of columns to make graph readable
colnames(players_matrix )[1] <- "played"
colnames(players_matrix )[2] <- "shots"
colnames(players_matrix )[3] <- "passes"
colnames(players_matrix )[4] <- "tackles"
colnames(players_matrix )[5] <- "saves"
players_heatmap <- heatmap(players_matrix, Rowv=NA, Colv=NA, col = brewer.pal(9, 'Blues'), scale='column', margins=c(5,10), main="World Cup 2010")
dev.print(file="SoccerPassed.jpeg", device=jpeg, width=600)       
#players_heatmap <- heatmap(players_matrix, Rowv=NA, Colv=NA, col = brewer.pal(9, 'Greens'), scale='column', margins=c(5,10), main="World Cup 2010")
#dev.print(file="SoccerPlayed.jpeg", device=jpeg, width=600) 
dev.off()
Nothing very fancy here. Just showing that with a good data source and some online tutorials it is easy enough to knock up a picture in a fairly short time.

Monday, December 24, 2012

The Price Of Guinness

When money's tight and hard to get 
And your horse has also ran, 
When all you have is a heap of debt - 
A PINT OF PLAIN IS YOUR ONLY MAN.
Myles Na Gopaleen

How much has Guinness increased in price over time? Below is a graph of the price changes. The data is taken from a combination of the Guinness price index and CSO data

The R package code for this graph is below.

pint<-read.csv('pintindex.csv', sep=',', header=TRUE)
plot(pint$Year, pint$Euros, type="s", main="Price Pint of Guinness in Euros", xlab="Year", ylab="Price Euros", bty="n", lwd=2)
dev.print(file="Guinness.jpeg", device=jpeg, width=600)       
dev.off() 
Paul in the comments asked a good question. How does this compare to earnings?
        price   Earnings/Price     Earnings per Week (Euro)
 2008   4.22    167.31                 706.03
 2009   4.34    161.69                 701.73
 2010   4.2     165.02                 693.08
 2011   4.15    165.81                 688.11
 2012   4.23    163.56                 691.87
Here the earnings are average weekly earnings which is the modern and slightly different value to average industrial wage which the Pint Index used. It shows that even with a price drop in Guinness the total purchasing power of pints with wages decreased. This is based on gross wages increases in tax probably made the situation based on net wages worse.

Pintindex.csv is

Year,  Euros 1969, 0.2 1973, 0.24 1976, 0.48 1979, 0.7 1983, 1.37 1984, 1.48 1985, 1.52 1986, 1.64 1987, 1.73 1988, 1.8 1989, 1.87 1990, 1.93 1991, 2.02 1992, 2.15 1993, 2.24 1994, 2.34 1995, 2.42 1996, 2.5 1997, 2.52 1998, 2.65 1999, 2.74 2000, 2.88 2001, 3.01 2002, 3.24 2003, 3.41 2004, 3.54 2005, 3.63 2006, 3.74 2007, 4.03 2008, 4.22 2009, 4.34 2010, 4.2 2011, 4.15 2012, 4.23

Wednesday, December 19, 2012

Cystic Fibrosis Improved Screening

In the first post I claimed that like Tay-Sachs in Israel Cystic Fibrosis could be drastically reduced with some relatively inexpensive genetic testing. In the second further analysis suggested that such genetic screening of the Irish population would pay for itself several times over. In this post I want to see if some form of targeted screening could be shown to be as cost effective as currently implemented screening.

Currently there is free screening for people who has relatives with CF and their partners. I assume they include second cousin as a relative. Based on this paper and some consanguinity calculations I calculate that an Irish couple with one of their second cousins has CF have about twice the chance of having a child with CF as the general population. This means you can be tested for free currently if you have about a 1 in 700 chance of having a child with cystic fibrosis whereas the general population with a 1 in 1444 chance. If a test can be focused the test so that it is twice as good as random screening that should be enough by current standards to be rolled out.

How could a non random screening be made this focused?

1. Geographic area. Some areas of the country might be more likely to have CF carriers than others. Targeting screening in these areas might make it twice as effective. The Cystic Fibrosis Registry of Ireland annual report 2010 gives numbers for Irish counties. 4 counties do not have their numbers listed but I have estimated these based on their population.

This map is based on the figures of people with CF found in the registry. This could be a biased sample or people could have moved. A better measure would be babies born with CF in each county.

Number of people with CF in each county might be useful for deciding how to allocate some treatment resources. What % of people have CF is more interesting for screening though. To work this out we first need the numbers found in each county.

The number of people with CF in the registry per ten thousand people is

I can send anyone who wants them full sized versions of these maps or the r package code I used to generate them. The code I used is below

library(RColorBrewer)
library(sp)
con <- url("http://gadm.org/data/rda/IRL_adm1.RData")
close(con)
people<-read.csv('cases.csv', sep=',', header=TRUE)
pops = cut(people$cases,breaks=c(0,2,10,20,30,40,50,70,150,300))
myPalette<-brewer.pal(9,"Purples")
spplot(gadm, "pops", col.regions=myPalette, main="Cystic Fibrosis Cases Per County",
       lwd=.4, col="black")
dev.print(file="CFIrl.jpeg", device=jpeg, width=600)
dev.off()
population<-read.csv('countypopths.csv', sep=',', header=TRUE)
pops = cut(population$population,breaks=c(0,20,40,60,70,80,100,160,400,1300))

myPalette<-brewer.pal(9,"Greens")
spplot(gadm, "pops", col.regions=myPalette, main="Population in thousands",
       lwd=.4, col="black")
dev.print(file="PopIrl.jpeg", device=jpeg, width=600)       
dev.off()

gadm$cfpop <- people$cases/(population$population/10)
cfpop = cut(gadm$cfpop,breaks=c(0,0.5,1,1.5,2,2.5,3,3.5))
gadm$cfpop <- as.factor(cfpop)

myPalette<-brewer.pal(7,"Blues")
spplot(gadm, "cfpop", col.regions=myPalette, main="CF/Population Irish Counties",
       lwd=.4, col="black")
dev.print(file="CFperPopIrl.jpeg", device=jpeg, width=600)       
dev.off() 
If this result was replicated in a more complete analysis just picking the darker counties could get you the two times amplification needed to have a test as strong as the currently paid for ones.

2. Pick certain ethnic minorities. Some groups have higher levels of CF than the average population. For example travellers have higher levels of some disorders. 'disorders, including Phenylketonuria and Cystic fibrosis, that are found in virtually all Irish communities and probably are no more common among Travellers than in the general Irish population. The second are disorders, including Galactosaemia, Glutaric Acidaemia Type I, Hurler’s Syndrome, Fanconi’s Anaemia and Type II/III Osteogenesis Imperfecta, that are found at much higher frequencies in the Traveller community than the general Irish population'. 'There is no proactive screening of the Traveller population no more than there is proactive screening of the non-traveller Irish population'. I do not think deliberate screening of one ethnic group, unless that group themselves organise it, is a good idea. Singling out one ethnic group for screening risks stigmatising its members and reminds many of the horror of eugenics.

3. Certain disorders seem to cluster with CF. 'In 1936, Guido Fanconi published a paper describing a connection between celiac disease, cystic fibrosis of the pancreas, and bronchiectasis'. Ireland also has the highest rate of celiac disease in the world (about 1 in 100). If CF and celiac disease or some other observable characteristic are also correlated in Ireland testing people with celiac disease in their family could also provide amplification of a test.

4. Screening parents undergoing IVF. HARI was the first clinic in Ireland to offer IVF and it currently receives up to 800 enquiries a year specifically about the procedure. It carries out over 1,350 cycles of IVF treatment annually and over 3,500 babies have been born as a result. The Merrion Clinic carries out up to 500 cycles of IVF per year, while last year, SIMS carried out 1,063 cycles." IVf is roughly 33% effective per cycle so this means about 1000 children are born through IVF from these three Irish clinics here each year. Screening of these parents would prevent roughly one CF case per year. Screening people who use IVF does not prevent many cases. It can be used by people who know they are CF carriers to avoid having a child with CF though.

Concerns about the privacy and security of a general genetic screening program of the Irish population should not be ignored. Cathal Garvey on twitter pointed out that this screening would require 'With explicit informed consent & ensuing destruction of samples, Just wary of prior shenanigans of HSE bloodspot program. i.e. it's already fashionable among governments to abuse screening programs to create 'law' enforcement databases. Without clear guarantees against that, must weigh the costs of mass DNA false incriminations vs. gains of ntnl screening prog!' I agree that any genetic screening program for Ireland would have to ensure privacy for the individual.

Screening the general population for carriers of serious genetic disorders would save money and suffering. If the level of savings are not sufficient for general screening focusing on certain locations or relatives of people who suffer from disorders that co-occur with CF could amplify the returns sufficiently to be as useful as current screenings.

Thursday, December 13, 2012

Gluten Levels of 73 Beers

I often hear it asked what the gluten content of various beers are. Particularly in relation to celiacs who want to avoid gluten. This post is just a direct google translate of a Swedish research paper. PDF's can be hard to search as can Swedish documents for English speaking users. I am just putting up this translation to aid people searching for this research on beer gluten levels. The appendix here is from the Swedish National Food Agency (NFA). This is from a regularly cited report "Gluteninnehåll i de öl som analyserats vid Livsmedelsverket". Gluten content in beer. SLV. 2009 which is difficult to find online. This commonly linked to location linked to but it is dead.
Gluten content of the beers analyzed at the NFA
A total of 73 analyzed beer. For 12 of these low gluten content of 50 mg gluten per liter or higher.
A further 11 beer contained between 41 and 50 mg of gluten per liter. The list is sorted alphabetically by
manufacturers.
One should be aware that the consumption of beer can lead to increased intake of gluten, even if concentrations gluten in beer
is on a par with those found in foods that are appropriate for gluten intolerance.
Consumption of 0.5-1 liters of beer can in some cases make a significant contribution to the daily intake of gluten,
as for an adult celiac disease should be below 50 mg per day gluten.
The table sometimes describes the gluten level as e.p. = Not detected, which means less than 10 mg per liter gluten
Manufacturer Alcohol Strength Color Names ppm gluten (Mg / l)
AB Åbro Brewery, Sweden 3.5 light Åbro Original ep
AB Åbro Brewery, Sweden 3.5 Light 18:56 ep
AB Åbro Brewery 5.2 light Andersson Beer 47
AB Åbro Brewery 5.2 light Småland 41
Arthur Guinness Son & Co., Dublin, Ireland 3.5 dark Guinness Draft 48
Arthur Guinness Son & Co., Dublin, Ireland 5 dark Guinness Extra Stout 62
Brau Union Österreich AG 2.8 light Zipfer 23
Carlsberg, Denmark 2.8 light Carlsberg Beer 15
Carlsberg, Denmark 3.5 light Carlsberg Beer 21
Carlsberg, Denmark 3.5 Dark Carnegie Porter 20
Carlsberg, Denmark 4.1 light-Saxon gluten ep
Cerveceria Modelo, Mexico 4.6 Light Corona Extra ep
Cerveceria Cuauhtemoc Moctezuma, Mexico 4.5 light Sun ep
Erdinger Weissbräu, Germany 5.3 light Erdinger Weissbier 1188
Erdinger Weissbräu, Germany 5.6 Dark Erdinger Weissbier obscure 1224
Eriksberg 5.6 dark Christmas beer 33
Falcon Breweries, Sweden 2.8 light-Falcon 28
Falcon Breweries, Sweden 3.5 between Falcon Ale 22
Falcon Breweries, Sweden 3.5 light Falcon Extra brew 24
Falcon Breweries, Sweden 3.5 light Falcon Pilz 67
Falcon Breweries, Sweden 5.2 between Bavarian Falcon 55
Falken Falkenberg 3.5 Dark Beer July 49
Grolsche Bierbrowerij, Holland 3.5 light Grolsch Premium Stock 15
Harboes Brewery, Denmark 2.2 Light The Cheerful Dane 25
Harboes Brewery, Denmark 2.8 light Dansk Pilsner premium beer 42
Harboes Brewery, Denmark 3.5 light Dansk Pilsner premium beer 34
Harboes Brewery AB Denmark 3.5 lighting Christmas beer 31
Harboes Brewery AB Denmark 7.3 light Bjørne brewer 49
Hartwall PLC, Tornio, Finland 3.5 light Lapin Kulta ep
Hartwall PLC, Tornio, Finland 5.2 light Lapin Kulta Premium stock ep
Heinecken Brouwerijen Holland Heineken Light 3.5 45
Hofbräu, Germany 6.3 light Hofbräu October-fest bier 26
Inbev UK Limited 3.5 dark Murphys Irish Strout 43
Jämtland Brewery Ltd 6.5 dark Christmas beer e.p.
Kopparberg Brewery 5.3 light Fagerhult Exports III 93
Kra'sne'Březno 4.8 dark Zlatopramen 47
Kronenbourg Strasbourg, France 5.0 light Kronenbourg 1664 97
Krönleins Brewery AB Halmstad 5.3 dark Christmas beer exports 33
Löwenbräu, Germany 6.1 light Lowenbrau October-fest bier 21
Mariestad Brewery Ltd [Spendrups] 2.8 light Mariestads 40
Mariestad Brewery Ltd [Spendrups] 3.5 light Mariestads e.p.
Mariestad Brewery Ltd 3.5 between Julebrygd 60
Pivovar Nova Paka, Czech republic 2.8 light BrouCzech ep
Pivovary Staropramen 3.5 light Staropramen 21
Pripps Sweden 2.2 light Pripps Light beer 17
Pripps Sweden 3.5 Pripps Blue Light Special Stock 32
Pripps Sweden 3.5 light Pripps Blue Pure 28
Pripps (Carlsberg) 5.0 dark Christmas beer 33
Pripps (Carlsberg) 5.2 light Pripps Blue 66
Shepherd Neame Whitstable Kent 3.5 between Bishops Finger ep
Singha Corp. Thailand 5 Light Singha Premium stock beer 17
Source Castle Brewery 3.5 Uppsala dark Christmas beer 23
Source Castle Brewery Ltd 3.5 Light White Weissbier 67
Source Castle Brewery Ltd 3.5 from Vienna ep
Source Castle Brewery Ltd 9.0 dark Imperial Stout 50
Spendrups Brewery Ltd 2.0 dark Gammeldags Moderate Drinking ep
Spendrups Brewery Ltd 2.1 light Spendrups Premium Stock 31
Spendrups Brewery Ltd 2.8 light Norrland Gold 21
Spendrups Brewery Ltd 3.5 light Norrland Gold 35
Spendrups Brewery Ltd 3.5 light Spendrups Premium Gold ep
Spendrups Brewery Ltd 3.5 light Spendrup Bright Brew 28
Spendrups Brewery Ltd 3.5 light Odin Pilsner 46
Spendrups Brewery Ltd 5.0 light Spendrups Premium Stock 53
Spendrups Brewery Ltd 5.2 dark Christmas beer 24
Spendrups Brewery Ltd 5.3 light Mariestads Exports 45
Spendrups Brewery Ltd 5.3 light Norrland Gold 38
Spendrups Brewery Ltd 5.3 dark Norrland July ep
Spendrups Brewery Ltd 5.9 light Spendrups Premium Gold 35
Spendrups Brewery Ltd 7.0 dark julbock 34
Starobrno Brewery Czech 3.5 light Starobrno Premium Stock 21
St Peters Brewery *, UK 4.2 Light St. Petersburg G-free (gluten-free) ep
Tuborg Copenhagen, Denmark 3.5 light Tuborg Beer Premium Gold 28
Zeunerts AB, Sollefteå 5.1 dark Christmas beer 37
* According to the ingredients list on the brew sorghum.
e.p. = Not detected, which means less than 10 mg per liter gluten
My favorite beer blog is by the beer nut and this links to his gluten free section.