Sunday, December 30, 2012

World Cup 2010 Heatmap

I am reading Visualize This by Yau at the moment. It is full of really pretty visualization ideas and examples. One it has is creating a heatmap of NBA players. To practice this visualization I have made one of World Cup 2010 players. The dataset I got from the Gardian Data blog 'World Cup 2010 statistics: every match and every player in data'. The data only has 5 qualities quantified but that is good enough to practice making heatmaps.
The R Package code I used is below
library(RColorBrewer)
#save the guardian data to world.csv and load it
players2<-read.csv('World.csv', sep=',', header=TRUE)
players2[1:3,]
#players with the same name (like Torres) meant I had to merge surnames and countries
players2$Name <-paste(players2[,1], players2[,2])
rownames(players2) <- players2$Name
###I removed one player by hand
###I now do not need these columns
players2$Position <- NULL
players2$Player.Surname <- NULL
players2$Team <- NULL
players3 <-players2[order(players2$Total.Passes, decreasing=TRUE),]
### or to order by time played
###players3 <-players2[order(players2$Time.Played, decreasing=TRUE),]
players3 <- players3[,1:5]
players4<-players3[1:50,]
players_matrix <-data.matrix(players4)
###change names of columns to make graph readable
colnames(players_matrix )[1] <- "played"
colnames(players_matrix )[2] <- "shots"
colnames(players_matrix )[3] <- "passes"
colnames(players_matrix )[4] <- "tackles"
colnames(players_matrix )[5] <- "saves"
players_heatmap <- heatmap(players_matrix, Rowv=NA, Colv=NA, col = brewer.pal(9, 'Blues'), scale='column', margins=c(5,10), main="World Cup 2010")
dev.print(file="SoccerPassed.jpeg", device=jpeg, width=600)       
#players_heatmap <- heatmap(players_matrix, Rowv=NA, Colv=NA, col = brewer.pal(9, 'Greens'), scale='column', margins=c(5,10), main="World Cup 2010")
#dev.print(file="SoccerPlayed.jpeg", device=jpeg, width=600) 
dev.off()
Nothing very fancy here. Just showing that with a good data source and some online tutorials it is easy enough to knock up a picture in a fairly short time.

Monday, December 24, 2012

The Price Of Guinness

When money's tight and hard to get 
And your horse has also ran, 
When all you have is a heap of debt - 
A PINT OF PLAIN IS YOUR ONLY MAN.
Myles Na Gopaleen

How much has Guinness increased in price over time? Below is a graph of the price changes. The data is taken from a combination of the Guinness price index and CSO data

The R package code for this graph is below.

pint<-read.csv('pintindex.csv', sep=',', header=TRUE)
plot(pint$Year, pint$Euros, type="s", main="Price Pint of Guinness in Euros", xlab="Year", ylab="Price Euros", bty="n", lwd=2)
dev.print(file="Guinness.jpeg", device=jpeg, width=600)       
dev.off() 
Paul in the comments asked a good question. How does this compare to earnings?
        price   Earnings/Price     Earnings per Week (Euro)
 2008   4.22    167.31                 706.03
 2009   4.34    161.69                 701.73
 2010   4.2     165.02                 693.08
 2011   4.15    165.81                 688.11
 2012   4.23    163.56                 691.87
Here the earnings are average weekly earnings which is the modern and slightly different value to average industrial wage which the Pint Index used. It shows that even with a price drop in Guinness the total purchasing power of pints with wages decreased. This is based on gross wages increases in tax probably made the situation based on net wages worse.

Pintindex.csv is

Year,  Euros 1969, 0.2 1973, 0.24 1976, 0.48 1979, 0.7 1983, 1.37 1984, 1.48 1985, 1.52 1986, 1.64 1987, 1.73 1988, 1.8 1989, 1.87 1990, 1.93 1991, 2.02 1992, 2.15 1993, 2.24 1994, 2.34 1995, 2.42 1996, 2.5 1997, 2.52 1998, 2.65 1999, 2.74 2000, 2.88 2001, 3.01 2002, 3.24 2003, 3.41 2004, 3.54 2005, 3.63 2006, 3.74 2007, 4.03 2008, 4.22 2009, 4.34 2010, 4.2 2011, 4.15 2012, 4.23

Wednesday, December 19, 2012

Cystic Fibrosis Improved Screening

In the first post I claimed that like Tay-Sachs in Israel Cystic Fibrosis could be drastically reduced with some relatively inexpensive genetic testing. In the second further analysis suggested that such genetic screening of the Irish population would pay for itself several times over. In this post I want to see if some form of targeted screening could be shown to be as cost effective as currently implemented screening.

Currently there is free screening for people who has relatives with CF and their partners. I assume they include second cousin as a relative. Based on this paper and some consanguinity calculations I calculate that an Irish couple with one of their second cousins has CF have about twice the chance of having a child with CF as the general population. This means you can be tested for free currently if you have about a 1 in 700 chance of having a child with cystic fibrosis whereas the general population with a 1 in 1444 chance. If a test can be focused the test so that it is twice as good as random screening that should be enough by current standards to be rolled out.

How could a non random screening be made this focused?

1. Geographic area. Some areas of the country might be more likely to have CF carriers than others. Targeting screening in these areas might make it twice as effective. The Cystic Fibrosis Registry of Ireland annual report 2010 gives numbers for Irish counties. 4 counties do not have their numbers listed but I have estimated these based on their population.

This map is based on the figures of people with CF found in the registry. This could be a biased sample or people could have moved. A better measure would be babies born with CF in each county.

Number of people with CF in each county might be useful for deciding how to allocate some treatment resources. What % of people have CF is more interesting for screening though. To work this out we first need the numbers found in each county.

The number of people with CF in the registry per ten thousand people is

I can send anyone who wants them full sized versions of these maps or the r package code I used to generate them. The code I used is below

library(RColorBrewer)
library(sp)
con <- url("http://gadm.org/data/rda/IRL_adm1.RData")
close(con)
people<-read.csv('cases.csv', sep=',', header=TRUE)
pops = cut(people$cases,breaks=c(0,2,10,20,30,40,50,70,150,300))
myPalette<-brewer.pal(9,"Purples")
spplot(gadm, "pops", col.regions=myPalette, main="Cystic Fibrosis Cases Per County",
       lwd=.4, col="black")
dev.print(file="CFIrl.jpeg", device=jpeg, width=600)
dev.off()
population<-read.csv('countypopths.csv', sep=',', header=TRUE)
pops = cut(population$population,breaks=c(0,20,40,60,70,80,100,160,400,1300))

myPalette<-brewer.pal(9,"Greens")
spplot(gadm, "pops", col.regions=myPalette, main="Population in thousands",
       lwd=.4, col="black")
dev.print(file="PopIrl.jpeg", device=jpeg, width=600)       
dev.off()

gadm$cfpop <- people$cases/(population$population/10)
cfpop = cut(gadm$cfpop,breaks=c(0,0.5,1,1.5,2,2.5,3,3.5))
gadm$cfpop <- as.factor(cfpop)

myPalette<-brewer.pal(7,"Blues")
spplot(gadm, "cfpop", col.regions=myPalette, main="CF/Population Irish Counties",
       lwd=.4, col="black")
dev.print(file="CFperPopIrl.jpeg", device=jpeg, width=600)       
dev.off() 
If this result was replicated in a more complete analysis just picking the darker counties could get you the two times amplification needed to have a test as strong as the currently paid for ones.

2. Pick certain ethnic minorities. Some groups have higher levels of CF than the average population. For example travellers have higher levels of some disorders. 'disorders, including Phenylketonuria and Cystic fibrosis, that are found in virtually all Irish communities and probably are no more common among Travellers than in the general Irish population. The second are disorders, including Galactosaemia, Glutaric Acidaemia Type I, Hurler’s Syndrome, Fanconi’s Anaemia and Type II/III Osteogenesis Imperfecta, that are found at much higher frequencies in the Traveller community than the general Irish population'. 'There is no proactive screening of the Traveller population no more than there is proactive screening of the non-traveller Irish population'. I do not think deliberate screening of one ethnic group, unless that group themselves organise it, is a good idea. Singling out one ethnic group for screening risks stigmatising its members and reminds many of the horror of eugenics.

3. Certain disorders seem to cluster with CF. 'In 1936, Guido Fanconi published a paper describing a connection between celiac disease, cystic fibrosis of the pancreas, and bronchiectasis'. Ireland also has the highest rate of celiac disease in the world (about 1 in 100). If CF and celiac disease or some other observable characteristic are also correlated in Ireland testing people with celiac disease in their family could also provide amplification of a test.

4. Screening parents undergoing IVF. HARI was the first clinic in Ireland to offer IVF and it currently receives up to 800 enquiries a year specifically about the procedure. It carries out over 1,350 cycles of IVF treatment annually and over 3,500 babies have been born as a result. The Merrion Clinic carries out up to 500 cycles of IVF per year, while last year, SIMS carried out 1,063 cycles." IVf is roughly 33% effective per cycle so this means about 1000 children are born through IVF from these three Irish clinics here each year. Screening of these parents would prevent roughly one CF case per year. Screening people who use IVF does not prevent many cases. It can be used by people who know they are CF carriers to avoid having a child with CF though.

Concerns about the privacy and security of a general genetic screening program of the Irish population should not be ignored. Cathal Garvey on twitter pointed out that this screening would require 'With explicit informed consent & ensuing destruction of samples, Just wary of prior shenanigans of HSE bloodspot program. i.e. it's already fashionable among governments to abuse screening programs to create 'law' enforcement databases. Without clear guarantees against that, must weigh the costs of mass DNA false incriminations vs. gains of ntnl screening prog!' I agree that any genetic screening program for Ireland would have to ensure privacy for the individual.

Screening the general population for carriers of serious genetic disorders would save money and suffering. If the level of savings are not sufficient for general screening focusing on certain locations or relatives of people who suffer from disorders that co-occur with CF could amplify the returns sufficiently to be as useful as current screenings.

Thursday, December 13, 2012

Gluten Levels of 73 Beers

I often hear it asked what the gluten content of various beers are. Particularly in relation to celiacs who want to avoid gluten. This post is just a direct google translate of a Swedish research paper. PDF's can be hard to search as can Swedish documents for English speaking users. I am just putting up this translation to aid people searching for this research on beer gluten levels. The appendix here is from the Swedish National Food Agency (NFA). This is from a regularly cited report "Gluteninnehåll i de öl som analyserats vid Livsmedelsverket". Gluten content in beer. SLV. 2009 which is difficult to find online. This commonly linked to location linked to but it is dead.
Gluten content of the beers analyzed at the NFA
A total of 73 analyzed beer. For 12 of these low gluten content of 50 mg gluten per liter or higher.
A further 11 beer contained between 41 and 50 mg of gluten per liter. The list is sorted alphabetically by
manufacturers.
One should be aware that the consumption of beer can lead to increased intake of gluten, even if concentrations gluten in beer
is on a par with those found in foods that are appropriate for gluten intolerance.
Consumption of 0.5-1 liters of beer can in some cases make a significant contribution to the daily intake of gluten,
as for an adult celiac disease should be below 50 mg per day gluten.
The table sometimes describes the gluten level as ep = Not detected, which means less than 10 mg per liter gluten
Manufacturer Alcohol Strength Color Names ppm gluten (Mg / l)
AB Åbro Brewery, Sweden 3.5 light Åbro Original ep
AB Åbro Brewery, Sweden 3.5 Light 18:56 ep
AB Åbro Brewery 5.2 light Andersson Beer 47
AB Åbro Brewery 5.2 light Småland 41
Arthur Guinness Son & Co., Dublin, Ireland 3.5 dark Guinness Draft 48
Arthur Guinness Son & Co., Dublin, Ireland 5 dark Guinness Extra Stout 62
Brau Union Österreich AG 2.8 light Zipfer 23
Carlsberg, Denmark 2.8 light Carlsberg Beer 15
Carlsberg, Denmark 3.5 light Carlsberg Beer 21
Carlsberg, Denmark 3.5 Dark Carnegie Porter 20
Carlsberg, Denmark 4.1 light-Saxon gluten ep
Cerveceria Modelo, Mexico 4.6 Light Corona Extra ep
Cerveceria Cuauhtemoc Moctezuma, Mexico 4.5 light Sun ep
Erdinger Weissbräu, Germany 5.3 light Erdinger Weissbier 1188
Erdinger Weissbräu, Germany 5.6 Dark Erdinger Weissbier obscure 1224
Eriksberg 5.6 dark Christmas beer 33
Falcon Breweries, Sweden 2.8 light-Falcon 28
Falcon Breweries, Sweden 3.5 between Falcon Ale 22
Falcon Breweries, Sweden 3.5 light Falcon Extra brew 24
Falcon Breweries, Sweden 3.5 light Falcon Pilz 67
Falcon Breweries, Sweden 5.2 between Bavarian Falcon 55
Falken Falkenberg 3.5 Dark Beer July 49
Grolsche Bierbrowerij, Holland 3.5 light Grolsch Premium Stock 15
Harboes Brewery, Denmark 2.2 Light The Cheerful Dane 25
Harboes Brewery, Denmark 2.8 light Dansk Pilsner premium beer 42
Harboes Brewery, Denmark 3.5 light Dansk Pilsner premium beer 34
Harboes Brewery AB Denmark 3.5 lighting Christmas beer 31
Harboes Brewery AB Denmark 7.3 light Bjørne brewer 49
Hartwall PLC, Tornio, Finland 3.5 light Lapin Kulta ep
Hartwall PLC, Tornio, Finland 5.2 light Lapin Kulta Premium stock ep
Heinecken Brouwerijen Holland Heineken Light 3.5 45
Hofbräu, Germany 6.3 light Hofbräu October-fest bier 26
Inbev UK Limited 3.5 dark Murphys Irish Strout 43
Jämtland Brewery Ltd 6.5 dark Christmas beer e.p.
Kopparberg Brewery 5.3 light Fagerhult Exports III 93
Kra'sne'Březno 4.8 dark Zlatopramen 47
Kronenbourg Strasbourg, France 5.0 light Kronenbourg 1664 97
Krönleins Brewery AB Halmstad 5.3 dark Christmas beer exports 33
Löwenbräu, Germany 6.1 light Lowenbrau October-fest bier 21
Mariestad Brewery Ltd [Spendrups] 2.8 light Mariestads 40
Mariestad Brewery Ltd [Spendrups] 3.5 light Mariestads e.p.
Mariestad Brewery Ltd 3.5 between Julebrygd 60
Pivovar Nova Paka, Czech republic 2.8 light BrouCzech ep
Pivovary Staropramen 3.5 light Staropramen 21
Pripps Sweden 2.2 light Pripps Light beer 17
Pripps Sweden 3.5 Pripps Blue Light Special Stock 32
Pripps Sweden 3.5 light Pripps Blue Pure 28
Pripps (Carlsberg) 5.0 dark Christmas beer 33
Pripps (Carlsberg) 5.2 light Pripps Blue 66
Shepherd Neame Whitstable Kent 3.5 between Bishops Finger ep
Singha Corp. Thailand 5 Light Singha Premium stock beer 17
Source Castle Brewery 3.5 Uppsala dark Christmas beer 23
Source Castle Brewery Ltd 3.5 Light White Weissbier 67
Source Castle Brewery Ltd 3.5 from Vienna ep
Source Castle Brewery Ltd 9.0 dark Imperial Stout 50
Spendrups Brewery Ltd 2.0 dark Gammeldags Moderate Drinking ep
Spendrups Brewery Ltd 2.1 light Spendrups Premium Stock 31
Spendrups Brewery Ltd 2.8 light Norrland Gold 21
Spendrups Brewery Ltd 3.5 light Norrland Gold 35
Spendrups Brewery Ltd 3.5 light Spendrups Premium Gold ep
Spendrups Brewery Ltd 3.5 light Spendrup Bright Brew 28
Spendrups Brewery Ltd 3.5 light Odin Pilsner 46
Spendrups Brewery Ltd 5.0 light Spendrups Premium Stock 53
Spendrups Brewery Ltd 5.2 dark Christmas beer 24
Spendrups Brewery Ltd 5.3 light Mariestads Exports 45
Spendrups Brewery Ltd 5.3 light Norrland Gold 38
Spendrups Brewery Ltd 5.3 dark Norrland July ep
Spendrups Brewery Ltd 5.9 light Spendrups Premium Gold 35
Spendrups Brewery Ltd 7.0 dark julbock 34
Starobrno Brewery Czech 3.5 light Starobrno Premium Stock 21
St Peters Brewery *, UK 4.2 Light St. Petersburg G-free (gluten-free) ep
Tuborg Copenhagen, Denmark 3.5 light Tuborg Beer Premium Gold 28
Zeunerts AB, Sollefteå 5.1 dark Christmas beer 37
* According to the ingredients list on the brew sorghum.
e.p. = Not detected, which means less than 10 mg per liter gluten
My favorite beer blog is by the beer nut and this links to his gluten free section.

Wednesday, December 12, 2012

Cystic Fibrosis Carrier Screening

In my last post Cystic Fibrosis Screening I described how Tay-Sachs had been nearly eradicated in Israel and America and did a rough calculation as to why it would be cost effective to run a similar program to screen for Cystic Fibrosis in Ireland.

In this post I am going to take a closer look at the figures involved to give more evidence that such a screening program is justified.

The Cystic Fibrosis approve of genetic carrier screening for those related to people with CF and their partners. Genetic Carrier Testing For Cystic Fibrosis. 'Carrier testing is limited to adults over the age of 16 where there is a family history of CF, or where a family member has been found to be a carrier of a CF mutation' says the lab that does the testing.

[in the UK] 'A disadvantage of cascade testing is that it will not identify the majority of carrier couples since more than 80% of affected infants are born in families without a prior history of the disease'. Testing relatives though useful only covers a small fraction of potential CF cases.

This screening of relatives is paid for out of public funds

"How much does the test cost? GP fees will apply for arranging the blood test but molecular genetic testing at NCMG and any genetic counselling you may have is a public service and therefore free of charge".

This means that CF Ireland and the health service are involved in and support CF carrier screening. This means some of the moral objections to public screening that might have existed are not present.

What would population wide screening cost? The cost of genetic screening has fallen amazingly fast. for example here is the cost of sequencing an entire genome compared to Moores law.

23andMe a private company has recently announced it will for $99 dollars. This test checks for over 200 genetic markers including some forms of cystic fibrosis. This is further confirmation that genetic screening is getting much cheaper fast and that its current cost is quite low at less than a tenth of the cost of a night in hospital.

The list price of sending a sample from every 16 year old to 23andMe each year would at present be $7.5 million. There are reasons you might not want a private company to do this but it gives a baseline cost. This $7.5 million is the lifetime cost of under 8 CF patients 'However, the lifetime medical cost of the care of a CF child in today’s dollars was estimated to be slightly >$1,000,000'. To be economical, at US prices, this screening would have to prevent 8 of the roughly 40 CF sufferers born a year. Other genetic disorders are also screened for this $99 cost including many of those listed here. None of these are as common as CF in Ireland but these other disorders should be included in a full cost benefit analysis of genetic screening for the Irish population.

This $100 dollar cost is slightly deceptive as once someone finds out they are a CF carrier there are several options available to them. These vary in cost. They can decide (or matchmakers can ensure) not to have children with another carrier. If they do decide to have children with another carrier they can use IVF techniques to ensure an embryo without CF is implanted. Many of the cost analysis of CF screening (like the Rowley et al paper) include the possibility of screening a fetus for CF and terminating the pregnancy if found. This option is not legal in Ireland. They can ignore their and their partners screening results and have a baby as normal with all the risks that entails.

These costs and the probabilities on each have been worked out for the US in the 1998 paper Prenatal screening for cystic fibrosis carriers: an economic evaluation. 'the marginal cost for prenatal CF carrier screening is estimated to be $8,290 per quality-adjusted life-year. This value compares favorably with that of many accepted medical services. The cost of prenatal CF carrier screening could fall to equal the averted costs of CF patient care if the cost of carrier testing were to fall to $100'. This QALY cost figure is used by health care economists to decide which treatments and screenings meet a cost benefit analysis. The wikipedia page on QALY describes the measue well. According to this paper screening in the US, where CF is about four times rarer, is cost effective for the general population at current screening prices.

The paper 'Economic evaluation of cystic fibrosis screening: A review of the literature' has further figures on the cost of screening. This paper is from 2008 and the figures it quotes can be from years before then. As an example of how much screening costs have dropped in that time 23andMe screening cost $999 in 2007 and is now $99 and screens for more genetic markers.

In the UK £30,000 per QALY is generally considered cost effective.

In Ireland what is the cost for a QALY? 'In Ireland, there is no fixed and generally agreed cost effectiveness threshold below which health care technologies would be considered by policy makers to be cost effective'. ’Pee-in-a-pot’ screening in third level institution/college settings may be considered cost effective if a cost effectiveness threshold in the region of €45,000 per QALY gained is used. This €45,000 per QALY gained seems to be a generally accepted figure.

There are more costs to screening than can be supped up in a € per QALY figure. Any screening will induce worry for example. Prostate, breast cancer and other screenings all also induce extra human costs not measurable in QALY though. These common screenings also have to meet these cost per QALY standards.

Given the US analysis at $8,290, screening costs having dropped drastically since then and the high rate of CF gene in the Irish population this implies to me full CF screening of the Irish population would be very cost effective.

Friday, December 07, 2012

Cystic Fibrosis Screening

Approximately 1 in 25 Ashkenazi Jews are carriers of Tay-Sachs disorder which is a really nasty genetic disorder that kills children who have inherited two copies of the Tay-Sachs variant of the gene from their parents by the age of four.

This disease does not effect many kids anymore though

of the 10 babies born in North America in 2003 with Tay-Sachs, not a single one was Jewish.

Figures from Israel paint a similar picture.

According to Prof. Joel Zlotogora, who heads the Health Ministry's Department of Community Genetics, just one baby was born with Tay-Sachs in Israel in 2003. Insofar as is known, not a single baby in Israel was born with Tay-Sachs last year,(2004)

Israel almost ten years ago pretty much eradicated a really nasty genetic disease. This has been done by screening people so that they know they are carriers. If they find out early in a relationship that their partner is also a carrier people tend to decide they are incompatable. IVF techniques allow testing of preimplantation embryos for certain genetic disease before implantation. Finally 'the general public in Israel is advised to carry out, at the expense of the state, genetic tests to diagnose the disease before the birth of the baby. In the event an unborn baby is diagnosed with Tay-Sachs, the pregnancy is usually terminated'.

The Cystic Fibrosis variant of the gene is carried by 1 in 19 Irish people. This is the highest rate in the world. CF is an unpleasant disease but not nearly as unpleasant as Tay-Sachs. It is however the most common genetic disorder in Ireland and one that is more common than Tay-Sachs, which screening eradicated nearly a decade ago.

75,554 children born in Ireland during 2009. Testing newborns would be unfair for reasons of consent but a genetic test for CF could be offered to adults. Genetic carrier test costs a bit over 100 euro (but the costs has been dropping exponentially for the last several years). Testing in bulk means this screening could come in at under 7 euro million a year. Much of the cost in testing involves collection and processing of samples. This means other less common genetic disorders could be screened for at little extra cost.

A bed in an Irish hospital costs €910 per day. This is semi private room and I believe CF rooms need to be more isolated than this. That is €332150 per year. For the 20 beds in the St Vincents unit that costs (roughly) 6.5 million per year. The cost of screening every 18 year old for CF is roughly that of running one ward in St Vincents for a year.

This cost benefit analysis of screening for CF of one ward ignores all the other medical costs involved in CF but worse it ignores the suffering of the 35-40 children (one in 1,461) born with the disorder every year. The termination of fetus' with CF would not be supported in Ireland. CF compared to Tay-Sachs is a mild disorder and Irish people have a different opinion on termination to Israeli's. But with voluntary screening CF cases would significantly reduce just from partner selection alone.

I think at least a cost benefit analysis and a debate on the morality of genetic screening of the general population should take place.

Friday, November 30, 2012

Visiting Santa's Grave

I found out recently that Santa is buried in Ireland. No really he is and you can go visit his grave. Though a bonus, I didn't go to avoid having to get the sprog Christmas presents, I'm pretending to be Buddhist to do that.

Thomastown in Kilkenny is a funny town. It is beautiful looking and full of hippys but also has a rough edge to it. There are all sorts of art workshops and tea shops that the posh people run during the day and at night there is an air of menace and divilment about some of the pubs there.

John Martyn the towns most famous resident seems to also traverse these two characteristics. He has the hippy spirituality of his friend Nick Drake but also looked and acted like he was well up for a row.

It turns out that Thomastown may have always had the strange mixture of spiritual and hedonistic. Just outside the town is the famous Jerpoint Abbey and close to there is the newly rediscovered and Newtown.

The story of Newtown is that some Norman knights from Kilkenny headed off on the crusades in the 12 century. Taking religious relics was one of the major hobbies of the time like pokemon but with dead people. These two took back to Newtown what they claimed were Santa's bones and relics of St Nicholas. The story goes they got these in modern day Turkey. The aim of collecting these was possibly to create a tourist attraction to compete with other relics in Ireland. There is no way of telling at this remove that they were but contemporary accounts at least reveal that the people of the time believed they were.

The town thrived with these relics used as a draw for tourists and pilgrims. A three story church of St Nicholas was built.

A town developed around it including three mills. Eight pubs and a whore house the remains of which still exist. This was out of a total of 13 houses so pubs played a big role in the town. This shows religious pilgrimage may not have been so holy.
The town is at the last navigable point on the Nore, where it meets the Arygle.
These rivers powered the mills and allowed fish farming in the floodplain. Fish farming still takes place in the next door Goatsbridge farm which shows how little changes over time.

On the grave of Santa himself is the famous symbol of St Nicolas, the three figures.

These represent the three bags of gold Santa put down the chimney to pay the dowry of a poor mans three daughters. This both explains Santas chimney shimmying antics and the Pawn shop symbol. St Nicholas is the patron saint of pawn shops.
The guide explained the three heads on the gravestone as representing St Nicholas and the two crusaders who took his bones back to Ireland. The three figures symbol was generally thought to represent Jesus and Mary in medieval symbols.

The town was abandoned when the plague struck around 1346. Mills were havens for rats and towns were decimated while the more Gaelic countryside was much less effected. Monk John Clyn in nearby Kilkenny as the plague descended wrote the chilling 'so that the writing does not perish with the writer, or the work fail with the workman, I leave parchment for continuing the work, in case anyone should still be alive in the future and any son of Adam can escape this pestilence and continue the work thus begun'. The bridge over the river collapsed though the remains can still be seen. And the very existence of the town faded from memory to only recently be rediscovered. The stones from the houses were used in the railway bridge you can see form the ruins of Newtown.

The grave of St Nicholas is well worth a visit. The tour is informative and entertaining. The site is not over developed like some Irish tourist destinations. And the trip ends on the brilliant Father Ted touch of watching a sheep dog herd geese. Next time you are in Kilkenny head to Thomastown and Jerpoint and take the right lane up by the fish farm, it will at least save you money on Christmas presents.

Monday, November 05, 2012

Drawing the presidential Election

Finding all the ways the electoral college votes from each state can be added up to give both candidates 269 votes turns out to be really hard. Political pundits always bring up the spectre of a drawn presidential election.

What happens if the US election ends in a draw?

If There’s an Electoral College Tie, Things Will Get Even Crazier Than You May Know

Where both candidates get 269 of the electoral college votes.

The website 270towin calculates 32 practical combinations that could result in a tie and give the probability of one of these occuring given polling data at 0.2%.

The NYtimes here claims there are 5 practical ways a tie could result.

But how many possible drawing combinations are there in total, including implausible ones?

Elections are much closer than the set of all possible wins a candidate could have. Certain states are similar so tend to vote the same way so some divisions are far more likely than others. Also by the median voter theorem two competing parties will be as similar to each other as possible to get as much of the vote as they can. Roughly the Republicans will be as near the center as they can while getting all the right wing votes and the democrats as near the center while getting all the left wing votes.

But how many total ways are there that two candidates can win 50 states and one district? That is as, ColinTheMathmo pointed out, 2^51 as each state can go to either candidate. I want to find the number of allocations in this 2^51 that give each candidate 269 electoral college votes*.

2^51 is a big number. As Matthew Saltzman said the set partition is an NP-Complete problem and a 'Complete enum at 10M/sec, would take 7 years.'

But instead of complete enum we just want to see the cases where both candidates get 269. The code here calculates this. I took some Minizinc code from Hakan Kjellerstrand who pointed out an error in my previous reading of a result that would calculate all the allocations of states that give 269 electoral votes. The code is written in Minizinc as it can calculate all answers in a way GLPK** doesn't. Hakan also kindly pointed out some inefficiencies in my program so I am linking to his code.

Here are some of the many results it output

Alaska California Delaware Florida Idaho Illinois Iowa Maine Michigan Mississippi Montana Nevada 'New York' Ohio Pennsylvania 'South Dakota' Texas Utah 
----------
Alaska California Delaware Florida Idaho Illinois Iowa Maine Michigan Mississippi Montana Nevada 'New York' Ohio Pennsylvania Texas Utah Vermont 
----------
Alaska California Delaware Florida Idaho Illinois Iowa Maine Michigan Mississippi Montana Nevada 'New York' Ohio Pennsylvania Texas Utah Wyoming 
----------
Alaska California Delaware Florida Idaho Illinois Iowa Maine Michigan Mississippi Nevada 'New York' 'North Dakota' Ohio Pennsylvania 'South Dakota' Texas Utah 
My mac is 2.16 gz and has 2 gigs of RAM so it is not a fast machine. Still after three hours of running it ground to a halt. This means I have no answer for you. If I find out how many of the state allocations result in 269 votes I will let you know.

It is interesting that these sorts of difficult NP-Complete problems pop up in real life and getting solutions for them is not always easy.

* Actually Maine and Nebraska can give out votes proportionately so the search space is even larger than this

**Just in case someone can use it here is GLPK code that will work out one answer

/* sets */
set STATES;
set NEED;

/* parameters */
param VotesTable {i in STATES, j in NEED};
param Pop {i in STATES};
param Need {j in NEED};


/* decision variables: x1: alabama, x2: , x3: , x4:  x51: Wyoming*/
      var x {i in STATES} binary >= 0;

/* objective function 
          z: sum{i in STATES} VotesTable[i,j]*x[i];
     
/* Constraints */
s.t. const{j in NEED} : sum{i in STATES} VotesTable[i,j]*x[i] == Need[j];

/* data section */
data;

set STATES :=  Alaska Delaware "District of Columbia" Montana "North Dakota" "South Dakota" Vermont Wyoming Hawaii Idaho Maine "New Hampshire" "Rhode Island" Nebraska Nevada "New Mexico" Utah "West Virginia" Arkansas Kansas Mississippi Connecticut Iowa Oklahoma Oregon Kentucky "South Carolina" Alabama Colorado Louisiana Arizona Maryland Minnesota Wisconsin Indiana Missouri Tennessee Washington Massachusetts Virginia Georgia "New Jersey" "North Carolina" Michigan Ohio Illinois Pennsylvania Florida "New York" Texas California;
set NEED := Votes;

param VotesTable: Votes:=
 Alabama 9
 Alaska 3
 Arizona 11
 Arkansas 6
 California 55
 Colorado 9
 Connecticut 7
 Delaware 3
 "District of Columbia" 3
 Florida 29
 Georgia 16
 Hawaii 4
 Idaho 4
 Illinois 20
 Indiana 11
 Iowa 6
 Kansas 6
 Kentucky 8
 Louisiana 8
 Maine 4
 Maryland 10
 Massachusetts 11
 Michigan 16
 Minnesota 10
 Mississippi 6
 Missouri 10
 Montana 3
 Nebraska 5
 Nevada 6
 "New Hampshire" 4
 "New Jersey" 14
 "New Mexico" 5
 "New York" 29
 "North Carolina" 15
 "North Dakota" 3
 Ohio 18
 Oklahoma 7
 Oregon 7
 Pennsylvania 20
 "Rhode Island" 4
 "South Carolina" 9
 "South Dakota" 3
 Tennessee 11
 Texas 38
 Utah 6
 Vermont 3
 Virginia 13
 Washington 12
 "West Virginia" 5
 Wisconsin 10
 Wyoming 3;

param Need:=
Votes        269;

end;

Friday, November 02, 2012

Becoming President with 22% of the Votes

Political pundits talk about the possibility of winning the Presidential election with less votes than your opponent. Assuming only two candidates what is the lowest percentage of votes you could get and still win the election? In the case where everyone votes this becomes an interesting question. In America some states have a higher ratio of people to electoral college votes than others. If the ones that are preferentially treated banded together a small percentage of people could decide the election.

In my last post I created a program to work out in an overly complicated way what was the least amount of land a president could be elected from in the US. In this post I want to figure out the best states to win to get you 270+ electoral college seats using the smallest number of voters.

I got the estimated population in July 2011 from the census website here. Using this data in the glpk program below and got the result shown in this map. If everyone voted and the people who get the most power in votes all voted one way then the states on the winning side would have 135936335 voters to get 270 electoral college seats. The total population is 311591917 so 43.67% of the population

The winner would win 40 districts of the 51 states+dc and lose Virginia, Georgia, New Jersey, Michigan, Ohio, Illinois, Pennsylvania, Florida, New York, Texas, California.*

Now if the candidate in these states just squeaked a win by one vote and got zero votes in all the other states that means in a two party election you could win an election, where everyone voted, with under 22% of the vote. If you think of this happening in a senate election where each state has 2 senators (and DC none) then you could control 80% of the senate with under 22% of the vote.

There are all sorts of other questions similar to this. What is the smallest block where all the states touch? What is the shortest distance between all her state capitals a winner could have? I think some analysis that included Senate and Congress seats could be interesting. If you have any ideas please comment. * Thanks to Hakan Kjellerstrand who pointed out here I had read the solution file wrong and North Carolina was not present.

/* sets */
set STATES;
set NEED;

/* parameters */
param VotesTable {i in STATES, j in NEED};
param Pop {i in STATES};
param Need {j in NEED};


/* decision variables: x1: alabama, x2: , x3: , x4:  x51: Wyoming*/
      var x {i in STATES} binary >= 0;

/* objective function */
      minimize z: sum{i in STATES} Pop[i]*x[i];

/* Constraints */
s.t. const{j in NEED} : sum{i in STATES} VotesTable[i,j]*x[i] >= Need[j];


/* data section */
data;

set STATES :=  Alaska Delaware "District of Columbia" Montana "North Dakota" "South Dakota" Vermont Wyoming Hawaii Idaho Maine "New Hampshire" "Rhode Island" Nebraska Nevada "New Mexico" Utah "West Virginia" Arkansas Kansas Mississippi Connecticut Iowa Oklahoma Oregon Kentucky "South Carolina" Alabama Colorado Louisiana Arizona Maryland Minnesota Wisconsin Indiana Missouri Tennessee Washington Massachusetts Virginia Georgia "New Jersey" "North Carolina" Michigan Ohio Illinois Pennsylvania Florida "New York" Texas California;
set NEED := Votes;

param VotesTable: Votes:=
 Alabama 9
 Alaska 3
 Arizona 11
 Arkansas 6
 California 55
 Colorado 9
 Connecticut 7
 Delaware 3
 "District of Columbia" 3
 Florida 29
 Georgia 16
 Hawaii 4
 Idaho 4
 Illinois 20
 Indiana 11
 Iowa 6
 Kansas 6
 Kentucky 8
 Louisiana 8
 Maine 4
 Maryland 10
 Massachusetts 11
 Michigan 16
 Minnesota 10
 Mississippi 6
 Missouri 10
 Montana 3
 Nebraska 5
 Nevada 6
 "New Hampshire" 4
 "New Jersey" 14
 "New Mexico" 5
 "New York" 29
 "North Carolina" 15
 "North Dakota" 3
 Ohio 18
 Oklahoma 7
 Oregon 7
 Pennsylvania 20
 "Rhode Island" 4
 "South Carolina" 9
 "South Dakota" 3
 Tennessee 11
 Texas 38
 Utah 6
 Vermont 3
 Virginia 13
 Washington 12
 "West Virginia" 5
 Wisconsin 10
 Wyoming 3;

param Pop:=
Alabama 4802740
Alaska 722718
Arizona 6482505
Arkansas 2937979
California 37691912
Colorado 5116796
Connecticut 3580709
Delaware 907135
"District of Columbia" 617996
Florida 19057542
Georgia 9815210
Hawaii 1374810
Idaho 1584985
Illinois 12869257
Indiana 6516922
Iowa 3062309
Kansas 2871238
Kentucky 4369356
Louisiana 4574836
Maine 1328188
Maryland 5828289
Massachusetts 6587536
Michigan 9876187
Minnesota 5344861
Mississippi 2978512
Missouri 6010688
Montana 998199
Nebraska 1842641
Nevada 2723322
"New Hampshire" 1318194
"New Jersey" 8821155
"New Mexico" 2082224
"New York" 19465197
"North Carolina" 9656401
"North Dakota" 683932
Ohio 11544951
Oklahoma 3791508
Oregon 3871859
Pennsylvania 12742886
"Rhode Island" 1051302
"South Carolina" 4679230
"South Dakota" 824082
Tennessee 6403353
Texas 25674681
Utah 2817222
Vermont 626431
Virginia 8096604
Washington 6830038
"West Virginia" 1855364
Wisconsin 5711767
Wyoming 568158;

param Need:=
Votes        270;

end;

What is the least amount of land that will make you President?

Matthew Yglesias in the slate calculates what he thinks is the smallest area of land a candidate could win and still win this presidential election. Densely populated states will have more electoral college votes per square kilometer and so you can win the election while winning a relatively small surface area of America.
His reasoning is 'I started with a list of states in order of population density. So you have DC, then New Jersey, then Rhode Island, then Massachusetts, and so forth. Eventually you get a set that wins you the electoral college. Except the bloc of the 18 densest states gives you 282 electoral votes—way more than you need. Eliminate Michigan, the 18th densest, and you have 266 electoral votes. So then you can round things out with little New Hampshire's four electoral votes and you have your winning map'.

I checked this allocation with the GLPK program below. I used the electoral votes listed here and the state areas listed on wikipedia This gets 270 votes with an area of 1625012km². US states + DC is an area of 9826630km² so 16.54% of the US could win an election.

The states are Delaware, District of Columbia, Hawaii, New Hampshire, Rhode Island, Connecticut, Maryland, Indiana, Massachusetts, Virginia, New Jersey, North Carolina, Ohio, Illinois, Pennsylvania, Florida, New York, California. My map is here

Matthew Yglesias' map is the same so he did find the optimal solution by hand.

/*code to find the least land area to get 270 votes. Run with 'glpsol -m election.mod -o out'
*/
/* sets */
set STATES;
set NEED;

/* parameters */
param VotesTable {i in STATES, j in NEED};
param Cost {i in STATES};
param Need {j in NEED};


/* decision variables: x1: alabama, x2: , x3: , x4:  x51: Wyoming*/
      var x {i in STATES} binary >= 0;

/* objective function */
      minimize z: sum{i in STATES} Cost[i]*x[i];

/* Constraints */
s.t. const{j in NEED} : sum{i in STATES} VotesTable[i,j]*x[i] >= Need[j];


/* data section */
data;

set STATES :=  Alaska Delaware "District of Columbia" Montana "North Dakota" "South Dakota" Vermont Wyoming Hawaii Idaho Maine "New Hampshire" "Rhode Island" Nebraska Nevada "New Mexico" Utah "West Virginia" Arkansas Kansas Mississippi Connecticut Iowa Oklahoma Oregon Kentucky "South Carolina" Alabama Colorado Louisiana Arizona Maryland Minnesota Wisconsin Indiana Missouri Tennessee Washington Massachusetts Virginia Georgia "New Jersey" "North Carolina" Michigan Ohio Illinois Pennsylvania Florida "New York" Texas California;
set NEED := Votes;

param VotesTable: Votes:=
 Alabama 9
 Alaska 3
 Arizona 11
 Arkansas 6
 California 55
 Colorado 9
 Connecticut 7
 Delaware 3
 "District of Columbia" 3
 Florida 29
 Georgia 16
 Hawaii 4
 Idaho 4
 Illinois 20
 Indiana 11
 Iowa 6
 Kansas 6
 Kentucky 8
 Louisiana 8
 Maine 4
 Maryland 10
 Massachusetts 11
 Michigan 16
 Minnesota 10
 Mississippi 6
 Missouri 10
 Montana 3
 Nebraska 5
 Nevada 6
 "New Hampshire" 4
 "New Jersey" 14
 "New Mexico" 5
 "New York" 29
 "North Carolina" 15
 "North Dakota" 3
 Ohio 18
 Oklahoma 7
 Oregon 7
 Pennsylvania 20
 "Rhode Island" 4
 "South Carolina" 9
 "South Dakota" 3
 Tennessee 11
 Texas 38
 Utah 6
 Vermont 3
 Virginia 13
 Washington 12
 "West Virginia" 5
 Wisconsin 10
 Wyoming 3;

param Cost:=
 Alabama 135765
 Alaska 1717854
 Arizona 295254
 Arkansas 137732
 California 423970
 Colorado 269601
 Connecticut 14357
 Delaware 6447
 "District of Columbia" 177
 Florida 170304
 Georgia 153909
 Hawaii 28311
 Idaho 216446
 Illinois 149998
 Indiana 94321
 Iowa 145743
 Kansas 213096
 Kentucky 104659
 Louisiana 134264
 Maine 91646
 Maryland 32133
 Massachusetts 27336
 Michigan 250494
 Minnesota 225171
 Mississippi 125434
 Missouri 180533
 Montana 380838
 Nebraska 200345
 Nevada 286351
 "New Hampshire" 24216
 "New Jersey" 22588
 "New Mexico" 314915
 "New York" 141299
 "North Carolina" 139389
 "North Dakota" 183112
 Ohio 116096
 Oklahoma 181035
 Oregon 254805
 Pennsylvania 119283
 "Rhode Island" 4002
 "South Carolina" 82932
 "South Dakota" 199731
 Tennessee 109151
 Texas 695621
 Utah 219887
 Vermont 24901
 Virginia 110785
 Washington 184665
 "West Virginia" 62755
 Wisconsin 169639
 Wyoming 253336;

param Need:=
Votes        270;

end;

Thursday, October 04, 2012

How I Memorise a Poem

This post is not about the why. The podcast "Inscribe the poem on yourself" and the later one "Trying to Impress Literary Types" describes some of the benefits of memorising poetry. There are times in your life when words fail you and then it is handy to have quick access to someone else's. 'A good solid poem in your cortex can be almost like ballast in a ship’s hold. If turbulent mental activity surges, speaking a poem to oneself can be a way to even out the waves.'

Poetry in a set form is much easier to memorise than free verse. The epic poems of The Táin and the lliad used strict metrics to aid memorisation. This is because they came from a time where they were not written down but memorised and so any technique that made them easier to recall was vital.

When you know how many words of syllables should be left in the line so many blanks are already filled inserting the rest is easier. Christopher Hitchens describes this with

"A preferred form was the limerick, of which I still have a hundred or so hard-wired into my cortex in case of need (or opportunity). Not all these need be filthy—I have a special reserve of clean ones, some without even a double entendre—but all of them do need to follow a certain simple but exacting scheme. It depresses me beyond measure that most people I meet cannot even recite, much less compose, this gem-like form. Nor can any student in any of my English classes produce a single sonnet of Shakespeare: not even to get themselves laid (the original purpose of the project)."

This is the first thing I do when trying to learn a poem

1. Get the text of the poem. Say Shakespeare's Sonnet 12 (which is on the Leaving cert)

SONNET 12

When I do count the clock that tells the time,

And see the brave day sunk in hideous night;

When I behold the violet past prime,

And sable curls all silver'd o'er with white;

When lofty trees I see barren of leaves

Which erst from heat did canopy the herd,

And summer's green all girded up in sheaves

Borne on the bier with white and bristly beard,

Then of thy beauty do I question make,

That thou among the wastes of time must go,

Since sweets and beauties do themselves forsake

And die as fast as they see others grow;

And nothing 'gainst Time's scythe can make defence

Save breed, to brave him when he takes thee hence.

2. Find the rhythm and any rules the poem follows. Free verse can sometimes lack these but most poems clip along at a particular pace. Shakespeare's Sonnet XII is in an iambic beat with an unstressed syllable followed by a stressed syllable. The rhythm can be written as:

da DUM / da DUM / da DUM / da DUM

When I / do COUNT / the CLOCK / that TELLS / the TIME

3. See if there is an audio of the poem being read. Youtube is a good place to look but a google search for " audio" will usually turn up something. Here is Sonnet 12. I then rip the audio file from youtube and stick it on my phone.

4. If there are any words I do not understand I look them up now.

5. Next I follow the memorisation technique described here

Read a line of the poem and say it back to myself. Ideally do this outloud.

The more senses involved in a memorisation the better. Neurons that fire together wire together and the more bits of your brain you can get in on the task the better. Imagine any easily visualised objects mentioned in the poem

Try feel any emotions the line conveys.

Listen to the audio of this line.

6. Now do this again for the next line and so on.

7. Now go back through the poem but two lines at a time.

8. Do this again for 3,4 and 5 lines at a time.

9. Listen again to the full poem and then repeat it completely.

10. Repeat it again to myself and reread what I have learned before I go to sleep.

11. Repeat the poem and read it again it the next day, a week and a month later. This review advise is fairly common but I cant find research that shows this gap is optimal.

There are other techniques to memorize poems. Competitive memorizers tend to use the method of loci described in my post on memorising cards. In this a poem becomes a walk and each line is a particular location where you imagine something happening. There is a good explanation of how this technique works for poems here

In my opinion it is worth practicing learning poems by rote initially as it will at least improve your ability to memorise small chunks of text which will help even loci based methods. A friend and I are working on a way to make this simpler system available easily to anyone with a phone.

Monday, September 24, 2012

The Loom of Language

Tim Ferriss recommends one of my favorite books as a great way to learn languages (36:43 into his long now talk audio here). Tim Ferriss wrote the wildly popular self help books "the 4 hour work week" and "4 hour body" and his favorite language book can only be bought second hand.

This book about European languages written in 1942 by Frederick Bodmer is sadly hard to get nowadays. Bodmer had Chomsky's job in MIT before he retired. Which in the linguistics world is a bit like being the opening act for Jesus.

The book was written with the encouragement of and edited by his friend Lancelot Hogben. Hogben had a vaguely Victorian project, entitled optimistically "The Age of Plenty", to write good books in a few different areas so that people could educate themselves.

Mathematics for the Million (1936) and Science for the Citizen (1938) were the first two and "History for the Homeland" the fourth. Hogben seems to have been quite a Victorian kind of guy with a certain smoking jacket cool

Hogben was mocked by Orwell for his staid writing style. But if you judge a man by the quality of his detractors Orwell is a great one to have. To Hogben's eternal credit he fought against the cloven hoof of eugenics at a time when it was horribly popular.

The loom of langauge is like a secret handshake among language nerds. When you start talking to someone else who has read it a few hours later you escape the conversational black hole where you've been ignoring everyone else. I can't keep a copy as I keep giving mine to people. The loom of langauge 'has always been my favorite book about learning languages' is just one example of the gushing reviews the book gets on language blogs.

'It is the only book that actually teaches languages instead of simply teaching how to learn languages.' The first section is a history of human language and alphabets. Next is a morphology and syntax of several languages and then a classification of languages throughout the world. The second section is mainly about how to learn vocabulary lists by taking advantage of similarities among languages particularly regular sound shifts from one language to another.

It is an old fashioned book in the sense it deals with European languages and pretty much ignores the fascinating languages of the rest of the world. But what it does do is lay out through history and logic where German and Latin languages come from. How they work and how they morphed and combined into English.

Mathematics for the million was really influential in the past. For example there is a great post here about how the book inspired him in his career. However nowdays the book seems not to be popular in the same way as the loom of language is. Still the mathematics book in 'the age of plenty' series is a classic and it should be available to more people.

This book was published during the second world war with the express intention that language learning could bring people together. The general principle of the age of plenty series was that we could and would learn if we just had access to the knowledge. I still think that is true and with smartphones, Gutenberg, librivox and other cool projects this really could happen. Hogben and Bodmer went to an huge effort to write this book and found 700+ pages during wartime rationing to try get us to talk to each other.

Maybe now 70 years after this book was written we can achieve its aim far easier than the authors imagined. If the book was released into the public domain everyone with a smartphone or computer could read it. Which is an audience of billions and much more than the million the mathematics book was written for or the '1,800 million people on this globe' when Loom was written. With smartphones we can achieve Bodmer's books aim easier than he could have imagined.

Sunday, September 23, 2012

Memorise a Pack of Cards

Memory is about imagination. Try as vividly as possible to imagine this. Sitting on your bed is a giant hen. She is chirping around scratching at the sheets. On the windowsill is a big pile of cash, smell the new banknote smell. At the door a beehive buzzes around. One of them stings you on the arm, imagine the pain. In your minds eye go to the nearest toilet to your bedroom and see Adolf Hitler sitting on the toilet ranting away. In the sink a tiny Jack White plays Seven Nation Army.

Each of these items is actually a playing card and you are well on your way to memorising the order of a pack of cards.

Derren Brown's in his book 'Tricks of the Mind' describes a peg system where each number reminds him of a sound.

In his system

0 sounds a bit like z

1 looks like an L

2 is n as it has has two down strokes

3 is m as it has three downstrokes

4 is r because it sounds like fouR

5 is v because fiVe

6 looks like a b

7 looks like a T

8 has a gh/ch/j sound in it

9 looks like a g

Now when you need to remember a number the digits become sounds and those sounds become words. With cards the suit of the card becomes the start of the word. So hearts H, Clubs start with a C, Spades S and Diamonds are words starting with a D.

A hen is 2 of hearts

Cash is 8 of clubs

Hive is five of hearts

Adolf Hitler is Ace of Hearts each of the picture cards I represent with a person. Kings are people with the surname King.

Jack White is the Jack of hearts. Jacks are younger men. Queens are famous queens or actresses you have played the queen.

It is vital you make the memory as vivid as possible. Change the objects size so it looks ridiculous like a Tiny Jack White in a sink. Have the thing doing something that produces an emotional reaction like the bee stinging you. Rude things are really memorable so use them when you can.

I won't list out exactly what words go with what cards. It is better you pick words that mean something to you. The six of clubs could be Cob a Cub a Cube or a Cab pick whichever one is most vivid to you. When you have a word for each card buy a deck and write the word on each card. Then shuffle the deck and walk around the house putting each card down in a location. Talk out loud about what is happening. "The Dinosaur in the Shower is really scared of the water and is screeching as it tries to escape". Dinosaur is 2 of Diamonds for me. When every card has been placed start back at the first card and try and list what the next one is before you pick it up. Name the object do not worry about the card it represents yet, you will remember the connection in time.

People have great memories for locations. You can probably describe the route and details along it of a walk you took on holiday years ago but be unable to think of anything you discussed that day. Great memorisers use this ability in what is called the method of loci to put things to be remembered along a known walk. Everyone knows their house well so that is a good location to use to practice memorising a deck of cards. To memorise a poem or mathematical constant a walk near your house might be ideal.

This method of remembering things is ancient. The story goes that about 500BC Simonides of Ceos snuck out of a banquet for a sneaky smoke when the building collapsed crushing to death everyone inside. Making the best of a bad situation Simonides realised he could remember where everyone was sitting and point out the spot so the realtives could dig out their loved ones. This trick of using locations for memory was borne out of this unlucky event.

Building were always falling on the Greeks. Take the case of histories worst loser. Kleomedes of Astypalaia was an Olympic boxing champion in the early fifth century BC. In 496 he killed his opponent at Olympia with a foul blow. Because of this fouling offence (not because of the death of his opponent which was considered fine) the Olympics judges took away his victory.

Kleomedes became depressed. On his return to Astypalaia he destroyed a school by pulling down the pillar which kept up the roof in a flash of insanity and so killed all sixty children present. The inhabitants of the city formed a mob and tried to kill him.

He hid in the temple of Athena, from where he disappeared miraculously. His confused pursuers consulted the oracle of Delphi and were told that Kleomedes had become a hero. From then onwards he was honoured with sacrifices. Think about that one next time someone talks about how clever the ancient Greeks were or how mass killings are an entirely new phenonomen. Also this story does illustrate quite how much random stuff I am willing to put into a blogpost.

Memorising large amounts of data even random looking data is quite easy with a system to turn the data into something concrete. I recommend Derren Browns book on how to improve your memory and Joshua Foer's 'Moonwalking with Einstein' a few simple techniques and a few minutes practice everyday can have you performing feats most people find unbelievable.

What is on your bed? If you remembered a giant Hen you are well on your way to memorising a deck of cards.

Thursday, August 09, 2012

How byzantine was the Byzantine civil service?

Byzantine is a descriptive adjective for a very complicated bureaucracy

Byzantine: Highly complicated; intricate and involved: a bill to simplify the byzantine tax structure.

Byzantine: (of a system or situation) Excessively complicated, typically involving a great deal of administrative detail - Byzantine insurance regulations

John D Cook talks here about how the Roman bureaucracy was less than the size of Heuston's. How does the Roman Empire figure compare to a famously bureaucratic empire?

According to recent historians the Byzantine empire did not have that many beaueacrats '‘In terms of staff numbers the Byzantine bureaucracy was relatively small: a recent estimate for the ninth century central civil service places the number of core staff at five to six hundred men, split between thirteen different bureaux or departments of state’'

This was in an empire of around 7 million people. France has about 90 civil servants to 1000 people. Which by the per capita numbers in more than 1000 times the dictionary definition of bureaucratic.

Monday, July 30, 2012

Sweet Suffering Hell

GOVERNMENT health experts have ruled out banning fizzy drink vending machines in schools because they are making too much money. Instead they are recommending increased taxes on fizzy drinks in a bid to reduce consumption. ... Having ruled out banning the vending machines, the special group has suggested making sugary drinks more expensive by piling on extra taxes in the next Budget. The cost of a regular soft drink could increase by as much as 7pc.
So sugar is bad and should be taxed.
Sugar companies were among the largest beneficiaries last year of Europe's Common Agricultural Policy payments, according to statistics made public Saturday by most EU member countries. In France, in the year between October 2008 and 2009, three sugar companies received the top subsidies: Tereos (117.9 million euros/156.8 million dollars), Saint Louis Sucre (143.7 million euros) and Cristal Union (57.2 million euros). In Spain a sugar company also occupied first place, with Azucarera Ebro receiving 119.4 million euros. The world's leading sugar company Sudzucker came second in Germany's list with its 42.9 million euro subsidy, behind the dairy company Nordmilch (51,1 million euros).
So sugar is good and should be subsidised. Penn and Teller summed up the problem with "They spend our money to make soft drinks cheap. And now the same government wants more of our tax money to make soft drinks more expensive. Does anyone else think this is incredibly fucked up?"

Saturday, July 14, 2012

Theatre Riots In Irish History

We detail the events that marked a new low in Ireland’s relationship with drink, drugs and casual violence
Were the events at the Phoenix Park last weekend uniquely bad in Irish history? I am not arguing here that they were acceptable, I just want to see if they were a uniquely low level? Ireland was fairly well known in the past of excessive drinking, fighting and sex at festivals. The sedate suburb of Donnybrook and its poshest shop gave English eponymous nouns for excess

Donnybrook: an inordinately wild fight or contentious dispute; brawl; free-for-all.

Donnybrook Fair: a fair which until 1855 was held annually at Donnybrook, County Dublin, Ireland, and which was famous for rioting and dissipation.

If the original Donnybrook fair rave can give rise to the fancy shop of the same name maybe in 2200 Ikea will be called the Swedish House Mafia. But an Oxegen squared in the 1800's doesn't prove much. Were Irish concerts generally well behaved?

THE BEST PLACE for a good riot is a theatre. The left likes to imagine rioting as the oppressed rising up against the oppressors, and the right sees it as evidence of the moral decay of society. But there’s a long history, in Dublin and London, of theatrical rioting. Indeed, to my knowledge, the longest and most sustained riots in both cities in the past three centuries happened in and around theatres. This surely says something about the nature of both theatre and riots.

According to the history of Irish theatre the Smock alley riot of 1747 suggests not. Just down from the Phoenix Park Smock Alley was where Trinity Toffs seemed to go to feel superior. Edmund Kelly a student went backstage told an actress Mrs Dyer that he would 'do what her husband Mr Dyer, had done to her', using the obscene expression. Another young Trinity student of the time, Edmund Burke, saw Kelly put his hands 'under the actress's petticoats'. Edmund Burke the intellectual founder of conservatism is now considered venerable enough to have a statue outside Trinity. The manager Sheriden kicked Kelly out but because he was not a 'gentleman' Kelly demanded an apology. Rioting shut the theatre and spread to the streets. Days of riots followed over whether a theatre owner stopping a girl getting raped could say "I am as good a gentleman as you are” about a would be rapist.

In 1821 the Bottle Riots also started in a theatre. 'Orange sentiment which, in the heated condition of public opinion, had become dangerous, and he prohibited the dressing of the statue of William III. on College Green on July 12, then regarded as an annual demonstration. This was followed by a riot, afterwards known as "the bottle riot," when an organized body of Orangemen packed the pit and gallery of the Dublin theatre when the Marquess was present and with cries of, "Down with the Popish Lord-Lieutenent" they flung missiles, one of which was a large whiskey-bottle, at the royal box'

Next up in the entirely 21st century phenomena of fights at concerts is the 1851 riot in the Mechanics theatre

The Beatles song "Being for the benefit of Mr Kite" was inspired by one of the posters of Pablo Fanque the proprietor the night of the riot. 'playgoers threatened to riot and destroy the theatre in protest to the winner of a "conundrum" contest' which puts loud words at someone using a mobile during the pub table quiz into perspective.

Synge’s The Playboy of the Western World and O’Casey’s The Plough and the Stars also incited riots. The Playboy riots were incited by Arthur Griffith the President of Dáil Éireann. He described the play as "a vile and inhuman story told in the foulest language we have ever listened to from a public platform". O'Casey's riot was seen by Yeats as again showing how uncouth Irish concert goers were "You have disgraced yourself again, is this to be the recurring celebration of the arrival of Irish genius? Wilde's Salome caused a bit of a ruck as well but does not seem to have descended into open fighting.

There is a long history of people having fights at concerts. History seems to better remember those with political roots unlike what happened in Phoenix Park. To decide this is a particularly bad incident requires us to at least look at the history of these riots.

Tuesday, June 19, 2012

The Fairest Way to Pick a Team

What is the best way to pick a team? As kids we would always strictly alternate between teams so team 1 had first team 2 the second pick and then team 1 again etc.

Most things you can measure about people are on a bell curve. A small number of people are bad, most are in the middle and a few are good. There are a few good known metrics of ability. None are perfect, there is no one number that can sum up ability. The simpler the sport the more one metric can tell you, in cycling VO2 max is a very good indicator. Whereas in soccer VO2 max, kicking speed, vertical leap, number of keep me ups you can do etc could all measure some part of football ability.

So say there was one good metric for a task and teams were picked based on this. Is the standard strict alteration, where Team 1 picks then Team 2 alternating, fair? Fair here meaning both teams end up with a similar quality.

I wrote a program in R Package. Not because I know it but because it is perfect for this sort of problem. If you are picking 5 a side and the best player left is always picked by a team how much better is the first picker?

Strict Alteration the code is

players<-10
#create a vector
z <-0
#run 10000 simulations
for(i in 1:10000)
{
#rnorm generates a normally distributed dataset
# this one has 10 elements. A mean of 100 and a std of 12
#sort puts the biggest at the end
x <- c(sort(rnorm(players, mean=100, sd=12)))
# for each simulation take every second one and put it into a different team. 
# Give one team even and one odd 
z <- append(z, sum(x[c(1,3,5,7,9)]-x[c(2,4,6,8,10)]))
}
print(sd(z))
#get the average difference between the two teams
print(mean(z))

> print(sd(z))

[1] 8.794016

> print(mean(z))

[1] -22.59786

IQ has an average of 100 and a standard deviation of 12. IQ isn't used much to pick soccer teams but many things follow a similar pattern. In software development IQ wouldn't be the worst metric to pick a team on and agile teams are supposed to have between 5 and 9 members. So think of this as people picking teams of developers.

In this simulation Team 1 ends up with .225 of a person advantage. The more people on the team the greater advantage the first picker gets.

18 players

> print(sd(z))

[1] 8.287164

> print(mean(z))

[1] -25.52077

16 players

> print(sd(z))

[1] 8.20681

> print(mean(z))

[1] -25.00685

Would another way of picking the teams be fairer?

Balanced Alteration from the Win Win Solution by Brams and Taylor 'strict alteration can give a big boost to the first chooser when there are only two parties. What we need to do is reduce this advantage of the first chooser by amending strict alternation.'

The balanced alteration allows the captains to be first chooser in turn.

This is

Team 1 Team 2 Team 2 Team 1 Team 1 Team 2 Team 2 Team 1....

the code is

players<-10
#create a vector
z <-0
#run 10000 simulations
for(i in 1:10000)
{
#rnorm generates a normally distributed dataset
# this one has 10 elements. A mean of 100 and a std of 12
#sort puts the biggest at the end
x <- c(sort(rnorm(players, mean=100, sd=12)))
# for each simulation take every second one and put it into a different team. 
# Give one team even and one odd 
z <- append(z, sum(x[c(1,4,5,8,9)]-x[c(2,3,6,7,10)]))
}
print(sd(z))
print(mean(z))

> print(sd(z))

[1] 9.757417

> print(mean(z))

[1] -9.04198

This method looks better than the standard strict alteration.

Thinking about the bell curve though is would make sense if the team that got the best player got the worst, and the second best the second worst etc. This should even up the teams well. The code for this is

players<-10

#create a vector
z <-0
#run 10000 simulations
for(i in 1:10000)
{
#rnorm generates a normally distributed dataset
# this one has 10 elements. A mean of 100 and a std of 12
#sort puts the biggest at the end
x <- c(sort(rnorm(players, mean=100, sd=12)))
# for each simulation take every second one and put it into a different team. 
# Give one team even and one odd 
z <- append(z, sum(x[c(2,4,6,7,9)]-x[c(1,3,5,8,10)]))
}
print(sd(z))
print(mean(z))

> print(sd(z))

[1] 9.3498

> print(mean(z))

[1] 3.027536

This has a better average difference. The fact the difference is as high as it is makes me think I may have a bug in my code.

Kids to implement this method would have to alternate picking a player until they were about to pick the middle player in their team Then Team 2 would get a second pick. This sounds almost practical.

When you played a sport (particularly soccer) as a kid what rules did you pick teams by? Can you think of a better algorithm now?

Monday, June 18, 2012

Baby's First Hack

Maternity hospitals put these security bracelets on your baby. The thing is because the baby loses so much weight and generally changes so much after their birth they keep falling off. The nurses get annoyed by them because they fall off so often but they generally seem not to mind them too much. My video below shows how easy they are to remove

And that is without trying to cut off the tag or shield it from the radio receiver in some way. Schneier has a good post on how these tags just because they don't actually reduce risk of harm much are still valuable. I'll quote it at length because it is so good.

'While visiting some friends and their new baby in the hospital last week, I noticed an interesting bit of security. To prevent infant abduction, all babies had RFID tags attached to their ankles by a bracelet. There are sensors on the doors to the maternity ward, and if a baby passes through, an alarm goes off.

Infant abduction is rare, but still a risk. In the last 22 years, about 233 such abductions have occurred in the United States. About 4 million babies are born each year, which means that a baby has a 1-in-375,000 chance of being abducted. Compare this with the infant mortality rate in the U.S. -- one in 145 -- and it becomes clear where the real risks are.

And the 1-in-375,000 chance is not today's risk. Infant abduction rates have plummeted in recent years, mostly due to education programs at hospitals. So why are hospitals bothering with RFID bracelets? I think they're primarily to reassure the mothers. Many times during my friends' stay at the hospital the doctors had to take the baby away for this or that test. Millions of years of evolution have forged a strong bond between new parents and new baby; the RFID bracelets are a low-cost way to ensure that the parents are more relaxed when their baby was out of their sight.

Security is both a reality and a feeling. The reality of security is mathematical, based on the probability of different risks and the effectiveness of different countermeasures. We know the infant abduction rates and how well the bracelets reduce those rates. We also know the cost of the bracelets, and can thus calculate whether they're a cost-effective security measure or not. But security is also a feeling, based on individual psychological reactions to both the risks and the countermeasures. And the two things are different: You can be secure even though you don't feel secure, and you can feel secure even though you're not really secure.

The RFID bracelets are what I've come to call security theater: security primarily designed to make you feel more secure. I've regularly maligned security theater as a waste, but it's not always, and not entirely, so.'

In Praise of Security Theater

I agree with his description. The tags from a rational measurable security point of view silly, everyone if they think about it can tell their silly. But they reassure new parents of a non rational but still present fear. And that means the tags probably are not silly.

Thursday, May 24, 2012

That's no way to kill an elephant

It seemed dreadful to see the great beast Lying there, powerless to move and yet powerless to die, and not even to be able to finish him. I sent back for my small rifle and poured shot after shot into his heart and down his throat. They seemed to make no impression.
wrote George Orwell in Shooting an Elephant

I read Orwell's essay ages ago and figured it pretty much had the elephant execution story covered. Looking back I was amazingly naive about quite how many bizarre possibilities for dispatching pachyderms existed.

1. Hanging. Erwin, Tennessee thought it was a good idea to hang Mary the elephant.

2. Electrocution. In order to show that his DC current was a great idea Edison decided to show AC was really dangerous. So he got Topsy the elephant and electrocuted her. Which is a massively dick move anyway you cut it. He also invented and sold the electric chair to execute criminals as a similar negative publicity campaign against AC current. The video he produced is here

3. Shooting. Tyke (elephant) Police fired 86 shots at Tyke, who eventually collapsed from the wounds onto a blue car and died. This video is of Tyke's attack and later shooting. I am not going to embed it as it is frankly horrifying.

4. Harpooning. Chunee "Kneeling down to the command of his trusted keeper, Chunee was hit by 152 musket balls, but refused to die. Chunee was finished off by a keeper with a harpoon or sword". Having to harpoon an elephant has to be the definition of a hard day at work.

Not execution but still weird

5. Lethal Injection of LSD. Tusko was a 14 year old who weighed 32000kg. Some scientists decided to give him enough LSD to get 3000 people off their mash. This mammoth dose killed him under two hours later. The scientific paper that came out of this mess is "Lysergic Acid Diethylamide: Its effect on a Male Asiatic Elephant."

6. Lightening "Norma Jean, struck by lightning, c. 1972, during a circus parade in Oquawka, Illinois. She was buried where she died, and a marker now lies on this spot."

7. Drowning (ish). Dan Rice was a sort of PT Barnum character. He ran loads of stunts to advertise his various travelling circus events. One of these for one poor elephant was "In August 1860, Rice had Lallah Rookh swim across the Ohio River in Cincinnati, Ohio to drum up publicity for his new "Monster Show." It took her 45 minutes to swim across the river. A month later, Lallah died of a fever brought about by her swim".

While on this elephantine swimming subject my favourite theory about the Loch Ness Monster is that it was a swimming circus elephant. And once the mistake was made the circus owner used the publicity to drum up business 'In 1933 a circus promoter in the area—acting perhaps on inside information that the monster was really a big top beast—offered a rich reward for Nessie's capture'

8. Burning. In 1681 an elephant was burned to death in Dublin. How the poor creature got to Dublin at that time is difficult to imagine. But then to have your crate set on fire is just tragic. The autopsy revealed information that later helped show elephants had evolved from an aquatic animal. "An anatomical account of the elephant accidentally burnt in Dublin on Fryday, June 17 in the year 1681" is Allen Mullen's description of the autopsy

And the weirdest one, and I realise that is saying something, is not an execution of elephants but by elephants. Most of the elephants killed listed above had killed a person. But elephants were once a really common. Apparently death by Nelly was wildly popular from prehistoric times up until the late 1800s. Execution by elephant is an incredible wikipedia page, hard to extract from but worth reading through.

Because elephants are so easy to train and because an elephant standing on your head was such a gruesome way to die most south Asian countries seemed to practice it.

I do not know what the wide an varied history of death of and by elephant tells us. They are all pretty tragic tales. Recently an elephant escaped in Cork . Then later crushed one of the circus workers. It seems the same sort of issues that killed Chunee, Mary, Tyke and many people who have been killed by performing elephants still exist and that more than these historic stories is a tragedy.