Wednesday, March 04, 2015

Pandemics and the Internet

My last point pointed out how many people died in the flu pandemic in Ireland in 1918.

To take the example of Japan according to Gapminder 1918 had a huge drop. The other big drop is the is the second world war with all the bomb dropping and shooting that involved.

David Eagleman has an interesting point here about how the internet can help prevent and reduce epidemics.

"The internet can be our key to survival because the ability to work telepresently can inhibit microbial transmission by reducing human-to-human contact. In the face of an otherwise devastating epidemic, businesses can keep supply chains running with the maximum number of employees working from home. This can reduce host density below the tipping point required for an epidemic. If we are well prepared when an epidemic arrives, we can fluidly shift into a self-quarantined society in which microbes fail due to host scarcity."

Eagleman has a good short video on his thesis

The long term effects of a sudden switch to everyone avoiding each other for a month or two could be huge. These would include

Education: How schools help spread influenza has been studied. 'School closures during the 2009 influenza pandemic: national and local experiences'. If all the schools were closed for a few months and people would move to Khan Academy and other online education sites. After this period a switch back to a fully non online world won't happen

Telecommuting: In a similar way online telecommuting would become much more popular. After a quarantine lite period the use of online project management and other telecommuting tools would become mainstream.

Shopping: If you don't meet people in school or at work you meet them in the shops. Deliveries of shopping would be strongly encouraged in the event of a pandemic. They should probably even be sponsored. Shops would not get as popular again once everyone got used to online shopping.

Banking: No one likes queuing in the banks at the best of times. Even ATMs would become horrible grubby in a pandemic world. Everything including social welfare payments would try and avoid using the fomite that is cash.

Telemedecine: People with the influenza need to be kept away from people who are sick. People with other illnesses will have to be dealt with remotely to avoid them coming into contact with people with influenza.

Public Events: Public events parades, cinemas, bars and museums would be closed. By their nature these involve people. If public events are made cheaper to attend virtually that will reduce the need for people to meet up. By this I mean if Sky Sports is made free for a few months people will be less annoyed no fans are allowed attend the football game.

There are many people without access to the internet that would not be helped by the use of digital technologies. Hopefully the use of digital technologies will help focus more of the traditional public health effort on them.

When the next pandemic happens the internet will reduce the consequences. Many industries will also change but the main thing is to avoid the 50 to 100 million the last pandemic killed.

Tuesday, March 03, 2015

Tree Rings and Life Expectancy

Andy Kirk here has an interesting blog post on dendrochronology and visualisation literacy.
Here is an example of a tree ring visualisation showing how over time the tree grows and leaves down rings.







I am going to visualise another time series expected lifespan.
Gapminder uses a line graph to visualise life expectancy over time. I downloaded the life expectancy data from gapminder.































The interesting points here are the famine where the life expectancy dropped from an estimated 38.3 to 14.1. Also the 1918 flu epidemic causes an obvious drop from 55.3 in 1917 to 49.68 and back to 55.8 in 1919.
I use this data to create a graph using the code below. The idea is like tree rings except that instead of each line laid down in a particular year each line represents the life expectancy in that year.









































The size of each ring should be a good representation on the number of years people could expect to live in that year. However I just multiplied the years given by Gapminder *6 to give the number of pixels each circles radius should be. A proper visualisation has to be more careful not to distort the number than this. Roughly, living twice as long should look like a tree that is twice as big.

The code to create this graph in a canvas element of a webpage if here. So what do you think, does this visualisation show increase in lifespan in the last 200 years well?

Saturday, February 28, 2015

What Colour are Books?

What colour are famous books?

Colours Used I counted up the occurrence of the
colours = ["red","orange","yellow","green","blue","purple","pink","brown","black","gray","white", "grey"]
in Ulysses by James Joyce. I'll post the word count code soon

red 113, orange 12, yellow 50, green 98, blue 82, purple 17, pink 21, brown 59, black 146, gray 2, white 163, grey 68

Turned this count into a barchart with r package ggplot2 graphing package

library(ggplot2)
df <- data.frame(colours = factor(c("pink","red","orange","yellow","green","blue","purple", "brown", "black", "white", "grey"), levels=c("pink","red","orange","yellow","green","blue","purple","brown", "black", "white", "grey")),
                 total_counts = c(21.0, 113.0,12.0, 50.0, 98.0, 82.0, 17.0, 59.0, 146.0,163.0,70.0))
colrs = factor(c("pink","red","orange","yellow","green","blue","purple", "brown", "black", "white", "grey"))

bp <- ggplot(data=df, aes(x=colours, y=total_counts)) + geom_bar(stat="identity",fill=colrs)+guides(fill=FALSE)
bp + theme(axis.title.x = element_blank(), axis.title.y = element_blank())+ ggtitle("Ulysses Color Counts")
bp 

There is a huge element of unweaving the rainbow in just counting the times a colour is mentioned in a book. The program distills “The sea, the snotgreen sea, the scrotumtightening sea.” into a single number. Still I think the ability to quickly look at the colour palette of a book is interesting.

The same picture made from the colours in Anna Karenina by Leo Tolstoy, Translated by Constance Garnett


Translations
Translations produce really funny graphs with this method. According to Jenks@GreekMythComix the ancient Greeks did not really use colours in the same abstract way we did. Things were not 'orange' so much as 'the colour of an orange'. The counts in the Alexander Pope translation of the Iliad are
red 36, yellow 11, green 16, blue 9, purple 43, brown 4, black 69, gray 1, white 25, grey 6

Because colours are not really mentioned in the original Iliad these sorts of graphs could be a quick way to compare translations. Google book trends does not seem to show increased use of these colours overtime.

Sunday, February 22, 2015

2014 Weather Visualizations

There is a great tutorial by Brad Boehmke here on how to build a visualization of temperature in one year compared to a dataset. The infographic is based on one by Tufte

Met Eireann have datasets going back to 1985 on their website here. Some basic data munging on the Met Eireann set for Dublin Airport and I followed the rstats code from the tutorial above to build the graphs below. Wexford would be more interesting for Sun and Kerry for Rain and Wind but those datasets would not download for me.

The first is a comparison of the temperature in 2014 compared to the same date in other years.

Next I looked at average wind speed

And finally the number of hours of sun

These visualizations doesn't look like 2014 was a particularly unusual year for Irish weather. With 30 years of past data if weather was random (which it isn't) at random around 12 days would break the high and low mark for most of these measures. Only the number of sunny days beat this metric. The data met.ie gives contains every day since 1985

maxtp: - Maximum Air Temperature (C)

mintp: - Minimum Air Temperature (C)

rain: - Precipitation Amount (mm)

wdsp: - Mean Wind Speed (knot)

hm: - Highest ten minute mean wind speed (knot)

ddhm: - Mean Wind Direction over 10 minutes at time of highest 10 minute mean (degree)

hg: - Highest Gust (knot)

sun: - Sunshine duration (hours)

dos: - Dept of Snow (cm)

g_rad - Global Radiation (j/cm sq.)

i: - Indicator

Gust might be an interesting one given the storms we had winter 2014. I put big versions of these pictures here, here and here.

Wednesday, February 18, 2015

When were Wodehouse's stories set?

They seem to be sometime before the first world war. But I have never figured out when Wodehouse's books take place. From Something New by P.G. Wodehouse "Whoever carries this job through gets one thousand pounds.” Ashe started. “One thousand pounds–five thousand dollars!” “Five thousand.” Looking at historical exchange rates at www.measuringworth.com the rate stayed close to 1901s $4.87 up until the book was published in 1915. Because exchange rates did not change much at the time they do not help work out when a book was set.

Monday, February 09, 2015

Ancient Death Counts from Poems

What killed you in an ancient battle? Could we look at ancient epics for clues as to what killed people in fights at the time?

Pinker's better Angels of our Nature talks about how archeologists look at bones to see evidence of violent injuries that lead to death. The book talks about examinations of ancient bones unearthed in peat bogs and on long-forgotten battlefields. This bone examination will not tell us about injuries to people that do not cut bones.

The epic poems include the Iliad, Beowulf and the Táin. They were passed down from Bards who memorised them and travelled from place to place reciting them. Some recent research suggests that these epics may have some basis in history. The social network described for the characters usually resembles one real people would have. The social network between characters in Homer’s Odyssey is remarkably similar to real social networks today. That suggests the story is based, at least in part, on real events, say researchers. 'They discovered that while the networks associated with Beowulf and the Iliad had many of the properties of real social networks, the network associated with Tain was less realistic. That led them to conclude that the societies described in the Iliad and Beowulf are probably based on real ones, whereas the Tain appears more artificial.'

There is a site that examines and lists the deaths in the Iliad here. I extracted from there counts for each mentioned body part killed or wounded someone*.

head 21 
jaw 2 
cheek 1 
ear 1 
eye 1 
mouth 1 
nose 1 
skull 1  

neck 12 
throat 3  
  
collar 1 
chest 17 
shoulder 7 
collar bone 2 
nipple 1 
ribs 1 1 of these wound
 
arm 4 3 of these wound
hand 1 1 of these wound
  
back 11 
buttock 2 
  
gut 10 1 of these wound
stomach 5 
liver 3 
 
side 6 
  
thigh 2 1 of these wound
hip 1 
knee 1 
leg 1 
foot 1 1 of these wound
 
groin 2 
testicles 1

I totalled these by body region

Head  29
Neck  15
Upper Body 29
Arm  5
Back  13
Lower Body 18
Side  6
Leg  6
Groin  3 
Using Color brewer to pick out colours I made bins of 5
25-30 RGB 153,0,13
20-25 RGB 203,24,29
15-20 RGB 239,59,44
10-15 RGB 251,106,74
5-10  RGB 252,146,114
1-5   RGB 252,187,161
0     RGB 0,0,0
And I made this into this weird picture. I got the drawing from here. And the idea from Greek myth comix.

Any translation will have disagreements so the original source or as close as we can get to it should be used. Ian Johnson's is the basis for these counts.

Upper body counts for 73 of the deaths: arm, back, legs and lower body count for only 51. But gut, liver and stomach (and maybe buttock) do account for 18 deaths which seems like modern archeology could miss. For example many bog bodies seem to have been ritually killed which may have involved more beheading then the standard violent death.

It would be interesting to do a similar count with the other epic poems to see if liver injury is as common in them or whether that relates to Greek culture.

Anyway please comment what you think about this sort of quantitative analysis of stories that are meant to be entertainment. Can they tell us anything about the ancient world?

*Alcmaon's death I left out as no specific part is named.

Wednesday, February 04, 2015

Irish Alcohol Consumption in 2020

Drink blitz sees bottle of wine rise to €9 minimum 'Irish people still drink an annual 11.6 litres of pure alcohol per capita, 20pc lower than at the turn of the last decade. The aim is to bring down Ireland's consumption of alcohol to the OECD average of 9.1 litres in five years' time.'

What would Irish alcohol consumption be if current trends continue? Knowing this the effectiveness of new measures can be estimated.

The OECD figures are here. I put them in a .csv here.The WHO figures for alcohol consumption are here I loaded the data in R Package

datavar <- read.csv("OECDAlco.csv")

attach(datavar)

plot(Date,Value,

     main="Ireland Alcohol Consumption")
Which looks like this

Looking at that graph alcohol consumption rose from the first year we have data for 1960 until about 2000 and then started dropping. So if the trend since 2000 continued what would alcohol consumption be in 2020?

'Irish people still drink an annual 11.6 litres' I would like to see the source for this figure. We drank 11.6 litres in 2012 according to the OECD. I cannot find OECD figures for 2014. In 2004 we drank 13.6L the claimed 20pc reduction of this is 10.9L, not 11.6L. Whereas the 14.3L we drank in 2002 with a 20pc reduction would now be 11.4. This means it really looks to me like the Independent were measuring alcohol usage up to 2012.

Taking the data since 2000 until 2012.

newdata <- datavar[ which(datavar$Date > 1999), ]

detach(datavar)

attach(newdata)

plot(Date,Value,

     main="Ireland Alcohol Consumption")

cor(Date,Value)

The correlation between year and alcohol consumption since 2000 is [1] -0.9274126. It look like there is a close relationship between the year and the amount of alcohol consumed in that time. Picking 2000, near the peak of alcohol consumption, as the starting date for analysis is arguable. But 2002 was the start of this visible trend in reduced alcohol consumption.

Now I ran a linear regression to predict based on this data alcohol consumption in 2015 and 2020.

> linearModelVar <- lm(Value ~ Date, newdata)
> linearModelVar$coefficients[[2]]*2015+linearModelVar$coefficients[[1]]
[1] 10.42143
> linearModelVar$coefficients[[2]]*2020+linearModelVar$coefficients[[1]]
[1] 9.023077
> 
This means based on data from 2000-2012 we would expect people to drink 10.4 litres this year. Reducing to drinking 9 litres in 2020. So with current trends Irish alcohol consumption will be lower than 'the aim is to bring down Ireland's consumption of alcohol to the OECD average of 9.1 litres in five years'.

There could be something else that is going to alter the trend. One obvious one would be a glut of young adults. People in their 20 drink more than older people. If there are a higher proportion of youths about then the alcohol consumption will rise all else being equal. So will there be a higher proportion of people in their 20s in 5 years time?

The population pyramids projections for Ireland are here. Looking at these there seems to have been a higher proportion of young adults in 2010 than there will be in 2020 which would imply lower alcohol consumption

it would be interesting to see the data and the model that the prediction of Irish alcohol consumption are based on. And to see how minimum alcohol pricing changes the results of these models. But without seeing those models it looks like the Government strategy is promising current trends to continue in response to a new law.