Big data. What is it and why should you care? Just by starting to watch this video, you've added new data to your profile in somebody's database.
That profile will be used to target you for adverts or political campaigns or to predict what people like you will do in the future. Whether you get a job interview or go to jail could depend on the data other people have collected about you. So you might want to spend five minutes learning a bit more about how it works and what kind of world it's building around us.
If you stop watching now, by the way, that will also produce data. Probably pushing your profile's impulsivity score up by 0. 7 points, meaning that in five years time some recruiting algorithm will reject you for that job as an astronaut.
So stick around while I tell you about big data. Data is nothing new; it's just information in a transferable form. About 30,000 years ago, in a cave in central Europe, somebody carved 57 notches into a wolf shin bone in groups of five like you tally something today.
That's data. We don't know what they were counting or why but we know there were 57 of whatever it was. That's the oldest example I know of digital data.
Digital because you could count it on your digits and it must have been quite a revolution in information technology back in the ice age. How big is big data? It's getting bigger so fast that if I gave you a figure, it would be out of date before the end of this video.
Ten years ago, Google was processing 20,000 terabytes of data every day. Last year, American retailer Walmart collected 2,500 terabytes of data on its customers every hour but though size does matter it's not the whole story. There are four other things that make big data special and I've made them into a backronym for you, using D A T A for data.
D is for dimensions, if you want the technical word or diverse or different if you prefer. By combining data of different types from diverse sources, you can get a multi-dimensional picture. For example, I asked neuroscientist Prof Paul Matthews if he was excited about using all the data from brain scans but he said that's just large data.
Big data is when I put those brain scans together with the patients' medical records, the postcodes where they've lived and the weather records for those postcodes. And then I asked a completely new question. For example, what's the relationship between the hours of sunshine they got and the way that their multiple sclerosis has progressed.
A is for automatic. The way data is collected by default, every time we do anything on our computers or with our bank cards or just move around carrying our smartphones. In fact, pretty much anything we do these days generates data which somebody can use.
Your smart meter, your car, your fitbit, we don't even notice it being collected most of the time. And T is for time. because the data is being collected pretty much in real time, those patterns can be projected forward to help make predictions about the future.
Things like when the trains will be busiest or how much electricity we'll need or how fast malaria will spread. And the last day is for A. I.
Artificial intelligence. Which is not really intelligence like a human is intelligent but an AI computer program that can find patterns in the data often using techniques like machine learning where you don't give it every step of the instructions, you just tell it to sort a from b, cat pictures from dog pictures maybe, or good job applicants from bad job applicants. People are doing lots of exciting things with big data.
Tracking insects to fight diseases like malaria and zika, predicting aircraft engine failures before they happen, finding new particles or new antibiotics. When the same techniques are applied to humans, things get more tricky. Should we use big data to predict the chance someone will reoffend, and sentence them accordingly?
Because that's already happening. Did big data change the results of recent votes by targeting voters with personalised adverts? Certainly it helped campaigners know which voters to target and what kind of people they were by combining different sources of data to build multi-dimensional profiles and target them accordingly.
That approach started with US President Obama's successful election campaign but would it be fair to say Obama only won because of big data? I don't think so, I think he offered something that voters wanted and big data just helped him reach the voters whose votes were most crucial. Politics is about the content of the messages not the platform that delivers them.
Should we worry about politics being reduced to a marketing campaign? Yes but that's a much broader problem than the technology. Big data has its limitations, especially when it comes to human beings, but it also has immense potential to improve human lives.
Let's just make sure we use it the right way and keep the humans in control.