How well can artificial intelligence (AI) understand language? Here at the House of Commons Library there are plenty of words or phrases we could test AI’s abilities on; ‘election’, ‘affordable housing’ or any other topic we provide briefings on. Instead, confident in the promise of AI, we gave it a more challenging task: understanding Brexit.
We let a deep learning algorithm loose on millions of words found in Hansard from Brexit-related debates in the House of Commons. ‘Deep learning’ refers to techniques that use artificial ’neural networks‘ to recognise patterns in large data sets. These networks are loosely inspired by the way neurons interact in the brain.
The algorithm learns about language by placing each different word or phrase in a particular position in a mathematical space. The ‘space’ that the algorithm uses can be made of several hundred dimensions that the algorithm experiments with. A few of these dimensions can end up making sense to us, for example size, gender or word tense, but most have no inherent meaning. Instead, it is overall patterns of location and distance between words that the algorithm discovers for itself.
For our limited human brains to be able to get a glimpse of what the AI has learnt, we collapse the hundreds of dimensions into two. The word map below is a highly simplified representation of what the algorithm has learnt. Similar words hang close together:
For example, ‘negotiations’, ‘deal’ and ‘agreement’ have clustered together (top left). Near the phrase ‘post-Brexit’ (bottom left), we can find the words ‘trade’, ‘new’ and ‘future’.
The word map above only shows a hundred words. To get the full details, we queried the algorithm directly.
We can zoom into the map and zero in on a word. For example, the ‘most similar’ or ‘closest words’ to ‘United Kingdom’ are ‘union’ and ‘UK’. We have asked the algorithm what the closest words to ‘Brexit’ are, and the response was, in order:
So, ‘Brexit’ is the outcome of a referendum and a transition.
Give us a guess
We can also ask the algorithm to predict a word, given some context. For example, when asked to predict what is going to accompany ‘house of’, the algorithm tells us to expect ‘Commons’. ‘Lords’ is only expected in 3% of cases: the AI has learnt not to expect members of the Commons to call ‘the other place’ by its proper name!
If asked to predict what word would accompany ‘Brexit’ in a speech, the algorithm’s best guess is ‘post’. So it’s not so much Brexit, but what comes next, that we’re really interested in.
Brexit + economic – political = ?
The model we have used (known as ‘word embedding’ or ‘word2vec’) can also add and subtract words from each other and get a result that makes sense, just as subtracting 1 from 3 gives you 2. A famous example is that by taking the word ‘queen’, subtracting ‘woman’ and adding ‘man’, the result you get is ‘king’.
In this vein, we asked the model to take ‘Brexit’, add ‘economic’ and subtract ‘political’. The result we got was ‘trade’.
Asking the reverse (‘Brexit’ less ‘economic’ plus ‘political’) returns ‘referendum’, ‘vote’ and ‘general election’. Politicians beware.
Similarly, a Brexit with ‘certainty’ and without ‘uncertainty’ results in the ‘future’, a ‘final deal’ and the ‘best possible’, while a Brexit with ‘no deal’ is a ‘hard Brexit’. A Brexit with an ‘agreement’ entails an ‘implementation period’. That is correct, AI colleague.
Now for the most difficult question of all: what is predicted to follow the words ‘Brexit means’?
It turns out that for AIs and humans alike, Brexit means ‘Brexit’.