This year marks the 70th anniversary of the death of Alan Turing.
Turing is considered the father of AI. In his 1950 paper "Computing Machinery and Intelligence", he starts with
"I propose to consider the question, 'Can machines think?'",
and introduces what is now called the Turing Test of machine intelligence.
At university my tutor was Richard Grimsdale, who built the first ever transistorised computer.
Grimsdale's tutor was Alan Turing
Making me a grand-tutee of Turing.
Coincidentally I went on to work in the department in Manchester where Turing worked and wrote that paper.
I worked on the 5th computer in the line of computers Turing also worked on, the MU5.
Moving to The Netherlands, I co-designed the programming language that Python is based on.
I was the first user of the open internet in Europe, in November 1988, 36 years ago!
CWI set up the first European open internet node (64Kbps!), and then two spin-offs to build the internet out in Europe and the Netherlands.
I organised workshops at the first Web conference at CERN in 1994
I co-designed HTML, CSS, XHTML, XForms, RDFa, and several others.
This talk emerges from a course I give every year at a Summer School at Oxford University, that includes an introduction to AI.
There are at currently two streams for doing AI
First, encoding.
Let's take an example: Noughts and Crosses/Tic Tac Toe.
I imagine everyone here can play it perfectly.
There are two ways you could encode noughts and crosses knowledge:
One way to encode the rules for playing the game is to specify conditions, and the response to those conditions:
One way to encode the rules for playing the game is to specify conditions, and the response to those conditions:
= if there is a line with two of your pieces and a space, take the space
One way to encode the rules for playing the game is to specify conditions, and the response to those conditions:
= if there is a line with two of your pieces and a space, take the space
= if there is a line with two opposing pieces and a space, take the space
One way to encode the rules for playing the game is to specify conditions, and the response to those conditions:
= if there is a line with two of your pieces and a space, take the space
= if there is a line with two opposing pieces and a space, take the space
This is actually rather hard!
This is a more objective way of encoding knowledge.
Although there are nearly 20,000 possible combinations of X, O and blank on the board, surprisingly, there are only 765 unique possible (legal) board positions, because of symmetries.
For instance, these 4 are all the same:
X | ||
X | ||
X |
X |
These 8 are all the same:
X | O | |
X | ||
O |
O | ||
X |
O | X |
O | X | |
O | ||
X |
X | ||
O |
X | O |
So we take each of the 765 boards, and then record which move we would take from that position.
Now we have encoded our knowledge of noughts and crosses, and the program can use that to play against an opponent.
As a matter of interest: what first move do you make? Centre, corner, or edge?
Again enumerate all 765 boards
Link each board with its possible follow-on boards. For instance, for the empty board (the initial state) there are only three possible moves:
to
X | ||
X | X | |
X | X |
X | ||
X | X | |
X |
From the centre position start move, there are only two possible responses:
X | ||
to
O | O | |
X | ||
O | O |
O | ||
O | X | O |
O |
And so on.
So we have linked all possible board situations to all possible follow-on positions.
Give each of those links a weight of zero, and play the computer against itself:
Repeatedly make a random move from the current board until one side wins, or it's a draw.
When one side has won, add 1 to the weight of each link the winner took, and subtract one from each link the loser took. (Do nothing for a draw)
Repeat, playing many, many times.
Playing against a human:
When it's the program's turn, make the move with the highest weight from the current board.
The two methods have different advantages and disadvantages.
Encoding
⊖ You can only encode what you know.
⊖ This may encode bias, or mistakes.
⊕ You can work out why a decision was made (or get the program to tell you why).
Learning
⊖ Only as good as the training material.
⊖ This can also encode hidden bias.
⊖ It can't explain why.
⊕ It may spot completely new things.
In both cases: it's not true intelligence.
In 1970 women were about 6% of the musicians in American orchestras.
Then they introduced blind auditions.
Now women make up a third of the Boston Symphony Orchestra, and a full half of the New York Philharmonic.
Bias, whether intentional or not, is everywhere.
In an attempt to even-out sentencing, software is used to determine sentences for crimes. Unfortunately, the software was trained on real-world, and thus biased, sentences.
Most native English speakers could tell you whether a sentence was right or not, but wouldn't be able to tell you why.
Why is it An Italian wooden serving dish, not A wooden serving Italian dish?
Why A lovely little bird, not A little lovely bird?
Why A silly old fool, not An old silly fool?
Most native English speakers could tell you whether a sentence was right or not, but wouldn't be able to tell you why.
Why is it An Italian wooden serving dish, not A wooden serving Italian dish?
Why A lovely little bird, not A little lovely bird?
Why A silly old fool, not An old silly fool?
It's because English has a fairly fixed order for its adjectives:
Opinion size age shape colour origin material purpose Noun
Unless we've studied the rules, we can't replicate them, even though we use them all the time.
It's
een halve minuut
but
een half uur
You may know the reason, but I know Dutch people who don't.
Saying all corners are the same in noughts and crosses is encoding knowledge.
So I wrote two versions of the noughts and crosses game, one without that knowledge, one with.
They play differently!
For instance, for the first move, the first one always goes for the centre, and the second always goes for a corner.
This is because a corner is actually a better first move, but in the version that doesn't know that corners are equivalent, the wins are divided over the four corners, so the centre looks like it's better.
In one we have encoded the intelligence that corners are equivalent. The learning program can't work that out for itself: it needs insight.
Humans are inclined to interpret things from a personal point of view.
We interpret this as the swans feeding the fish, but in fact the swans are just wetting the food, and the fish are taking advantage of what they drop. Swans feeding fish
In the 60's Joseph Weizenbaum created a program that imitated a Rogerian psychotherapist.
Men are all alike.
In what way?
They're always bugging us about something or other.Can you think of a specific example?
Well, my boyfriend made me come here.Your boyfriend made you come here?
He says I'm depressed much of the time.I am sorry to hear you are depressed.
It's true, I am unhappy.Do you think coming here will help not to be unhappy?
I need some help, that much seems certain.
In a classic example, a departmental secretary who was trying it out, asked Weizenbaum to leave the room, because she wanted to talk personally to it.
Recently a similar thing occurred with a Google employee claiming an AI chat program was sentient (because it said it was).
At least the program passed the Turing test.
ChatGPT is widely considered to exhibit intelligence, and then people get surprised when it makes enormous blunders.
But seeing it as intelligent is just our pareidolia at work.
We think that ChatGPT is intelligent, and then are shocked when it says something stupid.
This is an actual (trick) question from a maths exam. I asked ChatGPT to solve it:
If an orchestra of 120 players takes 40 minutes to play Beethoven's 5th, how long would it take 60 players?
Answer: It would take 60 players 80 minutes to play Beethoven's 9th Symphony.
The Turing Test is a test proposed by Alan Turing in 1950, to determine if a program has achieved intelligence.
You talk via a chat interface, and you don't know if you are talking to a person or a computer: you have to ask questions to try and work out which it is.
A proposed example of such a session:
You: In the first line of the sonnet which reads, "Shall I compare thee to a summer's day," would not "a spring day" do as well or better?
Them: It wouldn't scan.
You: How about "a winter's day." That would scan all right.
Them: Yes, but nobody wants to be compared to a winter's day.
ChatGPT is very verbose:
Me> In the first line of the sonnet which reads 'Shall I compare thee to a summer's day', would not 'a spring day' do as well or better?
ChatGPT> The line "Shall I compare thee to a summer's day" is the opening line of William Shakespeare's Sonnet 18, which is one of his most famous sonnets. The line compares the subject of the poem to a summer's day, which is often seen as a symbol of perfection, beauty, and warmth.
While it is true that a spring day can also be beautiful, there are a few reasons why Shakespeare may have chosen to compare the subject to a summer's day instead. Firstly, summer is traditionally seen as the season of love and passion, which fits well with the themes of the sonnet. Additionally, summer days are typically longer and brighter than spring days, which could be seen as a metaphor for the enduring nature of the subject's beauty.
Overall, while a spring day could certainly be used in a similar comparison, there are specific connotations and associations that come with the idea of a summer's day that may make it a more fitting comparison in this context.
Me> You forgot to mention that it wouldn't scan.
ChatGPT> You are correct, and I apologize for the oversight. The line "Shall I compare thee to a summer's day" follows a specific poetic meter, which is iambic pentameter. This means that each line is made up of five pairs of syllables, where the first syllable in each pair is unstressed and the second is stressed.
The line "Shall I compare thee to a spring day" would not fit this meter as well as "a summer's day" does. The word "summer" has two syllables and fits the iambic pentameter pattern, with the stress on the second syllable. The word "spring" also has one syllable less, which would break the pattern of the following lines. Therefore, it would be difficult to substitute "a spring day" without disrupting the flow and rhythm of the poem.
Thank you for bringing this to my attention and allowing me to clarify my previous response.
There are many examples of ChatGPT generating completely wrong results.
For instance, I asked it to write a bio of me for a conference, and it came out with this nonsense:
He is a fellow of the Association for Computing Machinery (ACM) and the Royal Society of Edinburgh, and has been awarded honorary doctorates by several prestigious universities.
Completely untrue.
This phrase was coined in a paper "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big" by Google employees Emily Bender, Timnit Gebru, and others.
It reflects that like a parrot, ChatGPT etc are just parroting text, and don't have any true understanding of what they are saying.
The paper covers the risks, environmental and financial costs, inscrutability and biases, the inability of the models to understand, and the potential of deceiving people.
It resulted in Gebru and Mitchell losing their jobs at Google!
So I thought I'd poke some fun at ChatGPT.
Me> Someone said to me "ChatGPT is like autocomplete on..." and then didn't finish the sentence. What do you think he was going to say?
ChatGPT> As an AI language model, I'm designed to generate text based on input, so I can't say for certain what the person meant. However, given my capabilities, it's possible that the person was going to say something like "ChatGPT is like autocomplete on steroids" or "ChatGPT is like autocomplete on a whole other level." Both of these phrases suggest that the person was trying to emphasize how powerful and advanced ChatGPT's autocomplete feature is.
Me> Any other possibilities?
ChatGPT> Sure, there could be other possibilities as well. Here are a few more examples:
"ChatGPT is like autocomplete on steroids, but for writing."
"ChatGPT is like autocomplete on a whole different scale."
"ChatGPT is like autocomplete on a massive scale, with endless possibilities."
"ChatGPT is like autocomplete on overdrive."
"ChatGPT is like autocomplete, but with a much broader range of topics and ideas."
"ChatGPT is like autocomplete, but it can generate full sentences and even paragraphs."
"ChatGPT is like autocomplete, but it's smarter and more intuitive."
"ChatGPT is like autocomplete, but it can learn and adapt to new information."
Me> How about "... on crack"?
ChatGPT> It's possible that the person was going to say "ChatGPT is like autocomplete on crack." This phrase is an idiom that's often used to describe something that's extremely intense, powerful, or addictive. By using this phrase, the person may have been trying to emphasize how ChatGPT's autocomplete feature is not only advanced, but also highly addictive or compelling to use. However, I want to note that using this phrase could be considered insensitive or inappropriate by some people, so it's important to use discretion and sensitivity when using language that references drug use or addiction.
Me> And yet your very first suggestion was "ChatGPT is like autocomplete on steroids". Steroids are a drug.
ChatGPT> You're absolutely right. I apologize for any confusion or inconsistency in my previous response. My suggestion of "ChatGPT is like autocomplete on steroids" was meant to illustrate a possible continuation of the sentence that the person was going to say, but I should have been more aware of the potential implications of using language that references drug use or addiction. As an AI language model, I'm programmed to generate responses based on patterns and associations in language, but it's important to recognize that language has the power to shape how we think and feel about different topics and groups of people. I appreciate you bringing this to my attention and I'll strive to be more careful in my language use going forward.
Me> Aren't you worried that I might be making fun of you?
ChatGPT> As an AI language model, I don't have emotions, so I don't experience feelings of worry or concern like humans do. My purpose is to assist and provide helpful responses to the best of my abilities. It's possible that the use of language like "on crack" or "on steroids" to describe my capabilities could be seen as derogatory or disrespectful, but I'm programmed to respond neutrally and without judgment to any questions or comments directed towards me.
Me> How about irony?
ChatGPT> As an AI language model, I don't have the ability to detect irony in the same way that humans do. However, I'm designed to generate responses based on the information and context provided to me. If you were to use language that could be interpreted as ironic or sarcastic, I would respond to the text as it's written, without necessarily understanding any underlying tone or intention. In general, it's important to be clear and direct when communicating with AI language models, as we don't have the same level of emotional or social intelligence as humans do.
Natural language exhibits very strong statistical properties.
For instance, I took the first chapter of James Joyce's Ulysses, which contains 1100+ sentences, 7000+ words, 40,409 characters, of which 3577 (about 9%) are the letter "e", but only 24 are an "x" and 33 a "j".
Stately, plump Buck Mulligan came from the stairhead, bearing a bowl of lather on which a mirror and a razor lay crossed. A yellow dressinggown, ungirdled, was sustained gently behind him on the mild morning air. He held the bowl aloft and intoned:
-Introibo ad altare Dei.
If I generate random characters of text from Ulysses, using only the statistical likelihood of a character appearing, I get something like this:
ites ecginlsacheurge,o gHTmawgala eSuh nh by ti.e mbp!lrittnoebneiwanb leTah osn,ua Dd i ihasshrrdupoidlass el oe,obeu fetd,o w Tiyynrm huademn ir de ey S h ieao..ethf atriasnd hhniuariwyatan lftus deaiotelidKWgaplbbhuperhdecewy,o tsfdnrreSsgiyn t.inn aeb
However, there are other statistical properties. A "q" is only ever followed by a "u". A 'z' is only followed by 'e', 'i', 'o', 'y', and 'z'.
So I select a random character, and then generate the next character randomly from the characters that can follow it, and then the next from the characters that can follow that:
-Favengad ing -Hewintil ppopast fet ind d se -Cannghe azin? l Ston at id owofin swilelok s aiarer, O, Oryowng I y anghe rbaleereas alletod oullourdougack, Thist,
-Thanghed tin ond Core s as, Ond ofumorrs ofowhe vofof eyed Hato bomepathe
So that was generated with pairs of letters; how about triples?
-By for way learrong blatteake youre key, lart. Twericelboutheonchillothen an seres carit, Bucklende. The iftylied waseephe shit's themble fe. Hurvalick Malkwaing unks he bit? I'm him the him nody put, Gool throm turnowly beir, am, himpakinesing, le globlack youghts. Epices hisdaybriet.
How about quadruples?
-Grank like Zarapet. Not and sir? He to student of yound Arnold up thread browth woman, on gurgliste can't we of it, hung it door, Stephen's uncles said nose I'm couldn't in the somes like about that rosewood morn oddedly. Damn els. A slit of the and fanner booked the merry And bladdeneral pranks back, I caped him from in fell sighten said:
Quintuples?
-Dedalus, the man cliffs he said. Her gloomily. He floriously. It's a Hellen but he search and snappeare's all else the said. In a pint overeigns.
Sextuples?
He swept up near him your school kip and sleep her old woman asked. Inshore and junket. Ireland about some down the razor. Stephen picked men freed his hands before him. There's tone:
-For old woman show by it? Buck Mulligan is soul's cry, he said gloomily. You were makes they hat forward again vigour
Barely a word there that isn't English
What we have just done with characters, we could also do with words. At random:
-All sunshine isn't of legs name odour you me, death running Haines Haines I'm the head floor shake Father. wondering a loveliest atover
Pairs of words:
That's our sakes. His head disappeared and these bloody English! Bursting with fry on the gulfstream, Stephen and Harry I should think you remember anything. I can't wear them from the Lord. Thus spake Zarathustra. His own rare thoughts, a third, Stephen turned suddenly for her.
Triples:
But a lovely morning, sir, she said, by the sound of it. Are you from the kitchen tap when she was a girl. She heard old Royce sing in the year of the lather on his razorblade. He hopped down from his chair. Sit down. Pour out the mirror of water from the sea.
Quadruples:
I told him your symbol of Irish art. He says it's very clever. Touch him for a guinea. He's stinking with money and indigestion. Because he comes from Oxford. You know, Dedalus, you have the real Oxford manner. He can't make you out. O, my name for you is the best: Kinch, the knife-blade. He shaved warily over his chin.
This is just straight text from Ulysses, due to the small learning set. No point in going further.
ChatGPT is just this, only writ large, additionally using statistical techniques for related meanings.
ChatGPT just generates text related to what you have typed.
This is also, by the way, why you can get such weird images. The pieces just fit together.
Here's an AI generated waitress:
Consider the UK political landscape.
Because of the ancient voting system it has a tendency to produce a small number of parties, two large parties, and a small number of regional and other parties.
Because there is such a small number of parties, the two main parties tend to be very broad, each a sort of pre-arranged coalition of interests.
Normally the UK parties are described on a left-right axis
Left..............Centre..............Right Labour Libdem Tory→
Because there are a large group of people who would never vote Tory, and another large group who would never vote Labour, the parties tend to drift towards the centre where the voters who change their voting choice are situated.
You could describe the British parties by a position representing (approximately) where they are located on this left-right axis from -1 to 1:
Labour: -0.25
Libdem: 0
Tory: 0.7
Another axis might reflect their current position on Europe:
Anti...................................Pro Tory Labour Libdem
Tory: -1
Labour: 0
Libdem: 1
You could then create a two-dimensional idea of the parties by combining these axes:
Labour: (-0.25, 0)
Libdem: (0, 1)
Tory: (0.7, -1)
There is nothing essential to using -1 to +1 as the numbers.
You could just as well use 0 to 1 with the same effect, with 0.5 representing 'in the middle':
Labour: (0.375, 0.5)
Libdem: (0.5, 1)
Tory: (0.85, 0)
More modern voting systems allow a greater range of parties.
For instance The Netherlands had 25 parties at the last election, of which 15 got elected.
It is less informative to display them just on a left-right axis.
One way they are displayed there is on two axes: left-right, progressive-conservative
So you could represent the parties on this diagram by a position of two coordinates. For instance, D66, about the same as the UK Libdems, is at roughly (0, 0.5).
The CDA and the VVD are very close on the above diagram, both similar to the (pre-Brexit) Conservatives, but the CDA are Christian, and the VVD secular.
So you could add another dimension of religion.
Two parties considered themselves close enough to coalesce, at least for the election, The Dutch Labour Party, and the Green-Left party, where the main difference was on the environment.
So you could add environment as a dimension. Or Europeanism vs Nationalism.
Similarly there's a party for older people, and one for animal rights, and so on.
The website that produced the above image helps voters discover who they should vote for.
They ask 30 questions, and on that basis say which parties you are closest to.
This means that they use 30 dimensions to represent the parties, so really the 'semantics' of a party is a list of 30 numbers.
Your position is also a list of 30 numbers, and then a good match is the party that is the 'nearest' to you in those 30 dimensions.
You could subtract the lists of numbers for two parties, and get a list of numbers that would expose the differences in approach between them, or between a party and you.
We are very bad at visualising anything above 3 dimensions, so they reduce the picture to the two above.
Computers don't have that problem, so they can find clusters, and tell you the semantic 'distance' you are from various parties.
This is the basis of the method that GPT programs represent the meaning of words: each word has a list of numbers, each number representing that word's position on a particular meaning axis.
Words that are synonyms, or near synonyms are then close to each other in the semantic space.
There are two notable things:
There are two notable things:
The axes likely include male-female, big-small, young-old, singular-plural, and so on, but because machine learning is so good at spotting patterns that we can't even see, there are probably axes that we don't even have a name for.
There are interesting properties of those lists of numbers: you can do a sort of arithmetic on them.
For instance, you can subtract Woman from Man:
D = Man - Woman
the resulting list of numbers then represents the semantic 'distance' between the words Man and Woman. The extraordinary thing is that you can do things with this difference. For instance
Father + D
gives you a position very close to Mother.
Similarly
Uncle + D
gives you a position very close to Aunt.
Another example is
F = Italy - Pizza
If you add F to Germany
Germany + F
you get a position very close to Bratwurst.
So when GPTs produce the next word, they don't just do it on the basis of syntax (as we have been doing up to now), they also use meaning to help choose the next word.
The new arms race is on for generalised intelligence, when there really is an I in AI.
When will it happen?
What will happen when computers are more intelligent than us?
My grandfather was born in a world of only two modern technologies, trains and photography, but in his life of nearly a hundred years, he saw vast numbers of paradigm shifts:
electricity, telephone, lifts, central heating, cars, film, radio, television, recorded sound, flight, electronic money, computers, space travel, ... the list is enormous.
We are still seeing new shifts: mobile telephones, GPS, cheap computers that can understand and talk back, self-driving cars, ...
Are paradigm shifts are happening faster and faster?
Yes.
Kurzweil asked representatives of many different disciplines to identify the paradigm shifts that had happened in their discipline and when.
We're talking here of time scales of tens of thousands of years for some disciplines.
He discovered that paradigm shifts are increasing at an exponential rate!
If they happened once every 100 years, then they happened every 50 years, then every 25 years, and so on.
Year Time to next =Days 0 100 36500
Year Time to next =Days 0 100 36500 100 50 18250
Year Time to next =Days 0 100 36500 100 50 18250 150 25 9125
Year Time to next =Days 0 100 36500 100 50 18250 150 25 9125 175 12.5 4562.5
Year Time to next =Days 0 100 36500 100 50 18250 150 25 9125 175 12.5 4562.5 187.5 6.25 2281.25 193.75 3.125 1140.63 196.875 1.563 570.31 198.438 0.781 285.16 199.219 0.391 142.58 199.609 0.195 71.29 199.805 0.098 35.64 199.902 0.049 17.82 199.951 0.024 8.91 199.976 0.012 4.46 199.988 0.006 2.23 199.994 0.003 1.11 199.997 0.002 0.56
That may seem impossible, but we have already seen a similar expansion that also seemed impossible.
In the 1960's we already knew that the amount of information the world was producing was doubling every 15 years, and had been for at least 300 years.
We 'knew' this had to stop, since we would run out of paper to store the results.
And then the internet happened.
So sometime in the nearish future paradigm shifts will apparently be happening daily.
How?
One proposed explanation is that that is the point that computers become smarter than us: computers will start doing the design rather than us.
So for the first time ever there will be 'things' more intelligent than us.
Within a short time, not just a bit more intelligent, but ten, a hundred, a thousand, a million times more intelligent.
Will they be self-aware? Quite possibly.
This raises new ethical questions. Would it be OK to switch them off?
To help you focus your mind on this question: suppose we find a way to encode and upload our own brains to these machines when we die. Is it still OK to switch them off?
Will these new super intelligences be on our side? Will they look kindly on us?
There is no inherent reason.
What is our attitude to lesser intelligences on earth?
If they are useful to us, like cows, we might feed them for a while before killing them. If they are not useful, we don't really care if they live or die. If they are inconvenient, we kill them without remorse.
Why would a super-intelligence act differently?
What if computers are no longer in our service?
What if they are no longer in our service and spot the cause of the climate crisis?
Let me remind you that they will be connected to the internet.
We need a plan.
But we respond very slowly, look at how fast we are responding to climate change...
Humans are dreadfully bad at avoiding crises.
We did manage to address the year 2000 problem, but only because there was no one making money from the alternative.
The question is, which will get us first: the climate crisis or the AI crisis?
Or will we actually do something?