Watson on Jeopardy

I just watched Watson on Jeopardy, my god it killed Jennings and Brad's clock.

Only to blow it big time on a pretty easy final Jeopardy question.
I watched the second episode yesterday, and Watson did blow away the competition (only to blow an easy final Jeopardy question - wonder how that happened?).

But IMO what yesterday (and the first episode) showed is not that Watson is "smarter" or better at Jeopardy than a human, it showed that Watson has an edge on the buzzer somehow. You could just tell the two human champs knew many if not all the answers that Watson got, Watson was consistently just quicker to the buzzer.

So IMO this has unfortunately devolved into something other than a display of human intelligence vs AI - it's a show of buzzer mechanics. What was intriguing going in, has now sadly become (more of) a sideshow IMO...
 
Yes, and it's not exactly an "error". AI is often construed as an enterprise to understand human intelligence, so when the machine makes exactly the same error as the human, that's a triumph. It shows that the machine reasons like the human.
Exactly – and I think people are missing this point. Reasoning with context. A knowledge based system reasoning against the two top champs, each of whom has shown the broadest range together with a very high percentage of correct answers.

More useful but less interesting contests would AI diagnoses compared with medical doctor, AI conducts an audit vs an auditor, etc. Imagine the productivity gains. The possibilities are immense.
 
I'd like to see IBM take on the challenge of having Watson see, hear and interpret the clue as it is given verbally and visually. This is what the humans have to do, so it would be a fairer test.
 
Yes, and it's not exactly an "error". AI is often construed as an enterprise to understand human intelligence, so when the machine makes exactly the same error as the human, that's a triumph. It shows that the machine reasons like the human.

And that's a good thing? :p
 
One more Watson thing if you haven't read Ken Jenning's blog on the Washington Post.
Do so it is laugh out loud funny.

Thanks - that was great! Based on that, I'm definitely going to read his books. It's cool that he isn't just an auto-answer robotic type, but actually has a great personality and funny bone.

I watched the second episode yesterday, and Watson did blow away the competition (only to blow an easy final Jeopardy question - wonder how that happened?).

But IMO what yesterday (and the first episode) showed is not that Watson is "smarter" or better at Jeopardy than a human, it showed that Watson has an edge on the buzzer somehow. ...

So IMO this has unfortunately devolved into something other than a display of human intelligence vs AI - it's a show of buzzer mechanics. What was intriguing going in, has now sadly become (more of) a sideshow IMO...

One of the blogs that clifp listed explained the final round problem - they programmed Watson to lightly weight the categories,. I didn't notice was Chicago a #2 pick for 'him'?

Yes, the whole buzzer thing is kinda distracting here. I thought today - it isn't really important if Watson is a fraction of a second slower or quicker than a human, the real value is can he be reasonably fast for the situation at hand. That might be a second or two - that could still be pretty normal conversation speed; to ten seconds if you just need an answer 'while you wait' ( compared to real life - OK, I've got that information in this pamphlet over here,- ahh, here it is...); to a minute or two (real life: let me put you on hold); to overnight (let me get back to you on that).

Now I'm curious how Watson could perform with more normal computing power that a Customer Service dept of a large company might be able to afford - It would need the language deciphering part (maybe that could be sent to a central server), but a company would need far less data in its database than what it takes to handle random game show questions.

As far as this becoming a sideshow - yes, but I bet the IBM people are thrilled with that. It's just want they want. This gets attention they couldn't buy any other way. They know it's a gimmick, they'll milk it for all it's worth and learn from it what they can and throw away the rest as the 'cost of doing business'. I think it was a brilliant sideshow. Doing additional rounds would be a bit silly, but maybe with the voice recognition as others have said - that would be good to see.

More useful but less interesting contests would AI diagnoses compared with medical doctor, AI conducts an audit vs an auditor, etc. Imagine the productivity gains. The possibilities are immense.

Yes - in this case it's coming up with what are largely single answers to searches. I would think a computer would really be powerful in calculating more complex things, once it 'understands' the question. Medical diagnoses were mentioned in the show, and I would think a computer could weigh all the inputs and pull in knowledge bases and make suggestions based on that far better than any human.

I'd like to see IBM take on the challenge of having Watson see, hear and interpret the clue as it is given verbally and visually. This is what the humans have to do, so it would be a fairer test.

+1


. It shows that the machine reasons like the human.
And that's a good thing? :p

Heh-heh. I recall when computers were far less powerful, and there was all this 'fuzzy logic' and AI buzz. I always thought - computers are really good at doing some things so much better than humans ( calculations, dealing with large amounts of info, etc), why not use them for what they do best, and use humans for what they do best?

Of course now, with all that past work and todays computer power, this AI stuff is getting seriously good.

-ERD50
 
I'd like to see IBM take on the challenge of having Watson see, hear and interpret the clue as it is given verbally and visually. This is what the humans have to do, so it would be a fairer test.
The humans will be able to win that way for a long time, just because to program the tasks, we'll need to understand more about how we perform them ourselves.

I think it's fun watching "How it's Made" on the Science Channel to try to figure out, when there are just a few humans interspersed among a bunch of automated devices, what it is about the specific task that needed a human to do it rather than a robot.
 
Might not make for interesting TV, but I'd like to see the results with buzzer speed taken out of it. Take all three "players" separately, give them each all the questions, see how many total they could get right given maybe a few seconds to get their answer out. Seems clear that Watson would score high ... could either of the others score as high or higher?
 
I think it was ERD50 who said that this could be really useful in medical diagnosis. Totally agree. 100 human doctors given the same patient will come up with 101 diagnoses and 102 ways to treat the patient. There have already been studies and commercial applications that use DSS (decision support systems) to improve clinical management based on the best current evidence. Unfortunately there are both economic and cultural reasons why they have not had more acceptance so far. Doctors tend to get a bit upset if they are expected to do the rectal exam but not get any glory for thinking the problem through. :LOL:
 
Might not make for interesting TV, but I'd like to see the results with buzzer speed taken out of it. Take all three "players" separately, give them each all the questions, see how many total they could get right given maybe a few seconds to get their answer out. Seems clear that Watson would score high ... could either of the others score as high or higher?

Jenning's has said that good Jeopardy players know between 25-26 out of 30 answers and the buzzer is the difference. During his run Jenning thought it was unfair that had so much practice so he requested that challengers get more practice using the buzzer (good sport that guy). The precision of the computer ringing the buzzer is clearly the difference in this match according to Jennings.

In yesterdays match they questions seemed harder than normal. I am curious how they would have perform if it was strictly a knowledge test where each participant was given 5 to 10 seconds to answer the question.
My guess is Watson would have won getting either 27 or 28 questions right, and the two humans would have been a question or two behind.
 
I think it was ERD50 who said that this could be really useful in medical diagnosis. Totally agree. 100 human doctors given the same patient will come up with 101 diagnoses and 102 ways to treat the patient. There have already been studies and commercial applications that use DSS (decision support systems) to improve clinical management based on the best current evidence. Unfortunately there are both economic and cultural reasons why they have not had more acceptance so far. Doctors tend to get a bit upset if they are expected to do the rectal exam but not get any glory for thinking the problem through. :LOL:

My one and only AI course was taken exactly 30 years ago. At the time one of the big future applications of AI was in medical diagnostics... So much and so little has changed.

The natural language 'understanding' is very impressive. I do think that customer support is very logical next step. Frankly it is easier to understand Watson than many support people from India and a good speech recognition program is also often superior communicating with somebody in a different country.


I thought the reason we pay doctors the big bucks :) is because they have to do rectal exams, and put up with crotchety sick people.
It seems to me that liabilities of making a wrong diagnosis of why Windows crashes are much less than missing a tumor or something.

Personally, I am looking forward to Holographic medical doctor like in Star Trek:Voyager. Especially if I can have the option of a hot blond instead of this guy.
TheDoctor.jpg
 
I really don't like Watson, Jennings is too gracious. :)
 
@grumpy

Making Watson read the clue like a human is a little silly and wouldn't really be that hard. It would require giving him a video camera that's just pointed at where the text-clues show up, then running optical character recognition (OCR) software to turn the visual image of the text into plain text.

It's pretty simple to do. Spammers do it all the time. That's why when you login or try to post on some sites you have to spend 3 minutes trying to figure out what all all that ciphered wobbly text in a "captcha" is actually saying. That's to trick computers, if it were plain-text (like on jeopardy) it would be a cinch to automate the reading and create spam.

I don't really see the point in jumping through those hoops, it achieves nothing, sending him an electronic version of the text of the question is much simpler.

It's about artificial intelligence, not building a robot to take the place of a human player.
 
I missed the first two because I didn't watch the Tivo'd NOVA episode until after they had aired. Anyone found a source for online viewing?

In the NOVA episode, he'd given the same wrong answer because he wasn't fed the answers given by the other competitors. But they said they'd fixed that. Did he ever give the same wrong answer during the actual show?

Concerning the buzzer issue -- you have to figure out whether Watson is faster because of purely mechanical reasons, or because he comes up with the answer faster. If the latter, then it's a valid advantage.

I agree with Glippy that having Watson understand the speech or read the clue would be the easy part.

This is one of the most impressive computer feats I've seen. Too bad Watson didn't lose so that we could have rematches. It was fun to see Watson's list of top three answers -- just like my favorite scene from Terminator (not suitable for work).

YouTube .com/watch?v=AeV-DI09Q3w
 
Might not make for interesting TV, but I'd like to see the results with buzzer speed taken out of it. Take all three "players" separately, give them each all the questions, see how many total they could get right given maybe a few seconds to get their answer out. Seems clear that Watson would score high ... could either of the others score as high or higher?

I agree that the "buzzer factor" effectively killed the competitive aspect of the game. I was thinking that the computer should have signaled a human that it was prepared to buzz in (maybe show a green light) and then the human would be competing for the buzz-in with the other two living contestants. I pretty much lost interest although I usually enjoy Jeopardy.
 
Concerning the buzzer issue -- you have to figure out whether Watson is faster because of purely mechanical reasons, or because he comes up with the answer faster. If the latter, then it's a valid advantage.

It could be an advantage for Watson either way. Consider this scenario:

1) It takes Alex 3 seconds to read the clue and clear the buzzers.

2) It takes the two humans 2.5 seconds to read the clue and form a response in their head.

3) It takes Watson 2.99 seconds to form a response.

4) At the 3.00 second mark, Watson buzzes in with machine precision, and beats out the humans even though they had the answer faster than the computer.

Who can say what is 'fair' though? The buzzer precision is one of the skills of the machine, so they use it. But I agree with others that it makes it less interesting. Maybe random mixing his button response with an average distribution of the other players response times would have made for a more interesting match. But playing the game isn't the point of this thing, so it doesn't really matter that much.

Can it pick stocks and/or time the market? I wish one of the Jeopardy clues involved paying off a mortgage or investing the money.


Heh-heh - I thought about that Terminator scene too - that always cracks me up!

-ERD50
 
The buzzer thing is complicated. Pretty sure that Watson does not buzz in until he gets green light plus an answer with a certain confidence level.

The humans can buzz in right after the green light even when they don't have an answer formed yet. They can buzz in if they see a few words and can quickly be thinking, "I got a pretty good idea, I can figure the detailed correct answer after I buzz in". They get a few seconds to respond and they often use it - you can see that, the wheels are still turning after they buzz in. Watson doesn't do that.

So maybe the only 'fair' determination would be to forget the buzzers/lights and keep the contestants isolated and just time how fast they can give a correct answer (judged by the end of the final syllable of their answer).

It was fun to watch, but it wouldn't be so interesting beyond that. But wow - what this technology will be doing in the next few years.

edit - another way that might make it 'fair' - let everyone buzz in at anytime. But double the penalty for wrong answers to avoid buzzing just to get the chance to answer, the humans would have to have a decent level of confidence in their answer or they'd go negative. Maybe still require Watson to wait for Alex, to account for the human reading versus texting to Watson.

-ERD50
 
The whole 'timing the buzzer' thing is really just a funny side issue. If Watson can mimic much of human language understanding in a speed close to real time (the speed humans can do the task within), then it's just a processor upgrade away from performing the task faster than humans.

That's a generally recognized phenomena in AI research.
 
The Watson program definitely had the buzzer timing advantage over the human opponents, but there was probably no way to avoid that and surely IBM would not have entered (masterminded) the competition if otherwise. The real achievement was in the percentage of correct answers. IIRC 90% was considered the threshold needed for a system to be considered "equivalent" to a human expert.

Knowledge based systems have been limited by context and programmer knowledge. That is, the program could only work within a limited context (medical diagnosis) and was based on the knowledge of the individual programmers. Together these have been very limiting factors. IBM has showcased a program not limited by either, and the very public way it was presented implies it is ready for wider-scale commercial use. The potential for productivity improvement is immense. A few areas of opportunity:

Medical: Diagnosis and selection of treatment

Public Services: (social security, medicare, IRS) where providing information is an important and costly part of the mission

Education: knowledge based systems for student questions built around multimedia, computer based or large scale remote learning systems

Technical support: software troubleshooting wizards that actually solve problems. (I might even stop hating MS)

User or customer interface:

Auditing and forensic accounting

Intelligence gathering and analysis

Military: This is an area of almost unlimited opportunity
 
...

Technical support: software troubleshooting wizards that actually solve problems. (I might even stop hating MS)

Ha! Like MS would buy something from IBM -- probably not. Oh, but a third party could do it. Well, that's possible. ;)
 
I often watch Jeopardy and was watching last night when Watson was competing. Very interesting. One thing I noticed was that Watson seemed to beat the humans on buzzing in, not that the humans didn't know the "questions", just that they didn't get the chance to answer. It seemed that the human reaction time couldn't compete with the machine's.

I wonder what the outcome would be if the contestants were allowed to buzz in BEFORE the question reading ended. I suspect that the humans would sometimes be able to make an intuitive leap to the correct question before Watson could complete "his" algorithm process. As it is, Watson may already be well into his "thought" process before the reading of the question ends.

I wonder if the programmers were embarrassed when one of the humans gave the wrong answer and then Watson buzzed in and gave the same wrong answer. An obvious oversight in their logic.

+1. I would be interested in hearing the logic/criteria used for the buzz in.

Did Watson have to push an actual button down via a mechanical device (like the other players) or was it electronic (unfair).
 
+1. I would be interested in hearing the logic/criteria used for the buzz in.

Did Watson have to push an actual button down via a mechanical device (like the other players) or was it electronic (unfair).

Both. They did hook up a mechanical button presser, but the speed of an electro-mechanical device is still going to be 'unfair' compared to human reaction times of around 0.16 seconds.

-ERD50
 
Funny.

Were there any video or audio clues in the games that Watson played?
 
Funny.

Were there any video or audio clues in the games that Watson played?

No that was the one of the concession that the producers made to eliminate audio and video daily doubles.
 
Back
Top Bottom