[email protected]
[Top] [All Lists]

RE: [Heartlogic-dev] new idea (fwd)

Subject: RE: [Heartlogic-dev] new idea fwd
From: "William L. Jarrold"
Date: Wed, 4 May 2005 18:23:01 -0500 CDT

On Wed, 4 May 2005, Josh White wrote:

Dear Open Heart Logic user.  As part of _The Turing Challenge_ we
would love it if you would rate the following assertion in
terms of
its believability...

"Vienna is wet."

(1) V.Unbeliveable
(2) Unbelievable
(3) Nuetral
(4) Believable
(5) V Believable

I think this is a very good idea, in general.


I don't think you'll get much participation unless you tell the user the
computer's best guess AFTER they click the answer.   The point is the user
can see how their answer improved the computer.

Ah.  Very interesting and excellent point.  Once again, you "get it"
and then take it a step further.  (If we give them this feedback there
is a slight risk that a person would start to think like Cyc rather
than a naive and natural human, but, I think this risk is small,
unimportant, and distracting...The real problem will be keeping
people's motivation and excitment up.)..So yes I agree that we should
do this.

Read my very rough draft mock up of this screen below and see if you
have any comments.  Farther below you will see some subtleties that
might complicate matters (see {a} below). Of course, we can tweak it
even after it is up as a dynamic WWW page.  So, there's a risk in
giving too much precise feedback on actual wording what I am mainly
after now are overall design type issues.

Joshua, can you add this feature?

The feature is this: Why don't we call it the "Feedback to User" page.
(better names please!!!) after user selects a rating, a new window
comes up.  The new window has text that we will manually craft (in the
beginning...later we may wish to call Cyc or KM to generate part of
that page...dream on).

Here is a baby step one example of "Feedback to user" page:

Assume User Joe has selected 2, Unbelievable as his rating for "Vienna
is wet.".  This is what we want on that page......


Thanks for your rating, Joe!  Ratings like yours can help us evualate
and improve our AI models.

You rated the believability of the assertion...

        "Vienna is wet."

..as:  Unbelievable.

In case you are interested, we would like to give you some context for
the item that you just rated for us.  If you are not interested, just
move on to the next item.

Cyc believes that "Vienna is wet." is true.

Cyc believs this assertion because it concluded it based on an
inference.  The following facts and rules caused Cyc to conclude
"Vienna is wet."

(1) "Rivers are a kind of water."
(2) "If water touches x then x is wet."
(3) "The Danube is a river."
(4) "The Danube runs through Vienna."
(5) "If a river runs through a region it touches that region."

(Note that these are represented in Cyc not in English but rather in a
logical computer language.  We have translated these facts and rules
into English to make them easy for you to understand.)


Also, some people at hotornot put in wrong answers on purpose, just to hack
the system.  I think this system is even more tempting to lie to.


And this is a good seguey (sp?) into what I referred to above as the
"subtlies" {a}.

What we should do is give some items are "reversed" or "mutated" or
"perturbed."...In other words, for SOME items we will ask participants
to rate "normal" Cyc assertions.  These assertions may be be ground
facts (like "The Danube runs through Vienna") or deductions (like
"Vienna is wet.") or maybe even rules (e.g. "If a river runs through a
region it touches that region.").

But OTHER items will be intentially reversed.  We would predict that
humans would rate these as less believable than unreversed items.

This is what I did in my dissertation.  In study 2 half of the items
were unreversed and the other half were reversed.  In study 3, a third
of the items were unreversed, another third were "slight" reversed,
and another third were "strongly" reversed.

There are two reasons we want to do this reversal stuff. One reason is to catch liars or vandals.

The OTHER reason is to allow us to compare the mean believability of
different groups of items.  E.g....

unversed items vs reversed items human generated items vs machine generated items
deductions vs ground facts

...this is all part of the computational ablation paradigm and it
figured big time in my dissertation.  It is an example of what I mean
by good and rigorous methodology.

[Now, it occurs to me that there is a THIRD more minor,
user-interfacey sorta reason to do this...That is that we want quick
*coarse* judgements about whethher a commonsense assertion is a good
one or not.  One way to obtain such coarseness is to throw in a fair
number of ridiculous assertions.]

Well, I should enlist some other AI gurus opinions on this before I
spout off too loudly about good and rigorous methodology.  Speaking of
AI gurus, Peter, are you on this list yet?

So Josh and Joshua does this make sense conceptually, designwise?

Joshua, can you implement this.  Note: Just getting the believability
ratings up there is step one.  Implementning the Feedback to User is
step two.



Heartlogic-dev mailing list
[email protected]

<Prev in Thread] Current Thread [Next in Thread>