Scoffing at Autonomy

From: CorporateInfo (TelegraphHillSF@yahoo.com)
Subject: Expert scoffs at Autonomy software
Newsgroups: comp.ai.nat-lang
View: (This is the only article in this thread) | Original Format
Date: 2001-08-23 21:49:53 PST

At Genentech headquarters today in South San Francisco, there was a
seminar on “Metadata & Controlled Vocabularies: What Are they and What
is their Value?” given by Amy Warner, faculty member in Information
Architecture at UMichigan, Genentech’s consultant on taxonomy creation
and a well-known corporate consultant on information technology. Dr.
Warner had a final slide with “Autonomy”and “Semio” written on it, and
here’s what she said:

“I always get asked, aren’t there automated tools to build
taxonomies? There is some ‘Hiearchy Generation Software’ available,
but it generally relies on the principal of Collocation, not
aboutness, which is preferable. Collocation simply means statistical
clustering. Two examples are Autonomy and Semio. Beware of these.
They don’t work very well. They make many promises, but Collocation
only works some of the time. I shouldn’t bash, but I haven’t talked
to a company yet that likes Autonomy, including those that have bought
it. In many cases they are overselling their product.”

Jorn’s Ontology

From: Jorn Barger (jorn@enteract.com)
Subject: Semantics and NLP
Newsgroups: comp.ai.nat-lang
View: Complete Thread (19 articles) | Original Format
Date: 2001-11-05 06:10:02 PST

Judging from the replies I got to my recent ‘newsdiff’ posting, it
appears university programs in AI are (still) doing an extremely bad job
of covering *semantics*…

An unabridged dictionary may have half a million entries. Roget tried
to sort these into 1000 categories, in a hierarchical tree:

1.abstract relations
1.existence
2.relation
3.quantity
4.order
5.number
6.time
7.change
8.causation

2.space
1.generally
2.dimensions
3.form
4.motion

3.matter
1.generally
2.inorganic
3.organic

4.intellect
1.formation of ideas
2.communication of ideas

5.volition
1.individual
2.intersocial

6.affections
1.generally
2.personal
3.sympathetic
4.moral
5.religious

Semantic AI has to wrestle with ontologies like Roget’s, using a
precisely-defined, *limited* vocabulary to express the very-wide range
of realworld concepts.

And I think so long as one stays in the top three categories of Roget’s
tree (abstract relations, space, matter) progress can be– and is
being– made.

But when human psychology enters the picture (intellect, volition,
affections) precise definitions and limited vocabularies instantly and
catastrophically *fail*.

And this barrier has apparently traumatized the field of AI so severely
that the topic is practically taboo! It’s just excluded from
discussion.

*If* a neat ontology-of-psychology could be created, all the other
problems of NLP might just evaporate. Understanding a sentence would
just require finding the node in the ontology that expresses the exact
same meaning (in a generalised form).

I’ve been arguing that this ontology _can_ be built if we think of the
nodes not as dictionary-entries, but as the whole ‘usual stories’
surrounding a concept or word.

So Roget’s ‘intellect’ node would correspond to the whole usual-story of
human intellect: children are born with limited intellect, they learn,
some learn faster, some learn more, humans use intellect to solve
problems, they teach others, etc. (This could even be encoded in the
form a short, human-readable encyclopedia article, written in simplified
English.)

Particular concepts like ‘smarter’ can then be linked to specific parts
of this general story as ‘specialisations’. (The usual ‘smarter’ story
involves getting praised in school, becoming annoyingly self-important,
going to college, etc.)

But remarkably, the mental skill required to think in terms of these
usual-stories is much closer to the novelist’s art than to anything
taught in comp-sci– to such a great extent that comp.ai.nat-lang barely
acknowledges the topic as appropriate!

But I think I’ve found a leverage point, finally: pseudo-XML tagging of
the entries in Web *timelines*.

Because the authors of timelines are trying to limit themselves to the
most significant discrete events (in all of history), timelines do an
excellent job of prioritising human behaviors, and so of identifying the
most-useful limited vocabulary for human history.

Examples:

person1 is born at place on date to mother person2 and father person3
person1 is educated at place by person2
person moves from place1 to place2
person creates creative-work
person founds social-institution
person joins social-institution
person discovers theory
person1 fights person2
person leads group with persons2-3-etc
group fights group
etc

These, then, become the ‘root-level’ usual-stories in the psychological
ontology.

And my ‘newsdiff’ suggestion was about the need for a NLP to have a
mega-timeline of history that’s continually updated with new
news-events, winnowing just the new developments from the standard
high-redundancy journalistic style.

ai faq: http://www.robotwisdom.com/ai/
timelines project: http://www.robotwisdom.com/science/history.html

http://www.robotwisdom.com/ “Relentlessly intelligent
yet playful, polymathic in scope of interests, minimalist
but user-friendly design.” –Milwaukee Journal-Sentinel

3 Dimensions Fundamental

Quote : Martin Rees in Just Six Numbers – The 3 spatial dimensions are a fundamental feature of our universe, any other number and it / we could not exist. eg the alimetary canal of a higher animal in 2D splits it in two unless it ingests and excretes through the same orifice – messy prospect. Time is another dimension, but is distinctly different from the others, due to its irreversible directionality.
Ref : ISBN 0-297-84297-8 (9 780297 842972) Weidenfeld & Nicholson (page 3)

The Challenge – A Preamble

Pre-amble needs to cover :

A Challenge – Modelling knowledge bumps up against age old philosophical issues almost before you open the box. Is there a fundamental truth view of the world, and if there is, will we find and agree what it is anytime soon ? Well maybe, but not likely, seem to be pretty reasonable working assumptions; which is hardly a very tight scientific argument, but the challenge is there for any who wish to disagree.

A Warning – So it’s important to bear in mind that we’re not actually looking for a “Grand Unifying Theory of Everything”, even if our practical objective is to achieve something that could be applied to the ubiquitous and generic domain of the entire world wide web, all who may interact with it, and the entire body of human knowledge and artificial intelligence that may represent. I have been concerned with this subject, since a moment I can pinpoint very precisely some 21 years ago in 1980, when I was first struck by an important hidden ambiguity in a pretty insignificant business form, and in particular for the last 5 years concerned with attempting to standardise and implement a generic extended enterprise model in the broadly engineering industry. In that time, I have experienced countless individuals, myself included, who get drawn towards the fatal attractor at the moment they discover they have a very generic and flexible model on their hands.

The Trick – It is important to remember that when developing your ontology, even when this is a framework, meta-model (or meta-meta-model, or a language etc.) with which to develop an ontology, you are making a choice – deeming which entities may exist. The choice is based on some world view – which may of course be some rationalisation of several other world views – and others with different perspectives and different practical application domains will hold or choose to hold different world views. So a widely applicable generic ontology may come tantalisingly close to being a model to which all others can be deterministically mapped, but there will always be other models to which only incomplete or imperfect mappings will be possible.

The Ologies – Having said that philosophy and meta-physics cannot hold the one true answer to this problem domain, it is of course necessary to appreciate and map between different world views which lead to different models or modelling frameworks, and the various limitations and compromises of the different views.