Inductive Learning

May, 1995

This is a letter I composed for an Inductive Learning mailing list.

>As a first question we propose to discuss the very definition of the inductive learning process:

Perhaps you had a narrower focus in mind, but I associate "inductive learning" with the more general, common definition of induction:

"Inference of a generalized conclusion from particular instances"

I.e., I do not believe induction applies only to class differentiation. For instance, learning causal relationships is certainly a form of induction, and while these relationships might be _used_ in class determinations, the relationships themselves constitute induced knowledge all on their own.

Further, from a teleological perspective, induction is valuable as a tool for creating _expectation_ given observation. Your categorical assignments, for instance, are a cognitive tool for simplifying the process of expectation: if we can pigeonhole a given perceptual object into a known category, we can expect this object to exhibit a set of known _properties_ which will in turn predict its _interactions_ with other objects. I.e., categorical identity is not so much a statement about the constitution of an object as it is about the potential _participation_ of that object. From this perspective, it is much easier to see how "classification" is not a boolean process. It seems a bit confounding to ask "to what degree is this object a table?" But it makes more sense, and is more useful, to ask "how well will this object hold my cup of coffee?" Similarly, it becomes apparent that "object" boundaries are never objective, but are subject to the properties under consideration. Imagine a table and chair pair that share two legs. Where does the table end and the chair begin? They don't. The table is delimited by its properties, not by its physical construction.

So, induction could be seen as: creating from past observations an expectation of the future (including the expected consequences of actions we might take). Then "inductive learning" could be seen as: building a model, based on past observations, with which induction can be carried out efficiently.

Presumably, such a model should facilitate the identification of _property sets_ (synonymous with "objects") within the environment, and should contain knowledge about the causal relationships between such property sets. Returning to the table/ chair example, the "table" is a _conceptual_ object defined by a set of properties -- the ability to hold my books and coffeecup stationary and at an acceptable height. Likewise, the "chair" has a similar though unique set of properties. Thus, I have identified two "objects" in my environment. Further, I know from experience (I have induced) that when a table and chair occur within a certain proximity of each other, the two combined have the new, emergent property that I can sit at the chair and use the table simultaneously. While we normally wouldn't consider this a new "category" of object, it is! The table-and-chair has properties unique to the _combination_ of the two, which allows this new joint object to _participate_ in new ways. For instance, recognition of this table-and-chair object leads me to expect that people might eat dinner here, as opposed to at the lone chair in the den, or at the table in the work room.

So, inductive learning is not simply about learning class distinctions, but about _finding_ the classes and the causal relationships between them (keeping in mind that a "class" is really just a declaration of a set of properties). Classes can be discovered explicitly, by observing the common co-occurrence of a set of constituent objects (I have seen many objects with legs and flat tops at a certain height), and implicitly, by observing a common mode of participation (I have seen many objects used to hold plates and coffee cups while people are eating). The latter is inherently the hierarchical parent of the former, in the sense that a class is ultimately defined implicitly by properties of interaction (by participation), while the explicit grouping of constituents may take a number of different forms. That is, there may be a number of different ways to elicit the same set of emergent properties. E.g., a well-placed force-field may someday constitute a table, and although it shares no physical constituents with a table of today, it nonetheless _interacts_ with other objects as per my conception of table. Therefor, I may expect it to participate in the same higher-level joint objects -- if I were transported into the future, and witnessed such a table for the first time, I could induce from my existing experience that with a well placed chair, people may eat at such a table. As another example of the same principles, consider hand-printed letters. Some people write capital "a" as "A", others simply write an oversized "a". The two look very different -- they are explicitly dissimilar. And yet we form a single concept for them (most people would not notice which form were used) since they _participate_ in exactly the same manner. (Note, therefor, that contextual participation can be used as a local learning _supervisor_.)

Further, once classes and causal relationships begin to be discovered, there is the question as to _how much_ can we infer from what we've seen? Take, for example, the biased-coin problem:

I pull a single coin C from a bag of randomly biased coins (each coin has an equal likelihood of having any bias from always tails to always heads). I flip it for you N times, and you witness H heads and T tails. What can you induce about this coin? Given a uniform distribution for its bias as an a a priori assumption, you can compute the likelihood of each possible bias given your new observations, H and T. Further, you can integrate this into an expectation: P(H'|H&T) = H+1/(H+T+2). That is, there _is_ an _objective_ best guess for the bias of the coin given any number of observations. (Although, there is some question as to how objective the a priori distribution assumptions can be...) Or, more correctly stated, there is an objective expectation of the future behavior of the coin (regardless of trying to nail down the actual bias).

Generalizing this observation to static object classification: given a finite amount of experience, and a finite amount of current observation, we can at best assign a _likelihood_ that a given joint object collection in fact produces a given emergent property set. I.e., just because it looks like a table does not mean it absolutely must be a table. There may be any number of things that we don't know about it which may prevent it from functioning as, and therefor from being, a table. Or, to take the character-recognition example: a given small vertical stroke may be an "i" or an "l". Viewed alone, the most we can say is that it might be one or the other (with some respective likelihoods based on experience).

So, if we assume these ideas are reasonable for the moment, inductive learning should, qualitatively, infer from observation the hierarchy of emergent property sets (classes) inherent in the universe, and, quantitatively, maintain a notion of reasonable expectation given a finite amount of evidence.

-Brandyn (brandyn@sifter.org)

p.s., There may be some argument as to whether there is an objective basis for classes, or whether it is exclusively a cognitive tool. Consider the table and chair example. Of all possible organizations of those two objects, only a very tiny fraction of them have the table-and-chair properties (that you can sit and eat at the table, for one; if the chair is in the bathtub, and the table out on the lawn, they do not constitute a joint table-and-chair object). This inherent non-linearity of emergent properties, combined with the fact that these properties themselves become the potential constituents of higher-level groupings, creates a natural, objective hierarchy of "objects" in the universe, independent of any observers.

Brandyn Webb / brandyn@sifter.org

(Back to Cybernetic Ruminations)