Future Direction

Future direction:

I want to work with artificial learners that figures out what a human teacher would like the learner to do. Learning from limited teachers or learning from the types of social signals that a teacher might produce naturally during interactions would both be interesting topics. One straightforward way of examining limited teachers would be to control which parts of a situation that is visible to the teacher. It would then be possible to work with human teachers that are limited in known ways, and make comparisons with non limited teachers. A more social version would be to learn from a teacher that do many things at once, and having a learner estimate how much attention a teacher is paying to the learner. A straightforward way of doing this would be to have a teacher interact with more than one learner at the same time, where each learner is estimating what the teacher is paying attention to. This would also make it possible to compare the estimates of different learners at a given timestep. The response of a learner could for example be to: trust an evaluation less if the teacher has not been paying attention, or postpone the execution of some important step until the teacher is paying attention, or try to draw attention to itself when doing something important, etc. This scenario could also be entangled with social interaction issues and social signal processing, for example establishing joint attention towards an object, or estimating what a teacher is looking at.

Human-robot interactions where a robot is learning from the types of signals that humans naturally produce during interaction would also be interesting, especially working on foundational interpretation questions of the type: "should a smile be interpreted as indicating approval of an action choice, or as something that should be maximised, or as something more complex?".

Based on the realisation that maximising positive minus negative evaluative buttons, pushed by a human teacher observing an artificial learner, would be a misinterpretation of what human teachers mean when they press evaluation buttons (see the past research section), we can for example question the implicit interpretation assumptions of an algorithm that is maximising smiles minus frowns. It's however difficult to say to what degree we can transfer what we know about the interpretation of evaluative buttons to things such as smiles, nods, concerned surprise, frowns, fear, disgust, etc. It seems likely that some social signals will fit reasonably well with evaluative button pushing behavior, but it's difficult to say much with any certainty without experimental results.

I think there are many interesting questions of the type: "what assumptions can be made regarding smiles, frowns and other facial expressions, what behaviors do those assumptions predict, and how do we design experimental setups to compare different sets of assumptions/interpretations?". I see the question of how to interpret facial expressions as strongly intertwined with, but not identical to, the question of how to detect facial expressions.

One straightforward way of getting started on this would be to record social signals of a human teacher that is watching a learner perform a series of pre determined actions in specifically designed tasks. The task would be designed in such a way as to be easy to understand, explain, and where performance is easily evaluated. A series of learner actions would be specifically designed so that they are easy to evaluate. A human teacher could observe these actions and for example give speech comments or press an evaluative button. This would get us approximately known teacher evaluations, because at each timestep we know what action the teacher is evaluating. And each evaluation would be coupled with for example facial expressions or tone of voice readings. We would then have a sort of reverse situation compared with trying to infer meaning from known signals (such as button pushes); we now know the meaning (at least approximately), and are trying to analyse the signal. Such a study could tell us what a human teacher looks like/sound like/etc when the learner is, for example; failing completely, succeeding comfortably, barely avoiding a disaster, managing to limit the damage of a self generated problem, being too passive and failing to achieve some bonus opportunity, etc. A classifier built from this data could be tested by using it for learning some new task. If it fails it would be hard to know which part is wrong, but if it succeeds in learning the task, then both classifier and underlying interpretation of the social signal should be reasonably accurate. Once a task and a set of learner actions has been created for one experiment, the same setup could be re used in multiple experiments, and with several different types of social signals.

I would like to work on the question: "what does various social signals actually mean?" for basically the same reasons that I worked on the question: "what does evaluative buttons actually mean when pushed by a human teacher?" (see the past research section).

Interpretation questions could be asked for a range of different types of inputs; facial expressions, eye gaze, tone of voice, body posture, etc, and interpretations does not necessarily need to be restricted to approval/disapproval related meanings. In general, it seems like there are many possible specific research questions that one could start with on the "what does social signals actually mean?" front. An even longer term question would be: "can we find a subset of social signals that can be reliably detected, that humans produce naturally while interacting with an artificial learner, and that has known interpretations which are useful for learning?". This seems like a distant goal, but I like experiments that can be seen as laying the foundation for something more ambitious.