Can machines read your emotions? - Kostas Karpouzis
- 254,394 Views
- 8,960 Questions Answered
- TEDEd Animation
In order to train computers to recognize emotions, we employ machine learning: scientists attempt to mimic the human learning process by presenting computers with multiple instances of a particular pattern (in this cases, expressive faces or clips of expressive speech) and adapting complex equations so that they approximate the expected output from each pattern. Given the low computing power available in the 1990s, early approaches used scaled-down, cropped mug-shots, but eventually moved towards more natural instances, where algorithms identify expressive facial features and track how those are deformed, e.g. when smiling. In the case of expressive speech, one of the common approaches is to extract a representation of the pitch of the person’s voice and calculate simple or complex statistical features.
Which brings us to the question of digital representation: recognizing particular facial expressions can sometimes be challenging for humans, let alone having to teach machines how to classify them. Paul Ekman, a prominent U.S. psychologist, developed a theory about “universal facial expressions”, where specific facial manifestations of six emotions are deemed to be recognizable by people across cultures and ages. The initial list included anger, disgust, fear, happiness, sadness and surprise; the original six basic emotions are the ones widely used by CS people to classify affective behavior. This theory lends well to classification, the problem of identifying which of a set of categories a particular instance belongs to: given a set of images, videos or speech clips, a classification algorithm identifies the most likely class each of them belongs to. Here, the more labeled examples we use when training, the better chance we have that an image of a smiling person we haven’t used in the training set will be correctly classified as ‘joy’. One of the most common machine learning algorithms used here is neural networks. Neural networks consist of artificial neurons, which exchange information and form connections with each other. The connections between these artificial neurons have numeric weights which are adjusted based on the class that each sample belongs to during the training process. So, the network adapts to and learns new information, much like how our brain functions when learning.
If you think that is difficult to collect, think about all the selfies that go around the internet, how much we talk on our mobile phones, or the amount of status updates we post every day. So the big question is not how to collect the necessary data to train our machine learning algorithms, but what we’re going to do with that. Would it be something beneficial or resemble a dystopian scenario, like the ones in Hollywood movies? And, in real life, what’s stopping a computer from ‘sacrificing’ humans, in order to perform the task it has been assigned? And how is a self-driving car going to decide whether to stop abruptly when someone jaywalks, risking a crash with the car right behind it? These are all questions that we’re going to have to face, sooner rather than later, and have to do with the amount of power that we’re committed to give to machines in exchange for a more comfortable life.
Create and share a new lesson based on this one.