Isaac Asimov was a prolific writer known for his works in science fiction and popular science. In the field of AI, researchers know about him more because of Asimov’s Three Laws of Robotics, which he introduced in 1942. The Three Laws are: (1) A robot may not injure a human being or, through inaction, allow a human being to come to harm; (2) A robot must obey the orders given it by human beings except where such orders would conflict with the First Law; and (3) A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws. Essentially, a robot must behave so that it never harms people.
It was science fiction then. Now, a group of researchers from the University of Massachusetts (UMass) and Stanford University have come up with a way to design machine learning algorithms that can be told by the user not to behave in a certain way. The work was published in Science this week.
“We call algorithms created with our new framework ‘Seldonian’ after Asimov’s character Hari Seldon,” explains lead author Phillip Thomas of UMass. “If I use a Seldonian algorithm for diabetes treatment, I can say to the machine, ‘while you’re trying to improve the controller in the insulin pump, don’t make changes that would increase the frequency of hypoglycemia.’
The safety constraints are introduced to the algorithm by ways of probabilistic reasoning, instead of applying rules requiring certainty. The new framework has been tested to predict grade point averages of 43,000 students in Brazil, successfully avoided several types of undesirable gender bias.
More about this work can be found here.