AI systems have huge potential for good, but they’re only as good as the data they’re trained with.
As machine learning expands into all areas of our lives, finding uses in healthcare, autonomous driving and law enforcement (to name just a few), any bias could be not just inconvenient, but could mean it does more harm than good.
Biased data, biased results
The problem of AI bias isn’t just theoretical; after personal experience with systems recognizing her lighter –skinned colleagues’ faces more readily than her own, MIT researcher Joy Buolamwini began a project to find out whether the software had trouble with her particular features, or if there was a wider issue.
Buolamwini tested systems from IBM, Microsoft and Chinese company Face++, showing them 1,000 faces and asking them to identify the subjects as either male or female. She found that all the systems were significantly better at identifying male faces than female ones, and perform better on lighter faces than darker faces.
As facial recognition becomes more integrated into daily life (for security, medical diagnostic applications and finding missing people, for example), darker skinned people and women are at a real disadvantage.
In response to those findings, researchers at MIT are working on an algorithm that can automatically ‘de-bias’ data sets by finding hidden biases and re-sampling the data.
In tests, the algorithm effectively made AI facial recognition less racist, reducing ‘categorical bias’ but over 60% without reducing overall accuracy.
At Intel AI DevCon (AIDC) in Munich, companies from around the world demonstrated how they use Intel machine learning platforms to benefit society, in areas including healthcare and physics research.
Although Intel mostly supplies the platforms for machine learning rather than developing applications it’s well aware of the ethical implications of its work and is the Partnership on AI (opens in new tab) – an organization that aims to develop and share best practices on AI. Its goal is for AI to be safe, trustworthy, fair, transparent and accountable, and should serve society.
We spoke to Stephan Gillich, director of technical computing, analytics and AI, and GTM for the Intel EMEA Datacenter Group in between the presentations at AIDC.
“For the functionality of a learning algorithm, you need to have accurate representation of the data,” said Gillich. “And if you have not a broad representation of that data then your mechanism will not work. So from that end, we see that problem, and from our point of view, whatever we can do, we tell people that they should have as much unbiased data as they can have.”
“Once you have the data and everything, [there’s the question of] how can you look inside the box, which sometimes people see with an AI mechanism that is trained by data. […] How can you actually make sure that an AI algorithm performs, and have a certain level of understanding why it does what it does. So we are in general behind using AI in a way that really helps humanity, and one part of that is also the ethical part.”
AI for good
Intel is also working with partners on some to show just how beneficial machine learning can be when trained using good quality data, and carefully targeted. For one of these projects, it has partnered with environmental organization Parley for the Oceans (opens in new tab) to develop a drone-based system for assessing the health of whales without disturbing them.
“[Whales] can be identified by the fluke,” explained Gillich. “It’s like a fingerprint – you can take pictures and then you don’t need to tag these whales any more. They have their own tag – you just have to recognize it.
“What they used to do is take a probe of the whale – so they had to basically shoot something into it. It doesn’t hurt it, but it’s difficult and it’s a bit disturbing for the whale as well. So they found a method to fly on top of the whale, and when the whale blows, they take samples of that, and then they analyze it using AI methods. Like for the image analysis for the flukes, you can use AI to do the image recognition.”
The resulting system – Parley SnotBot (opens in new tab) – gives real-time data on the creatures’ health, without alarming them.
Another special project involves protecting African elephants from poachers. The population of African forest elephants is currently on track to be entirely wiped out by poaching for ivory and bushmeat within the next decade, but the sheer scale of their habitat means it’s impossible for park rangers to protect them all. Serengeti National Park, for example, has just 150 rangers to patrol an area roughly the size of Belgium.
“[These are] remote areas, and you can’t have people everywhere because that would disturb the wildlife as well,” says Gillich. “It’s simply not possible, so one thing we have worked with is […] cameras that can operate for a long time on their own, but still send signals to the people who can take care of this problem.”
To help solve the problem, Intel worked with conservation organization Resolve to develop a system called TrailGuard (opens in new tab), which uses a low-power smart camera small enough to be hidden in a bush.
When the camera is activated by motion, the AI analyses the footage to identify animals, people and vehicles. If a person is in an area that’s normally only populated by animals, there’s a good chance that they’re a poacher, so park keepers are alerted to the presence of unexpected people and vehicles.
The amount of bandwidth necessary to send every frame captured by the TrailGuard cameras to park headquarters or a cloud server would be prohibitive, and it would be impractical for rangers to trawl through so much footage. Instead, a neural network running on the cameras analyzes each frame in real time, discards those with no activity (or just animals), and sends those with humans to the park keepers.
Power to do good
TrailGuard project has been particularly successful, with park keepers able to intervene and catch poachers before animals are harmed, making the most of their limited resources – and applications like this are some of the things Gillich finds most exciting.
"You can do things right now that enable applications that are fascinating, revolutionizing the way things are done," he said, "because now I have the capability to have a lot of data, and store a lot of data, and process a lot of data and collect a lot of data [...] and then I can process it because I have the compute power.
"So that enables the capability that not only programmers are doing functional programming, but the data is actually doing some of the programming, and this is what we call artificial intelligence. So this is the fascinating part, that we can actually do this now […] and I can do things that help humanity in a much more efficient way than I could do before."