In the last years, the research and application of smart algorithms through machine learning or deep learning has multiplied. Artificial narrow intelligence (ANI) achieved incredible results through natural language processing and computer vision in many areas. For example, British AI company DeepMind managed to tackle one of biology’s grand challenges, the protein-folding problem. Its neural net, known as AlphaFold, was able to predict the 3D structures of proteins based on their amino acid sequences with unprecedented accuracy.
However, parallel to the successes, many questions, problems and risks surrounding AI reached the surface, including data bias, algorithmic failures and “black boxes” that no-one understands, many ethical questions arise. In the below list, we tried to summarize the most pressing issues in AI ethics.
1. Is it ethical to treat data labelers the way we do?
In the most widespread type of machine learning, algorithms are taught using a dataset that already has specific parameters that the software must “memorize”. For an autonomous car to recognize pedestrians and stop signs, it’s typically fed thousands or millions of photos, all hand-labeled. To nail a conversation, a digital assistant needs to be told over and over when it failed. If we want to teach the algorithm to recognize tumors on CAT scans, radiologists should provide bioinformaticians medical imaging where it is clearly indicated that a lesion on the scan is a malignant or a benign tumor, or just a harmless shadow not to consider. This work is repetitive and monotonous, requiring sifting through thousands of similar images for hours. It is carried out by hundreds of data annotators, in medical imaging usually medical students or residents.
This work usually goes unnoticed and unrecognized and goes against the universal claim of why algorithms are beneficial for workers. It is often claimed that AI helps reduce the amount of monotonous daily work routines, however, as the practice of using data labelers shows, this work is not reduced but rather outsourced.
2. How much should data labelers be paid?
Connected to the problem outlined above, data labelers are usually not paid proportionately enough for their tremendous contribution to the workings of smart algorithms, and the AI industry can also contribute to the East-West divide. Many high-tech companies outsource the tasks from developed countries to developing ones. For example, IndiVillage Tech Solutions LLP hosts about a hundred women and youth at its office in the Indian town of Yemmiganur employing them to label data, but at least they spend their profits on education and drinking water for the community.
In Serbia, Microwork pays an hourly wage of at least $3 an hour, more than twice the local minimum, to 100 people in an area where jobs are scarce, and it says it aims to expand its ranks to 1,000 this year. Samasource trains and employs people in Africa, India, and Haiti. In China, similar data factories are popping up in areas far from the biggest cities, often in relatively remote areas where both labor and office space are cheap. Many of the data factory workers are the kinds of people who once worked on assembly lines and construction sites in those big cities.
3. Is it O.K. to use data sets that are released when websites are hacked?
California-based Ambry Genetics, a clinical genomic diagnostics vendor, suffered an email hack last year, which compromised the data of 232,772 patients. An investigation revealed a hacker gained access to an employee email hack. The compromised patient data could include names, medical information, and information related to services provided by Ambry Genetics. Some Social Security numbers were compromised, as well.
Similar data breaches are frequent and usually hackers can gain access to large datasets like that of more than 200 thousand patients. As one of the most difficult task for algorithm developers is to find enough adequate data for their operations, it is easily conceivable that hackers not only use individual data to try to gain some unfair advantages, but they also offer entire datasets to companies in dire need of data. It would be unethical to say yes to such databases, right? It’s a tougher question than it might seem for first sight since what if it’s a healthcare company that aims to treat cancer better? Would the end justify the means?
4. How to treat AI research that could be easily weaponized against populations?
Facial recognition technologies, location tracking and other types of surveillance heavily rely on the latest developments in artificial intelligence. For example this excellent piece mentions that on Reddit, in June, 2019, a user linked to an article titled “Facial Feature Discovery for Ethnicity Recognition,” published in Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. The machine-learning model it described successfully predicted Chinese Uyghur, Tibetan, and Korean ethnicity based on photographs of faces. What could be done if it seems to be crystal-clear what are the intentions of the end-user of the technology?
One potential solution could be the introduction of an updated peer-review process for such papers, asking the authors to consider the social and ethical implications of their research, as well as to require researchers to discuss means for mitigation, through other technologies or even new policies.
5. How to deal with AI research that unnecessary draws boundaries?
Smart algorithms are bound to categorize objects and draw conclusions from data through such categories. However, humans are not easily categorized: poorly chosen datasets, poorly chosen categories and labels could all end up creating unnecessarily offensive algorithms. For example, an algorithm called Speech2Face is bound to determine how the face behind a voice recording might look like. Such a technology may “harden people into categories that don’t fit well”, such as gender or sexual orientation, argues Katie Shilton, an information scientist at the University of Maryland. Such research is deeply problematic from an ethical point of view.
Luckily we have projects that call attention to such issues. Do you remember the project called ImageNet Roulette? It was launched by researcher Kate Crawford and artist Trevor Paglen, and it aimed at drawing attention to the harmful effects caused potentially by facial recognition and other algorithms categorizing people. As a reaction, by last September, one of the biggest visual database in the world, ImageNet, which was created by researchers at Stanford and Princeton Universities, removed around 600,000 images and reviewed 438 categories identifying them as “offensive” independent from any context. Removal of the images happened in the framework of a review of whether the algorithm of ImageNet shows any bias. During the examination, it turned out that the software reproduced those power relationships of gender and race, which were hidden in the data – moreover, it made them visible and exaggerated them.
6. What to do about automated weapons research?
The production, dissemination and use of lethal autonomous weapons, which can include unmanned flying drones or submarines, cruise missiles, autonomously operated sentry guns, or battlefield robots are hardly justifiable from a point of view that considers human life as the most valuable. Therefore, Human Rights Watch and other non-governmental organizations launched the Campaign to Stop Killer Robots in 2013, and since then, 30 countries have called for a ban on such fully autonomous weapons. Also, since 2018, the United Nations Secretary-General António Guterres has repeatedly urged states to prohibit weapons systems that could, by themselves, target and attack human beings, calling them “morally repugnant and politically unacceptable.”
Some researchers don’t even agree with doing research into the area. In 2018, fifty-seven scientists from 29 countries have called for a boycott of a top South Korean university because of a new center aimed at using AI to bolster national security. The AI scientists claim the university is developing autonomous weapons, or “killer robots,” whereas university officials say the goal of the research is to improve existing defense systems.
7. What to do about tools creating a fake reality?
And finally, we arrived at the highly problematic area of tools generating fake images, fake voices, deepfake videos and even comments through algorithms trained for becoming bots on Twitter or such. In 2017, an anonymous Redditor with the username Deepfakes released a software tool kit that allows anyone to make synthetic videos in which a neural network substitutes one person’s face for another’s, while keeping their expressions consistent.
In a media environment, that is already saturated with fake news, such technology has disturbing consequences. Imagine a world where you cannot decide whether a politician speaking in a video on your Facebook feed is true or fabricated. Imagine a world where you are deciding on whom to vote based on such videos. Oh wait, that’s already here. Thousands if not millions of fake images, videos and texts are generated every day online and it becomes nearly impossible for the handful of social media moderators and fact-checking websites to flag the fake.
Ethicists argue that at the moment, AI development and application faces its Manhattan Project moment, and governments need to regulate the research, development and use of highly powerful AI tools in order for them not to alter our social systems to the disadvantage of humanity. The field should not repeat the notorious saying of Wernher von Braun, founding father of the US Space Program: “Once the rockets are up, who cares where they come down?”