Small Data for sustainability: AI ethics and the environment

Source: iStock/gorodenkoff

New technologies have the potential to affect the environment in disruptive ways. In the emergence phase of new technologies, however, these impacts are often unclear or underestimated. 

One of the main tasks of technology ethics is to proactively identify the potential harms of emerging technologies and possible ways to minimize them. Typical values considered by technology ethics include respect for human dignity and autonomy, beneficence, avoidance of harm, fairness, justice, and privacy. 

Environmental sustainability is mainly discussed in relation to specific technologies (such as nuclear power and molecular biology) and application areas (transport, energy production, agriculture, etc.), but the relevance that ethical evaluations attach to it varies greatly across different technology fields.

In the field of AI ethics, sensitivity to environmental sustainability is still emerging. Environmental sustainability is mentioned only cursorily in the draft EU AI Act, and a global quantitative survey of existing AI guidelines published in 2019 found that sustainability is mentioned in only 14 of the 84 guidelines examined. The recent European Guidelines for Trustworthy AI and the UNESCO Recommendation on the Ethics of AI acknowledge sustainability, along with societal well-being, as one of the key requirements that should guide the production and use of AI artifacts. 

But even when sustainability is considered, AI ethics is still partly characterized by a short-term focus on the reliability of AI products, as opposed to a long-term focus on the impact of technological innovation, especially if compared to technology fields where sustainability is an established principle. 

Moreover, AI ethics focuses mainly on the immediate impact of AI applications on their users, rather than taking a broader perspective on stakeholders that also includes society at large and future generations. AI ethics could therefore benefit from a long-term, broad-based approach that is more open to stimulating reflection on sustainability issues.

By strengthening its focus on sustainability, the ethics of AI would be able to acknowledge the enormous environmental impact of AI. Rare minerals, land, and water are needed in huge quantities to build the special chips and other hardware components that are essential for AI systems and to house and cool the servers that store and process the data used to train AI systems. In addition, the development and maintenance of AI systems generate large amounts of carbon emissions, and, at the end of their life cycle, the hardware components leave behind waste that in turn requires land for storage.

Focusing on the carbon footprint of AI systems, a recent MIT study calculated that one training session of a large language model (which is only one of many steps in the development of an AI system) generates about 284 tons of CO2. This is equivalent to five times the carbon footprint of an entire car’s life cycle, including fuel consumption, and about 57 times the amount of CO2 emitted by an average person in a year. 

A newly established research center aims to collect and evaluate more detailed, large-scale empirical data on the environmental impact of AI, beyond carbon emissions, to suggest how AI can be made more sustainable. Among the many building blocks that would contribute to a more sustainable AI, one seems to be of crucial importance and is particularly promising from an ethical perspective. At its core is a shift away from the currently dominant Big Data approach towards a “Small Data” approach. 

Switching to smaller training datasets would reduce both the processing and storage capacities required, which currently make AI systems so energy consuming and resource-intensive. But it would also have a cascade of positive impacts on other ethical values. 

First, preferring smaller training data would allow AI developers to select and check for biases and errors more accurately. This would, in turn, mitigate the tendency of automated systems to reinforce pre-existing biases or discriminations. It would also allow better attention to be paid to the quality and diversity of the training data, which would have a positive impact on equity, fairness, and non-discrimination.

Furthermore, the environmental impacts of AI are currently exacerbating global inequalities and the North-South divide. While the populations of rich countries enjoy the benefits of AI systems, the populations of the Global South are the most affected by the negative environmental impacts of AI and related human rights violations. These include global warming, deforestation, and pollution from mines and landfills used to extract the raw materials needed for AI systems and dispose of the waste. 

A Small Data approach could also enable the development of AI systems for applications that do not rely on the abundance of data available in rich, highly digitized societies. It would, for example, make it possible to develop language models for languages for which terabytes of training data are not available, thus making the benefits of AI accessible to more people. In terms of equity, AI based on Small Data would thus enable a fairer global distribution of the advantages and disadvantages of AI. 

Moreover, a Small Data approach to AI can also be expected to have positive side effects in terms of privacy, accountability, and transparency, as less personal data would be required and the functioning and decision-making process behind AI systems could be made more intelligible. 

A strong take on sustainability in the field of AI can also play a driving role in further developing technically promising AI approaches, including data mining and training methods working with Small Data, which have been relatively marginal so far but are potentially capable of making AI more technically accurate and more ethical under various aspects.

Existing initiatives using a Small Data approach further demonstrate that it can have important synergies with human rights movements. A pioneering study aiming to develop language models for “low-resourced” African languages demonstrates the affinity of such projects with participatory methods attentive to the needs and specificities of marginalized communities. As the previous discussion shows, a Small Data approach can moreover sustain attempts to counteract global power imbalances that perpetuate historical injustices and are at the root of contemporary human rights violations.