The Environmental, Social & Governance of Data Storage

While the most common understanding of AI is analytics tools and content generation – focused on customers and performance – which are integral to any marketing strategy, the ways in which the data generated is stored is equally crucial in today’s highly digitized world. As it stands, the current state of affairs poses a fundamental environmental issue that developers, investors and businesses now have to take into account.

With a larger carbon footprint than the aviation industry, data centers consume nearly 2% of the world’s electricity, and that is expected to reach at least 8% by 2030. This is because every action taken online – clicking a link, a simple google search, or even streaming a video – is matched to a “piece” of data linked to a corresponding use of energy. As a result, the cloud’s enormous physical data centers require continual electricity to power servers, storage devices and back-up equipment. This electricity is often generated by fossil fuel sources like coal, oil and natural gas. As the amount of data increases, so too does the number of data centers required and the power needed to keep them operational. Of course, cloud technology is undeniably changing the way we connect with tech, but the infrastructure used in this exponential growth is not as environmentally friendly as one would hope.

The digital transformation of businesses, content and data has been a growing trend in many capacities from investment opportunities to practical uses. It ranges from digitizing capital markets (i.e. currencies, trade and retail operations) to modes of communication as well as a myriad of entertainment purposes, to name a few. With the growth in internet usage, and cloud-based solutions, there is an undeniable need for businesses, consumers and governments to ensure that this evolution does more good than harm. While that sounds simple, the truth is that it is anything but.

In 2016, it was reported that the world’s data centers used more than Britain’s total electricity consumption at 416.2 terawatt hours – exponentially higher than the UK’s 300 terawatt hours. These data centres operated on 3% of the global electricity supply and accounted for around 2% of total greenhouse gas emissions. More recent predictions expect that the energy consumption of data centers will account for 3.2% of the total worldwide carbon emissions by 2025 and they could consume nearly a fifth of global electricity consumption. By 2040, storing digital data is set to create 14% of global emissions, around the same amount as the entire US today.

Environmental, Social, and Governance – better known as ESG – is a hot topic for investors and businesses. Shedding light on the environmental impact, and carbon footprints, of everything from consumer goods and packaging, to supply chains, and the ever-expanding world of IT. ESG was derived from the ‘Triple Bottom Line’, also known as the ‘People, Planet and Profits’ (PPP), a concept introduced in the 1990s. Advocating that businesses should focus on each of the three ‘P’s’ and not solely on ‘Profits’. These were equally important for any commercial enterprise to be considered sustainable, and thus evolved into the Sustainable and Responsible Investing (SRI) trend sweeping most markets today.

Environmental criteria examines a business’s carbon footprint – including waste, pollution, greenhouse gas emissions, the impact on natural habitats and, thus the contribution to climate change. Social criteria evaluates how a company treats people – human capital management, diversity and equal opportunities, work conditions, and the health & safety of those within the organization as well as those living in the larger community that houses these organizations. The Governance criteria examines a company’s governance and organizational structure – including executive remuneration, tax practices and strategy, corruption and bribery, board diversity and set up, as well as the legislation that regulates them. As such, ESG focuses on the ways in which companies serve society and the impacts of their current and future performance.

Current statistics show that only half of the world’s population is connected to the internet and therefore contributing to this data deluge. Despite this, IDC noted that the number of data centres worldwide has grown from 500,000 in 2012 to more than 8 million today. The amount of energy used by data centres continues to double every four years, meaning they have the fastest-growing carbon footprint of any area within the IT sector. A new trend in the industry looks at Green IT, which essentially tackles the issues posed by the IT world, and in specific, the offset environmental issues that large facilities like data centers pose. This also includes the current infrastructure engineered to help such centers operate. The solutions proposed include things like cloud-based computing, as well as AI-powered methods of ensuring data sustainability.

Another development is the establishment of The EU Code of Conduct for Data Center Energy Efficiency program. A voluntary initiative developed in 2008 by the Joint Research Center under the Directorate-General (DG JRC) in response to the increased energy consumption by data centers. With the objective of reducing and regulating the environmental, economic and energy supply impacts without interfering in the functionality of the centers. The policies are routinely reviewed and updated in correlation with the growing demand and supply. These policies include an industry-wide code of conduct, regulation on allowances for data centers such as energy supplies, and business practices that are agreed upon by operators, local governments and the JRC.

Some countries require citizen data to be stored close to home on domestic servers, a move to colder locations simply may not be possible – or even legal. The environmental impact of data centers is not limited to their electrical consumption. The coolants used to prevent overheating are often made of hazardous chemicals, and the battery backups at data centers (needed for power shortages) can also be detrimental to the environment due to unsustainable mining and the irresponsible disposal of toxic batteries.

For data centers to operate, they require temperature control – made possible in one of two ways: to be built in a country with a naturally cold climate or to be housed in a temperature-controlled environment. In 2009, in an attempt to test innovative solutions to the growing environmental impacts of such facilities, Google opened a data center in Finland. Since then, the tech giant has pumped millions into improving its eco-credentials and now operates with 100% renewable energy. Using 50% less energy than the industry average by deploying evaporative cooling solutions, smart temperature and lighting controls, and custom-built servers that aim to use as little energy as possible. Studies have shown that nearly 40% of the energy consumed by data centers is actually generated by the equipment used for cooling devices. If the location of the data center is in a warmer climate, then this number can grow up to 80%.

Data storage is important for reliable preservation of data, data continuity and accessibility – including quicker and easier recovery – as well as flexibility in price points, capacity options, seamless scalability potential and effective protection from any kind of malware or loss of data. Similar to the methodology in content creation and promotion, there is no One Size Fits All approach and more often than not people make the most out of setting up hybrid means of archiving and storing their work. This is especially important for companies that are built on the exchange of information, or streamlined automated data processing that require large caches.

Decisions around storage types are often based on budget, business model, company structure, usage and uses of digital files. Broken down into 4 categories – private, public, community and hybrid storage through network attached storage (NAS) or direct attached storage (DAS) systems. The former is a Random Array of Independent Disks (RAID) configuration of multiple independent machines that share data to form a centralized network that improves collaboration among the devices therein. These devices, also applicable as direct attached storage, are the more commonly used CD/DVD drives, hard drives, Solid State Drives (SSD) and flash drives.

These drives, or devices, house different filing systems known as file or block systems that can range in scale. File systems are inexpensive, created to store simple data in files & folders, and commonly on direct attached storage (such as hard drives); whereas block systems are more expensive, complex setups that are less scalable and primarily ideal for frequently accessed data with the intent to regularly edit data often over a network. Data storage ranges in scale and complexity in that data is created, stored and exchanged in a myriad of ways, to serve a range of purposes. A recurring issue in the amount of energy required to operate data storage systems, and the programs that run them, is the duplication of files.

It has to be said that there is a notable distinction between data at rest and operations on data in the way energy is consumed. Data at rest refers to the minimal to no consumption of energy by machines when not in use – such as when a computer is switched off – but still keeps record of the stored data, accessible to the user when switched back on. Whereas the varied operations on data, which refers to what one does with data. Different data operations require different energy consumptions – for instance, video streaming consumes more energy than sending a text message would – and therein lies the correlation between the quality of data (e.g. HD videos) and the energy consumption needed to generate it. The scale of this grows exponentially with the number of views/likes/shares of such content. In simple terms, while a Netflix account will save you a trip to the cinema (and the carbon footprint of the transportation, incurred payments, et al.) or a YouTube video may entertain you when you are bored, the content you stream actually taxes the environment in less obvious ways.

Technology giants such as Amazon, Apple, Microsoft and Facebook have committed to 100% renewable energy use in the coming years. Other data centers have taken further measures in trying to reduce their carbon footprint and energy waste. For instance, Nordic data center operator DigiPlex has pledged that the wasted heat from its facility in Ulven, Oslo, can be reused to warm 5,000 apartments in the city. The facility is also powered through renewable energy sources.

Tech giants like Google are not the only companies capable of minimizing the effects data storage facilities have on the environment. As with any reevaluation of an organization’s carbon footprint, there are simple climate conscious steps that can be integrated into operations. Modernizing data protection, improving efficiencies, simplifying operations and maximizing the potential of AI-driven intelligence can all enable the right action needed.

Data lifecycle management: avoiding excess old data, whereby admins and architects ensure adequate processes to prevent this kind of build up. As per government guidelines, like GDPR, most data should be erased after a set period of time (while this may be slightly different for things like company financial and/or personnel records)
Deduplication: the process of finding and eliminating duplicate data stored in different data sets, which has proven to reduce storage requirements
Compression: a form of data reduction focused on finding and eliminating repeated patterns of bytes. Well-suited for databases, e-mail and files, it can be incorporated within certain storage systems
Policy-based tiering: moving data into different classes of storage based on things like the data’s age, frequency of access, or how readily available it needs to be. Unless the policy requires the deletion of unneeded data, this technique will not reduce storage requirements entirely but it can cut costs by moving data to less expensive mediums.
Thin provisioning: the process of setting up an application server to use a certain amount of space on a drive, without using the space until it is actually needed. For instance, policy-based storage will not completely cut the data footprint, but it will delay the need to buy more drives and thus mitigate the environmental impact

Moving forward, the ESG focus of data centers of any scale is founded on core principles that tackle the carbon footprint of their operations in a concerted effort between companies and governments in pivoting energy consumption and waste towards more sustainable purposes. In order to ensure this, multiple factors including the geographic location of data centers and a variety of legislation processes have become focal points in tackling the issue of climate conscious solutions within the digital world.

A recent IDC study claims that by 2025, worldwide data traffic will have grown by 61% to 175 zettabytes, with roughly 75% of the global population accounting for at least one data interaction every 18 seconds. As global internet penetration rates continue to grow along with the connected technologies entering the mainstream, it is inevitable that the number of data centers globally will keep increasing.

In addition to removing the need to build temperature-controlled environments to house data centers, companies have started to explore using renewable energy such as wind, hydro or solar to power data centers as well as optimising and/or upgrading technology to improve efficiency and operating temperature. Moreover, AI is also being deployed in some facilities to reduce power consumption. AI can analyse data output, humidity, temperature, and other important statistics to find solutions in improving efficiency, cut costs, and reduce total power consumption.

The introduction and integration of new technologies such as the launch of 5G, new IoT devices and cryptocurrency worldwide can be viewed as a double-edged sword. In that, these new modes of exchange and infrastructure offer real solutions to plenty of global problems, they do require even more connectivity, data storage and processing than ever before. If we do not re-evaluate and address the carbon footprint of current operations, these may compound the ever-growing problem of climate change even further.