“With great power, comes great responsibility.” Comic book fans will quickly recognise this quote as the words inspiring Peter Parker to become Spider-man. Others will note that Voltaire said it first. Nevertheless, as important as this quote is to history or to Spidey’s future, I think it has even greater relevance for technologists in our data-driven world.
The amount of personal data collected by organisations is staggering. As Facebook and Cambridge Analytica taught us, the opportunity to abuse data is overwhelming. For these reasons, we need to embrace the concept of data ethics.
What is data ethics?
It’s a thin, thin line between proper use and abuse of data. As data science and related technologies evolve, so does the “art of the possible.” While data analysis is not new, we now have the ability to quickly process large amounts of data and make correlations and predictions using disparate data sets. The ease of these efforts creates numerous issues related to privacy, confidentiality, transparency, and identity.
As a result, data ethics emerged as a new branch of ethics to focuses on the moral problems associated with:
- How data is generated, recorded, and shared;
- The way algorithms for machine learning and artificial intelligence use data;
- The data practices embraced by the public and private sectors.
Data ethics highlights the complexity of the ethical challenges posed by data science and big data analytics. Gartner previously predicted that one-half of business ethics violations would result from the improper use of big data. In short, our current ethical frameworks no longer apply to data and we must now think differently.
Privacy in practice
While almost every organisation has a privacy policy, this doesn’t mean that they are demonstrating data ethics. Have you ever read the privacy policy on a website you where you give your personal information? Only 16 per cent of people claim they do. The real number is probably lower. Further, a privacy policy doesn’t guarantee the confidentiality of your data. It’s simply a legal document describing potential uses of your data. We can be agreeing to almost anything.
Although generally slow to respond, various governments recently enacted laws to protect consumers. The General Data Protection Regulations (GDPR) in the EU has been described as “privacy by default”, giving citizens strict control of their data. Earlier this year, California passed the California Consumer Privacy Act (CCPA) to protect online privacy and personally identifiable information (PII). Now, the Federal Data Strategy is purporting to make ethical governance one of its core principles. Major corporations are also lining up as privacy advocates in hopes of shaping future legislation in the US.
Privacy may be making a comeback, but we still need data ethics to guide us towards “privacy by design.”
The UK is leading the way
Such is the case in the United Kingdom.
The UK created a Data Ethics Framework, as part of its National Digital Strategy. The framework sets clear guidelines for acceptable uses of government data, building in transparency and accountability. The audience is anyone that interacts with government data, from statisticians to policymakers to IT staff and beyond.
As Matt Hancock, the previous UK Secretary of State for Digital, Culture, Media and Sport, stated: “If we fail to preserve the values we care about in our new digital society, then our big data capabilities risk abandoning these values for the sake of innovation and expediency.” Essentially, the UK felt it necessary to document their societal values to ensure their efficacy in the new economy.
I concur with this sentiment and believe that it is time for an international code of data ethics.
Code of data ethics
The following tenets, based on the UK framework, examples from professional organisations, and my own experience as a CIO, form guidelines for the acceptable use of data as we fully engage in digital transformation.
- Behind the data is a person. Respect the individual when interacting with their data. Watch out for disparate impact based on blind spots and inherent biases.
- Clearly state what you plan to do with an individual’s data. Never attempt to trick them. Make it easy to understand your intentions and give consent.
- Don’t use data in ways it was not originally intended. Make additional disclosures if intentions change.
- Be transparent. Open your data to inspire trust.
- Maintain an audit trail for a dataset’s lineage. This way, anyone that interacts with it can know its history, including accuracy and quality, the context for its collection, and any related manipulations. This also supports ethics reviews and minimises risk across the data supply chain.
- Consult experts if there’s any doubt that you may be in violation of laws or regulations. Also, remember that the law often lags technology and is the minimum standard, not all you should do to protect confidentiality and privacy.
- Use as little data as necessary to meet your need. Less data equals less risk.
- Use data insights responsibly. There are limits to the decisions we should make based solely on data without human involvement.
- Take a risk-based approach when securing data. Protect PII as if it’s your own. You can’t secure what you don’t know exists, so make information visible by finding hidden datasets.
These ideas are…