80 per cent of the data that a company holds is dark data. But what actually is dark data – and why should you care about it?
In truth, dark data is an umbrella term that covers a number of different sources. Keep reading to find out:
- What dark data is
- The opportunities and risks it presents
- How to use dark data
- Four examples of organisations already using their dark data
- How to avoid making any of your data dark in the first place
What is dark data?
Dark data is digital information you are not using. Whenever you gather and store data without a plan for its use, you are collecting dark data.
Some of this data is already structured, such as under-utilised survey fields or out-of-date information. Other dark data is unstructured, such as date or time of creation, edit logs and qualitative data.
Examples of dark data
Dark data is a broad term. It comprises many types of information and will differ per company. However, any of the following can constitute dark data, if they are unstructured or out of date:
- Customer information
- Raw survey data
- Email or social media correspondence
- Account information
- Former employee data
- Presentations or notes
- Old versions of documents
- Financial statements
Other examples, such as server log files, can provide insight into website visitor behaviour. Mobile geolocation data can reveal traffic patterns to inform business planning. And customer call records can shed light on consumer sentiment data. These types of dark data have the potential to create new revenue sources, streamline processes and reduce costs.
Making sense of it all
Despite the variety of dark data, it will always fall into one of three categories:
- It’s junk: information that is now redundant, obsolete or trivial – and occupying valuable storage space
- It’s a risk: sensitive data stored without appropriate safeguards, such as encryption
- It’s valuable: unused data and unstructured metadata that could provide new business insights
The risks of dark data
Dark data holds risks as well as potential. The better you understand and categorise the data you hold, the less risk it poses.
In the worst case scenarios, unstructured data can have serious cost implications. One company spent $6 million simply searching through dark data to provide information for a court case.
Dark data is an opportunity
Dark data’s potential lies in being able to understand the relationships between apparently unrelated pieces of information. Giving structure to your dark data may well shine a light on key differences or trends. It might help you to identify that two people living in one house are married – or siblings, or flatmates. You may then be able to better predict their decisions based on this new information.
How to use dark data
O2 uses dark social to personalise communications
Dark social accounts for the vast majority of shares and interactions with online content on smartphones. Research from 2016 found that dark social channels accounted for 77 per cent of shared content on mobile phones. The more specialised your customer base, the more important dark social is to your business.
O2 recognised this and wanted to capitalise on dark social’s swathes of unknown, but useful, information. It decided to introduce URL shorteners and sharing widgets across its media assets to understand how content is shared on dark social. O2 anonymises all data, but has been able to mark out core customers who regularly share content with friends. Dark social data has also disproved certain assumptions the company previously relied on to make decisions.
Dark data helped to reduce staff turnover
Gate Gourmet, an airline industry catering provider, was struggling to lower its employee turnover (unusually high at 50 per cent). It used accessible dark data from its internal systems to confirm that high turnover was a result of commute time. The company changed its hiring process and consequently reduced its attrition rate by 27 per cent.
Scientists use dark data to improve research
Certain research conducted in the US during the 1970s and 1980s couldn’t be made public due to limits in technology. However, this dark data (concerning ocean-based zooplankton) is now publicly available, enabling further investigation. The unstructured data could provide valuable information about the ocean, and whether climate change has affected it.
Healthcare is exploring unstructured data
Indiana University Health is looking at how it can leverage dark data to personalise healthcare and predict demand across populations. It involves integrating patient data with non-traditional external data sources and cognitive computing to identify historical trends and patterns of illness. This could make it possible to understand how socioeconomic factors affect patients’ engagement with healthcare providers.
Can you avoid dark data?
You won’t be able to remove all of your legacy data or dark data. However, following a regular data suppression schedule to remove unnecessary information from your databases will help you understand and prune the data you hold. Being rigorous when collecting customer data in the first place will also help to minimise dark data:
- When you are capturing customer data, ask yourself why you are capturing it. What are you going to do with it?
- Record simple details (such as a customer’s name, address and job title) in the same way. This is key to ensuring data consistency and integrity
- Ensure your forms include minimal ‘required’ fields to avoid customers and prospects submitting dummy data
- Vet data when you capture it and provide immediate feedback to users
- For more on this, see How to build up your customer database after GDPR