Making sense of big data
Big data may be a big deal, but it needs to produce actionable insights to have true value. But, not every enterprise knows how to use its data effectively. Big data expert Dr. Craig Brown reveals the way forward.
By Gary Maidment
Big data is a big deal for enterprises – if they can use it for actionable insights, that is. And research shows that most can't. In fact, only 39 percent of enterprises have invested in a big data analytics platform, according to a Tech Pro Research survey in Q4 2017. Big data expert, author, and consultant Dr. Craig Brown shares his tips on how enterprises can turn data into gold rather than kryptonite.
A question of quality
“When you’re moving towards a data driven solution, the question to ask initially is about the quality of your data,” says Brown. “What kinds of questions can be answered from the data that you have. Will the data allow for clear decision-making in a data-driven system?” If you’re not answering these questions in a clear strategic context, your big data plans may well fail. In November 2017, Gartner’s Nick Heudecker tweeted that the company’s 2017 estimate that 60 percent of big data projects fail was in fact “too conservative. The failure rate is closer to 85%. And the problem isn't technology.” So what are the problems? The most pressing issues are a mix of not having the right strategies, the right tools, and the right management support.
You need more structure
According to Brown, deciding the quality, answers, and type of decisions you want from your data – your overall strategy – will help identify what the relevant sources are, a crucial first step to get right. A few examples of data sources are archived data, documents, reports, emails, social media, images, business apps, analytics engines, sensors, click-streams, apps, APIs, and public websites. Retailers, for example, can employ simple techniques like assigning a guest ID to credit cards to track purchase preferences and habits, monitoring social media activity to push promotions, optimizing in-store traffic to optimize store layout, and more effectively managing inventory so “out-of-stock” is no longer an issue. McKinsey reports that five-year ROI is 15 to 20 percent for retailers that invest in big data. In turn, telcos can benefit from data sources that yield huge amounts of data, including call detail records, mobile phone usage, network equipment, server logs, billing, and social networks, which can be applied to things like predicting traffic, preventing customer churn, and bill recovery.
Analytics, though, is easier in the case of structured data, which exists in a pre-defined record, is organized, and can be stored in a relational database (RDB) for easy search and analysis. Structured data can be machine-generated, like RFID tags from sensors in devices such as smart meters and collars for livestock, or human-generated, for example, the click-stream data trail you left when you last bought something online.
But, an estimated 80 percent of all data is partially structured or unstructured, including social media posts, articles, emails, and media files, meaning that it isn’t organized or stored in an RDB. This type of data – the way it’s created – has a typically human flavor so it’s harder for algorithms to perform parsing or computer vision to make sense of natural speech or text or images. Traditional analytics programs tend to use tags for structured data and keywords for unstructured data, which is imprecise and can create an insight gap.
According to Brown, we’ve still got some way to go in the relatively immature field of unstructured data analytics, with AI and 5G set to be key pivot points, “Network intelligence can act in 5G networks, collecting data from sensors and bringing that unstructured data into new technologies that provide analytics,” he says referring to unstructured information management architecture (UIMA).
The personnel problem
Another question that Brown believes is necessary for enterprises to ask is “the quality of the collaboration between your data stewards and the decision makers.” A survey by Tata Consulting about big data analytics in manufacturing identified building trust between data scientists and functional managers as a top concern for enterprises. This lack of trust creates a gap between data insights and how and which business strategies are executed. These findings were supported by a survey by Forrester for KMPG consulting, which found that only 38 percent of respondents have a high level of confidence in their customer insights, just one-third seem to trust the analytics they generate from their business operations, and only 51 percent of C-suite executives are fully behind their data and analytics policy. According to a McKinsey report in 2017 on advanced analytics in telcos, “Given the volume and pace of change within the industry, [telco] leaders are not well positioned to move their companies as fast or as far as they need to go to thrive.”
However, for data to have value in the decision-making chain, says Brown, “There has to be a clear, concise understanding about what those decisions should be, what is expected of those decisions to confirm, that the data has what it needs in order to provide those kinds of expectations.” And that momentum needs to come from the top.
The right tools for the job
A key facet of getting past C-suite inertia is understanding the benefits of big data analytics for business and choosing the right tool to collect, organize, store, and access that data. Common problems affecting enterprises when choosing analytics solutions include doing so before properly defining what problems they’re solving, implementing a solution that isn’t agile and cannot scale, and forgetting about legacy systems and integration.
Huawei’s FusionInsight – Universe Analytics received the award Most Innovative Data Governance Solution at the Telco Big Data Analytics Summit Euro 2017. Optimized for enterprises as well as telcos, the solution dramatically reduces data integration workloads and provides data management across heterogeneous data warehouses, integrated quality audits, data lifecycle management, and data lineage analysis. Its big data governance solution provides high-quality data for AI and big data applications, shortening the data preparation cycle from hours from months, increasing data governance efficiency by more than 40 percent and the amount of high-quality data by 40 percent.
In conjunction with Informatica PowerCenter, a Hadoop-based data integration solution, the solution provides data integration, conversion, and cleaning capabilities – key features because as Brown says, “There’s going to be a lot more data involved in the future, so you’re going to need faster systems and faster processing in order to compute that data into something realistic and usable for business.” Data may be a goldmine, but its value will remain buried without the right strategies, tools, and personnel.