Developing Big Impact Solutions

By: Lourens

The way we conduct business is changing. Decisions that were once delegated to the sixth sense of executives now have predictable outcomes based on an infinity of variables, from GDP to the latest Buzzfeed article. While tech giants have been the first to master this new powerful tool, it is not exclusive to the top players: with the right mindset, any decision-maker can harness the swings of human behavior to inform modern, successful strategies.

Big data is not just a fad but rather the new standard, required to cater to a user base that evolves rapidly according to a fast and constant stream of information. To embrace this change and develop your own big data analytics solution is really just a matter of Darwinian adaptation: evolve or fall behind. To do so efficiently and with insight from highly skilled specialists is a choice, and a very sensible one.

In his first address to the nation in 2018, Italian PM Giuseppe Conte announced that one of the focal points of his government program would be to “use big data to take advantage of the sharing economy”. Two striking revelations emerged from his speech: that ‘big data’ is a magical word, synonymous with innovation, that inspires confidence in the potential of technology, and that its applications reach far beyond the realm of business, where it is the most popular, into politics and economics.

Although big data is somewhat of a buzzword these days, the underlying concept has revolutionized the way businesses approach their customer base. Products and strategies can now be tailored according to behavioral patterns that emerge from the data that the consumers release by visiting websites, making online payments, or just being active on social media. While the potential of such a technology is enormous, many of the firms that have tried to implement it during the last decade have failed to set realistic goals and expectations, leading to frequent failures in such initiatives.

A previous ADC White Paper argued that this disappointing outcome can be reversed by increasing the synergy between management and data scientists, sitting together to define sensible objectives and working towards a Minimum Viable Product (MVP) in structured but integrated steps. This paper will introduce an example of such an approach.

Data-driven innovation projects: in theory

“A cycle in which each step produces feedback that leads to a common fine-tuning.”

While really just a codification of common sense, it is worth mentioning that the process of implementing big data analytics has already been formalized in structured models, such as the Cross-Industry Standard Process for Data Mining (CRISP-DM), and the Analytics Solutions Unified Method for Data Mining/Predictive Analytics (ASUM-DM) developed by IBM. These models, with minimal differences, introduce a common framework in which the logical steps of setting up such projects are described in a cyclical flow.

Any business initiative requires a high-level rationale, fundamental to define desirable outcomes and key metrics for evaluation, and big data is no different. In other words, management should envision how big data is going to fit within the overall strategy of the firm. Is it going to help understand the needs of the customers to develop new products? Or perhaps to adjust the marketing narrative in a way that is relatable with the user base on social media? Defining these strategic objectives is the first step towards a sensible implementation of big data, and serves towards specifying concrete deliverables for the project. Data scientists can help separate the feasible from the unfeasible and find a strategy that suits the organization.

Once the objectives have been defined, the best infrastructure to store, process and query your data needs to be set up: the technical work begins. The key difference between big data and its traditional small data counterpart is that the former requires much more attention. Fortunately, cloud computing offers not only a simple solution to the problem of storing massive amounts of unstructured data, but also some cheap online tools to gain some first insights from it, defined as Analytics as a Service (AaaS).

Now the process of data gathering can begin. Depending on your industry, it may include transactional data from POS systems, Online Marketing Analytics, social media content such as comments, likes and shares, user profiles and clickstreams, or even survey responses. In many cases, the data will be clean and ready for use, though most often data analysts and engineers will have to spend a considerable amount of time cleaning the raw datasets to present them in a usable format. After the data is collected, it should also be structured and made accessible to the involved departments. Data accessibility is key in a successful project, and datasets and infrastructure will have to be continuously updated as new insights are brought in by the application of the model. As we will see, cyclicality will characterize all phases of the project.

In the absence of complications, the data should now be ready for some exploratory analysis, the search for interesting patterns and correlations in the dataset. With a clearer idea of the relationships at play between the variables, you may want to build a predictive model, if the objective is to forecast a strategic metric, or a prescriptive model, if you are interested in decision-making models. Data scientists can develop a relevant methodology, test it on a sample and then scale it up to the full dataset, repeating the process as many times as necessary to ensure the results are consistent. This should lead to a first prototype model. Prototypes as previews of the final solution are essential for management to stay updated on the state of the project and to understand the direction the end product is going to take.

When a suitable prototype is developed and validated, it is ready for deployment, the phase in which the model is subjected to real working conditions with live data. Perhaps it is going to become part of the corporate website as a search-refining algorithm for the user, or maybe it will become part of the company’s internal assets. In any case, it will generate some insights that you can examine to support your decisions. This is usually achieved by plotting intuitive visual representation of the insights obtained from the model.

This is not the end, of course: running the prototype will most likely highlight strengths and weaknesses of the current methodology, with room for improvement. The process then restarts from thinking about the role of the model in the bigger picture, in a cycle in which each step produces feedback that leads to a common fine-tuning. If machine learning is implemented, the model will update itself based on new available data with no additional programming. Still, sometimes human input will be required. Model maintenance and enhancement is a constant work in progress, as all modellers know.

Data-driven innovation projects: in practice

 “When pursuing an innovative project, it pays to adopt agile practices; that means to employ a unit which encompasses the entire chain of operations.”

Although this process seems quite logical and almost inevitable, a significant amount of things can go wrong. Especially at the beginning, poorly specified objectives, or deliverables assigned without elaborating on their relevance within the overall strategy, can lead to inefficiencies. Traditionally, organizations would have a separate IT department prepare the infrastructure. The data collection and cleaning tasks would then be assigned to a data engineering unit, while a data science unit would develop a model which is going to be evaluated by the business unit itself and deployed by IT. Within this scheme, an army of IT, management and quantitative consultants, each proficient in one sector of the operations chain, supports the corresponding departments.

With the same product moving between so many links of this vertical chain, it is very easy for the members of each department to remain stuck in their silos and lose sight of the bigger picture. This is why many successful companies have begun to migrate towards a more ‘agile’ solution, a horizontal, multidisciplinary and product-oriented structure that adapts easily to new tasks and technologies. DevOps is an example of such an approach, in which developers and IT are combined in a single team focused on one product, that has become increasingly popular among tech firms.

The traditional role of consultants, which is limited to one link of the chain based on their expertise, does nothing more than exacerbate the silo mindset. Instead, contemporary consultants adopt agile practices; that means to employ a unit which encompasses the entire chain of operations: a unit that can formulate concrete and relevant deliverables with the experience of a manager and work towards a Minimum Viable Product, configure top-notch data infrastructure with the skill of an IT specialist, and build cutting-edge statistical models with the knowledge of data science and econometrics, all with the dependability of a functional team of modern professionals. This is what turns a big data solution into a “big impact” solution.


Amsterdam Data Collective helps decision makers realise their advanced analytics vision, and develop modelling solutions for maximum control. Partner with us to optimise, realise, create or validate.