The growing role of data – from big data to smart data

Digital transformation is the use of digital technologies and data as leverage to improve an organization’s performance.

A PwC report “A new image of the Polish consumer” shows that the coronavirus pandemic, among other things, has brought about an increase in the frequency with which Poles shop using computers and smartphones and, as a result, most Poles will continue to shop online once the sanitary regimes have been eased. The presence of consumers in digital channels also means that the amount of data about them which becomes available to companies has increased. And this means managing that data and using it effectively.

Why is using data important?

How can a company start working with data properly?

Having customer data is a prerequisite to becoming a data-driven organization, but it is not enough. Having large volumes of data in dispersed systems (or Excel files) does not allow you to build any competitive advantages on that basis. This requires a transition from having data to using data that supports business objectives, i.e. moving from big data to smart data.

The first step is to develop a data strategy which, with regard to the organization’s business (both short- and long-term) objectives, determines the need for data and identifies the data that the organization already has as well as that which the organization should start collecting.

In the following steps, it is necessary to determine the value of the data from a legal perspective (the possibility of using it legally for business purposes), to integrate the data and manage it in a central manner in a system adapted for processing huge volumes of data (big data).

In addition, areas such as the building of unique customer ID's (using deterministic and probabilistic methods), the generation of consumer insights through the use of ML/AI (Machine Learning/ Artificial Intelligence) mechanisms and algorithms, data activation and continuous measurement and optimization are important elements.

zbiory danych

Types of consumer data

Consumer data may be systematized according to how and from which sources the organization obtains it. We get a possibly complete picture of consumers when data is obtained from various sources, i.e. the organization’s own data (1st party data), data obtained through cooperation with business partners (2nd party data), and data collected by independent companies (3rd party data).

smart data big data
  • 1st party data is data collected from various own sources, e.g. CRM, CMS, and MA systems, own websites, advertising creations, mobile applications. They can be divided into:
    • Personal data (Personally Identifiable Information – PII) – Name and surname, address, email, and telephone number collected by signing up for a newsletter, access to content and services that require setting up an account/profile/log-in, making a purchase, payment or delivery, joining a loyalty programme. This allows for full customer identification, learning who the customer is, and the transactional relationships the customer has with the organization.
    • Anonymous – e.g. cookies, the results of advertising campaigns, search history, history of calls to call centres, geolocation data, Device ID, Social Media profiles. This allows you to enrich your knowledge and understanding of the broader customer base.
smart data big data
  • 2nd party data is data collected by direct business partners, e.g. partner websites, digital resources of companies with which a business relationship exists, e.g. cooperation in organizing an event / sponsoring / sponsored articles / themed blogs / a company’s posts and publications on third-party websites. E.g. name and surname, address, email, telephone number, behaviour, purchase history, engagement metrics. It allows you to enrich your knowledge and understanding of a wider customer base. This type of data can often be anonymized.
  • 3rd party data is behavioural, demographic, geo data, etc., collected by independent companies, by “data sellers and brokers”, e.g. DMP, data clouds, data processing companies. On the one hand, it can be used to profile and segment anonymous customers to target advertisements more accurately, build the so-called look-a-like audience, i.e. searching for digital twins of the best-converting customers in the Internet. On the other hand, by linking IDs (ID Graph) you can supplement your current customer profiles with additional information, enriching your knowledge about them.
How can consumer data be acquired

How can large consumer data sets be collected?

  • Cloud – in the case of huge data sets (Big data), collecting and processing data in a cloud may prove to be a better solution as it allows full scaling of data space and information processing. A significant part of DMP, CRM, and MA class solutions that collect and process customer data already operate entirely in the cloud under the Software as a Service (SaaS) model. Obviously, personal data processing in the cloud is subject to many regulations – GDPR, PFSA, etc.;

  • On-premise – storing and processing customer data in the organization’s own IT infrastructure gives it greater control over data and processes but limits efficiency and scalability to the available infrastructure. This requires long-term planning of investments in the equipment infrastructure, software and IT teams necessary to maintain and develop it.

zbiory danych

How to process data to bring business value?

  • Real time – solutions that allow for continuous input, processing and output of data in real time (in milliseconds/seconds). Achieving a continuous result of processing a huge stream of input data and getting an immediate system response – the result of a data operation that can be applied immediately. Real time solutions require more efficient IT systems;
  • Batch (batch processing) – solutions that require the performance of a series of interrelated tasks separated into input, processing and output of data. They require queueing of processes, use far fewer IT resources, but they do not allow for the output of real time data processing, only in planned “windows” of time.

Each of these methods has its advantages and disadvantages. In one organization, different data can be processed in various ways. The choice of method for a data type depends on the actual use of the data (not all data needs to be provided in real time to carry value) and the estimated costs of processing the data in each method.

 

What changes in data usage will affect companies in the immediate future?

Eliminating 3rd party cookies

  • Cookies used to analyse cross-domain internet user behaviour, build behavioural, demographic, and transactional profiles, etc. for data brokerage, create consumer insights and customize messages across communication channels (in Data Management Platform/Demand Side Platform tools) are gradually being eliminated from the marketing ecosystem;
  • 3rd party cookies are already blocked by the following web browsers: Firefox, Apple Safari, and Microsoft Edge;

  • Google has announced that they will be blocked in Chrome in 2022 (Chrome has a 63% share in the Polish market according to a Gemius Ranking dating from November 2020);

  • The elimination will increase the role of 1st party cookies as the only remaining user IDs on the market, available only to the owners of large online properties such as website publishers, ecommerce platforms, price comparison websites, web applications, telecoms, banks, etc. These companies are already building their online ecosystems like Google or Facebook have done previously but, for obvious reasons, smaller ones. These are so-called walled gardens based on 1st party cookies which allow the use of data, among other things, for advertising.

Limiting the use of device IDs

  • Device IDs (IDFA, GAID, MAID) – are used in the mobile app world to identify users;
  • Apple, with its iOS 14, has announced that in 2021 it will introduce restrictions on the ability of third parties to use device IDs;

  • iPhone users will have to give their explicit consent whenever 3rd party solutions attempt to track them, and app providers should prepare appropriate labels for their solutions in the Apple App Store, indicating: what data they use to track users, what data is collected in the apps and can be linked to users’ identities.

New regulations

  • Regulations introduced by the European Union, the USA, Russia, and China on the protection of privacy, collection and use of data, user consents, e.g.: GDPR, COPA, CCPA.
app privacy

App privacy label on the Appstore for Instagram
Source: App Store, accessed: 29.01.2021.

Examples of using user data in Retail

  • To achieve operational excellence, the US retailer has taken care of its stock analysis and the prediction of the scale of purchases in 2,400 of the chain’s stores. Using real time sensors placed in the stores to count people and measure the length of checkout queues, as well as historical data, the retailer’s Analytics Department was able to make predictions for the necessary stocking and optimization of checkout operations;

  • USD 120 million – savings on reducing necessary stocks of goods;

  • USD 1.7 million – fewer orders for out-of-stock products;

  • USD 80 million – increase in revenue;

  • Improving customer experience by reducing the time spent in checkout queues and preventing frustration when a customer tries to purchase items that the chain offers but does not stock in a given store.

  • The cosmetic brand has been collecting and analysing data about its customers’ behaviour for years, which enables it to customize its message, make precise product recommendations, build relationships and loyalty, among other things, through product assessment and recommendation programmes, increase sales, and build its own Direct 2 Consumer channels; 
  • EUR 4.6 bn, a 15.6% share, + 52.4% increase year-on-year – sales in e-commerce (2019);

  • ½ – share of digital channels in advertising expenditure;

  • 1.3 bn visits to its websites;

  • 33,000 employees trained under the Digital Upskilling programme;

  • Collecting data from various channels and touchpoints: consumer behaviour, preferences or transactions, product testing, e.g. a mobile application to “try on” make-up via augmented reality, quizzes (guided selling) helping to choose a hair colour, digital diagnostic solutions, i.e. skin “diagnosis”, enables the company to tailor its message to 2.1 quadrillion (10 to the power of 27) potential persons. In 2021, the company will launch Perso – an AI-based, self-service, customized home skin care system.

  • Thanks to a data scientist team, based on data analytics and Machine Learning, a US listed company is able to sell sets of clothes precisely tailored to its customers’ tastes under a subscription model. Recommendation algorithms combined with an army of personal stylists allow them to select not only the style and size of clothing properly, thus reducing the number of returns and increasing loyalty;
  • USD 7.4 bn – market capitalization;

  • 1% of products sold are brand new goods which have been designed using only machine learning algorithms;

  • An in-depth knowledge of customer preferences, their physical characteristics, demographics, location, purchases, etc., combined with detailed data about the characteristics of all the goods sold – size, fit, style, design, brand, price, etc. as well as consumer opinions and feedback – including the reasons for purchases and returns – has allowed the company to build a unique recommendation system – a true Netflix or Spotify for clothing.

Contact us

Michał Kreczmar

Michał Kreczmar

Director, PwC Poland

Tel: +48 883 365 805

Krzysztof Badowski

Krzysztof Badowski

Partner, Strategy& Poland

Tel: +48 608 333 277

Follow us