Artificial Intelligence

Data Lake on AWS for Data Management at Dislicores

4 min read

Oct 25, 2023 11:37:26 AM

Dislicores is a Colombian company founded in 1956. It has established itself as one of the leading distributors of wines, spirits, tequilas, beers, and non-alcoholic beverages. To date, Dislicores has 50 stores in cities such as Medellín, Bogotá, Rionegro, Cali, Bucaramanga, Ibagué, Barranquilla, Cartagena, Santa Marta, Valledupar, Montería, and Pereira. Dislicores expects a more substantial expansion in Bogotá, Armenia, and Cartagena, among other Colombian territories, in the coming months. Within its strategic vision is the internationalization plan, which hopes to reach different countries in Central and South America with the experience centers—Dislicores Stores.

Dislicores has five sales channels: physical stores, apps, e-commerce through its website, www.dislicores.com, a WhatsApp line for deliveries, national landlines, and a button for 24-hour Rappi purchases in some cities. On its online platform, Dislicores has more than 69,000 visitors per month, and one of its goals is to continue growing thanks to its loyalty programs. Besides, the company has more than 39,000 customers per year.

imag- articulo-caso-dislicores-02

What was the challenge?

For a company with national operations like Dislicores, it is imperative to have quality and timely information to correctly operate its marketing and distribution processes. Before the data lake implementation, the company had difficulties having a single view of its customers in B2B and B2C businesses because the data was dispersed between on-premises systems and files generated by other corporate areas.

After implementing the data lake, the company set up a centralized repository with quality data, including completeness, consistency, and reliability attributes. This initiative enabled the creation of a single view of customers, the development of data-driven sales strategies, and a significant improvement in customer relationship management tailored to their profiles.

These strategies included the conception of various segmentation models, the performance and monitoring of campaigns through digital channels, and the tracking of key indicators through business dashboards.

imag- articulo-caso-dislicores-01

How did we solve it?

Pragma adopted a strategic approach to these problems to meet our client’s needs. First, we gained a greater understanding of the business through different work sessions, delimited the scope of the solution to be implemented, and established clear, measurable, and achievable objectives with the development.

Once the objectives were set, the technical teams of both companies had to align their efforts to understand the current architecture of the applications that would be the source of information, as well as the critical business and quality rules when transforming the data and extracting the information that the business needed.

Having a clear vision of the client’s operation, Pragma made the data architecture proposal for the data lake, contemplating the best market practices for this solution in alignment with the client’s needs. Given our experience as an AWS Partner in the development of cloud solutions, an architecture with multiple serverless-type components such as Amazon S3, Glue, Lambda, and Athena was proposed to the client, allowing flexibility in terms of costs with a pay-as-you-go model and freeing up capacity in the IT team by reducing on-premise server management and maintenance tasks.

As a result, we delivered to the client a data lake with the ETL processes required for loading and transforming data from its on-premise systems and a data warehouse with a dimensional model and the critical information to have the 360° view that the company required for customer loyalty and segmentation processes for B2B and B2C businesses.
Our impact: OUTCOMES

To solve this challenge, we relied on the following AWS services:

Implementing a data lake as the organization’s primary data source resulted in principles and best practices for data architecture and governance at Dislicores.

Creating a unified view of customers laid the foundation for developing a CRM system that makes retaining and attracting customers easier.

Consolidating essential customer data into a centralized system improved the timely availability of information by 80%.

Implementing a data warehouse enhanced business intelligence by providing clean and filtered data from the Glue and Amazon Athena catalog.

Consolidating data in a single repository and automating processes optimized the generation of crucial indicators for the business, going from several days to just a few hours. This accelerated strategic decision-making based on critical company information, such as sales, promotions, and channels, now accessible from a single centralized source.

How AWS is used as part of the solution

To address the challenge of consolidating information from multiple systems, a variety of formats, and significant volumes of information, the proposed architecture involved the Amazon S3 service to implement the data lake, as it is cost-efficient, reliable, and highly scalable. It was the right option for Dislicores’ needs.

Below are the primary services that were part of the solution and how they contributed to the construction of the data lake:

Amazon S3: An object storage service that offers secure, durable, and scalable storage for all data types. As it can be used to store documents, photos, videos, and any other file, it is the perfect solution for implementing a data lake.
Amazon Athena: A data analysis service that searches and analyzes data stored in Amazon S3 without setting up a dedicated database. In the case of Dislicores, this service allowed the implementation of a modern data warehouse with excellent performance, where you only pay for the amount of data scanned; the processing power is at the expense of AWS.
AWS Glue: A fully managed ETL (Extract, Transform, Load) service that makes it easy to move, transform, and prepare data for analysis. Within the implemented data lake, the data catalog provided by Glue is crucial in ordering, cataloging, and querying all the information stored in Amazon S3.
AWS Lambda: Serverless computing service that runs code in response to events without provisioning and managing servers. Lambda plays a vital role in data transformation processes within the implemented solution since it allows information to be transformed and transported through the layers of the data lake cost-efficiently.
AWS Step Functions: A service that creates visual workflows to coordinate tasks and services in applications deployed on AWS. For Dislicores, it was used as an orchestration tool for the data transformation and transport tasks since they work perfectly with Lambda.
Amazon EventBridge: A service that connects and routes events between AWS services and external applications and schedules the execution of tasks based on business rules or logic. Within the data lake, it triggers the data loading, updating, and transformation processes within the times and periodicity the client requires.

Start, end, and production
Start: 07/01/2021
End: The project is current
Production: 12/31/2021