A data engineer’s ability to rapidly and effectively profile and model data is a critical determinant of how quickly and effectively we can extract, transform and load the data to target system of reference to be ready for consumption by business end users. In a recent collaborative effort, PwC and Microsoft organized a training hackathon with a focus on the adoption of MS Fabric & Foundry, of scaling data platforms. This intensive, multi-day event was designed to empower data engineers with the necessary skills to deploy agents for automating data profiling and modelling within data transformation projects. The primary goal of the hackathon was to offer participants hands-on experience with these cutting-edge technologies, thereby cultivating practical knowledge and a deeper conceptual understanding. The event was not just a training session; it was an immersive experience designed to accelerate the adoption of agentic AI and advanced data engineering practices.
The central challenge that the hackathon aimed to address is a question many data engineers and architects are asking: to what extent data preparation can be automated? The urgency behind this question is growing.
According to the PwC Global AI Jobs Barometer, skills for jobs most exposed to AI are changing 66% faster than for other roles, a shift driven largely by the automation of manual and repetitive tasks.
While this is a broad trend, its concrete impact is already visible.
The parallel for data engineering is clear where tasks like data profiling, and modelling, can create bottlenecks as part of the extended ETL process. This automation is the key to streamlining data transformation projects, making them faster, more accurate, and more scalable, which in turn enables organizations make better decisions quicker.
The hackathon was an in-person event structured into foundation and advance tracks for participants with varying levels of experience and expertise from Warsaw, London and Zurich.
The Foundational track was designed for data engineers who were new to MS stack and the application of agentic AI, with a focus on the following lab exercises:
The Advanced track, was designed for more experienced data engineers who wanted to delve deeper into the application of agentic AI:
The agenda was packed with activities including group allocations, lab introductions, and hands-on exercises, and playback sessions for teams to share their progress and insights.
On the last day of the training hackathon the Data Engineers worked on the following proof of concepts to demonstrate how agents and Infrastructure as a Code approach can be applied to build logical data models and configure data platforms for large scale multi-domain migrations.
1. Agentic Data Modelling PoC Demo
2. Agentic Infrastructure Builder PoC Demo
The hackathon was designed to produce several key outcomes, including hands-on configuration of MS Foundry for deployment of agents for automation of data preparation:
A working knowledge of MS Fabric and Foundry: through dedicated, hands-on learning labs, participants gained a practical understanding of these powerful platforms.
The development of a conceptual agent architecture for data profiling: provided participants with a blueprint for how they can leverage agentic AI to automate data profiling in their own organizations.
The creation of conceptual and logical data models: based on dummy data within a dedicated sandbox environment, this gave participants the opportunity to apply their newly acquired skills to a real-world problem.
By focusing on these outcomes, the hackathon successfully equipped participants with the skills and understanding needed to leverage the power of automation in data engineering, ultimately enabling them to build more efficient and effective data solutions.
If this resonates with you then we would encourage bringing your teams together for similar collaborative exercises. These events provide the crucial space to move from theoretical knowledge to hands on practical exercises simulating the application of services like MS Foundry to real problem statement, thereby reinforcing individual learning through collective problem-solving.
Ultimately this will act as a catalyst to extend their training journey towards obtaining industry certifications, including Microsoft Certified: Fabric Analytics Engineer Associate (DP-700). Beyond this the, the labs exercises will enabled the team to create PoCs that can in turn be developed into working solutions within dedicated sandbox environments, demonstrating a direct and accelerated path from upskilling to real-world application.