Please collect your lanyards from the front counter.
Lars Klint will welcome us to the conference.
Data Contracts: Data Quality for AI
In this session, we'll dive into the world of data contracts - API-based agreements between data producers and consumers that capture the schema, semantics, distribution, and enforcement policies of the data. Learn how data contracts provide a single surface for collaboration on data in a shared language, allow the data model to evolve in an agile, iterative way, and apply data governance incrementally where it's needed for AI and ML systems. We'll explore how organizations can leverage data contracts to ensure Artificial Intelligence systems are trained on trustworthy and well-governed datasets.
Data Operating Models, The Missing Link Between Strategy and Execution
We've all been part of companies that are embarking on the "new great strategy", attended the town halls, drunk the cool aid on the idea that we're finally going to fix some things around here. Yet we get back to our desks and find that not much really changed. The best intentions and near-infinite elbow grease don't seem to move us towards our lofty new goal. You feel encumbered by the pressures and bureaucracy of what was, as you try to bring about the future. Operating models give us a way to bridge between the multi-year strategy and the day to day execution. A way of taking a machete to the existing business architecture to mindfully organise around unlocking flow and value. In this talk we'll go through the building blocks of designing a data operating model. The pieces you'll need to bring alignment between your executive leadership and engineering. You'll leave empowered to not just be in charge of your own destiny, but to be in control of it.
Enjoy some delicious refreshments to keep you going towards lunch.
Build your own electric vehicle charging map with PostGIS
An educational live demo of a map server showing electric vehicle charging stations so you'll never be stuck without a charge again. Your map will show the available charging stations, which ones you can reach with your current range and the most efficient route to each one. Come along and see what you can build with PostGIS and pgRouting in the context of a topical real world use case. We will also take a look at some ETL and geocoding examples using PL/Python running natively in Postgres. Finally, we will look at why PostGIS is so powerful for both performance and integration - including functions to work with GeoJSON, KML and MVT, built-in 3D and topology support, and advanced spatial indexing enhancements. This is a beginner to intermediate session for PostGIS and assumes a rudimentary knowledge of SQL and spatial data.
Exposing Big Edge Data: What Big Cloud Providers Don't Want You to Know
In this session, I will demystify managing 'Big Data in Motion' from edge devices to the cloud. We'll delve into the innovative engineering design I've recently implemented, leveraging the capabilities of AWS Greengrass and StreamManager. This strategy offers an efficient and flexible IoT data ingestion and analysis pathway. The discussion will highlight how this approach distinguishes itself from and enhances the traditional out-of-the-box solutions that large cloud providers typically endorse. Key Topics: • Deciphering the Challenge: An in-depth investigation into the complexities of managing high-frequency data streams from edge devices. • Alternative Engineering Design: A comprehensive exploration of my pioneering approach that utilizes AWS Greengrass and StreamManager to ship high-frequency data whilst still maintaining flexibility. • Unmasking the Data: A practical demonstration of how query services like Athena can be utilized directly on raw data from the edge. Further, I will showcase how Grafana can be effectively employed for rapid historical data visualization. By the end of this session, you will witness a practical demonstration, complete with code, illustrating how high-frequency raw data can be readily exposed to teams without investing significant engineering time in constructing complex pipelines. Gain insights that big cloud providers might prefer to keep under wraps.
Unleashing the Magic of Machines and LLMs by Mastering MLOps
MLOps combines ModelOps, DataOps, and DevOps. It involves actively managing a productionized model, ensuring its stability and effectiveness. What is the best way to focus on optimizing data, model, and developer operations to maintain the functionality of the ML application? MLOps involves striking a balance between providing flexibility and visibility to data scientists for model development and maintenance while granting ML engineers control over production systems.
Relax and network with your fellow peers
Data engineering challenges in building a realtime recommendation system for real estate industry
Developing a realtime recommendations system is challenging, not only from AI/ML model training but more to data engineering and integration perspective. Property listings are updated continuously, new listings are added or existing listings are expired in every minute. Its not feasible and cost effective to train new models within that frequency. Moreover, end users behaviour changes like property preference, location, budget while they are browsing through properties. In addition, response time of the models needs to be reduced significantly to serve it on website and track clickstream behaviour continuously in database to respect most recent preferences. To solve the customers optimal interest, we had to address multiple data engineering challenges. In this session, we can highlight those challenges and how does the data engineering team support to Productionize end-to-end system.
Data Warehouse: Data Scientists' new lab!
Incorporating Machine Learning into your traditional data warehouse can be a complex process that entails managing multiple distributed components written in programming languages that may be unfamiliar to you, potentially slowing down your delivery speed. Imagine if your data warehouse could serve as an "All-in-One Tool" for your Data Scientists, enabling them to securely and cost-effectively utilise Machine Learning models without the need for extensive Extract, Transform, Load (ETL) processes. What if all they needed was SQL? In this presentation, I will demonstrate how Amazon Redshift ML can serve as the ultimate solution, acting as the "One Tool to Rule Them All," allowing you to achieve this seamlessly in just a few minutes.
Harnessing Azure OpenAI for Data Engineering
In this enlightening session, Sergio will unfold the benefits of integrating Azure OpenAI into Data Engineering processes. Attendees will gain insight into the advantages of deploying Azure OpenAI in their Data Engineering initiatives, which includes the effortless integration of top-tier AI functionalities into their routine tasks and operations. Sergio will demonstrate how Data engineers can use Azure OpenAI to improve code readability, suggest code improvements, help build testing cases, create sample code/data pipeline and many more activities. Azure OpenAI offers access to advanced AI models crafted by OpenAI within the Azure ecosystem.
Enjoy some delicious refreshments to keep you going into the afternoon.
Panel discussion - Perth
Hear from some of the brightest minds in data engineering both locally and internationally in our panel discussion!
Trust but Verify: Ensuring Data Quality with CI/CD
In this talk, we dive into the realm of data quality assurance through Continuous Integration and Continuous Delivery (CI/CD) practices. We explore how a CI/CD pipeline acts as a safety net, catching data inconsistencies and errors before they impact the end-users. From automated testing to timely data validation, we discuss how these principles increase data reliability and trustworthiness. Revealing how we have leveraged CI/CD to create a robust data quality assurance framework.
How can we have a common understanding without a common language? (Lessons from the semantic layer)
Forcing everyone to use the same terms for stuff will make your data program fail. But if we can't agree on what to call things, how can we manage data? Mark shares his insights from over 20 years of creating shared understanding (to have greater impact with data) in Global Banks, Government, and Media and Advertising.
Lars Klint will close out the conference.
Join us for some late refreshments and further networking to close out the conference.
Building Data Teams and Platforms at South32
Regional Analytics Lead, Asia
Data Engineer at Canva, Founder of Data Engineer Camp
Senior Engagement Partner, Cognizant Servian
CMD Solutions/Mantel Group, Lead Data Consultant
Solutions Engineer at EDB
Data Strategy and Management Specialist - Independant
Mechanical Rock Data-Ops Engineer
DevOps engineer at Mechanical Rock
Senior Platform Engineer @ First Mode
Sergio Zenatti Filho
Microsoft - Sr Cloud Solution Architect Data & AI
AWS Community Hero | Principal Consultant at Mechanical Rock | DevOps, Data and Serverless Enthusiast