Leveraging Apache Iceberg for Effective Data Governance in a Data Lakehouse

Description Preamble:
 Companies have and are continuously collecting and storing a range of consumer data for further analysis in data lakehouses. In a lot of cases, the collected data contains personal identifiable information. Recent data breaches have highlighted the need for better data governance guidelines and regulation on how companies handle consumer information. The Right To Be Forgotten presents some challenges for companies in how to remove personal information they can no longer legally hold, without compromising data quality, security and atomicity. Apache Iceberg is a data table format that enables data lakehouses to easily comply with this legislation. 

I will discuss Apache Iceberg and the benefits it provides in managing large amounts of data and how Apache Iceberg can be part of your modern data architecture by bringing a range of capabilities to your data lake, including compliance with proper regulations, ACID transactions, schema evolution, time travel, and incremental processing. By the end of the talk, attendees will have gained a better understanding on how Apache Iceberg can be successfully used in data lakehouses.