Lessons from reluctant data engineering

As we all know, solid data engineering is essential to the success of data science and AI applications. And yet, people often get excited about fancy machine learning models and neglect the data engineering layer. This is totally understandable: playing with data in a throwaway notebook is more relaxing than dealing with a data pipeline that keeps finding ways to break in production. In this talk, I'll share lessons on data engineering from a data science perspective. Everywhere I've worked, from small start-ups to established companies, I've found that I had to do some data engineering if I wanted my work to ever get to production. While I've always been reluctant to do too much of it, my engineering background has placed me in a better position to do it than colleagues who started off as analysts and academics. You could call my work full-stack data science, reluctant data engineering, or some other data & AI thing. Whatever it is, I hope that my talk will help us all play better with each other, across all layers of the data stack.