This presentation was recorded at GOTO Copenhagen 2023. #GOTOcon #GOTOcph https://gotocph.com James Bowkett - Technical Delivery Director at OpenCredo @OpencredoItd RESOURCES https://twitter.com/techwob https://www.linkedin.com/in/jamesbowkett ABSTRACT Data is everywhere, and so too are data-centric applications. As the world becomes increasingly data-centric, and the volumes of that data increase over time, data engineering will become more and more important. If we're going to be dealing with petabytes of data it will be better to get the fundamentals in place before you start, rather than trying to retrofit best practices onto mountains of data. This only makes a difficult job harder. The 12-factor app helped to define how we think about and design cloud native applications. In this presentation, I will discuss 12 principles of designing data-centric applications that have helped me over the years across 4 categories : Architecture & Design, Quality & Validation (Observability), Audit & Explainability, Consumption. This has ultimately led to our teams delivering data platforms that are both testable and well-tested. The 12 factors also enable them to be upgraded in a safe and controlled manner and will help them get deployed quickly, safely and repeatedly. This talk will be filled with examples and counter examples from the course of my career and the projects that my teams have seen over the years. It will incorporate software engineering best practices and how these apply to data-centric engineering. We hope that you can benefit from some of our experience to create higher quality data-centric applications that scale better and get into production quicker. [...] TIMECODES 00:00 Intro 01:51 What & why? 06:17 Architecture & design 06:27 No. 1: Data structures as code 10:07 No. 2: Append-only data structures 13:21 No. 3: Optimize for access & retrieval 15:51 No. 4: Separate data from logic 22:32 No. 5: Strongly type your data fields 25:03 Quality & validation 25:14 No. 6: Architect for regression testability 29:19 No. 7: Track changes in your test data 31:59 Audit & explainability 32:08 No. 8: Mind your metadata: Data cataloguing 34:37 No. 9: Mind your metadata: Code traceability 37:02 Consumption 37:28 No. 10: Defined APIs for accessing data 40:08 No. 11: Defined SLAs (& SLOs) for data 41:17 No. 12: Treat data as a product 42:28 Summary 45:32 Outro Download slides and read the full abstract here: https://gotocph.com/2023/sessions/2968 RECOMMENDED BOOKS Zhamak Dehghani • Data Mesh • https://amzn.to/3tTCwAC Sandeep Uttamchandani • The Self-Service Data Roadmap • https://amzn.to/3wAw5W2 Piethein Strengholt • Data Management at Scale • https://amzn.to/3tya08H https://twitter.com/GOTOcon https://www.linkedin.com/company/goto- https://www.instagram.com/goto_con https://www.facebook.com/GOTOConferences #12FactorApp #Data #DataCentricApp #CloudNative #SoftwareArchitecture #DataPlatforms #DataStructuresAsCode #ContinuousDelivery #SLA #SLO #DataAsAProduct #JamesBowkett CHANNEL MEMBERSHIP BONUS Join this channel to get early access to videos & other perks: https://www.youtube.com/channel/UCs_tLP3AiwYKwdUHpltJPuA/join Looking for a unique learning experience? Attend the next GOTO conference near you! Get your ticket at https://gotopia.tech Sign up for updates and specials at https://gotopia.tech/newsletter SUBSCRIBE TO OUR CHANNEL - new videos posted almost daily. https://www.youtube.com/user/GotoConferences/?sub_confirmation=1
Get notified about new features and conference additions.