By Matt Turner
Published on June 14, 2023
Every organization wants to better serve its customers, and that goal is often achieved through data. But how?At the recent webinar Fifth Third’s Journey to Data Mesh with Alation and Snowflake Kaleigh Lavorini, director of product ownership and data strategy at Fifth Third, explains that many people don’t know whether they’re a data source owner, or even where their data goes or who uses it. That’s a problem! And that’s why Fifth Third Bank, as part of its digital transformation, concluded that the decentralized approach of a data mesh architecture, with its focus on data products and people, would help it become more agile, flexible, and data-driven.“Situationally, it was a really good time to deploy a data mesh architecture and its principles and invest in this space because we were doing so much tech modernization,” Lavorini says. “So why not make data a part of it?”
Giving everyone the keys to a race car doesn’t equip everyone to drive that car — especially if you’re not hiring more race car drivers. “We didn’t have access to hundreds of data engineers out in the marketplace,” Lavorini points out. So instead of looking toward the job market, Lavorini’s team looked internally at their people and supply chain. “If you think about the supply chain of a bank like us or any financial institution or really anyone who’s maybe not directly in the manufacturing space, what is your supply chain? Data!” she laughs. A supply chain of data means people have very data-rich roles, whether they realize that or not. For Lavorini, the foundation for decentralizing data was “taking the human piece of it and marrying it up with the tech goals.”It’s about changing the way teams think, and it’s about driving us towards a modern data culture,” Lavorini says. “This is the foundation of our data mesh strategy.” She adds that whenever she talks about data mesh, “I focus on the people piece of it because it’s people who are actually going to drive it forward.”
As the product owner for self-service analytics, Lavorini would speak to people in various roles and departments — leaders and developers; people in operations, finance, and marketing; those on the consumer side and the commercial side — and find that they had similar answers to the question “What is painful about data at Fifth Third?” Here’s what she learned:
These pain points around data all result from not having a modern data culture. These are common themes, regardless of what industry or company you’re in, and they are a direct result of not having a modern culture around data. Again, this means a data culture is about your people, both technical and non-technical. “We all live in a world where data drives our supply chain,” Lavorini says. “These are people who just touch and interact with and are affected by data constantly, every day.”
Lavorini started with a team of 8 that focused first on data mesh research, including platforms, best practices, and proofs of technology. This prepared the bank to launch a half dozen teams to align with what they called the data value stream, defined as all the activities from start to finish to build a data product.
The data value stream at Fifth Third Bank. The data producer — like a platform engineer — sits close to the source. The role starts and ends with selecting and moving data from a source system. The files go to a landing zone, and then the central data office, between the producers and consumers, does everything else, from acquiring to maintaining the data. That office, Lavorini cautions, does not scale. At the same time, there’s been an explosion in the size of source systems. “We have more source systems in the environment,” Lavorini notes. “We want to consume more data. The size of the data is increasing.” That data grows not just on the producer side, but also the consumer side, with more people wanting to access and use more and more data.“That’s why data mesh is super attractive to us,” Lavorini says, noting that, while the enterprise data office cannot scale, the data mesh can. “We can rally around building the right software and capabilities to actually figure out how we scale ourselves and achieve federation.”
To implement the data mesh, Fifth Third Bank launched 5 agile development teams known as squads. Under Lavorini’s leadership, these data teams are focused on building the foundations, which make it easier for everyone to do the same data activities. The focus areas of these teams include:
“Our landing zone of choice is Snowflake,” Lavorini says. “The process is simplified. Anyone building anything net-new publishes to Snowflake in a database driven by the use case and uses our commoditized web-based GUI ingestion framework. You don’t have to write ETL jobs.” That lowers the barrier to entry because you don’t have to be an ETL developer.
This team’s scope is massive because the data pipelines are huge and there are many different capabilities embedded in them. Fifth Third leverages dbt for “embedded” governance. “The enterprise data office can control the packages that are built in dbt and as we start to federate to different engineering teams, when they build a data pipeline, they all deploy the same packages,” she says. The team focuses on cleansing and transforming pieces of the data value stream, while seeking ways to further commoditize and standardize data.
Lavorini notes that data products are built for a reason and should be highly reusable: “It’s data, which is gold, because it’s what drives our supply chain.” This team built a web-based application on top of ServiceNow for users to register and recognize a data product. It’s also the mechanism that brings data consumers and data producers closer together.“Our legacy architecture, like that at most organizations, is a massive on-prem enterprise data warehouse,” Lavorini says. “As we modernize our core banking platforms, the data goes with that modernization journey.” The remaining decoupling teams are expert in all of the technologies, whether it’s a third-party partner like Snowflake, Alation, or dbt, or internal entities like the bank’s homegrown ingestion framework or data marketplace. These agile teams are fully dedicated to coaching, training, peer programing, and upskilling.“You cannot do this by building cool tech alone,” Lavorini says. “You have to invest in people for people.” This meant having two distinct teams working in close collaboration to update their platforms. One team comprises former platform engineers, mainframe developers, and even individuals initially unfamiliar with SQL. Fully dedicated to the task of modernization, these teams integrate through peer-programming, coaching, training, and upskilling to ensure success, with the ultimate goal of taking full ownership of a comprehensive data pipeline and numerous data products.“You have to invest in the people as you invest in the technology,” she concludes. In the diagram above, the bottom layer of the stream are the data management and governance capabilities. Altogether, the investment — in both technology and “people for people” — is to bring data producers and data consumers closer together to fulfill the potential and value of data at the bank.
Lavorini was joined on the webinar by Matthias Nicola, Field CTO at Snowflake. Together they noted these major lessons from executing the data mesh:
“Someone in finance or marketing doesn’t care about tech simplicity or that they’re using more scalable cloud-based platforms,” Lavorini says. “They care about having access to 100% of the data needed to drive profitability decisions or reduce risk or better-segment their bank products.” She suggested considering the “why” and the “what”: “You’re asking people who have never owned data before to do something very different. Why do they care?” Nicola noted that the key to success is “truly embracing the concept of data as a product,” and that includes aligning the data product to business success.
One of the critical points for Nicola is that “many companies are finding that individuals or teams in the organization can be quite hesitant to accept ownership, such as domain or data product ownership. That is one of the reasons why the responsibilities of a data product owner should be defined very precisely. This removes uncertainty and ensures people know what is and isn’t expected of them.”
“The more we simplify and mature our tech stack and make it easier to build and own a data pipeline, the less investment we need to make in coaching, upskilling, pure programming, and hand-holding,” Lavorini says. “Invest in building out the technology and making it easy to use. This is a key point for Nicola, who also noted that “data product owners need to be given a clear understanding of the help they’ll be getting,” including how automation and education will make their jobs easier. This, in turn, helps increase their willingness to accept data ownership.
“Data mesh is not easy,” Lavorini shares. “You’re asking a lot of people to do and think very differently, and that should not be taken lightly.” This is even more important for a project like data mesh where, as Nicola noted, “You are never truly done. The better question is how to measure progress or maturity of your data mesh,” and that includes how people adopt the concept. The last lesson is the most important. “You should lead with that empathy,” Lavorini concludes, “and build your team around an understanding that data mesh, while valuable, requires a really challenging evolution.”