cost savings over single-vendor solutions
TB of data governed and monitored
enterprise data lakehouse tables cataloged and monitored
With roughly 190,000 real estate agents, Keller Williams (KW) is the world’s largest real estate franchise by agent count. Keller Williams’ agent-centric culture is a core component of its success, as is its technology.
“Data associated with property listings is the lifeblood of Keller Williams’ Enterprise Information Management (EIM) team,” says Cliff Miller, enterprise data architect. The company profiles over 70 TB of data every day. Data plays a crucial role in helping the business make informed decisions, but this is only possible after that data is validated for quality and made easily accessible to stakeholders.
Miller joined Keller Williams in the spring of 2022, along with Dan Djuric, Head of Enterprise Data and Advanced Analytics. By then, KW already had a largely modernized data stack built around the Google Cloud Platform, including BigQuery, Cloud SQL, and Dataflow/Composer. Their remaining legacy infrastructure was on a roadmap for cloud transformation.
Both Djuric and Miller identified data governance and cataloging as one of KW’s most significant areas for improvement. The goal was, in Miller’s words, “a data framework that the entire organization can rely on, shifting towards empirical, objective-based evidence.”
While searching for a data cataloging solution, the EIM team recognized another core need in upgrading their data maturity: an enterprise-grade data quality platform. Miller described data quality issues as the “sleeper issues” that technology teams at large wish they knew about well before they surface and cause all sorts of collateral damage.
The EIM team sought a data catalog platform and a robust data quality solution to help KW:
Govern 1300+ enterprise data lakehouse tables
Empower the entirety of Keller Williams with access to data
Build data trust and provide confidence that the data is ready for use
The EIM team explored a variety of solutions for data cataloging and monitoring. They quickly ruled out the legacy suites from the software giants whose primary focus was neither data cataloging nor data quality. And were KW to opt for one of their bundles, they’d be locked into a complex ecosystem of other products they didn’t need along with the price tag to match.
“We were in need of two core platform competencies, we didn’t need ten. We wanted those things to be best of breed at what they did,” said Miller.
Because the EIM team was looking for vendors whose core competency was data governance, KW ultimately selected Alation as its data catalog and Anomalo for data quality monitoring. Critically, Anomalo and Alation have a native integration that lets mutual customers seamlessly view data quality in the context of the data catalog. This was compelling for the EIM team because Alation and Anomalo effectively behave as one unified product.
Figure: Seamless integration between Alation and Anomolo
When starting with Anomalo, KW used Alation’s popularity feature to identify which of their 1300+ tables to prioritize. So far, they’ve configured deep data quality monitoring for 250 tables and higher-level table observability for another set that didn’t require quite that level of profiling.
Because Anomalo and Alation continue to invest in their partnership, the value proposition only increases over time. For instance, Alation’s data lineage features now enhance Anomalo’s ability to trace how data quality affects KW’s entire stack. That way, when Anomalo detects a discrepancy, it’s easy to understand how that fits into upstream data sources and downstream data consumers.
Since transitioning to Anomalo and Alation, KW has moved significantly along the data management maturity curve. The two data governance solutions are a few key components behind a cultural shift at Keller Williams — the entire company, not just the EIM team, can now naturally incorporate data into their workflows to steer business decisions.
Both Anomalo and Alation emphasize universal accessibility, meaning everyone, from non-technical users to veteran data scientists, can find the platforms intuitive. The EIM team considered it a major success because the solutions also seek to elevate the company’s overall data literacy.
KW achieved these results at an estimated cost savings of 10x compared to bundled legacy solutions. Not only that, the software’s value is projected to grow over time.
The EIM team derives value in two main ways from Anomalo. “Anomalo has made a ton of difference around what we’ve been able to observe and keep track of. There’s the day-to-day, ‘How is everything looking?’ And there are also indicators about how our business is trending. You can do both—it’s not an either/or proposition,” said Miller.
That combination of data observability and deeper metrics monitoring is made possible by Anomalo’s machine learning capabilities, which can automatically detect unusual patterns and identify the root cause of issues. Whereas some data monitoring solutions suffer at scale, Anomalo’s algorithms get more statistical power from large datasets.
As Keller Williams continues its journey along a data modernization path, Anomalo and Alation are poised to ease that transition. Together, they’ll make sure no data gets corrupted and that it’s organized in such a way to be useful. In Miller’s words, “Information is a differentiator and something that provides our organization with exceptional ability to go above and beyond others in the industry.”