Databricks Data + AI Summit 2024: Alation’s Key Takeaways

By David Sweenor

Published on June 21, 2024

Alation booth at Snowflake Summit 2024

It’s been a crazy two weeks! We’re just getting back from another whirlwind week at the Moscone Center in San Francisco. The Databricks Data & AI Summit 2024 was another smashing success, with core focuses on AI for innovation in data – and ample reminders that you can’t be successful in AI without a solid data foundation. The Alation team was there in full force, sharing our latest updates, from compelling customer stories to product innovations designed to make our joint customers more successful. Let’s dive in!

Alation team at the booth at Databricks Summit 2024.

Data Quality Processor for Databricks

We kicked off the week with an exciting joint announcement. Alation has unveiled a deeper integration with Databricks through our new Data Quality Processor for Databricks, which delivers data health visibility to users. This collaboration, part of Alation’s Open Data Quality Initiative, builds on Databricks’ Lakehouse Monitoring innovations. The integration enables business users to access data quality metrics from Databricks within Alation, streamlining decision-making and improving data reliability. Technical users can review profiling results directly in Databricks, while Alation provides a summary of these results in a user-friendly format, allowing broader data user access to critical health information.

The new integration leverages Databricks' Lakehouse Monitoring to track statistical properties and model performance. Alation uses these insights to populate its Data Health tab, giving users quick access to data quality metrics. This system allows data leaders to categorize data health, helping downstream users identify trustworthy data. 

The integration benefits both technical and business users by enhancing data trust and enabling intelligent, data-driven decisions. It simplifies the process of implementing data health indicators without the need for custom scripts, making high-quality data more accessible across organizations.

Keynote highlights: AI innovations and more

The Databricks Data and AI Summit 2024 brought together industry leaders to showcase groundbreaking advancements in data management and artificial intelligence. The event was marked by significant announcements, which are aimed at enhancing data governance, accelerating AI innovation, and democratizing data analytics for businesses.

Open-Sourcing Unity Catalog 

Databricks announced that it will open-source Unity Catalog, “a metadata catalog that governs how users and compute engines can access data”. This move aims to foster collaboration and innovation across the data community, providing a unified governance solution that is accessible and adaptable​​.

Open-sourcing Unity will create new innovation opportunities for partners in the Databricks ecosystem. “We basically standardized the data layer and the security layer so that you own your data and everything goes through these open interfaces,” said CEO and co-founder Ali Ghodsi. “And I think that’s going to be awesome for the community, for everybody in here. Because we just have way more use cases. We’re going to be able to do much more innovation, and we’ll just expand this market for everybody involved.”

Alation's data catalog and Databricks' Unity Catalog work together by leveraging each other's strengths. Unity Catalog provides the core data governance and administration layer for the Databricks Lakehouse Platform, centralizing security governance and policy management. It ingests metadata and calculates lineage within Databricks environments, which Alation then uses to provide end-to-end lineage and centralized metadata across all Databricks workspaces and beyond. 

This integration allows customers to achieve comprehensive data intelligence, capturing lineage from source to destination, offering an end-to-end view of the data’s history and impact. While Unity Catalog ensures efficient governance within Databricks, Alation extends visibility and intelligence across the entire data ecosystem as the “catalog of catalogs”, offering a single-pane-of-glass view for business and technical users.

Ali Ghodsi shakes hands with Jensen Huang at the Databricks Data + AI Summit 2024.

NVIDIA calls data a “gold mine”

NVIDIA's CEO Jensen Huang emphasized the potential of data as a "gold mine" and highlighted the chip manufacturer’s expanded partnership with Databricks. The collaboration aims to accelerate enterprise data analytics and AI, leveraging NVIDIA’s powerful hardware and Databricks' software capabilities to deliver groundbreaking AI solutions.

Launch of Shutterstock ImageAI 

Databricks introduced Shutterstock ImageAI, a tool powered by Databricks’ AI technology. This innovation aims to revolutionize how businesses manage and utilize visual content, offering advanced image analysis capabilities that can be seamlessly integrated into existing workflows.

Introduction of AI/BI platform

Databricks also unveiled Databricks AI/BI, an intelligent analytics platform designed for real-world data. This tool features AI-powered dashboards and a conversational interface, Genie, which allows users to ask complex questions in natural language and receive precise answers. Databricks AI/BI promises to democratize data analytics, making it accessible to all levels of an organization without the need for specialized knowledge.

“A truly intelligent BI solution needs to understand the unique semantics and nuances of a business to effectively answer questions for business users,” said Ghodsi. “We believe this requires a different approach than how BI software has been designed in the past — one that places an AI system at the center of the architecture and is designed to take advantage of the AI systems’ strengths as well as complement their weaknesses to tackle the challenges of understanding and learning these nuances. The launch of AI/BI is a step towards building such a system.”

These announcements underline Databricks' commitment to advancing the data and AI landscape, focusing on open-source initiatives, strategic partnerships, and innovative tools that enhance business intelligence and data management.

Kroger scales self-service with data mesh and Alation

Over on the showroom floor, the Alation booth was bustling with the data curious. That energy was matched in a theater session featuring Kroger and 84.51°. The supermarket powerhouse shared details on its journey to democratize data via a data mesh with Alation. 

How does America’s largest food supermarket by revenue approach data management? According to Nate Sylvester, VP of Architecture at 84.51°, it comes down to data mesh. Sylvester took the stage (along with myself, David Sweenor, Director of Product Marketing at Alation) to share how Kroger's data science and analytics subsidiary, 84.51°, delivers data products through a data mesh approach with help from Alation. The session, "Delivering Data Products with Data Mesh at 84.51/Kroger," highlighted their strategies and key learnings. Read on to learn more!

Kroger's "Our North Star" slide from their customer session outlines the key data principles the leadership sough to achieve.

People, process, and culture over technology

Sylvester stressed that the foundation of Kroger’s data strategy lies in establishing the right culture and processes rather than merely implementing technology. Initially, they faced challenges when focusing solely on technology by introducing a cataloging tool without the necessary cultural and procedural changes. This approach led to a disorganized "swamp of information" that lacked trustworthiness. 

They realized the need for a robust governance framework, clear data discovery processes, and a reliable catalog to make data accessible and trustworthy across the organization. This cultural shift ensured that when someone finds information in their catalog, they can rely on it with confidence.

Self-service for data literacy (with help from metadata!)

To scale efficiently and foster innovation, leadership emphasized the importance of self-service capabilities for data teams. This initiative was supported by extensive training programs designed to enhance data literacy among both technical and business teams, ensuring that all stakeholders could effectively use and understand the data available to them. 

Sylvester’s team aimed to make data not only accessible but also comprehensible by providing context—such as the purpose and transformations applied to the data. That context is delivered through metadata. “What Alation brings to the table is metadata,” says Sylvester. “The layers of Wiki-pages and the business glossary and terms allow us to bring a richer set of information to the self-serve mindset, and really to understand what [the data] is.” A business glossary was also critical to create a common language across the organization. With the business glossary, “We’re increasing our data literacy for our technical partners and business partners as well,” Sylvester said.

A slide from Kroger's session with Alation at Databricks Data + AI Summit summarizing their data journey, from data strategy onboarding to data publication.

Hybrid governance, federated domains, and data as a product

Kroger adopted a hybrid governance model, where central governance defines the overarching standards and interfaces, while individual domains maintain ownership and responsibility for their data. This approach allowed for flexibility within domains to address their specific needs while ensuring consistency and interoperability across the organization. Domain leaders were empowered to define their data quality metrics and manage their data products, fostering a sense of ownership and accountability. 

Alation has facilitated this by providing a unified platform that supports the hybrid governance model, enabling domain leaders to manage their data effectively while adhering to centralized standards.

A slide from Kroger's presentation with Alation, "Delivering Data Products with Data Mesh at 84.51/Kroger" detailing how they federate data compliantly.

A core concept in 84.51/Kroger’s strategy is treating data as a product. This involves curating data interactions with consumers—whether they be internal teams or external partners—focusing on the lifecycle, purpose, and quality of the data. By adopting a product mindset, the organization ensures that data products are well-maintained, fit for purpose, and meet the needs of their consumers, thereby fostering trust and confidence in the data.

Decentralized ownership and the role of data fabric

Kroger adopted the data mesh approach to better organize their data in a highly federated organization. This involves decentralizing data ownership and governance. They applied domain-driven design principles, ensuring each domain has clear ownership and accountability for its data. This decentralization aims to enhance collaboration and interoperability between different business units. 

​​“Data mesh is really about how we organize and create decentralized teams within the business units. Data fabric is the connective tissue that allows us to interoperate… Data mesh and data fabric work together, in order to create an ecosystem where we can make data accessible as a product.”

“Six or seven people own Alation here internally to govern what a good entry looks like,” Sylvester adds. “They make it available to the different domains to keep it from being a swamp of information.”

How does Snowflake’s Unity Catalog help? Sylvester shared that Unity Catalog is used to govern data access and security, making data available in different technologies. Meanwhile, Alation surfaces metadata for self-service data access through features like the Business Glossary, wiki pages, and additional context about the data beyond just the structure and location. This enables more efficient decision-making across the organization.

Leveraging Alation for success

Alation was instrumental in Kroger/84.51°'s data mesh journey, providing the tools needed for effective data cataloging, discovery, and governance. By leveraging Alation, Kroger was able to create a scalable, self-service data environment that promoted data literacy and facilitated the seamless integration of cultural and procedural changes. The platform's business glossary and training capabilities helped bridge the gap between technical and business users, fostering a data-driven culture throughout the organization.

Slide from Kroger's presentation with Alation addressing the question, "How has Databricks and Alation helped accelerate [Kroger's] work [with data]?"

In conclusion, Kroger successfully transformed their data strategy by focusing on cultural integration, self-service, and hybrid governance, with cross-cultural alignment playing a pivotal role in this transformation. Their journey underscores the importance of aligning technology with the right processes and cultural shifts to create a trustworthy and efficient data ecosystem.

TheCUBE interview: to win at AI, start small and build

Diby Malakar, VP of Product Management at Alation, sat down with Savannah Peterson of SiliconANGLE’s TheCUBE to share insights on the evolving data landscape. Malakar highlighted the critical need for trusted data, emphasizing that organizations struggle with data silos and unverified data sources. Alation addresses these challenges by enabling users to easily find, understand, and trust their data, making it AI-ready. This capability has attracted significant interest from major enterprises, with 40% of the Fortune 100 using the Alation Data Intelligence Platform.

Malakar also discussed current trends, noting the rising importance of data quality in the age of AI and generative AI. He stressed that high-quality data is essential for effective AI, as poor data quality leads to subpar outcomes even with advanced models. Organizations are at various stages of AI adoption, from experimentation to implementation, driven by budget constraints and the need for innovation. 

When asked about what excites him about AI, Malakar pointed to a greenfield opportunity for innovation that we're only just beginning to grasp. "[What excites me is] the ability to innovate and to solve problems and use cases that we never even dreamed of before," he said. "It allows [data users] to focus not on mundane stuff like finding data or like doing the the data analytics but how do I use AI to drive my productivity to drive innovation, to drive competitive advantage. I think the possibilities are tremendous."

Alation's Diby Malakar being interviewed for theCUBE.

So how can enterprises getting started with AI see success? Malakar advised companies to focus on measurable business outcomes and start with small, successful projects that can scale. He also highlighted Alation's success with customers like Cisco, where a mix of centralized policies and decentralized control has empowered business units to drive their data governance initiatives effectively. This balanced approach has proven crucial in fostering a data culture that supports both compliance and innovation.

Conclusion

As if all this weren’t enough, the Alation team hosted several customer dinners, a range of fantastic meetings, and a cocktail party we coined “Data on the Rocks” that was not to be missed! This party followed Day 3 of Summit at the Alchemist Bar & Lounge. We got to mingle, network, and connect with industry leaders and peers in a relaxed atmosphere at this exclusive event, supported by our partners, Monte Carlo, Fivetran, dbt Labs, and Sigma. There were finger foods and cocktails along with valuable insights and fostering connections.

The Data on the Rocks party concluding Databricks Summit 2024.

The Data on the Rocks party concluding Databricks Summit 2024.

Databricks Data & AI Summit was incredible, not just for the things we learned, but for the friendships we made and the partnerships we deepened. We can’t wait to see you again next year!

Learn more about our partnership with Databricks.

    Contents
  • Open-Sourcing Unity Catalog 
  • NVIDIA calls data a “gold mine”
  • Launch of Shutterstock ImageAI 
  • Introduction of AI/BI platform
  • People, process, and culture over technology
  • Self-service for data literacy (with help from metadata!)
  • Hybrid governance, federated domains, and data as a product
  • Decentralized ownership and the role of data fabric
  • Leveraging Alation for success
  • Conclusion
Tagged with