It’s hard to create standards across all compute engines from a governance and catalog standpoint. This is what theCUBE Research Chief Analyst Dave Vellante calls a “convenient truth.”
Snowflake Inc.’s strategy to open source Polaris Catalog was one of the subjects of conversation on the latest episode of theCUBE Podcast, recorded during Snowflake’s Data Cloud Summit.
George Gilbert, Sanjeev Mohan and Dave Vellante discuss the great Iceberg debate during Data Cloud Summit.
“They’re open-sourcing Polaris, which is really just the technical metadata,” Vellante said. “The somewhat convenient truth for Snowflake is that the standards for governing open table formats like Apache Iceberg are not only lacking, but they’re extremely challenging. You’ve got to herd the cats of all the various compute engine players and agree and then align.”
Snowflake’s strategy relies on its Horizon solution for advanced governance features. Iceberg does not specify certain things, according to Sanjeev Mohan, principal at SanjMo.
“There are no permissions; there’s no security,” he said. “Security has to be applied above, in a different catalog. Snowflake actually provides this Polaris Catalog, which is a technical metadata catalogue — that’s it.”
However, it’s a different situation if a company wants to do role-based access control, column-level security or row-level security. Companies need Horizon to do so, Mohan added.
“If you don’t have Horizon, then every single compute engine, if it’s Spark, or Dremio, or Trino, Starburst, they have to figure out how to apply data access governance onto Iceberg,” he said. “That’s why Horizon is so important.”
Data Cloud Summit saw company discussing its rapid pace
This week’s episode of theCUBE Podcast also featured an interview that took place during Snowflake’s Data Cloud Summit with Sridhar Ramaswamy (pictured), the newly-minted chief executive officer of Snowflake. He outlined the company’s vision for data and AI integration.
Snowflake CEO Sridhar Ramaswamy talks to theCUBE’s Dave Vellante during Data Cloud Summit.
“The pace of play is absolutely breathtaking,” Ramaswamy said. “[We must] seize opportunity, because you need to do [things] quickly. Otherwise, there’ll be competition. Otherwise, the opportunity will vanish. Not a week passes by without another model breakthrough, without somebody else getting another amazing product done.”
In November 2023, Snowflake announced Cortex AI. The pace with which that project has unfolded is emblematic of how the industry is moving right now, according to Ramaswamy.
“In November, we said, ‘We are going to be doing this.’ We said, private preview,” he said. “It went into GA three weeks ago. That is unheard of in enterprise software. But that is what you need to succeed today.”
The big questions swirling around the company are whether it is a data cloud, a data AI cloud, an AI company or an application platform. Perhaps the company is all of the above.
“The best way that I can answer this is to start with two words … a data cloud. But let’s unpack that,” Ramaswamy said. “What does it really mean? We think of ourselves as a cloud computing platform but centered on data. So, we absolutely want to be the best platform that there is for storing data, for running data processing. That’s the core, that compute engine that is so magical, that’s the core of Snowflake, and we’re expanding it.”
The battle between Databricks and Snowflake rages on
Amid Data Cloud Summit, Databricks Inc. announced it had agreed to acquire Tabular Technologies Inc., developer of a universal storage platform based on the Apache Iceberg standard. It was the latest move in the battle between Databricks and Snowflake, but was it a good move for customers?
“I, in fact, had lunch with some very large companies, financial services, and they said, ‘We put our eggs in Iceberg’s basket because we were getting an open standard file format, and now Hudi is actually not really as prevalent,’” Mohan said. “There are only two, Delta and Iceberg, and they’re both owned by the same company.”
Often, theCUBE has made reference to the sixth data platform, which isn’t just about separating compute from storage, but about separating compute from data. When it comes to the latest moves from Databricks, the value the company is trying to add now might be fairly straightforward, or will be soon, according to George Gilbert, senior analyst at theCUBE Research.
“To be able to just read and write, whether it’s in Delta or Iceberg. The basic abstraction is the same, where all the work that Ryan [Blue] and the Tabular group were doing, was adding on the permissions, and to go beyond authentication and road-based access control,” Gilbert said. “They were trying to add a full policy engine, which is tagged. This is really advanced, where you control access to data based on its attributes. That’s the full-blown stuff.”
Don’t miss out on the latest episodes of “theCUBE Pod.” Join us by subscribing to our RSS feed. You can also listen to us on Apple Podcasts or on Spotify. And for those who prefer to watch, check out our YouTube playlist. Tune in now, and be part of the ongoing conversation.
VIDEO
Photo: SiliconANGLE
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy
THANK YOU
{Categories} _Category: Takes{/Categories}
{URL}https://siliconangle.com/2024/06/10/data-cloud-summit-thecubepod/{/URL}
{Author}Ryan Stevens{/Author}
{Image}https://d15shllkswkct0.cloudfront.net/wp-content/blogs.dir/1/files/2024/06/Sridhar-Ramaswamy-CEO-of-Snowflake-Data-Cloud-Summit-2024-1.jpg{/Image}
{Keywords}AI,Cube Event Coverage,NEWS,#theCube,#theCUBEPod,#theCUBEpodcast,Apache Iceberg,Cortex AI,Data Cloud Summit,Dave Vellante,Delta,Dremio,George Gilbert,Horizon,hudi,Polaris,Polaris Catalogue,Sanjeev Mohan,SanjMo,sixth data platform,spark,Sridhar Ramaswamy,Starburst,Tabular Technologies Inc.,theCUBE Research,Trino{/Keywords}
{Source}POV{/Source}
{Thumb}https://d15shllkswkct0.cloudfront.net/wp-content/blogs.dir/1/files/2024/06/Sridhar-Ramaswamy-CEO-of-Snowflake-Data-Cloud-Summit-2024-1.jpg{/Thumb}