Written by Steve Crosson-Smith, Head of Data Strategy, Governance and Architecture at SDG Group UK.
From Snowpark to Snowflake Intelligence: an Overview of Snowflake's AI Capabilities
Many companies still regard Snowflake as a data warehouse platform, not understanding the breadth and depth of its AI capabilities. This article seeks to give the reader a broad overview of the AI functionality in the platform, including the recently introduced features that allow the creation of Agentic AI applications without the need to write code.
The Production Deployment Challenge
Many companies are finding that transitioning an AI project from proof of concept (PoC) to full production deployment is more challenging than initially anticipated. This difficulty stems partly from the requirements associated with scaling and the necessity for robust governance.
Scaling issues include architecting flexible, specialised compute resources and handling the significant data movement typically required, such as from a secure data warehouse to an external ML environment. Furthermore, applying end-to-end security policies is extremely difficult when using multiple technologies and environments.
Snowflake's Foundation: Processing Data Where It Lives
Snowflake recognised these problems some time ago and concluded that data processing and analytics should be executed as close to the data as possible and within a common governance and security framework; specifically without requiring data to leave the secure environment of the data platform. These principles were the driving force behind the development of Snowpark and Snowpark Container Services. These services, which allow customers to extend the capabilities of the platform with their own code, provide close proximity between that code and the data stored in the platform, along with an integrated security framework. On top of these foundational services, Snowflake developed its own AI services (Cortex) and models (the Arctic family of models), while also providing integrated support for a range of third-party Large Language Models (LLMs) and other services.
Cortex: The AI Processing Layer
Within Cortex, Snowflake offers two primary processing engines:
- Cortex Analyst - designed for the analysis of structured data
- Cortex Search - which focuses on analysing textual data
This textual data can be extracted from unstructured sources, such as audio, using dedicated functionality provided as Snowflake SQL extensions, like AI_Transcribe.
AI SQL Functions
Snowflake provides a library of AI SQL functions to deliver a wide array of capabilities, including classification, sentiment analysis, and translation. As mentioned above, it is also possible to create custom functionality in code, making it available via Snowflake SQL User Defined Functions (UDF) or Stored Procedures.
To make the picture complete, Snowflake also offers embedding functionality, which uses a simple SQL AI_EMBED function to generate vectors from audio, images, and documents. These vectors are numerical descriptions that permit AI systems to work effectively with unstructured data types.
The Role of Semantic Layers
Cortex Analyst relies upon the presence of a semantic layer to provide the context that AI requires to make business sense of the data. This includes business terminology, synonyms, calculations, the relationship between data objects, verified queries and various other pieces of metadata which ensure unambiguous interpretation by Cortex Analyst.
There are two main routes available to providing this semantic layer within Snowflake, and for the purposes of this article we will keep the discussion simple. The semantic layer is implemented as a Semantic View, which can be created from a YAML-based definition (Snowflake has a user interface for creating this), or directly using SQL commands. The most important requirement is to have a knowledge of the business context of your data.
Once built, the Semantic Views can be leveraged by Cortex Analyst or BI tools such as Sigma or Power BI.
From Reactive Tools to Autonomous Agents
This collection of Snowflake tools is very powerful for building complex AI solutions, however they are inherently reactive. They respond when a user or program requests them to perform a specific task, meaning that if a series of tasks must be executed, they must be orchestrated (programmatically told what to do and in what order).
The next step in AI evolution introduced the concept of Agents. Unlike giving a tool a specific task, an Agent can be given a desired outcome. The Agent then has the ability to autonomously determine the steps required to achieve that outcome, based upon the tools and datasets at its disposal.
An LLM is employed to analyse the requested outcome, determine the necessary steps, and then orchestrate the various tools and tasks needed to execute those steps. Furthermore, when the desired outcome is ambiguous, Agents can clarify the requirements through interactive dialogue with the user.
Introducing Snowflake Intelligence
The capability to configure (Cortex) Agents within Snowflake without requiring coding is now available in General Availability. This is paired with an additional capability called Snowflake Intelligence. This is a service providing interactive chatbot-style interfaces that sit on top of Cortex Agents, again, without the need to write code. Simply put, Cortex Agents provide the ability to create Agentic AI solutions, and Snowflake Intelligence offers a no-code interface for humans to interact with them.
The Skills Required: A Layer-by-Layer View
A useful way to understand the significance of this for a business is to examine the required skills needed to deploy an Agentic AI solution, including the Snowflake Intelligence chatbot user interface, within Snowflake.
- Data Layer: Snowflake began as a data platform, providing enterprise-strength data management capabilities. It is firmly founded on SQL, which is a relatively easy language to learn, and allows those with prior SQL experience to easily transition to Snowflake SQL. All Snowflake data objects, including Semantic Views, can be created with SQL commands.
- Tools: This refers to any data processing and analysis capability accessible via Snowflake SQL, encompassing all built-in functions, including AI extensions. Utilising Snowpark and Snowpark Container Services, customers can code their own tools—from simple mathematical functions to full ML models—in languages such as SQL, Python, Java, Javascript or Scala, and make them available as custom UDFs or Stored Procedures, accessible via SQL. External services can also be invoked.
- Cortex Agents: The commissioning of Agents relies on configuration and prompt engineering skills, rather than further coding. Agents draw upon the tools defined above to execute their required tasks.
- Snowflake Intelligence: This layer requires configuration, not coding, and delivers the necessary chatbot-style user interface for human interaction with the Cortex Agents.
- Alternative Interfaces: An alternative to using Snowflake Intelligence is accessing Cortex Agents and Tools directly through a Snowflake REST API. This may be appropriate for applications that are machine, rather than human-initiated, or for embedding Cortex Agents into a custom application. While this may require coding, depending on the chosen environment, it certainly requires knowledge of making calls to the Snowflake REST API.
Conclusion
Snowflake has established a comprehensive platform built on secure, close-proximity data processing, and a powerful collection of AI tools, accessible via SQL. The introduction of Agentic AI marks a significant shift, allowing users to define desired outcomes rather than specific tasks.
Snowflake Intelligence, coupled with Cortex Agents, democratises AI deployment by minimising the required technical skill set, shifting the focus from complex coding to streamlined configuration and prompt engineering.
This end-to-end integration simplifies the journey from PoC to production, addressing critical scaling and governance challenges inherent in traditional AI deployment.