All Insights
Snowflake#CortexAI

Snowflake Cortex AI: Running LLMs Directly on Your Data Warehouse

G

Genufy Team

Jun 3, 2025 · 8 min read

For years, running ML inference on your data warehouse meant one thing: ETL it out, run a model elsewhere, and ETL the results back. Snowflake Cortex AI eliminates that pipeline entirely.

What Cortex AI Actually Is

Cortex AI is a suite of LLM-powered functions — COMPLETE, SUMMARIZE, SENTIMENT, TRANSLATE, EXTRACT_ANSWER — that run natively inside Snowflake using SQL. There is no model to deploy, no Python environment to manage, no data to move.

"

SELECT SNOWFLAKE.CORTEX.SENTIMENT(review_text) FROM customer_reviews — that's the entire inference pipeline.

Real-World Use Cases We've Deployed

We've used Cortex AI for three production use cases: real-time sentiment scoring on support ticket streams, automatic summarisation of long-form sales call transcripts stored in Snowflake, and structured data extraction from unstructured contract text loaded via Snowpipe.

Cost and Latency Profile

Cortex functions are billed per token, not per compute hour. For batch workloads this is very efficient. For streaming or near-real-time use cases, model the token cost carefully — a high-cardinality table with verbose text fields can burn credits quickly. We recommend running CORTEX functions inside a separate dedicated warehouse with auto-suspend set to 1 minute.

Want to apply this to your organisation?

Talk to the Genufy team.

Get in touch →