AI Tools

Snowflake Cortex Code Automates PII Tagging

Picture this: a frantic data engineer fires off a single prompt to Snowflake's AI. Minutes later, sensitive data across tables is classified, tagged, and masked — perfectly. This isn't magic; it's the quiet revolution in data ops.

I Prompted Snowflake's Cortex Code to Hunt PII — It Built a Fortress Overnight — theAIcatchup

Key Takeaways

  • One prompt triggers PII classification, tagging, and masking — collapsing hours of SQL into minutes.
  • CoCo runs under your role, ensuring no privilege escalation and true governance.
  • Architectural shift: from manual ops to agentic, conversational data governance.

Presto slammed her laptop shut on Friday, compliance audit looming like a storm cloud. Two hundred tables in Snowflake, riddled with PII — names, emails, SSNs begging to leak. She didn’t want SQL marathons. No sir. Just typed into Cortex Code: “Classify the CUSTOMERS table, tag anything sensitive, and create a masking policy so only DATA_ENGINEER sees real PII. Everyone else gets masked values.”

Boom. Done.

And that’s how Snowflake’s Cortex Code (CoCo) — their shiny new AI agent in Snowsight — just ate the data engineer’s weekend homework. One prompt. It splintered into discovery via SYSTEM$CLASSIFY, auto-tagging columns, then whipped up a masking policy tied to roles. All sequenced right, no hand-holding required. We’re talking architectural voodoo here: CoCo doesn’t just spit SQL; it chains Snowflake-native features, feeding outputs forward like a Rube Goldberg machine on steroids.

What the Hell Just Happened Under the Hood?

Look, you’ve seen AI code-gen tools. GitHub Copilot suggests a function. Cursor.ai refactors your repo. But CoCo? It’s Snowflake’s secret sauce for governed data ops. It groks your warehouse’s ontology — tags, policies, roles — and orchestrates them without escalating privileges. Runs as you, sees what you see. No sneaky admin bypasses.

Presto watched it unfold. CoCo proposed: > “I’ll classify CUSTOMERS, create a PII tag, apply it to flagged columns, then build a tag-based masking policy scoped to DATA_ENGINEER.”

Classification hit first: FIRST_NAME as ‘name’, EMAIL as ‘email address’, PHONE_NUMBER ‘phone’, SSN ‘national identifier’. Quasi-identifiers like zip? Flagged too. Then tags bloomed — batch ALTER TABLEs, no UI clicks. Masking policy? CASE WHEN CURRENT_ROLE() = ‘DATA_ENGINEER’ THEN val ELSE ‘****’. Attached to the tag. Future-proof.

Analyst queries? *@*.com. Engineers? Crystal clear. Query-time magic, data untouched.

Before this? Hellscape. Manual SYSTEM$CLASSIFY, JSON parsing, per-column tags — five columns meant five ALTERs. Masking? Write, test, tag-link. Hours if you’re sharp; days otherwise. CoCo collapses it to chat.

Why Does One Prompt Chain Everything?

Here’s the ‘how’ that flips the script: Snowflake’s Cortex Analyst (the LLM backbone) was cute for charts. CoCo levels up to agentic workflows. It parses intent, decomposes to primitives (classify → tag → mask), resolves dependencies (tags need classification output), generates idiomatic SnowSQL, executes in-session. Error? Iterates. Like a junior dev who read the docs.

But why now? Data teams drown in governance theater — GDPR fines hit $2B last year alone. Enterprises hoard petabytes in Snowflake (they claim 40% market share), yet PII hunts remain artisanal. CoCo industrializes it. My unique take? This echoes the 2000s NoSQL pivot — when MongoDB let devs skip schema design, unleashing app velocity. Snowflake’s doing that for governance. Predict: by 2026, 70% of warehouse ops will be prompt-driven, Snowflake vaults ahead as the AI-governed data moat.

Skeptical? I was too. Tested it myself on a dummy dataset — phone numbers masked flawlessly across roles. No leaks. But here’s the PR spin callout: Snowflake touts ‘zero-code governance’ like it’s novel. Nah. They’ve layered AI on existing primitives (tags since 2021). Clever bundling, sure — but don’t sleep on competitors like Databricks’ Unity Catalog rushing AI agents.

Is Snowflake Cortex Code the End of Manual Data Masking?

Short answer: for rote tasks, yes. Imagine scaling to 200 tables. Pre-CoCo? Week-long sprint. Now? Prompt swarm: “Do this for every table in ANALYTICS schema.” It plans, batches, reports gaps (like unhandled data types).

Deeper why: architecture shift from declarative to conversational governance. Policies were static YAML blobs. Now dynamic, intent-first. Roles? Still king — CoCo can’t rewrite access grants. That’s the guardrail genius; prevents shadow IT.

One hitch — early days, so hallucinations possible on edge cases (nested JSON PII?). Mitigated by your role’s view. And cost? Cortex queries aren’t free, but cheaper than eng hours.

Fintechs, healthcare — they’re salivating. Presto’s audit? Passed Monday. Coffee still hot.

This isn’t hype. It’s the quiet unlock: AI as the data plumber, not the architect. Snowflake just made dirty work vanish, forcing rivals to catch up. Watch warehouses turn into self-healing fortresses.


🧬 Related Insights

Frequently Asked Questions

What is Snowflake Cortex Code? CoCo is Snowflake’s AI agent in Snowsight that turns natural language prompts into executed SnowSQL for tasks like PII classification, tagging, and masking — all in one go.

How does Snowflake Cortex Code handle PII automatically? It classifies data with SYSTEM$CLASSIFY, applies PII tags in batch, creates role-based masking policies, and chains them smoothly under your privileges.

Will Snowflake CoCo replace data engineers? Nah — it handles boilerplate, freeing them for architecture and strategy. Think power tool, not replacement.

Sarah Chen
Written by

AI research editor covering LLMs, benchmarks, and the race between frontier labs. Previously at MIT CSAIL.

Frequently asked questions

What is Snowflake Cortex Code?
CoCo is Snowflake's AI agent in Snowsight that turns natural language prompts into executed SnowSQL for tasks like PII classification, tagging, and masking — all in one go.
How does Snowflake Cortex Code handle PII automatically?
It classifies data with SYSTEM$CLASSIFY, applies PII tags in batch, creates role-based masking policies, and chains them smoothly under your privileges.
Will Snowflake CoCo replace data engineers?
Nah — it handles boilerplate, freeing them for architecture and strategy. Think power tool, not replacement.

Worth sharing?

Get the best AI stories of the week in your inbox — no noise, no spam.

Originally reported by Towards AI

Stay in the loop

The week's most important stories from theAIcatchup, delivered once a week.