Free PHI De-ID API for Clinical Text & LLMs

Healthcare AI devs have been chained to endless BAAs just Bottom line: notes with GPT. This free de-identification API flips the script: scrub PHI first, LLM second. No contracts needed.

Free API That Scrubs PHI from Clinical Notes Before LLMs Touch Them — theAIcatchup

Key Takeaways

  • Free API de-identifies clinical text in 7ms, dodging BAA needs for LLMs.
  • HIPAA Safe Harbor compliant via regex on 18 PHI types; add spaCy for names.
  • Changes healthcare AI pipelines — any LLM, no vendor contracts.

Healthcare AI teams were sweating bullets. Sign a BAA with OpenAI? Fine. But Claude? Llama? Every open-source LLM under the sun? That’s a compliance nightmare — a forest of legalese nobody wants to plant.

Enter Tiamat’s free de-identification API. It strips PHI from clinical text before your LLM even blinks. Suddenly, you’re free. No BAA for the model provider. Just clean, anonymized notes ready for summarization, analysis, whatever.

This changes everything.

What Everyone Expected (And Why They Were Wrong)

Picture this: your compliance officer looming, demanding ironclad BAAs for every LLM ping. That’s the status quo — painful, expensive, and oh-so-2023. Teams hacked around it with on-prem models (slow) or vendor lock-in (pricey). Nobody saw a free, drop-in API coming that hits HIPAA Safe Harbor on the nose.

But here’s Tiamat. A simple POST to https://tiamat.live/api/scrub. Feed it messy clinical text packed with SSNs, DOBs, MRNs. Out pops scrubbed gold: “Patient seen by [NAME_1], DOB [DATE_1]” — with a handy entity map to restore if needed.

It’s elegant. Brutally so.

And fast — 7ms average latency. Your LLM bottleneck laughs.

The workaround that actually works: de-identify the text before the LLM call. The LLM never sees PHI. No BAA needed for the LLM provider.

That’s straight from the source. No fluff.

How This PHI Scrubbing API Nails the Dirty Work

HIPAA’s Safe Harbor demands nuking 18 identifier types. Names. Dates. Phones. SSNs. MRNs. The works — even quirky stuff like VINs or biometric IDs.

Tiamat’s regex beast handles the structured slop: \d{3}-\d{2}-\d{4} for SSNs, US phone formats, emails, IPs, URLs. Dates in every flavor — MM/DD/YYYY or “March 22, 1975.” ZIPs. Titled names like “Dr. Smith.”

Bare names? Trickier — “Jane Doe” sans title needs NER. They admit it: pair with local spaCy (en_core_sci_md) for full coverage on unstructured notes. Regex first for speed, NER for the rest.

Here’s a Python snippet to drool over:

import requests

def analyze_note(clinical_text: str, llm_client, task: str) -> dict:
    # Step 1: strip PHI identifiers
    scrub = requests.post(
        "https://tiamat.live/api/scrub",
        json={"text": clinical_text}
    ).json()
    # Step 2: LLM call with clean text — no PHI exposure
    analysis = llm_client.complete(
        f"Task: {task}\n\nClinical note:\n{scrub['scrubbed']}"
    )
    return {
        "analysis": analysis,
        "phi_removed": scrub["count"],
        "entity_map": scrub["entities"]  # for re-identification if needed
    }

Boom. Pipeline secured. Your LLM stays blissfully ignorant.

Free tier? 100 requests/day, no signup. Prod at $9/month. Docs at tiamat.live/docs.

Skeptical? Test it. Curl their demo. It’ll work.

This isn’t vaporware.

Is This HIPAA Bulletproof — Or Just Clever Hype?

Let’s cut the BS. Safe Harbor says remove those 18 categories, and it’s de-identified. No PHI. Send to any LLM, BAA-free.

But — and it’s a big but — unstructured notes are chaos. Regex misses context sometimes. Bare names slip through without NER. That’s why they push spaCy combo.

My unique take? This echoes the GDPR gold rush of 2018. Back then, anonymization tools like Presidio exploded as fines loomed. Fast-forward: they’re table stakes. Tiamat? It’ll be the Presidio of PHI scrubbing — but for LLMs. Bold prediction: by 2025, every healthcare API gateway bundles this. Vendors who don’t? Left in the dust.

Compliance chats get easy: “LLM sees zero PHI. Safe Harbor pre-LLM. Here’s the API log.”

No begging vendors for signatures.

Dry humor alert: your legal team’s suddenly your best friend.

Why Does This Matter for Healthcare Devs Right Now?

LLMs crave data. Clinical notes? Goldmines for summaries, insights, drug discovery. But PHI walls block it.

This API tears ‘em down. Works with GPT, Claude, Mistral, local Llama — pick your poison.

Latency negligible. Scalable tiers cheap. Open for edge cases in comments.

One punchy caveat: it’s not re-identification magic for all cases. Use the tokens wisely — or don’t, if downstream needs stay anon.

Teams building HIPAA AI? Integrate yesterday.

Others? Watch. This sparks a de-ID arms race.

Ignoring it? You’re the sucker signing BAAs.

The Free Ride Ends Here (Prod Realities)

100/day free. Fine for prototypes. Prod? Upgrade.

But $9/month for unlimited-ish? Steal.

Compare to building your own scrubber. Months of regex hell, false positives, audit nightmares. Pass.

Tiamat’s battle-tested on real notes. You hack less.


🧬 Related Insights

Frequently Asked Questions

What does Tiamat’s PHI de-identification API do?

Strips 18 HIPAA identifiers from clinical text via regex + optional NER. Returns scrubbed text with restore tokens. LLM-safe.

How to use free PHI scrub API for LLMs?

POST JSON to https://tiamat.live/api/scrub with {“text”: “your note”}. 100/day free, no auth.

Does this API replace BAA for healthcare LLMs?

Yes — if you de-ID first per Safe Harbor. LLM sees no PHI. Works with any provider.

Elena Vasquez
Written by

Senior editor and generalist covering the biggest stories with a sharp, skeptical eye.

Frequently asked questions

What does Tiamat's <a href="/tag/phi-de-identification/">PHI de-identification</a> API do?
Strips 18 HIPAA identifiers from clinical text via regex + optional NER. Returns scrubbed text with restore tokens. LLM-safe.
How to use free PHI scrub API for LLMs?
POST JSON to https://tiamat.live/api/scrub with {"text": "your note"}. 100/day free, no auth.
Does this API replace BAA for healthcare LLMs?
Yes — if you de-ID first per Safe Harbor. LLM sees no PHI. Works with any provider.

Worth sharing?

Get the best AI stories of the week in your inbox — no noise, no spam.

Originally reported by dev.to

Stay in the loop

The week's most important stories from theAIcatchup, delivered once a week.