Not entirely. De-identification under HIPAA's Safe Harbor method removes 18 identifiers (name, email, dates, record numbers), but the clinical content of a session remains — and studies have shown de-identified data can sometimes be re-identified when combined with other datasets. "De-identified" lowers risk; it doesn't make data truly anonymous.
What HIPAA Safe Harbor actually removes
The Safe Harbor method requires stripping 18 specific identifiers: names, geographic subdivisions smaller than a state, all dates tied to an individual, phone numbers, emails, medical record numbers, and more. Once removed, the data is considered "de-identified" under US law.
| De-identified | Truly anonymous |
|---|
|---|---|---|
| Direct identifiers removed | Yes | Yes |
|---|---|---|
| Re-identification possible | Sometimes, via linkage | No, by definition |
| Used to train AI | Frequently | Rarely the point |
Why de-identified ≠ anonymous for therapy
The content is still sensitive
A transcript stripped of names still contains the substance of a therapy session — trauma disclosures, relationship details, emotional patterns. For clinicians, the content is the confidential part, not just the name attached to it.
Re-identification is a known risk
Research has repeatedly shown that "anonymized" datasets can be re-identified when cross-referenced with other data. A landmark study found that 87% of the US population could be uniquely identified by just ZIP code, birth date, and sex (Sweeney, 2000). Rich clinical narratives can be even more distinctive.
"De-coupled" is a vendor promise, not a law
When a platform says transcripts are "de-coupled and cannot be re-linked," that's a description of their internal process — it's only as strong as their engineering and their incentives. (See Do AI scribes train on your therapy data?.)
What this means for your consent process
If you use a tool that retains de-identified data, your client consent should say so plainly. Telling a client "it's anonymized" when it's technically "de-identified and used to train AI" risks misleading them. See client consent for AI note-taking for language that holds up.
Where Eclio stands
Eclio sidesteps the entire debate: we don't retain your transcripts to train AI, de-identified or otherwise. And our upcoming local mode keeps the transcript on your device, so there's no copy on a vendor's servers to de-identify in the first place.