Welcome to the LimeSurvey Community Forum

Ask the community, share ideas, and connect with other LimeSurvey users!

From "ethical anonymity" to provable anonymity — a differential-privacy plugin c

More
4 hours 26 minutes ago #274602 by VERAPROTOCOLE
Please help us help you and fill where relevant:
Your LimeSurvey version: [see right hand bottom of your LimeSurvey admin screen]
Own server or LimeSurvey hosting:
Survey theme/template:
==================
(Write here your question/remark)Please help us help you and fill where relevant:
Your LimeSurvey version: [see right hand bottom of your LimeSurvey admin screen]
Own server or LimeSurvey hosting:
Survey theme/template:
==================
(Write here your question/remark)Thanks for pushing on this, Holch — you're right that my previous answer didn't fully address what you're describing. Let me try to separate the two issues clearly, because they need different answers.
1. The survey creator passing token-table details into the response table
You're describing a real risk: in a standard LimeSurvey setup, the survey creator is the same entity that controls both the token table (with attributes like department) and the survey configuration. Whoever can configure the survey to use the department attribute could, with that same access, also leak other token-table fields (name, email, token) into the response table. I agree — within LimeSurvey alone, there's no separation between "the authority that knows the attribute" and "the system that processes the survey." It's one and the same actor.
This is exactly the gap VERA is built to address — not by hiding the attribute from the survey creator (it can't, and doesn't try to), but by splitting that single role into separate ones that never share the full picture:
The HR/organizer role generates a local opaque identifier per participant (never the email itself) and sends VERA only (opaque_id, department) — never identity.
VERA returns (opaque_id, token) without ever seeing the email.
HR alone reconstitutes the mapping locally and sends the token through a standard delivery channel (email/SMS) — which never receives or sees the department, only "send this content to this address."
The response, once submitted with the token, is processed and aggregated by VERA without ever being combined with identity or department at the individual level.
So: HR/the organizer still knows (identity, department) — I'm not claiming otherwise, and I want to be precise about that, because overclaiming it would be dishonest. What VERA prevents is that this knowledge propagates into the response-processing system itself. There's no single point in the pipeline — outside of HR's own local files — where identity, department, and an individual response ever coexist. VERA's contribution is introducing two independent boundaries (HR ↔ VERA, VERA ↔ delivery channel) that LimeSurvey's own architecture doesn't have by default.
I want to be honest about what this doesn't solve: HR (or whoever manages attribution) still has the underlying knowledge at the moment of issuance — that's an inherent property of any system distributing a right tied to a real-world attribute, not something cryptography can remove. What changes is who else has access to it downstream, and that's the actual question you raised.
On self-declaration specifically: you're right that if self-declaration is acceptable for the use case, LimeSurvey's native anonymous mode already solves the first problem on its own — no need for VERA there. VERA's separation of roles only matters when self-declaration isn't good enough for the task, which is exactly your cross-department evaluation example: you need the department attribute to be reliable, not just claimed by the respondent, which is the case where (b) or (c) apply and where the leakage risk you described is real.
2. Small sample sizes (your real-world client example with n=1 or n=2)
This one VERA handles directly through a hard threshold, not through noise alone.
VERA refuses to publish any cohort result below K_MIN=100 — not a degraded or noisy version, nothing at all. This isn't arbitrary: at ε=0.5, n≥100 is required to keep aggregate error within ±5%, demonstrated and reproducible in demo_cohortes.py in the repo, which simulates exactly your case — a small department (e.g. n=7) gets refused, a larger one (e.g. n=120) gets published with bounded error.
For your actual client case (departments at n=1 or n=2): with VERA, those departments would simply never produce a published result. If an organizer disables or bypasses that threshold to publish anyway, the result falls outside VERA's guarantee entirely — it's no longer "VERA with a small cohort," it's a choice made outside the protocol.
Summary
Token-table leakage into responses → prevented by design, because no component outside HR ever receives both identity and department together — not by fixing LimeSurvey's architecture, but by introducing a separation of roles that LimeSurvey alone doesn't have.
Small cohorts → solved by a hard refusal to publish below K_MIN=100, demonstrated with numbers in the repo.
I'd rather be precise about where the guarantee actually applies than oversell it. Happy to keep being challenged on this

Please Log in to join the conversation.

Moderators: holchtpartner

Lime-years ahead

Online-surveys for every purse and purpose