Threat Model
We design against three adversary classes:
- The Curious Manager β wants to know who said something. Has access to Teams/Zoom admin, email logs, corporate network monitoring. Motivation: performance management, retaliation.
- The IT/Legal Team β operates under legal pressure. Has access to eDiscovery tools, Microsoft 365 compliance center, corporate email servers, potentially a subpoena. Motivation: compliance, litigation hold.
- The Abuser β a participant using anonymity to harass, threaten, or probe moderation. Motivation: harm to others or circumventing content policy.
We defeat the first two by architecture. We mitigate the third by moderation and rate limiting β without breaking anonymity for the first two.
Data Retention Matrix
Every datum we touch, what happens to it, and when it's gone. TTL=0 means it never reaches storage.
| Data element | Collected? | Stored? | TTL | Notes |
|---|---|---|---|---|
| Sender phone number (SMS) | Yes β Twilio delivers it | Never stored raw | TTL=0 | Immediately HMAC-SHA256 hashed for routing lookup only. Raw number discarded. |
| Sender IP address (web form) | Yes β used for rate limiting | Never stored | TTL=0 | Rate limit check passes through IP; IP never written to session, log, or relay payload. |
| Message content (web, SMS) | Yes | Never stored | TTL=0 | Moderated β relayed to meeting chat β not retained. Only a message count integer is stored per session. |
| Meeting session metadata | Yes | In-memory only | Session lifetime + 30s | Stored in Node.js process RAM only. Session ends when meeting ends. Process restart wipes all sessions. |
| Teams conversation ID | Yes | In-memory only | Session lifetime | Required to post to Teams chat. Cleared when session ends. |
| Phone number (HMAC hash, routing) | Derived | In-memory only | Session lifetime | Used to route inbound SMS to the correct session. Cannot be reversed to recover the phone number. |
| Voice call recipient (email/phone) | Yes β user provides | Never stored | TTL=0 | HMAC-hashed for cooldown enforcement. Raw identifier discarded after hash computed. |
| Abuse report content | Yes β if report submitted | Logs only | 30 days (log rotation) | Session ID + reason only. No sender identity. Logs are not backed up to persistent storage by design. |
| Stripe payment data | Yes | At Stripe, not us | Per Stripe policy | We receive only a payment intent ID. Card data never touches our servers. Stripe's privacy policy applies. |
| Waitlist email | Yes β if user submits | Yes, disk | Until user requests deletion | Voluntarily provided. Not linked to any session or message. Can be deleted on request: privacy@notonrecord.com |
| Moderation API payloads | Yes β message text sent to OpenAI | Not by us | Per OpenAI policy | Only message text is sent. No session ID, sender identity, or metadata is included in the moderation API call. |
Identity Stripping
SMS path
Twilio delivers an inbound SMS webhook to our server. The payload includes the sender's phone number. Our processing order:
- Receive webhook from Twilio
- Compute HMAC-SHA256 of the normalized phone number (E.164 format)
- Use hash to look up active session (routing only)
- Discard raw phone number from memory
- Extract message text only
- Moderate content (text only β no phone, no hash sent to OpenAI)
- Relay "π¬ Anonymous: [message]" to Teams/Zoom chat
- Increment session message counter (integer only)
At no point is the phone number stored, logged, or included in the relay payload. The HMAC hash is stored in-memory only for session routing and is not logged.
Web form path
User submits message via browser form. Processing order:
- Receive HTTP POST
- Rate limit check using IP address (express-rate-limit, in-memory only)
- IP address goes no further β not stored, not logged, not included in session
- Extract message text
- Moderate content
- Relay to Teams/Zoom chat
Voice call path
User provides recipient phone/email and message text. Processing order:
- Pre-moderation check (before payment)
- HMAC-hash recipient for cooldown check β raw identifier not stored
- Stripe payment intent created (we store payment intent ID only)
- On payment confirm: TTS generates audio, Twilio places call
- Recipient identifier discarded; only cooldown hash retained in-memory
Keyed Hashing (HMAC-SHA256)
We use HMAC-SHA256 rather than raw SHA-256 wherever we hash identifying information (phone numbers, email addresses).
Implementation
// Normalize first to prevent bypass via formatting variation
const normalized = phone.replace(/\D/g, '').replace(/^1?/, '+1').slice(0, 12);
// HMAC-SHA256 with server secret
const hash = crypto
.createHmac('sha256', process.env.HMAC_SECRET)
.update(normalized)
.digest('hex');
Normalization
We normalize identifiers before hashing to prevent bypass via formatting variation:
- Phone numbers: Stripped to digits, normalized to E.164 (
+1XXXXXXXXXX) - Email addresses: Lowercased. We intentionally do not normalize Gmail
+addressing to avoid false coupling between unrelated users who happen to share a domain prefix.
Secret rotation
The HMAC secret is set via environment variable (HMAC_SECRET). Rotation is supported but resets in-memory cooldown continuity β existing sessions are unaffected since sessions are scoped to meeting lifetime. A process restart after rotation is sufficient.
Provider Metadata and Linkage
This is the most important architectural section for enterprise reviewers. The risk: NotOnRecord touches multiple providers (Twilio, Teams/Zoom, Stripe, OpenAI). Could a determined adversary join records across providers to identify a sender?
The joinable graph β what each provider sees
| Provider | What they hold | What they do NOT hold |
|---|---|---|
| Twilio | Sender phone (SMS), To/From numbers (voice), Call SID, timestamp | Message content, session ID, payer identity |
| Teams / Zoom | Bot posts "Anonymous: [message]", conversation ID, timestamp | Sender identity, session ID, phone number |
| Stripe | Card/payer identity, payment intent, payer IP, timestamp, amount | Message content, recipient identity, session ID |
| OpenAI | Message text, moderation timestamp | Sender identity, session ID, phone, IP, payment data |
| NotOnRecord | Session ID, message count (integer), HMAC phone hash (in-memory, session-scoped) | Sender phone (raw), sender IP, message content (not retained), payer identity |
Join path classification
We formally enumerate three categories of linkage risk:
- β Architecturally impossible β the data required for the join does not exist in any system we control or any provider we use
- β οΈ Subpoena-dependent + probabilistic β requires coordinated legal action against multiple providers, and even then yields probabilistic inference rather than deterministic identity
- πΆ User-opt-in β user explicitly chose a path that creates linkage; documented in UX; cannot be subpoenaed away because the user made the choice
| Join attempt | Meeting/SMS | Meeting/Web | Voice | |
|---|---|---|---|---|
| IP β content | β | β | β | β |
| Phone β content | β οΈ Subpoena+TS | N/A | N/A | N/A |
| Payer β recipient | N/A | N/A | β οΈ Subpoena+prob | β οΈ Subpoena+prob |
| Payer β recording | N/A | N/A | β οΈ Subpoena | N/A |
| Email β identity | N/A | N/A | πΆ Opt-in | N/A |
| Content β identity | β | β | β | β |
Temporal obfuscation: the Bayesian model
For paid channels (voice, email), the primary residual linkage vector is timestamp correlation: an adversary with both a Stripe record (payer at time Tβ) and a Twilio record (call at time Tβ) could attempt to infer P(same person | Tβ β Tβ = Ξ).
We address this with a non-uniform randomized delay between payment confirmation and call/email dispatch. The delay distribution is heavy-tailed:
P(Ξ β 10β60s) β 60% // most calls
P(Ξ β 60β180s) β 25%
P(Ξ β 3β8 min) β 10%
P(Ξ β 8β20 min) β 5% // long tail β high anonymity window
Why heavy-tailed rather than uniform? A uniform distribution over [5s, 120s] is statistically fingerprintable β an attacker with enough observations can model it and narrow the correlation window. A mixture distribution with long tails has higher entropy H(Ξ), which directly reduces the posterior confidence of timestamp-based linkage.
Formally: if Stripe events arrive as a Poisson process with rate Ξ»β and Twilio events with rate Ξ»β, and the introduced delay follows distribution D, then the posterior probability that a specific Stripe event caused a specific Twilio event is:
P(match | Stripe=Tβ, Twilio=Tβ) β P(Ξ = TββTβ) Β· P(Tβ from Ξ»β) Β· P(Tβ from Ξ»β)
As H(D) increases:
β More candidate Stripe events fall within the plausible delay window
β Posterior P(match) drops from β1.0 toward 1/N where N = candidate set size
β At scale (high Ξ»), N grows β anonymity increases with volume
The goal is not zero correlation β it is plausible deniability under realistic adversary models. An adversary cannot determine with certainty which Stripe payment corresponds to which call. They can only make probabilistic inferences under conditions that require multi-party legal coordination.
Additional mitigations
- Meeting chat has no payment linkage β SMS and web form messages involve no Stripe events. Twilio and Teams are temporally separated by 50β500ms jitter. No payment join vector exists.
- No shared session identifier across providers β Session IDs are never sent to Twilio, Stripe, or OpenAI.
- Nginx access_log suppressed on all anonymous-path endpoints β No IP-to-action log exists at the reverse proxy layer. The "IP never stored" claim holds at all layers.
- Recording stored in-memory, not disk β Recordings are held in Node.js heap and served directly. No filesystem artifact is created. Recording is GC'd immediately after serving.
Content Moderation & Degraded Mode
We moderate message content to prevent harassment, threats, and abuse. Moderation operates on message text only β no sender identity is included in moderation API calls.
Moderation state machine
β (3 consecutive API failures)
DEGRADED
β (3 consecutive API successes AND >60s since last failure)
NORMAL
The dual recovery condition (successes + time) prevents oscillation on intermittent outages where the API alternates between passing and failing.
Degraded mode policy
| Channel | Degraded mode behavior | Rationale |
|---|---|---|
| SMS | Blocked entirely | High velocity harassment vector; no substitute filter sufficient |
| Voice | Blocked entirely | High impact per-message; audio harassment cannot be easily keyword-filtered |
| Web form | Basic keyword filter + existing rate limits | Lower velocity; rate limiting provides secondary mitigation |
| Basic keyword filter + existing rate limits | Lower velocity; per-recipient cooldown provides secondary mitigation |
Anti-probe rate limiting
The moderation API is a content oracle β attackers could probe it to learn what content is blocked and craft circumventions. Mitigation: identical message content submitted more than 3 times within 5 seconds is rejected without calling the moderation API. The rejection is indistinguishable from a moderation failure.
Moderation before payment
For paid channels (voice, email), content is moderated before Stripe payment intent creation. This eliminates paid-abuse scenarios (where an attacker pays to send prohibited content) and reduces chargeback exposure.
Health endpoint
Moderation system state is exposed at GET /health for operational monitoring:
{
"moderation": {
"available": true,
"degraded": false,
"failStreak": 0, // consecutive failures (0 = healthy)
"successStreak": 14, // consecutive successes since last recovery
"lastFailureTs": null,
"lastSuccessTs": "2026-03-01T09:00:00Z",
"degradedSince": null
}
}
Abuse Handling Policy
Per-recipient cooldown
For voice calls and email, the same recipient cannot be contacted more than once per 10 minutes from any session. The cooldown is keyed on an HMAC hash of the recipient identifier β we track targets, not senders.
Abuse report mechanism
Each delivered message includes a report link containing a signed opaque token:
- Voice calls: "To report this message, visit notonrecord.com/report and enter code XXXX"
- Emails: Report link appended to message body
The token is single-use and expires after 24 hours. It encodes the session ID in a non-guessable way β the report endpoint does not accept raw session IDs. This prevents:
- Spam reports by parties who never received the message
- Session enumeration via brute-forced report attempts
What triggers action
| Event | Automated action | What's logged |
|---|---|---|
| Moderation API blocks content | Message rejected; sender sees generic error | Session ID + blocked category (no message content, no sender) |
| Rate limit exceeded | Request rejected with 429 | Nothing (rate limit handled in-memory by express-rate-limit) |
| Abuse report submitted | Session flagged; ops notified | Session ID + reason string (max 500 chars) |
| Multiple independent reports within 1 hour | Session auto-throttled (future) | Session ID + report count |
Retention of abuse artifacts
Abuse reports are logged to stdout only (not persistent storage). Log rotation occurs at 30 days. No sender identity is present in any abuse log entry β by design, we don't have it.
Subpoena Resistance
If we receive a subpoena for records identifying who sent a specific anonymous message:
- We do not have a sender phone number or IP address β it was never stored
- We do not have message content β it was relayed and not retained
- We have a session ID, session timestamps, and a message count integer
- We have an HMAC hash of a phone number, keyed on a server secret we hold β this is of limited use without also having the source phone number to verify against
What a subpoena to us produces: a meeting happened, N anonymous messages were sent, the meeting started and ended at these times. Nothing linking any individual to any message.
Provider subpoena considerations
A subpoena to Twilio would reveal that a specific phone number sent a text to our Twilio number at a specific time. Twilio cannot see message content (we relay it; Twilio sees only the delivery transaction). A sophisticated adversary combining a Twilio subpoena with meeting participant lists narrows the field to meeting attendees who had their phone nearby. This is the residual privacy risk that we document honestly: we cannot protect against a scenario where an adversary has both provider subpoena responses and meeting attendance records.
For maximum protection, participants can use the web form from a non-corporate device or network β in that case, even Twilio is not in the picture.
Failure Modes
| Failure | Behavior | Privacy impact |
|---|---|---|
| Node process crash / restart | All sessions wiped; active meetings lose bot | None β wipe is the privacy-preserving outcome |
| OpenAI moderation API unavailable | Degraded mode (see Β§6) | None β no identity data sent either way |
| Twilio webhook delivery failure | Message not relayed; sender sees no error (Twilio retries) | None |
| Teams Bot Framework token expiry | Relay fails; session error logged | None β message content not persisted on failure |
| HMAC secret compromise | An attacker with secret can brute-force phone hashes in-memory | Medium β rotate secret + restart; active sessions affected until restart |
| Rate limit store cleared (restart) | Rate limit counters reset; brief window of slightly higher permitted velocity | None β no identity data involved |
Operational Realities
This section documents constraints we cannot architect away β they are properties of operating on top of third-party infrastructure. We document them because enterprise reviewers deserve an honest picture, not a curated one.
Stripe as the strongest identity anchor
Stripe retains: card fingerprint, billing email, payer IP at checkout, payment timestamp, and amount. This is true regardless of anything we do. Stripe's privacy policy governs this data β not ours.
What we control: Stripe metadata fields (which we leave empty of message content and recipient identity), and our timing delay (which reduces timestamp correlation confidence). What we cannot control: Stripe's own data retention. Enterprise reviewers should treat Stripe's privacy policy as the binding constraint for paid-channel identity risk on the payer side.
Practical implication: A person who pays for a voice call is not anonymous to Stripe. They are anonymous to the recipient, to us, and to anyone without a Stripe subpoena. If maximum anonymity is required, use the meeting chat web form β it involves no payment processing.
Traffic volume and anonymity
Counterintuitively, the system's privacy properties improve with scale.
At low transaction volume, timestamp-based correlation is easier: few Stripe and Twilio events exist within any given time window, so the candidate set is small and inference confidence is higher. As transaction volume increases, more events fall within the plausible delay window for any given payment β the candidate set N grows, and P(correct match) approaches 1/N.
This is a structural property of the temporal obfuscation model: anonymity scales with usage. It does not degrade with scale, as is common in other systems.
Edge layer log retention
Our nginx reverse proxy suppresses access logs on all anonymous-path endpoints (access_log off). However:
- We do not currently use a CDN or edge proxy (Cloudflare, CloudFront, etc.). If we add one, that provider may retain request logs including IP and URI.
- We do not use a WAF. If we add one, WAF structured event logs could reintroduce URI-to-IP associations on blocked requests.
- Our TLS termination occurs at nginx on our own infrastructure.
Before adding any edge layer, we will audit its logging behavior and either disable logging on anonymous paths or document the exception.
Node.js diagnostic artifacts
Node.js can generate heap dumps (--heapdump) and diagnostic reports that could, in theory, capture in-memory recording buffers or session data. In production:
- Heap dumps are not enabled β no
--heapdumpflag, nov8.writeHeapSnapshot()call - Node diagnostic reports are disabled (
--no-reportis not explicitly set, but no report triggers are configured) - No APM agent (Datadog, New Relic, etc.) is installed β these can capture request/response bodies for large payloads
If we add APM or profiling tooling in the future, we will configure body capture exclusions on anonymous-path routes before deployment.
Future infrastructure: number pool rotation
As visibility increases, enterprise IT teams may begin blocking our Twilio numbers or flagging our domain. To maintain service continuity and resist targeted blocking:
- We plan to operate a pool of Twilio numbers, rotated per meeting session
- We plan to use multiple sending domains for email delivery
- Domain rotation infrastructure is not yet built β planned for post-launch after first blocking events are observed
Number pool rotation also slightly improves anonymity: a subpoena to Twilio against a specific phone number yields a smaller candidate set if many numbers are in use simultaneously.
What "anonymous" means in this context
We use "anonymous" to mean: sender identity is not present in any system record we hold or control, and cannot be recovered without multi-party legal action against third-party providers followed by probabilistic inference.
We do not claim: cryptographic unlinkability, protection against all possible adversaries, or anonymity in the presence of physical surveillance, device compromise, or behavioral analysis outside our system.
The threat model we are solving is: corporate IT teams, meeting organizers, and legal/compliance teams operating within normal organizational authority. We are not designed to withstand nation-state adversaries or coordinated law enforcement with full provider cooperation.
Audit & Contact
We intend to commission an independent security audit before general availability. We will publish the results (including findings) on this page.
If you've identified a security issue or privacy concern:
- Security: security@notonrecord.com
- Privacy: privacy@notonrecord.com
- Delete your waitlist data: privacy@notonrecord.com with subject "Delete my data"
We will respond to security reports within 48 hours.