Security & Privacy Architecture

Threat Model

We design against three adversary classes:

The Curious Manager — wants to know who said something. Has access to Teams/Zoom admin, email logs, corporate network monitoring. Motivation: performance management, retaliation.
The IT/Legal Team — operates under legal pressure. Has access to eDiscovery tools, Microsoft 365 compliance center, corporate email servers, potentially a subpoena. Motivation: compliance, litigation hold.
The Abuser — a participant using anonymity to harass, threaten, or probe moderation. Motivation: harm to others or circumventing content policy.

We defeat the first two by architecture. We mitigate the third by moderation and rate limiting — without breaking anonymity for the first two.

Key principle: Defeating adversaries 1 and 2 is a hard guarantee. Mitigating adversary 3 is a best-effort safety system that fails safely (blocks high-risk channels) rather than falls back to logging identity.

Data Retention Matrix

Every datum we touch, what happens to it, and when it's gone. TTL=0 means it never reaches storage.

Data element	Collected?	Stored?	TTL	Notes
Sender phone number (SMS)	Yes — Twilio delivers it	Never stored raw	TTL=0	Immediately HMAC-SHA256 hashed for routing lookup only. Raw number discarded.
Sender IP address (web form)	Yes — used for rate limiting	Never stored	TTL=0	Rate limit check passes through IP; IP never written to session, log, or relay payload.
Message content (web, SMS)	Yes	Never stored	TTL=0	Moderated → relayed to meeting chat → not retained. Only a message count integer is stored per session.
Meeting session metadata	Yes	In-memory only	Session lifetime + 30s	Stored in Node.js process RAM only. Session ends when meeting ends. Process restart wipes all sessions.
Teams conversation ID	Yes	In-memory only	Session lifetime	Required to post to Teams chat. Cleared when session ends.
Phone number (HMAC hash, routing)	Derived	In-memory only	Session lifetime	Used to route inbound SMS to the correct session. Cannot be reversed to recover the phone number.
Voice call recipient (email/phone)	Yes — user provides	Never stored	TTL=0	HMAC-hashed for cooldown enforcement. Raw identifier discarded after hash computed.
Abuse report content	Yes — if report submitted	Logs only	30 days (log rotation)	Session ID + reason only. No sender identity. Logs are not backed up to persistent storage by design.
Stripe payment data	Yes	At Stripe, not us	Per Stripe policy	We receive only a payment intent ID. Card data never touches our servers. Stripe's privacy policy applies.
Waitlist email	Yes — if user submits	Yes, disk	Until user requests deletion	Voluntarily provided. Not linked to any session or message. Can be deleted on request: privacy@notonrecord.com
Moderation API payloads	Yes — message text sent to OpenAI	Not by us	Per OpenAI policy	Only message text is sent. No session ID, sender identity, or metadata is included in the moderation API call.

Identity Stripping

SMS path

Twilio delivers an inbound SMS webhook to our server. The payload includes the sender's phone number. Our processing order:

Receive webhook from Twilio
Compute HMAC-SHA256 of the normalized phone number (E.164 format)
Use hash to look up active session (routing only)
Discard raw phone number from memory
Extract message text only
Moderate content (text only — no phone, no hash sent to OpenAI)
Relay "💬 Anonymous: [message]" to Teams/Zoom chat
Increment session message counter (integer only)

At no point is the phone number stored, logged, or included in the relay payload. The HMAC hash is stored in-memory only for session routing and is not logged.

Web form path

User submits message via browser form. Processing order:

Receive HTTP POST
Rate limit check using IP address (express-rate-limit, in-memory only)
IP address goes no further — not stored, not logged, not included in session
Extract message text
Moderate content
Relay to Teams/Zoom chat

Voice call path

User provides recipient phone/email and message text. Processing order:

Pre-moderation check (before payment)
HMAC-hash recipient for cooldown check — raw identifier not stored
Stripe payment intent created (we store payment intent ID only)
On payment confirm: TTS generates audio, Twilio places call
Recipient identifier discarded; only cooldown hash retained in-memory

Keyed Hashing (HMAC-SHA256)

We use HMAC-SHA256 rather than raw SHA-256 wherever we hash identifying information (phone numbers, email addresses).

Why HMAC matters: Phone numbers and many email addresses are low-entropy. An attacker who obtained a raw SHA-256 hash database could brute-force likely values offline with commodity hardware in minutes. HMAC requires the server secret key to compute candidate hashes — offline brute-force becomes infeasible without key compromise.

Implementation

// Normalize first to prevent bypass via formatting variation
const normalized = phone.replace(/\D/g, '').replace(/^1?/, '+1').slice(0, 12);

// HMAC-SHA256 with server secret
const hash = crypto
  .createHmac('sha256', process.env.HMAC_SECRET)
  .update(normalized)
  .digest('hex');

Normalization

We normalize identifiers before hashing to prevent bypass via formatting variation:

Phone numbers: Stripped to digits, normalized to E.164 (+1XXXXXXXXXX)
Email addresses: Lowercased. We intentionally do not normalize Gmail + addressing to avoid false coupling between unrelated users who happen to share a domain prefix.

Secret rotation

The HMAC secret is set via environment variable (HMAC_SECRET). Rotation is supported but resets in-memory cooldown continuity — existing sessions are unaffected since sessions are scoped to meeting lifetime. A process restart after rotation is sufficient.

Provider Metadata and Linkage

This is the most important architectural section for enterprise reviewers. The risk: NotOnRecord touches multiple providers (Twilio, Teams/Zoom, Stripe, OpenAI). Could a determined adversary join records across providers to identify a sender?

The joinable graph — what each provider sees

Provider	What they hold	What they do NOT hold
Twilio	Sender phone (SMS), To/From numbers (voice), Call SID, timestamp	Message content, session ID, payer identity
Teams / Zoom	Bot posts "Anonymous: [message]", conversation ID, timestamp	Sender identity, session ID, phone number
Stripe	Card/payer identity, payment intent, payer IP, timestamp, amount	Message content, recipient identity, session ID
OpenAI	Message text, moderation timestamp	Sender identity, session ID, phone, IP, payment data
NotOnRecord	Session ID, message count (integer), HMAC phone hash (in-memory, session-scoped)	Sender phone (raw), sender IP, message content (not retained), payer identity

Join path classification

We formally enumerate three categories of linkage risk:

❌ Architecturally impossible — the data required for the join does not exist in any system we control or any provider we use
⚠️ Subpoena-dependent + probabilistic — requires coordinated legal action against multiple providers, and even then yields probabilistic inference rather than deterministic identity
🔶 User-opt-in — user explicitly chose a path that creates linkage; documented in UX; cannot be subpoenaed away because the user made the choice

Join attempt	Meeting/SMS	Meeting/Web	Voice	Email
IP → content	❌	❌	❌	❌
Phone → content	⚠️ Subpoena+TS	N/A	N/A	N/A
Payer → recipient	N/A	N/A	⚠️ Subpoena+prob	⚠️ Subpoena+prob
Payer → recording	N/A	N/A	⚠️ Subpoena	N/A
Email → identity	N/A	N/A	🔶 Opt-in	N/A
Content → identity	❌	❌	❌	❌

Temporal obfuscation: the Bayesian model

For paid channels (voice, email), the primary residual linkage vector is timestamp correlation: an adversary with both a Stripe record (payer at time T₀) and a Twilio record (call at time T₁) could attempt to infer P(same person | T₁ − T₀ = Δ).

We address this with a non-uniform randomized delay between payment confirmation and call/email dispatch. The delay distribution is heavy-tailed:

P(Δ ∈ 10–60s)    ≈ 60%   // most calls
P(Δ ∈ 60–180s)   ≈ 25%
P(Δ ∈ 3–8 min)   ≈ 10%
P(Δ ∈ 8–20 min)  ≈  5%   // long tail — high anonymity window

Why heavy-tailed rather than uniform? A uniform distribution over [5s, 120s] is statistically fingerprintable — an attacker with enough observations can model it and narrow the correlation window. A mixture distribution with long tails has higher entropy H(Δ), which directly reduces the posterior confidence of timestamp-based linkage.

Formally: if Stripe events arrive as a Poisson process with rate λₛ and Twilio events with rate λₜ, and the introduced delay follows distribution D, then the posterior probability that a specific Stripe event caused a specific Twilio event is:

P(match | Stripe=T₀, Twilio=T₁) ∝ P(Δ = T₁−T₀) · P(T₀ from λₛ) · P(T₁ from λₜ)

As H(D) increases:
  → More candidate Stripe events fall within the plausible delay window
  → Posterior P(match) drops from ≈1.0 toward 1/N where N = candidate set size
  → At scale (high λ), N grows — anonymity increases with volume

Scale increases anonymity. At low transaction volume, correlation is easier because few Stripe and Twilio events exist in the plausible window. As volume grows, the candidate set N grows proportionally, and inference confidence drops. This is an unusual property: the product becomes more privacy-preserving as it scales.

The goal is not zero correlation — it is plausible deniability under realistic adversary models. An adversary cannot determine with certainty which Stripe payment corresponds to which call. They can only make probabilistic inferences under conditions that require multi-party legal coordination.

Additional mitigations

Meeting chat has no payment linkage — SMS and web form messages involve no Stripe events. Twilio and Teams are temporally separated by 50–500ms jitter. No payment join vector exists.
No shared session identifier across providers — Session IDs are never sent to Twilio, Stripe, or OpenAI.
Nginx access_log suppressed on all anonymous-path endpoints — No IP-to-action log exists at the reverse proxy layer. The "IP never stored" claim holds at all layers.
Recording stored in-memory, not disk — Recordings are held in Node.js heap and served directly. No filesystem artifact is created. Recording is GC'd immediately after serving.

Stripe as identity anchor: Stripe is the strongest identity anchor in the system. They hold payer card, billing email, and IP at checkout time. We cannot change Stripe's data retention. What we control is: (a) Stripe metadata never includes message content or recipient identity, (b) our timing delay prevents deterministic Stripe→Twilio correlation. Enterprise reviewers should treat Stripe's privacy policy as the binding constraint for paid-channel identity risk — not ours.

Content Moderation & Degraded Mode

We moderate message content to prevent harassment, threats, and abuse. Moderation operates on message text only — no sender identity is included in moderation API calls.

Moderation state machine

NORMAL
↓ (3 consecutive API failures)
DEGRADED
↓ (3 consecutive API successes AND >60s since last failure)
NORMAL

The dual recovery condition (successes + time) prevents oscillation on intermittent outages where the API alternates between passing and failing.

Degraded mode policy

Channel	Degraded mode behavior	Rationale
SMS	Blocked entirely	High velocity harassment vector; no substitute filter sufficient
Voice	Blocked entirely	High impact per-message; audio harassment cannot be easily keyword-filtered
Web form	Basic keyword filter + existing rate limits	Lower velocity; rate limiting provides secondary mitigation
Email	Basic keyword filter + existing rate limits	Lower velocity; per-recipient cooldown provides secondary mitigation

Anti-probe rate limiting

The moderation API is a content oracle — attackers could probe it to learn what content is blocked and craft circumventions. Mitigation: identical message content submitted more than 3 times within 5 seconds is rejected without calling the moderation API. The rejection is indistinguishable from a moderation failure.

Moderation before payment

For paid channels (voice, email), content is moderated before Stripe payment intent creation. This eliminates paid-abuse scenarios (where an attacker pays to send prohibited content) and reduces chargeback exposure.

Health endpoint

Moderation system state is exposed at GET /health for operational monitoring:

{
  "moderation": {
    "available": true,
    "degraded": false,
    "failStreak": 0,        // consecutive failures (0 = healthy)
    "successStreak": 14,    // consecutive successes since last recovery
    "lastFailureTs": null,
    "lastSuccessTs": "2026-03-01T09:00:00Z",
    "degradedSince": null
  }
}

Abuse Handling Policy

Per-recipient cooldown

For voice calls and email, the same recipient cannot be contacted more than once per 10 minutes from any session. The cooldown is keyed on an HMAC hash of the recipient identifier — we track targets, not senders.

Abuse report mechanism

Each delivered message includes a report link containing a signed opaque token:

Voice calls: "To report this message, visit notonrecord.com/report and enter code XXXX"
Emails: Report link appended to message body

The token is single-use and expires after 24 hours. It encodes the session ID in a non-guessable way — the report endpoint does not accept raw session IDs. This prevents:

Spam reports by parties who never received the message
Session enumeration via brute-forced report attempts

What triggers action

Event	Automated action	What's logged
Moderation API blocks content	Message rejected; sender sees generic error	Session ID + blocked category (no message content, no sender)
Rate limit exceeded	Request rejected with 429	Nothing (rate limit handled in-memory by express-rate-limit)
Abuse report submitted	Session flagged; ops notified	Session ID + reason string (max 500 chars)
Multiple independent reports within 1 hour	Session auto-throttled (future)	Session ID + report count

Retention of abuse artifacts

Abuse reports are logged to stdout only (not persistent storage). Log rotation occurs at 30 days. No sender identity is present in any abuse log entry — by design, we don't have it.

Subpoena Resistance

The principle: A subpoena cannot compel us to produce records that do not exist. This is not a legal strategy — it is an architectural consequence of building a system that does not retain identity.

If we receive a subpoena for records identifying who sent a specific anonymous message:

We do not have a sender phone number or IP address — it was never stored
We do not have message content — it was relayed and not retained
We have a session ID, session timestamps, and a message count integer
We have an HMAC hash of a phone number, keyed on a server secret we hold — this is of limited use without also having the source phone number to verify against

What a subpoena to us produces: a meeting happened, N anonymous messages were sent, the meeting started and ended at these times. Nothing linking any individual to any message.

Provider subpoena considerations

A subpoena to Twilio would reveal that a specific phone number sent a text to our Twilio number at a specific time. Twilio cannot see message content (we relay it; Twilio sees only the delivery transaction). A sophisticated adversary combining a Twilio subpoena with meeting participant lists narrows the field to meeting attendees who had their phone nearby. This is the residual privacy risk that we document honestly: we cannot protect against a scenario where an adversary has both provider subpoena responses and meeting attendance records.

For maximum protection, participants can use the web form from a non-corporate device or network — in that case, even Twilio is not in the picture.

Failure Modes

Failure	Behavior	Privacy impact
Node process crash / restart	All sessions wiped; active meetings lose bot	None — wipe is the privacy-preserving outcome
OpenAI moderation API unavailable	Degraded mode (see §6)	None — no identity data sent either way
Twilio webhook delivery failure	Message not relayed; sender sees no error (Twilio retries)	None
Teams Bot Framework token expiry	Relay fails; session error logged	None — message content not persisted on failure
HMAC secret compromise	An attacker with secret can brute-force phone hashes in-memory	Medium — rotate secret + restart; active sessions affected until restart
Rate limit store cleared (restart)	Rate limit counters reset; brief window of slightly higher permitted velocity	None — no identity data involved

Operational Realities

This section documents constraints we cannot architect away — they are properties of operating on top of third-party infrastructure. We document them because enterprise reviewers deserve an honest picture, not a curated one.

Stripe as the strongest identity anchor

Stripe retains: card fingerprint, billing email, payer IP at checkout, payment timestamp, and amount. This is true regardless of anything we do. Stripe's privacy policy governs this data — not ours.

What we control: Stripe metadata fields (which we leave empty of message content and recipient identity), and our timing delay (which reduces timestamp correlation confidence). What we cannot control: Stripe's own data retention. Enterprise reviewers should treat Stripe's privacy policy as the binding constraint for paid-channel identity risk on the payer side.

Practical implication: A person who pays for a voice call is not anonymous to Stripe. They are anonymous to the recipient, to us, and to anyone without a Stripe subpoena. If maximum anonymity is required, use the meeting chat web form — it involves no payment processing.

Traffic volume and anonymity

Counterintuitively, the system's privacy properties improve with scale.

At low transaction volume, timestamp-based correlation is easier: few Stripe and Twilio events exist within any given time window, so the candidate set is small and inference confidence is higher. As transaction volume increases, more events fall within the plausible delay window for any given payment — the candidate set N grows, and P(correct match) approaches 1/N.

This is a structural property of the temporal obfuscation model: anonymity scales with usage. It does not degrade with scale, as is common in other systems.

Edge layer log retention

Our nginx reverse proxy suppresses access logs on all anonymous-path endpoints (access_log off). However:

We do not currently use a CDN or edge proxy (Cloudflare, CloudFront, etc.). If we add one, that provider may retain request logs including IP and URI.
We do not use a WAF. If we add one, WAF structured event logs could reintroduce URI-to-IP associations on blocked requests.
Our TLS termination occurs at nginx on our own infrastructure.

Before adding any edge layer, we will audit its logging behavior and either disable logging on anonymous paths or document the exception.

Node.js diagnostic artifacts

Node.js can generate heap dumps (--heapdump) and diagnostic reports that could, in theory, capture in-memory recording buffers or session data. In production:

Heap dumps are not enabled — no --heapdump flag, no v8.writeHeapSnapshot() call
Node diagnostic reports are disabled (--no-report is not explicitly set, but no report triggers are configured)
No APM agent (Datadog, New Relic, etc.) is installed — these can capture request/response bodies for large payloads

If we add APM or profiling tooling in the future, we will configure body capture exclusions on anonymous-path routes before deployment.

Future infrastructure: number pool rotation

As visibility increases, enterprise IT teams may begin blocking our Twilio numbers or flagging our domain. To maintain service continuity and resist targeted blocking:

We plan to operate a pool of Twilio numbers, rotated per meeting session
We plan to use multiple sending domains for email delivery
Domain rotation infrastructure is not yet built — planned for post-launch after first blocking events are observed

Number pool rotation also slightly improves anonymity: a subpoena to Twilio against a specific phone number yields a smaller candidate set if many numbers are in use simultaneously.

What "anonymous" means in this context

We use "anonymous" to mean: sender identity is not present in any system record we hold or control, and cannot be recovered without multi-party legal action against third-party providers followed by probabilistic inference.

We do not claim: cryptographic unlinkability, protection against all possible adversaries, or anonymity in the presence of physical surveillance, device compromise, or behavioral analysis outside our system.

The threat model we are solving is: corporate IT teams, meeting organizers, and legal/compliance teams operating within normal organizational authority. We are not designed to withstand nation-state adversaries or coordinated law enforcement with full provider cooperation.

Audit & Contact

We intend to commission an independent security audit before general availability. We will publish the results (including findings) on this page.

If you've identified a security issue or privacy concern:

Security: security@notonrecord.com
Privacy: privacy@notonrecord.com
Delete your waitlist data: privacy@notonrecord.com with subject "Delete my data"

We will respond to security reports within 48 hours.

Enterprise review? If you're evaluating NotOnRecord for an organization with specific compliance requirements (HIPAA, SOC 2, GDPR), contact us directly. We'll walk through the architecture with your security team and provide documentation tailored to your review process.

Privacy by Architecture,
Not by Policy

Contents

Threat Model

Data Retention Matrix

Identity Stripping

SMS path

Web form path

Voice call path

Keyed Hashing (HMAC-SHA256)

Implementation

Normalization

Secret rotation

Provider Metadata and Linkage

The joinable graph — what each provider sees

Join path classification

Temporal obfuscation: the Bayesian model

Additional mitigations

Content Moderation & Degraded Mode

Moderation state machine

Degraded mode policy

Anti-probe rate limiting

Moderation before payment

Health endpoint

Abuse Handling Policy

Per-recipient cooldown

Abuse report mechanism

What triggers action

Retention of abuse artifacts

Subpoena Resistance

Provider subpoena considerations

Failure Modes

Operational Realities

Stripe as the strongest identity anchor

Traffic volume and anonymity

Edge layer log retention

Node.js diagnostic artifacts

Future infrastructure: number pool rotation

What "anonymous" means in this context

Audit & Contact

Privacy by Architecture,Not by Policy

Contents

Threat Model

Data Retention Matrix

Identity Stripping

SMS path

Web form path

Voice call path

Keyed Hashing (HMAC-SHA256)

Implementation

Normalization

Secret rotation

Provider Metadata and Linkage

The joinable graph — what each provider sees

Join path classification

Temporal obfuscation: the Bayesian model

Additional mitigations

Content Moderation & Degraded Mode

Moderation state machine

Degraded mode policy

Anti-probe rate limiting

Moderation before payment

Health endpoint

Abuse Handling Policy

Per-recipient cooldown

Abuse report mechanism

What triggers action

Retention of abuse artifacts

Subpoena Resistance

Provider subpoena considerations

Failure Modes

Operational Realities

Stripe as the strongest identity anchor

Traffic volume and anonymity

Edge layer log retention

Node.js diagnostic artifacts

Future infrastructure: number pool rotation

What "anonymous" means in this context

Audit & Contact

Privacy by Architecture,
Not by Policy