Privacy Policy — Searchbase

Last updated: 2026-05-04


1. Data Controller

The Searchbase service (https://searchbase.org) is operated by:

[Searchbase — legal entity to be defined] Registered office: to be defined Privacy email: [email protected] General email: [email protected]

Data Protection Officer (DPO): not appointed — not mandatory under Art. 37 GDPR.


2. Data we collect

We collect only what we need to provide Searchbase. Here is everything, explicitly.

2.1 Account data

  • Email
  • Password (stored as a hash by Supabase Auth — we never see it in cleartext)
  • Google OAuth profile (if you sign in with Google): name, email, sub identifier

2.2 Subscription and billing data

  • User credit balance (user_credits)
  • Subscription (user_subscriptions): Stripe customer ID, subscription tier, billing periods

Credit card data never transits our servers: it is collected directly by Stripe.

2.3 User-generated content

Stored in Supabase Postgres with Row Level Security (RLS) enabled on every user table:

  • chat_sessions — conversation titles and timestamps
  • chat_messages — full prompts, AI responses, tool calls and tool results
  • scrape_jobs — target URLs, extraction prompts, results, options
  • user_long_memory — free-text facts you author yourself (identity, projects, preferences, max 2000 chars per entry)
  • chat_session_summary — rolling summaries auto-generated by Claude Haiku
  • message_feedback — thumbs up/down on messages
  • workspaces and their members

2.4 Uploaded files

Audio and document uploads are stored in Cloudflare R2 (buckets scrapedog-audio-prod and scrapedog-documents-prod) and served via short-lived (15-minute) presigned URLs.

2.5 Browser localStorage

  • cookie-consent — your cookie choice
  • sidebar-collapsed, sidebar-workspaces-expanded — UI state
  • onboarding-completed — onboarding flag
  • scrapedog-user-memory — local mirror of your long-term memory

2.6 Server-side session cookies

  • Supabase sb-* cookies — authentication (httpOnly)
  • PostHog ph_* cookies — only if you accepted analytics consent

2.7 IP address and User-Agent

  • Collected by Cloudflare for DDoS / WAF protection
  • Collected by Microsoft Azure in request logs
  • Not sent to Sentry (sendDefaultPii: false)
  • PostHog is configured with person_profiles: 'identified_only', so anonymous visitors are not profiled

3. Purposes and legal bases (Art. 6 GDPR)

PurposeData processedLegal basis
Account creation, service delivery, billingEmail, password (hash), subscription data, user contentArt. 6(1)(b) — performance of contract
Transactional email (verification, password reset)EmailArt. 6(1)(b) — performance of contract
AI processing of your prompts and filesPrompts, attachments, scraping resultsArt. 6(1)(b) — necessary to deliver the requested service
Long-term memory storageEntries in user_long_memoryArt. 6(1)(a) — consent (you choose to enter the entries)
Product analytics (PostHog)Usage eventsArt. 6(1)(a) — consent (cookie banner)
Security, anti-abuse, rate limitingIP, UA, user IDArt. 6(1)(f) — legitimate interest (protect the service)

4. Sub-processors and third parties

To operate Searchbase we rely on third-party providers, each acting as a sub-processor. Full, current list:

ProviderPurposeData processedRegionPrivacy policy
SupabaseAuth + Postgres databaseAccount + all user contentAWS eu-central-1https://supabase.com/privacy
Cloudflare R2Object storageAudio + document filesGlobalhttps://www.cloudflare.com/privacypolicy/
CloudflareCDN, WAF, DNSIP, UA, request metadataGlobalhttps://www.cloudflare.com/privacypolicy/
Microsoft AzureCompute, Redis, Key VaultAll backend trafficNorth Europehttps://privacy.microsoft.com/
Anthropic (Claude Sonnet 4.6 + Haiku 4.5)AI response generation, session summariesEvery chat prompt, scraping job description, file contentUnited Stateshttps://www.anthropic.com/legal/privacy
Azure OpenAI Foundry (GPT-4.1, fallback)Fallback LLMChat prompts (when fallback is active)United States / EUhttps://privacy.microsoft.com/
DeepSeek (optional fallback)Fallback LLMChat prompts (when fallback is active)United States / Chinahttps://platform.deepseek.com/privacy
Deepgram (Nova-3, optional)Audio transcriptionUploaded audio filesUnited Stateshttps://deepgram.com/privacy
Firecrawl (optional fallback)Fallback scrapingTarget URLs + extraction promptsUnited Stateshttps://www.firecrawl.dev/privacy
Parallel.ai (optional)Scraping/researchTarget URLsUnited Stateshttps://parallel.ai/privacy
SerperGoogle searches via web_search toolSearch queriesUnited Stateshttps://serper.dev/privacy
ScrapeCreatorsSocial media queriesSocial handles and URLsUnited Stateshttps://scrapecreators.com/privacy
X (Twitter) APISearches on XSearch queries on XUnited Stateshttps://twitter.com/privacy
YouTube Data API (Google)Video/channel queriesSearch queriesUnited Stateshttps://policies.google.com/privacy
StripePayments and billingBilling email, payment-method tokens, subscription metadataUnited States / Irelandhttps://stripe.com/privacy
ResendTransactional email delivery (sender: [email protected])Recipient emailUnited Stateshttps://resend.com/legal/privacy-policy
PostHog EU CloudProduct analytics (consent only)Usage events, pseudonymous user IDEU (eu.i.posthog.com)https://posthog.com/privacy
SentryError trackingStack traces (no PII, sendDefaultPii: false)United Stateshttps://sentry.io/privacy/
BetterStack / LogtailApplication logs (may include user IDs and request metadata)Structured logsEU (Falkenstein)https://betterstack.com/privacy
Google OAuthSign-in with GoogleBasic profile + emailUnited Stateshttps://policies.google.com/privacy
GitHub Container RegistryDocker image hostingNo user dataGlobalhttps://docs.github.com/site-policy/privacy-policies

5. International transfers (Art. 46 GDPR)

A significant portion of the sub-processors above are based in the United States or other non-EU countries. In particular, every chat prompt, scraping description and file content sent to AI is transmitted to Anthropic in the United States (and, in fallback scenarios, to Azure OpenAI or DeepSeek).

For these transfers we rely on:

  • Standard Contractual Clauses (SCCs) issued by the European Commission (Decision 2021/914)
  • The provider's adherence to the EU–US Data Privacy Framework, where applicable (e.g. Stripe, Google, Microsoft, Cloudflare)
  • Supplementary technical measures: TLS 1.2+ in transit, JWT authentication, data minimization

6. Data retention

We keep data only as long as necessary for the purposes for which it was collected. Summary table:

Data categoryRetention period
Account dataUntil you request deletion (today via email request, self-service on the roadmap)
Chat sessions and messagesIndefinitely until you delete them, or until account deletion
Scrape jobsIndefinitely until you delete them, or until account deletion
Long-term memoryUntil you delete the entries
Files in Cloudflare R2Deleted on cascade when the parent session is deleted, or on account erasure
Application logs (BetterStack)~30 days (vendor default — exact period being documented)
Sentry events90 days (default)
PostHog events90 days (free plan default)
Billing data (Stripe)Per Stripe's policy, typically 7 years (tax / accounting obligations)
Audit logs (admin actions)Retained for compliance and security; on erasure, the user_id may be set to NULL but the action record is preserved

7. Cookies and similar technologies

CategoryNameTypePurposeConsent
Strictly necessarysb-access-token, sb-refresh-tokenHTTP cookie (httpOnly)Supabase authenticationNot required
Functionalcookie-consentlocalStorageStores your cookie choiceNot required
Functionalsidebar-collapsed, sidebar-workspaces-expandedlocalStorageUI stateNot required
Functionalonboarding-completedlocalStorageOnboarding stateNot required
Functionalscrapedog-user-memorylocalStorageLocal mirror of user memoryNot required
Analyticsph_* (PostHog)Cookie + localStorageProduct analyticsRequired (opt-in)

The cookie banner is currently binary (accept all / reject all). Rejection fully disables PostHog.


8. Security

We implement concrete technical and organizational measures:

  • In transit: TLS 1.2+ end-to-end via Cloudflare
  • At rest: Supabase (AWS-managed), Cloudflare R2 (Cloudflare-managed), Azure Key Vault for application secrets
  • Per-user isolation: Postgres Row Level Security (RLS) on every user table
  • Authentication: JWT with JWKS rotation
  • Anti-SSRF: centralized outbound URL validation through HttpGateway
  • Anti-abuse: Cloudflare WAF + DDoS protection
  • Rate limiting: distributed (Redis-backed), default 20 requests/minute per user
  • Admin access control: RBAC + audit_log table for every privileged action
  • Error minimization: Sentry configured without PII (sendDefaultPii: false)

No system is 100% secure. In the event of a personal data breach we will notify the Garante within 72 hours under Art. 33 GDPR and, if high-risk, also data subjects under Art. 34.


9. Your rights (Art. 15–22 GDPR)

You have the following rights, exercisable free of charge:

  • Access (Art. 15) — obtain a copy of your data. How: email [email protected]. We export your data as JSON (currently capped at 10,000 rows per table for operational reasons).
  • Rectification (Art. 16) — you can edit account, memory and workspaces directly in the app. For other corrections, email [email protected].
  • Erasure (Art. 17) — email [email protected]. Erasure today runs through an internal procedure that: deletes R2 files, deletes the PostHog person, cascades the Postgres deletion.
  • Restriction (Art. 18) — email request.
  • Portability (Art. 20) — same mechanism as access (JSON export).
  • Objection (Art. 21) — email request.
  • Withdrawal of consent — analytics cookie consent can be withdrawn from the banner (reject all). Long-term memory consent is withdrawn by deleting the entries from the app.

Operational transparency: as of this draft, account erasure and DSARs are admin-triggered on your email request, not self-service. We are building self-service in the gdpr/p2-gdpr-03-delete-account branch and will release it as soon as possible.


10. Automated decision-making and profiling (Art. 22)

We do not make solely-automated decisions producing legal effects or similarly significant effects on you.

Searchbase's AI models generate content (extractions, summaries, chat responses) but they are assistive tools: we do not automatically score your reliability, do not compute risk scores, do not deny access to services based on AI output.

We do not profile users for marketing purposes.


11. AI and user content

This section requires explicit honesty.

When you enter a prompt in chat, describe a scraping job, upload a file or audio, that content is sent to third-party AI providers in the United States (or other non-EU countries) for processing. Specifically:

  • Every chat prompt and AI response passes through Anthropic (Claude Sonnet 4.6 and Haiku 4.5), based in the United States.
  • If fallback is active, prompts may be sent to Azure OpenAI Foundry (GPT-4.1) or DeepSeek.
  • Audio files uploaded for transcription pass through Deepgram (Nova-3), United States.
  • Scraping prompts and target URLs may pass through Firecrawl or Parallel.ai (United States) as fallback or extension.
  • Text search queries pass through Serper, ScrapeCreators, X API or YouTube Data API.

What this means in practice:

  • The data you enter in chats is seen by the AI sub-processor for the time needed to generate the response.
  • By contract / configuration, we do not use your data to train AI models, and the main sub-processors (Anthropic, Azure OpenAI) commit not to train their models on API data.
  • Do not enter in chats personal data of others without a legal basis, secrets, passwords, health data or other special categories under Art. 9 GDPR. Searchbase is not currently designed to process special categories of data.

12. Minors

Searchbase is not directed at minors under 16. We do not knowingly collect personal data from minors under 16. If you become aware that a minor has provided us personal data without parental consent, contact [email protected] and we will delete it.


13. Changes to this policy

This notice may be updated. The current version is always available at https://searchbase.org/privacy. For material changes we will notify you by email at the address linked to your account, with at least 15 days' notice.

The version and last-updated date are shown at the top of this document.


14. Contact and complaints

Privacy email: [email protected] General email: [email protected]

If you believe the processing of your personal data violates the GDPR or Italian law, you have the right to lodge a complaint with a supervisory authority:

  • For users resident in Italy: Garante per la Protezione dei Dati Personali Piazza Venezia 11, 00187 Roma Website: https://www.garanteprivacy.it

  • For users resident in other EU Member States: the supervisory authority of your country of residence, place of work, or place of the alleged infringement (Art. 77 GDPR).