Privacy Policy — Searchbase
Last updated: 2026-05-04
1. Data Controller
The Searchbase service (https://searchbase.org) is operated by:
[Searchbase — legal entity to be defined] Registered office: to be defined Privacy email: [email protected] General email: [email protected]
Data Protection Officer (DPO): not appointed — not mandatory under Art. 37 GDPR.
2. Data we collect
We collect only what we need to provide Searchbase. Here is everything, explicitly.
2.1 Account data
- Password (stored as a hash by Supabase Auth — we never see it in cleartext)
- Google OAuth profile (if you sign in with Google): name, email,
subidentifier
2.2 Subscription and billing data
- User credit balance (
user_credits) - Subscription (
user_subscriptions): Stripe customer ID, subscription tier, billing periods
Credit card data never transits our servers: it is collected directly by Stripe.
2.3 User-generated content
Stored in Supabase Postgres with Row Level Security (RLS) enabled on every user table:
chat_sessions— conversation titles and timestampschat_messages— full prompts, AI responses, tool calls and tool resultsscrape_jobs— target URLs, extraction prompts, results, optionsuser_long_memory— free-text facts you author yourself (identity, projects, preferences, max 2000 chars per entry)chat_session_summary— rolling summaries auto-generated by Claude Haikumessage_feedback— thumbs up/down on messagesworkspacesand their members
2.4 Uploaded files
Audio and document uploads are stored in Cloudflare R2 (buckets scrapedog-audio-prod and scrapedog-documents-prod) and served via short-lived (15-minute) presigned URLs.
2.5 Browser localStorage
cookie-consent— your cookie choicesidebar-collapsed,sidebar-workspaces-expanded— UI stateonboarding-completed— onboarding flagscrapedog-user-memory— local mirror of your long-term memory
2.6 Server-side session cookies
- Supabase
sb-*cookies — authentication (httpOnly) - PostHog
ph_*cookies — only if you accepted analytics consent
2.7 IP address and User-Agent
- Collected by Cloudflare for DDoS / WAF protection
- Collected by Microsoft Azure in request logs
- Not sent to Sentry (
sendDefaultPii: false) - PostHog is configured with
person_profiles: 'identified_only', so anonymous visitors are not profiled
3. Purposes and legal bases (Art. 6 GDPR)
| Purpose | Data processed | Legal basis |
|---|---|---|
| Account creation, service delivery, billing | Email, password (hash), subscription data, user content | Art. 6(1)(b) — performance of contract |
| Transactional email (verification, password reset) | Art. 6(1)(b) — performance of contract | |
| AI processing of your prompts and files | Prompts, attachments, scraping results | Art. 6(1)(b) — necessary to deliver the requested service |
| Long-term memory storage | Entries in user_long_memory | Art. 6(1)(a) — consent (you choose to enter the entries) |
| Product analytics (PostHog) | Usage events | Art. 6(1)(a) — consent (cookie banner) |
| Security, anti-abuse, rate limiting | IP, UA, user ID | Art. 6(1)(f) — legitimate interest (protect the service) |
4. Sub-processors and third parties
To operate Searchbase we rely on third-party providers, each acting as a sub-processor. Full, current list:
| Provider | Purpose | Data processed | Region | Privacy policy |
|---|---|---|---|---|
| Supabase | Auth + Postgres database | Account + all user content | AWS eu-central-1 | https://supabase.com/privacy |
| Cloudflare R2 | Object storage | Audio + document files | Global | https://www.cloudflare.com/privacypolicy/ |
| Cloudflare | CDN, WAF, DNS | IP, UA, request metadata | Global | https://www.cloudflare.com/privacypolicy/ |
| Microsoft Azure | Compute, Redis, Key Vault | All backend traffic | North Europe | https://privacy.microsoft.com/ |
| Anthropic (Claude Sonnet 4.6 + Haiku 4.5) | AI response generation, session summaries | Every chat prompt, scraping job description, file content | United States | https://www.anthropic.com/legal/privacy |
| Azure OpenAI Foundry (GPT-4.1, fallback) | Fallback LLM | Chat prompts (when fallback is active) | United States / EU | https://privacy.microsoft.com/ |
| DeepSeek (optional fallback) | Fallback LLM | Chat prompts (when fallback is active) | United States / China | https://platform.deepseek.com/privacy |
| Deepgram (Nova-3, optional) | Audio transcription | Uploaded audio files | United States | https://deepgram.com/privacy |
| Firecrawl (optional fallback) | Fallback scraping | Target URLs + extraction prompts | United States | https://www.firecrawl.dev/privacy |
| Parallel.ai (optional) | Scraping/research | Target URLs | United States | https://parallel.ai/privacy |
| Serper | Google searches via web_search tool | Search queries | United States | https://serper.dev/privacy |
| ScrapeCreators | Social media queries | Social handles and URLs | United States | https://scrapecreators.com/privacy |
| X (Twitter) API | Searches on X | Search queries on X | United States | https://twitter.com/privacy |
| YouTube Data API (Google) | Video/channel queries | Search queries | United States | https://policies.google.com/privacy |
| Stripe | Payments and billing | Billing email, payment-method tokens, subscription metadata | United States / Ireland | https://stripe.com/privacy |
| Resend | Transactional email delivery (sender: [email protected]) | Recipient email | United States | https://resend.com/legal/privacy-policy |
| PostHog EU Cloud | Product analytics (consent only) | Usage events, pseudonymous user ID | EU (eu.i.posthog.com) | https://posthog.com/privacy |
| Sentry | Error tracking | Stack traces (no PII, sendDefaultPii: false) | United States | https://sentry.io/privacy/ |
| BetterStack / Logtail | Application logs (may include user IDs and request metadata) | Structured logs | EU (Falkenstein) | https://betterstack.com/privacy |
| Google OAuth | Sign-in with Google | Basic profile + email | United States | https://policies.google.com/privacy |
| GitHub Container Registry | Docker image hosting | No user data | Global | https://docs.github.com/site-policy/privacy-policies |
5. International transfers (Art. 46 GDPR)
A significant portion of the sub-processors above are based in the United States or other non-EU countries. In particular, every chat prompt, scraping description and file content sent to AI is transmitted to Anthropic in the United States (and, in fallback scenarios, to Azure OpenAI or DeepSeek).
For these transfers we rely on:
- Standard Contractual Clauses (SCCs) issued by the European Commission (Decision 2021/914)
- The provider's adherence to the EU–US Data Privacy Framework, where applicable (e.g. Stripe, Google, Microsoft, Cloudflare)
- Supplementary technical measures: TLS 1.2+ in transit, JWT authentication, data minimization
6. Data retention
We keep data only as long as necessary for the purposes for which it was collected. Summary table:
| Data category | Retention period |
|---|---|
| Account data | Until you request deletion (today via email request, self-service on the roadmap) |
| Chat sessions and messages | Indefinitely until you delete them, or until account deletion |
| Scrape jobs | Indefinitely until you delete them, or until account deletion |
| Long-term memory | Until you delete the entries |
| Files in Cloudflare R2 | Deleted on cascade when the parent session is deleted, or on account erasure |
| Application logs (BetterStack) | ~30 days (vendor default — exact period being documented) |
| Sentry events | 90 days (default) |
| PostHog events | 90 days (free plan default) |
| Billing data (Stripe) | Per Stripe's policy, typically 7 years (tax / accounting obligations) |
| Audit logs (admin actions) | Retained for compliance and security; on erasure, the user_id may be set to NULL but the action record is preserved |
7. Cookies and similar technologies
| Category | Name | Type | Purpose | Consent |
|---|---|---|---|---|
| Strictly necessary | sb-access-token, sb-refresh-token | HTTP cookie (httpOnly) | Supabase authentication | Not required |
| Functional | cookie-consent | localStorage | Stores your cookie choice | Not required |
| Functional | sidebar-collapsed, sidebar-workspaces-expanded | localStorage | UI state | Not required |
| Functional | onboarding-completed | localStorage | Onboarding state | Not required |
| Functional | scrapedog-user-memory | localStorage | Local mirror of user memory | Not required |
| Analytics | ph_* (PostHog) | Cookie + localStorage | Product analytics | Required (opt-in) |
The cookie banner is currently binary (accept all / reject all). Rejection fully disables PostHog.
8. Security
We implement concrete technical and organizational measures:
- In transit: TLS 1.2+ end-to-end via Cloudflare
- At rest: Supabase (AWS-managed), Cloudflare R2 (Cloudflare-managed), Azure Key Vault for application secrets
- Per-user isolation: Postgres Row Level Security (RLS) on every user table
- Authentication: JWT with JWKS rotation
- Anti-SSRF: centralized outbound URL validation through
HttpGateway - Anti-abuse: Cloudflare WAF + DDoS protection
- Rate limiting: distributed (Redis-backed), default 20 requests/minute per user
- Admin access control: RBAC +
audit_logtable for every privileged action - Error minimization: Sentry configured without PII (
sendDefaultPii: false)
No system is 100% secure. In the event of a personal data breach we will notify the Garante within 72 hours under Art. 33 GDPR and, if high-risk, also data subjects under Art. 34.
9. Your rights (Art. 15–22 GDPR)
You have the following rights, exercisable free of charge:
- Access (Art. 15) — obtain a copy of your data. How: email
[email protected]. We export your data as JSON (currently capped at 10,000 rows per table for operational reasons). - Rectification (Art. 16) — you can edit account, memory and workspaces directly in the app. For other corrections, email
[email protected]. - Erasure (Art. 17) — email
[email protected]. Erasure today runs through an internal procedure that: deletes R2 files, deletes the PostHog person, cascades the Postgres deletion. - Restriction (Art. 18) — email request.
- Portability (Art. 20) — same mechanism as access (JSON export).
- Objection (Art. 21) — email request.
- Withdrawal of consent — analytics cookie consent can be withdrawn from the banner (reject all). Long-term memory consent is withdrawn by deleting the entries from the app.
Operational transparency: as of this draft, account erasure and DSARs are admin-triggered on your email request, not self-service. We are building self-service in the gdpr/p2-gdpr-03-delete-account branch and will release it as soon as possible.
10. Automated decision-making and profiling (Art. 22)
We do not make solely-automated decisions producing legal effects or similarly significant effects on you.
Searchbase's AI models generate content (extractions, summaries, chat responses) but they are assistive tools: we do not automatically score your reliability, do not compute risk scores, do not deny access to services based on AI output.
We do not profile users for marketing purposes.
11. AI and user content
This section requires explicit honesty.
When you enter a prompt in chat, describe a scraping job, upload a file or audio, that content is sent to third-party AI providers in the United States (or other non-EU countries) for processing. Specifically:
- Every chat prompt and AI response passes through Anthropic (Claude Sonnet 4.6 and Haiku 4.5), based in the United States.
- If fallback is active, prompts may be sent to Azure OpenAI Foundry (GPT-4.1) or DeepSeek.
- Audio files uploaded for transcription pass through Deepgram (Nova-3), United States.
- Scraping prompts and target URLs may pass through Firecrawl or Parallel.ai (United States) as fallback or extension.
- Text search queries pass through Serper, ScrapeCreators, X API or YouTube Data API.
What this means in practice:
- The data you enter in chats is seen by the AI sub-processor for the time needed to generate the response.
- By contract / configuration, we do not use your data to train AI models, and the main sub-processors (Anthropic, Azure OpenAI) commit not to train their models on API data.
- Do not enter in chats personal data of others without a legal basis, secrets, passwords, health data or other special categories under Art. 9 GDPR. Searchbase is not currently designed to process special categories of data.
12. Minors
Searchbase is not directed at minors under 16. We do not knowingly collect personal data from minors under 16. If you become aware that a minor has provided us personal data without parental consent, contact [email protected] and we will delete it.
13. Changes to this policy
This notice may be updated. The current version is always available at https://searchbase.org/privacy. For material changes we will notify you by email at the address linked to your account, with at least 15 days' notice.
The version and last-updated date are shown at the top of this document.
14. Contact and complaints
Privacy email: [email protected]
General email: [email protected]
If you believe the processing of your personal data violates the GDPR or Italian law, you have the right to lodge a complaint with a supervisory authority:
-
For users resident in Italy: Garante per la Protezione dei Dati Personali Piazza Venezia 11, 00187 Roma Website: https://www.garanteprivacy.it
-
For users resident in other EU Member States: the supervisory authority of your country of residence, place of work, or place of the alleged infringement (Art. 77 GDPR).