TL;DR
Scraping public data is generally legal in the US and EU. Key exceptions: personal data without consent, bypassing technical barriers, violating terms of service. The 2022 hiQ vs LinkedIn ruling protects scraping public data. Always check local laws.
Disclaimer: This is not legal advice. Laws vary by country. Consult a lawyer for your specific situation.
The Short Answer
Web scraping is legal when you scrape publicly available data without bypassing security measures. Courts have consistently ruled that accessing public information is protected activity.
It becomes illegal when you: bypass login requirements, ignore technical blocks, violate computer fraud laws, or collect personal data without consent.
Key Legal Rulings
hiQ Labs vs LinkedIn (2022)
The most important case for web scraping. The Ninth Circuit ruled that scraping publicly available LinkedIn profiles is not a violation of the Computer Fraud and Abuse Act (CFAA).
What this means: Websites cannot use the CFAA to stop you from collecting public data. Even if they send cease-and-desist letters. Even if their terms prohibit scraping.
Clearview AI Cases (2020-2024)
Clearview scraped billions of photos from social media. Courts in the EU, UK, and Australia fined them heavily. The US had mixed results.
What this means: Scraping is riskier when it involves biometric data, photos, or personal information. Privacy laws apply separately from computer access laws.
Meta vs Bright Data (2024)
Meta sued Bright Data for scraping Facebook and Instagram. The court ruled that scraping public data was not a CFAA violation, but other claims like contract breach could apply.
What this means: Scraping public data is allowed, but big platforms will still sue. Most settle out of court.
Legal Status by Data Type
| Data Type | Risk Level | Notes |
|---|---|---|
| Public business data | LOW | Google Maps, Yelp, public directories |
| Product listings | LOW | Amazon, eBay, e-commerce prices |
| Public social media posts | MEDIUM | Legal but platforms may sue |
| Public LinkedIn profiles | MEDIUM | hiQ ruling protects this, but LinkedIn fights hard |
| Behind login content | HIGH | Using credentials you do not own is risky |
| Personal emails, phones | HIGH | GDPR, CCPA may apply |
| Photos of people | HIGH | Biometric privacy laws in many regions |
Laws by Region
United States
The Computer Fraud and Abuse Act (CFAA) is the main law. The hiQ ruling narrowed its scope. Scraping public data is generally safe. Circumventing security measures is illegal.
State laws vary. California has CCPA for personal data. Illinois has biometric privacy laws. Check your target audience location.
European Union
GDPR applies to personal data of EU residents. You need a legal basis to process personal data. Public interest, legitimate interest, or consent are common grounds.
Scraping business data is fine. Scraping personal profiles requires careful legal analysis.
United Kingdom
UK GDPR mirrors EU rules. The Computer Misuse Act covers unauthorized access. Public data scraping is generally allowed.
Best Practices
DO:
- Scrape publicly visible data only
- Respect robots.txt (not legally required, but good practice)
- Rate limit your requests to avoid server strain
- Store only what you need
- Delete data when no longer needed
- Have a clear business purpose
DO NOT:
- Bypass CAPTCHAs or login screens you should not access
- Use credentials that are not yours
- Ignore cease-and-desist letters without legal advice
- Scrape and resell personal data
- Overload servers with requests
- Scrape copyrighted content for republishing
Terms of Service
Many websites prohibit scraping in their terms. Does this matter legally?
Short answer: Violating terms of service is not a criminal offense. But it can be grounds for a civil lawsuit. Platforms can ban your account and IP. They can pursue breach of contract claims.
Practical advice: Small-scale scraping rarely attracts attention. Large commercial operations get legal letters. If you receive a cease-and-desist, consult a lawyer.
Using Apify Legally
Apify provides the tools. You are responsible for how you use them.
Apify complies with GDPR. They offer data processing agreements. Their terms require you to use actors legally.
When using Apify actors:
- Choose actors for public data sources
- Do not use login-based scrapers with stolen credentials
- Review what data each actor collects
- Consider your downstream use case
Common Questions
Q: Is scraping Google legal?
A: Scraping Google search results violates their terms of service, but is not a criminal offense. Google rarely sues individuals. They focus on large-scale commercial operations.
Q: Can I scrape my competitors?
A: Scraping public pricing and product data is legal. This is called competitive intelligence. Do not access private systems or steal trade secrets.
Q: What if I get a cease-and-desist letter?
A: Stop the activity. Consult a lawyer. Many letters are scare tactics, but some have merit. Do not ignore them.
Q: Is scraping for AI training legal?
A: This is evolving. Current US precedent treats it as fair use. EU AI Act may impose restrictions. Copyright holders are actively litigating this.
Q: Can I scrape and resell the data?
A: Factual data cannot be copyrighted. But be careful with personal data (privacy laws) and creative content (copyright). Many data businesses operate legally in this space.
Resources
- EFF on CFAA - Legal analysis of computer fraud law
- GDPR Official Site - EU data protection requirements
- California CCPA - State privacy law details