PeaCrawler
PeaCrawler is the governed web crawler used by Pea for operator-directed corpus intake, research retrieval, and source archiving.
PeaCrawler is not a general-purpose search indexer and does not crawl continuously for public ranking, advertising, or profiling purposes.
Identity
- User-Agent:
PeaCrawler/0.1 (+https://www.decentre.io/pea/crawler/) - Operator: Pea / Decentre
- Product: https://www.decentre.io/pea
- Policy URL: https://www.decentre.io/pea/crawler/
Crawling Policy
- PeaCrawler respects robots.txt.
- PeaCrawler honors crawl-delay when present.
- PeaCrawler uses bounded page, depth, byte, redirect, and rate limits.
- PeaCrawler does not bypass login walls, bot challenges, paywalls, or access controls.
- PeaCrawler records provenance, crawl limits, errors, and source URLs in manifests.
Purpose
PeaCrawler is used when an operator asks Pea to retrieve public web material for research, indexing, evidence review, or governed corpus intake.
Development Status
PeaCrawler is currently under active development. During this phase, crawler activity is limited to operator-directed testing, research retrieval, and controlled corpus intake.
Opt Out / Contact
To request reduced crawl rate, blocking, correction, or more information, contact: jessejr@decentre.io.
Requests are reviewed manually while PeaCrawler is under development.
Verification
Current public IP ranges: PeaCrawler does not currently operate from a fixed published IP range.
Request signing public key: not currently published.