Bright Data
Service domainWEB SCRAPING
CommunityBYOC
Search, Crawl and Scrape any site, at scale, without getting blocked
Author:Arcade
Version:
0.5.1Auth:No authentication required
3tools
3require secrets
Bright Data provides a developer toolkit for large-scale web search, crawling, and scraping, enabling reliable extraction of pages and structured data without getting blocked. It supports search queries, content-to-Markdown conversion, and configurable data feeds across many site types.
Designed for integration into data pipelines and analytics workflows with parameterized feeds and output formats.
Capabilities
- Scale-resistant crawling and scraping with anti-blocking behavior for sustained collection.
- Flexible search engine queries with advanced parameters across major engines.
- Transform pages into clean Markdown and emit structured JSON feeds for profiles, products, reviews, listings, and media.
- Configurable extraction parameters for batching, pagination, and media handling.
Secrets
See the Arcade secrets documentation for how to store and reference secrets in your Arcade configuration.
BRIGHTDATA_API_KEY: Your Bright Data API key. Obtain it from the Bright Data dashboard under Account Settings → API Tokens. Generate a new token and copy the value; it will not be shown again. A paid or trial Bright Data account is required. Example:BRIGHTDATA_API_KEY=sk_....BRIGHTDATA_ZONE: The name of the Bright Data zone (proxy zone or dataset zone) the toolkit should target. Zones are created and managed in the Bright Data dashboard under Proxies & Scraping Infrastructure. Use the zone's string identifier as shown in the dashboard. Example:BRIGHTDATA_ZONE=zone123.
Available tools(3)
3 of 3 tools
Operations
Behavior
| Tool name | Description | Secrets | |
|---|---|---|---|
Scrape a webpage and return content in Markdown format using Bright Data.
Examples:
scrape_as_markdown("https://example.com") -> "# Example Page
Content..."
scrape_as_markdown("https://news.ycombinator.com") -> "# Hacker News
..."
| 2 | ||
Search using Google, Bing, or Yandex with advanced parameters using Bright Data.
Examples:
search_engine("climate change") -> "# Search Results
## Climate Change - Wikipedia
..."
search_engine("Python tutorials", engine="bing", num_results=5) -> "# Bing Results
..."
search_engine("cats", search_type="images", country_code="us") -> "# Image Results
..."
| 2 | ||
Extract structured data from various websites like LinkedIn, Amazon, Instagram, etc.
NEVER MADE UP LINKS - IF LINKS ARE NEEDED, EXECUTE search_engine FIRST.
Supported source types:
- amazon_product, amazon_product_reviews
- linkedin_person_profile, linkedin_company_profile
- zoominfo_company_profile
- instagram_profiles, instagram_posts, instagram_reels, instagram_comments
- facebook_posts, facebook_marketplace_listings, facebook_company_reviews
- x_posts
- zillow_properties_listing
- booking_hotel_listings
- youtube_videos
Examples:
web_data_feed("amazon_product", "https://amazon.com/dp/B08N5WRWNW")
-> "{"title": "Product Name", ...}"
web_data_feed("linkedin_person_profile", "https://linkedin.com/in/johndoe")
-> "{"name": "John Doe", ...}"
web_data_feed(
"facebook_company_reviews", "https://facebook.com/company", num_of_reviews=50
) -> "[{"review": "...", ...}]" | 1 |
Last updated on