Scrapling

Q: MCP 서버 세션 간에 쿠키를 영구적으로 유지하는 방법은?

MCP 서버는 지속적인 브라우저 프로필을 지원하지 않습니다. 해결 방법: `StealthySession(user_data_dir='/path/to/profile')`을 사용하여 Python API를 직접 호출하십시오. 예: `async with StealthySession(headless=True, user_data_dir='/path/to/profile') as session: page = await session.fetch(url)`. 그런 다음 스크립트를 AI에 전달하십시오. 이렇게 하면 호출 간에 로그인 상태가 유지됩니다.

프레임워크

D4Vinci/Scrapling

적응형 웹 스크래핑 프레임워크로, 자동으로 안티 크롤링을 우회하고 요소를 이동합니다.

저장소 방문 홈페이지

개요

Scrapling은 적응형 웹 스크래핑 프레임워크로, 단일 요청부터 대규모 크롤링까지 지원합니다. 해당 파서는 웹사이트 변화를 학습하여 페이지 업데이트 시 자동으로 요소를 재배치합니다. Cloudflare Turnstile 등 안티 스크래핑 시스템을 우회하는 기능이 내장되어 있으며, 동시 다중 세션 크롤링 및 프록시 로테이션을 지원합니다.

README 미리보기

\n\n\n    \n        \n          \n          \n        \n    \n    \n    Effortless Web Scraping for the Modern Web\n\n\n\n    \n    \n    العربيه | Español | Português (Brasil) | Français | Deutsch | 简体中文 | 日本語 |  Русский | 한국어\n    \n    \n        \n    \n        \n    \n    \n        \n    \n        \n    \n    \n      \n    \n    \n      \n    \n    \n    \n        \n\n\n\n    Selection methods\n    &middot;\n    Fetchers\n    &middot;\n    Spiders\n    &middot;\n    Proxy Rotation\n    &middot;\n    CLI\n    &middot;\n    MCP\n\n\nScrapling is an adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl.\n\nIts parser learns from website changes and automatically relocates your elements when pages update. Its fetchers bypass anti-bot systems like Cloudflare Turnstile out of the box. And its spider framework lets you scale up to concurrent, multi-session crawls with pause/resume and automatic proxy rotation - all in a few lines of Python. One library, zero compromises.\n\nBlazing fast crawls with real-time stats and streaming. Built by Web Scrapers for Web Scrapers and regular users, there's something for everyone.\n\n```python\nfrom scrapling.fetchers import Fetcher, AsyncFetcher, StealthyFetcher, DynamicFetcher\nStealthyFetcher.adaptive = True\np = StealthyFetcher.fetch('https://example.com', headless=True, network_idle=True)  # Fetch website under the radar!\nproducts = p.css('.product', auto_save=True)                                        # Scrape data that survives website design changes!\nproducts = p.css('.product', adaptive=True)                                         # Later, if the website structure changes, pass `adaptive=True` to find them!\n```\nOr scale up to full crawls\n```python\nfrom scrapling.spiders import Spider, Response\n\nclass MySpider(Spider):\n  name = "demo"\n  start_urls = ["https://example.com/"]\n\n  async def parse(self, response: Response):\n      for item in response.css('.product'):\n

FAQ (2)

문제 해결

ClawHub의 Scrapling 스킬 페이지가 비어 있거나 손상된 이유는 무엇인가요?

이는 일시적인 장애(openclaw/clawhub#2345)였으며 해결되었습니다. https://clawhub.ai/D4Vinci/scrapling-official 페이지는 현재 정상 작동 중입니다. 여전히 빈 페이지가 보인다면, 대체方案으로 agent-skill GitHub 디렉토리의 zip 파일을 사용해 보세요: https://github.com/D4Vinci/Scrapling/tree/main/agent-skill

원본 Issue #290

문제 해결

MCP 서버 세션 간에 쿠키를 영구적으로 유지하는 방법은?

MCP 서버는 지속적인 브라우저 프로필을 지원하지 않습니다. 해결 방법: StealthySession(user_data_dir='/path/to/profile')을 사용하여 Python API를 직접 호출하십시오. 예: async with StealthySession(headless=True, user_data_dir='/path/to/profile') as session: page = await session.fetch(url). 그런 다음 스크립트를 AI에 전달하십시오. 이렇게 하면 호출 간에 로그인 상태가 유지됩니다.

원본 Issue #269

Scrapling

개요

README 미리보기

FAQ (2)

同类型项目

superpowers

everything-claude-code

flutter

langflow