monkey2025-coder
diff --git a/‎.gitignore‎ b/‎.gitignore‎
diff --git a/‎.idea/vcs.xml‎
Lines changed: 3 additions & 1 deletion b/‎.idea/vcs.xml‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎AutoScraperX.egg-info/PKG-INFO‎
Lines changed: 110 additions & 0 deletions b/‎AutoScraperX.egg-info/PKG-INFO‎
Lines changed: 110 additions & 0 deletions
diff --git a/‎AutoScraperX.egg-info/SOURCES.txt‎
Lines changed: 11 additions & 0 deletions b/‎AutoScraperX.egg-info/SOURCES.txt‎
Lines changed: 11 additions & 0 deletions
diff --git a/‎AutoScraperX.egg-info/dependency_links.txt‎
Lines changed: 1 addition & 0 deletions b/‎AutoScraperX.egg-info/dependency_links.txt‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎AutoScraperX.egg-info/requires.txt‎
Lines changed: 3 additions & 0 deletions b/‎AutoScraperX.egg-info/requires.txt‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎AutoScraperX.egg-info/top_level.txt‎
Lines changed: 1 addition & 0 deletions b/‎AutoScraperX.egg-info/top_level.txt‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎README.md‎
Lines changed: 95 additions & 0 deletions b/‎README.md‎
Lines changed: 95 additions & 0 deletions
diff --git a/‎build/lib/AutoScraperX/__init__.py‎
Lines changed: 4 additions & 0 deletions b/‎build/lib/AutoScraperX/__init__.py‎
Lines changed: 4 additions & 0 deletions
@@ -0,0 +1,110 @@
+Metadata-Version: 2.1
+Name: AutoScraperX
+Version: 0.1.0
+Summary: A common spider tool based on Selenium
+Home-page: https://github.com/chenziying1/AutoScraperX
+Author: czy
+Author-email: 1060324818@qq.com
+Classifier: Programming Language :: Python :: 3
+Classifier: Operating System :: OS Independent
+Requires-Python: >=3.6
+Description-Content-Type: text/markdown
+Requires-Dist: selenium
+Requires-Dist: beautifulsoup4
+Requires-Dist: undetected-chromedriver
+
+# AutoScraperX
+
+AutoScraperX 是一个基于 Selenium 和 undetected_chromedriver 的通用爬虫框架，旨在提供强大而灵活的 Web 自动化功能。它支持自动化浏览、元素操作、页面截图、cookie 管理等功能，适用于各种爬取任务。
+
+## 功能特点
+
+* **支持多种浏览器选项**（无头模式、用户数据目录、自定义 Chrome 位置等）
+* **支持移动端仿真**（iPhone X 模拟）
+* **智能等待机制**，确保元素加载完毕再进行操作
+* **页面截图**，可保存完整网页截图
+* **Cookie 读写**，支持持久化登录
+* **自动滚动、刷新、切换标签页等操作**
+* **异常处理**，确保爬虫稳定运行
+
+## 安装
+
+确保你的环境中安装了以下依赖：
+
+```
+pip install selenium undetected-chromedriver beautifulsoup4
+```
+
+此外，请下载并配置相应的 WebDriver，例如 [ChromeDriver]()。
+
+## 使用方法
+
+### 初始化爬虫
+
+```
+from common_spider import Spider
+
+options = {
+    'headless': True,  # 以无头模式运行
+    'binary_location': "C:\\Path\\To\\chrome.exe",  # 指定 Chrome 位置
+    'user_data_dir': "C:\\Users\\User\\AppData\\Local\\Google\\Chrome\\User Data",
+    'driver_executable_path': "C:\\Path\\To\\chromedriver.exe"
+}
+
+spider = Spider(options)
+```
+
+### 打开网页
+
+```
+spider.open("https://example.com")
+```
+
+### 获取页面源码
+
+```
+html = spider.get_source()
+print(html)
+```
+
+### 等待元素加载
+
+```
+spider.wait_element("//div[@id='content']", by=By.XPATH)
+```
+
+### 进行交互
+
+```
+spider.comment("测试评论", "#comment-box")
+```
+
+### 保存截图
+
+```
+spider.save_screenshot("screenshot.png")
+```
+
+### 处理 Cookie
+
+```
+spider.save_cookie("cookies.pkl")
+spider.load_cookie("cookies.pkl", domain="example.com")
+```
+
+### 退出爬虫
+
+```
+spider.quit()
+```
+
+## 贡献
+
+如果你对 AutoScraperX 有任何改进建议或贡献，请提交 PR 或 Issue。
+
+## 联系方式
+
+* 作者: czy
+* 邮箱: [1060324818@qq.com]()
+* 项目地址：https://github.com/chenziying1/AutoScraperX
+* 项目示例：https://github.com/chenziying1/AutoScraperX/test/test.py
@@ -0,0 +1,11 @@
+README.md
+setup.py
+AutoScraperX/__init__.py
+AutoScraperX/common_spider.py
+AutoScraperX/utils.py
+AutoScraperX.egg-info/PKG-INFO
+AutoScraperX.egg-info/SOURCES.txt
+AutoScraperX.egg-info/dependency_links.txt
+AutoScraperX.egg-info/requires.txt
+AutoScraperX.egg-info/top_level.txt
+test/test.py
@@ -0,0 +1 @@
+
@@ -0,0 +1,3 @@
+selenium
+beautifulsoup4
+undetected-chromedriver
@@ -0,0 +1 @@
+AutoScraperX
@@ -0,0 +1,95 @@
+# AutoScraperX
+
+AutoScraperX 是一个基于 Selenium 和 undetected_chromedriver 的通用爬虫框架，旨在提供强大而灵活的 Web 自动化功能。它支持自动化浏览、元素操作、页面截图、cookie 管理等功能，适用于各种爬取任务。
+
+## 功能特点
+
+* **支持多种浏览器选项**（无头模式、用户数据目录、自定义 Chrome 位置等）
+* **支持移动端仿真**（iPhone X 模拟）
+* **智能等待机制**，确保元素加载完毕再进行操作
+* **页面截图**，可保存完整网页截图
+* **Cookie 读写**，支持持久化登录
+* **自动滚动、刷新、切换标签页等操作**
+* **异常处理**，确保爬虫稳定运行
+
+## 安装
+
+确保你的环境中安装了以下依赖：
+
+```
+pip install selenium undetected-chromedriver beautifulsoup4
+```
+
+此外，请下载并配置相应的 WebDriver，例如 [ChromeDriver]()。
+
+## 使用方法
+
+### 初始化爬虫
+
+```
+from common_spider import Spider
+
+options = {
+    'headless': True,  # 以无头模式运行
+    'binary_location': "C:\\Path\\To\\chrome.exe",  # 指定 Chrome 位置
+    'user_data_dir': "C:\\Users\\User\\AppData\\Local\\Google\\Chrome\\User Data",
+    'driver_executable_path': "C:\\Path\\To\\chromedriver.exe"
+}
+
+spider = Spider(options)
+```
+
+### 打开网页
+
+```
+spider.open("https://example.com")
+```
+
+### 获取页面源码
+
+```
+html = spider.get_source()
+print(html)
+```
+
+### 等待元素加载
+
+```
+spider.wait_element("//div[@id='content']", by=By.XPATH)
+```
+
+### 进行交互
+
+```
+spider.comment("测试评论", "#comment-box")
+```
+
+### 保存截图
+
+```
+spider.save_screenshot("screenshot.png")
+```
+
+### 处理 Cookie
+
+```
+spider.save_cookie("cookies.pkl")
+spider.load_cookie("cookies.pkl", domain="example.com")
+```
+
+### 退出爬虫
+
+```
+spider.quit()
+```
+
+## 贡献
+
+如果你对 AutoScraperX 有任何改进建议或贡献，请提交 PR 或 Issue。
+
+## 联系方式
+
+* 作者: czy
+* 邮箱: [1060324818@qq.com]()
+* 项目地址：https://github.com/chenziying1/AutoScraperX
+* 项目示例：https://github.com/chenziying1/AutoScraperX/test/test.py
@@ -0,0 +1,4 @@
+from .common_spider import Spider
+from .utils import rm_space, random_id
+
+__all__ = ["Spider", "rm_space", "random_id"]
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+selenium`
	`2`	`+beautifulsoup4`
	`3`	`+undetected-chromedriver`