OpenSource-Hub

page-agent

Library

alibaba/page-agent

JavaScript library for controlling web UIs with natural language.

Overview

Page Agent is an in-page GUI agent that manipulates web interfaces through text-based DOM operations. It does not require screenshots, multi-modal models, or browser extensions, and supports custom LLMs. Easily integrate it into any webpage with a script tag or npm package.

README Preview

# Page Agent\n\n\n  \n  \n\n\n[](https://opensource.org/licenses/MIT) [](http://www.typescriptlang.org/) [](https://bundlephobia.com/package/page-agent) [](https://www.npmjs.com/package/page-agent) [](https://github.com/alibaba/page-agent)\n\nThe GUI Agent Living in Your Webpage. Control web interfaces with natural language.\n\n🌐 **English** | [中文](./docs/README-zh.md)\n\n🚀 Demo | 📖 Docs | 📢 HN Discussion | 𝕏 Follow on X\n\n\n\nhttps://github.com/user-attachments/assets/a1f2eae2-13fb-4aae-98cf-a3fc1620a6c2\n\n---\n\n## ✨ Features\n\n- **🎯 Easy integration**\n    - No need for `browser extension` / `python` / `headless browser`.\n    - Just in-page javascript. Everything happens in your web page.\n- **📖 Text-based DOM manipulation**\n    - No screenshots. No multi-modal LLMs or special permissions needed.\n- **🧠 Bring your own LLMs**\n- **🐙 Optional [chrome extension](https://alibaba.github.io/page-agent/docs/features/chrome-extension) for multi-page tasks.**\n    - And an [MCP Server (Beta)](https://alibaba.github.io/page-agent/docs/features/mcp-server) to control it from outside\n\n## 💡 Use Cases\n\n- **SaaS AI Copilot** — Ship an AI copilot in your product in lines of code. No backend rewrite.\n- **Smart Form Filling** — Turn 20-click workflows into one sentence. Perfect for ERP, CRM, and admin systems.\n- **Accessibility** — Make any web app accessible through natural language. Voice commands, screen readers, zero barrier.\n- **Multi-page Agent** — Extend your own web agent's reach across browser tabs [chrome extension](https://alibaba.github.io/page-agent/docs/features/chrome-extension).\n- **MCP** - Allow your agent clients to control your browser.\n\n## 🚀 Quick Start\n\n### One-line integration\n\nFastest way to try PageAgent with our free Demo LLM:\n\n```html\n\n```\n\n> **⚠️ For technical evaluation only.** This demo CDN uses our free [testing LLM API](https://alibaba.github.io/page-agent/docs/features/models#free-testing-api). By using it, you ag