
GoScry
A server application written in Go that acts as a bridge between a controlling system and a web browser
Key Features
GoScry provides powerful capabilities for browser automation and control
Architecture Diagram
How GoScry connects your systems with web browsers
sequenceDiagram
participant Client as LLM System / API Client
participant GS as GoScry Server (API)
participant TM as Task Manager
participant BM as Browser Manager
participant DOM as DOM Processor
participant CDP as Chrome (via CDP)
participant Site as Target Website
Client->>+GS: POST /api/v1/tasks (Task JSON)
GS->>+TM: SubmitTask(task)
TM-->>GS: TaskID
GS-->>-Client: 202 Accepted (TaskID)
Note right of TM: Task Execution starts async
TM->>BM: ExecuteActions(actions)
BM->>+CDP: Run CDP Commands (Navigate, Click, etc.)
CDP->>+Site: HTTP Request
Site-->>-CDP: HTTP Response (HTML, etc.)
CDP-->>-BM: Action Result / DOM State
BM-->>TM: Action Completed / Error
alt DOM AST Retrieval
Client->>+GS: POST /api/v1/dom/ast (URL, ParentSelector)
GS->>+DOM: GetDomAST(URL, ParentSelector)
DOM->>+CDP: ChromeDP Actions (Navigate, GetHTML)
CDP->>+Site: HTTP Request
Site-->>-CDP: HTTP Response (HTML)
CDP-->>-DOM: HTML Content
DOM-->>-GS: DOM AST Structure
GS-->>-Client: 200 OK (AST JSON)
end
alt 2FA Required (e.g., after login action)
TM->>BM: ExecuteAction(detect2FAPrompt)
BM->>CDP: Check page state for 2FA indicators
CDP->>BM: Presence result
BM->>TM: Prompt detected/not detected
opt Prompt Detected
TM->>TM: Update Task Status (WaitingFor2FA)
TM-->>Client: Notify Callback (MCP 2FA Request)
Note over Client, TM: Client retrieves code externally
Client->>+GS: POST /api/v1/tasks/{id}/2fa (Code)
GS->>+TM: Provide2FACode(id, code)
TM->>TM: Signal/Resume Task Execution
Note right of TM: Next action types the 2FA code
TM->>BM: ExecuteActions(type 2FA code, submit)
BM->>+CDP: Type Code, Submit
CDP->>+Site: Verify 2FA
Site-->>-CDP: Login Success/Failure
CDP-->>-BM: Result
BM-->>TM: Action Completed
end
end
TM->>TM: Process Final Result / Format MCP
TM-->>Client: Notify Callback (MCP Result/Status)
TM->>TM: Update Task Status (Completed/Failed)
Note over TM: Task finished execution.Documentation
Everything you need to get started with GoScry
Prerequisites
- Go: Version 1.21 or later.
- Chrome / Chromium: A compatible version installed on the system where GoScry will run.
Installation Steps
1. Clone the repository:
git clone https://github.com/copyleftdev/goscry.git cd goscry2. Build the executable:
go build -o goscry ./cmd/goscry/Action Types
The actions array in the submit request defines the steps
| Type | Description | selector Used | value Used | format Used |
|---|---|---|---|---|
| navigate | Navigates the browser to a URL. | No | URL string | No |
| wait_visible | Waits for an element matching the selector to become visible. | Yes | Optional duration (e.g., "5s", default "30s") | No |
| wait_hidden | Waits for an element matching the selector to become hidden. | Yes | Optional duration (e.g., "5s", default "30s") | No |
| wait_delay | Pauses execution for a specified duration. | No | Duration string (e.g., "2s", "500ms") | No |
| click | Waits for an element to be visible and clicks it. | Yes | No | No |
| type | Types text into an element. Use {{task.tfa_code}} for 2FA code injection. | Yes | Text string, or {{task.tfa_code}} | No |
| select | Selects an option within a <select> element by its value attribute. | Yes | Option value string | No |
| scroll | Scrolls the page (top, bottom) or an element into view. | If value is not top/bottom | top, bottom, or empty (uses selector) | No |
| screenshot | Captures a full-page screenshot. Result attached to task result. | No | Optional JPEG quality (0-100, default 90) | base64 (string) or png (bytes) |
| get_dom | Retrieves DOM content. Result attached to task result. | Optional (defaults to body) | No | full_html, simplified_html, text_content |
| run_script | Executes arbitrary JavaScript in the page context. Result attached. | No | JavaScript code string | No |
DOM AST API
Extract structured DOM representations with the Abstract Syntax Tree API
Overview
The DOM AST API provides a structured representation of a webpage's Document Object Model (DOM) as an Abstract Syntax Tree. This enables:
- Analyzing page structure programmatically
- Extracting specific sections of a page with their hierarchical relationships intact
- Performing targeted content extraction with scope control
Endpoint
POST /api/v1/dom/astRequest Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| url | string | Yes | The URL of the webpage to analyze |
| parent_selector | string | No | CSS selector to scope the AST to a specific element |
Response Structure
Each node in the DOM AST has the following structure:
{
"nodeType": "element", // "element", "text", "comment", or "document"
"tagName": "div", // HTML tag name (for element nodes)
"id": "main", // ID attribute (if present)
"classes": ["container"], // Array of CSS classes (if present)
"attributes": { // Map of all attributes
"id": "main",
"class": "container"
},
"textContent": "", // Text content (primarily for text nodes)
"children": [] // Array of child nodes
}Example Usage
Request:
curl -X POST http://localhost:8080/api/v1/dom/ast \
-H "Content-Type: application/json" \
-H "X-API-Key: your-api-key" \
-d '{
"url": "https://example.com",
"parent_selector": "#main-content"
}'Response:
{
"nodeType": "element",
"tagName": "div",
"attributes": {"id": "main-content"},
"children": [
{
"nodeType": "element",
"tagName": "h1",
"attributes": {"class": "title"},
"children": [],
"textContent": "Welcome to Example.com"
},
{
"nodeType": "element",
"tagName": "p",
"attributes": {},
"children": [],
"textContent": "This domain is for use in illustrative examples in documents."
}
]
}Implementation Notes
- Uses ChromeDP's selector support for robust CSS selector matching
- Waits 5 seconds for JavaScript-heavy pages to fully load before processing
- Works with both simple sites and complex modern web applications
- Handles a wide range of CSS selectors including tag, class, ID, and nested selectors
Package Structure
How GoScry is organized
Ready to Automate Your Browser Tasks?
Get started with GoScry today and streamline your web automation workflows with powerful, flexible browser control.