Skip to main content
GoScry Logo - A blue gopher mascot with cybernetic enhancements

GoScry

A server application written in Go that acts as a bridge between a controlling system and a web browser

Key Features

GoScry provides powerful capabilities for browser automation and control

Remote Browser Control
Uses CDP (via chromedp) to control headless or headed Chrome/Chromium instances
Task-Based API
Submit sequences of browser actions via a simple JSON API
Authentication Handling
Supports basic username/password login sequences within tasks
2FA Support
Detects potential 2FA prompts and signals back via API/callback
DOM Extraction
Retrieve full HTML, text content, or a simplified version of the DOM
DOM AST
Generate a structured Abstract Syntax Tree representation of the DOM with optional scope control
Configurable
Manage server port, browser settings, logging, and security via YAML or environment variables

Architecture Diagram

How GoScry connects your systems with web browsers

sequenceDiagram
  participant Client as LLM System / API Client
  participant GS as GoScry Server (API)
  participant TM as Task Manager
  participant BM as Browser Manager
  participant DOM as DOM Processor
  participant CDP as Chrome (via CDP)
  participant Site as Target Website

  Client->>+GS: POST /api/v1/tasks (Task JSON)
  GS->>+TM: SubmitTask(task)
  TM-->>GS: TaskID
  GS-->>-Client: 202 Accepted (TaskID)
  Note right of TM: Task Execution starts async
  TM->>BM: ExecuteActions(actions)
  BM->>+CDP: Run CDP Commands (Navigate, Click, etc.)
  CDP->>+Site: HTTP Request
  Site-->>-CDP: HTTP Response (HTML, etc.)
  CDP-->>-BM: Action Result / DOM State
  BM-->>TM: Action Completed / Error
  
  alt DOM AST Retrieval
      Client->>+GS: POST /api/v1/dom/ast (URL, ParentSelector)
      GS->>+DOM: GetDomAST(URL, ParentSelector)
      DOM->>+CDP: ChromeDP Actions (Navigate, GetHTML)
      CDP->>+Site: HTTP Request
      Site-->>-CDP: HTTP Response (HTML)
      CDP-->>-DOM: HTML Content
      DOM-->>-GS: DOM AST Structure
      GS-->>-Client: 200 OK (AST JSON)
  end
  
  alt 2FA Required (e.g., after login action)
      TM->>BM: ExecuteAction(detect2FAPrompt)
      BM->>CDP: Check page state for 2FA indicators
      CDP->>BM: Presence result
      BM->>TM: Prompt detected/not detected
      opt Prompt Detected
          TM->>TM: Update Task Status (WaitingFor2FA)
          TM-->>Client: Notify Callback (MCP 2FA Request)
          Note over Client, TM: Client retrieves code externally
          Client->>+GS: POST /api/v1/tasks/{id}/2fa (Code)
          GS->>+TM: Provide2FACode(id, code)
          TM->>TM: Signal/Resume Task Execution
          Note right of TM: Next action types the 2FA code
          TM->>BM: ExecuteActions(type 2FA code, submit)
          BM->>+CDP: Type Code, Submit
          CDP->>+Site: Verify 2FA
          Site-->>-CDP: Login Success/Failure
          CDP-->>-BM: Result
          BM-->>TM: Action Completed
      end
  end
  TM->>TM: Process Final Result / Format MCP
  TM-->>Client: Notify Callback (MCP Result/Status)
  TM->>TM: Update Task Status (Completed/Failed)
  Note over TM: Task finished execution.

Documentation

Everything you need to get started with GoScry

Prerequisites

  • Go: Version 1.21 or later.
  • Chrome / Chromium: A compatible version installed on the system where GoScry will run.

Installation Steps

1. Clone the repository:

git clone https://github.com/copyleftdev/goscry.git cd goscry

2. Build the executable:

go build -o goscry ./cmd/goscry/

Action Types

The actions array in the submit request defines the steps

Action Types for GoScry Tasks
TypeDescriptionselector Usedvalue Usedformat Used
navigateNavigates the browser to a URL.NoURL stringNo
wait_visibleWaits for an element matching the selector to become visible.YesOptional duration (e.g., "5s", default "30s")No
wait_hiddenWaits for an element matching the selector to become hidden.YesOptional duration (e.g., "5s", default "30s")No
wait_delayPauses execution for a specified duration.NoDuration string (e.g., "2s", "500ms")No
clickWaits for an element to be visible and clicks it.YesNoNo
typeTypes text into an element. Use {{task.tfa_code}} for 2FA code injection.YesText string, or {{task.tfa_code}}No
selectSelects an option within a <select> element by its value attribute.YesOption value stringNo
scrollScrolls the page (top, bottom) or an element into view.If value is not top/bottomtop, bottom, or empty (uses selector)No
screenshotCaptures a full-page screenshot. Result attached to task result.NoOptional JPEG quality (0-100, default 90)base64 (string) or png (bytes)
get_domRetrieves DOM content. Result attached to task result.Optional (defaults to body)Nofull_html, simplified_html, text_content
run_scriptExecutes arbitrary JavaScript in the page context. Result attached.NoJavaScript code stringNo

DOM AST API

Extract structured DOM representations with the Abstract Syntax Tree API

Overview

The DOM AST API provides a structured representation of a webpage's Document Object Model (DOM) as an Abstract Syntax Tree. This enables:

  • Analyzing page structure programmatically
  • Extracting specific sections of a page with their hierarchical relationships intact
  • Performing targeted content extraction with scope control

Endpoint

POST /api/v1/dom/ast

Request Parameters

ParameterTypeRequiredDescription
urlstringYesThe URL of the webpage to analyze
parent_selectorstringNoCSS selector to scope the AST to a specific element

Response Structure

Each node in the DOM AST has the following structure:

{
  "nodeType": "element",        // "element", "text", "comment", or "document"
  "tagName": "div",             // HTML tag name (for element nodes)
  "id": "main",                 // ID attribute (if present)
  "classes": ["container"],     // Array of CSS classes (if present)
  "attributes": {               // Map of all attributes
    "id": "main",
    "class": "container"
  },
  "textContent": "",            // Text content (primarily for text nodes)
  "children": []                // Array of child nodes
}

Example Usage

Request:

curl -X POST http://localhost:8080/api/v1/dom/ast \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-api-key" \
  -d '{
    "url": "https://example.com",
    "parent_selector": "#main-content"
  }'

Response:

{
  "nodeType": "element", 
  "tagName": "div", 
  "attributes": {"id": "main-content"}, 
  "children": [
    {
      "nodeType": "element",
      "tagName": "h1",
      "attributes": {"class": "title"},
      "children": [],
      "textContent": "Welcome to Example.com"
    },
    {
      "nodeType": "element",
      "tagName": "p",
      "attributes": {},
      "children": [],
      "textContent": "This domain is for use in illustrative examples in documents."
    }
  ]
}

Implementation Notes

  • Uses ChromeDP's selector support for robust CSS selector matching
  • Waits 5 seconds for JavaScript-heavy pages to fully load before processing
  • Works with both simple sites and complex modern web applications
  • Handles a wide range of CSS selectors including tag, class, ID, and nested selectors

Package Structure

How GoScry is organized

cmd/goscry
Main application entry point
internal/taskstypes
Core data types shared across packages
internal/tasks
Task management and execution
internal/browser
Browser control and CDP interactions
internal/server
HTTP API handlers
internal/config
Configuration handling
internal/dom
DOM processing utilities

Ready to Automate Your Browser Tasks?

Get started with GoScry today and streamline your web automation workflows with powerful, flexible browser control.