Name		Name	Last commit message	Last commit date
parent directory ..
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
const.py		const.py
main.py		main.py
requirements.txt		requirements.txt

README.md

Steel + OpenAI Computer Use Assistant in Python

This example shows how to integrate Steel with OpenAI's Computer Use Assistant (CUA) API to create a browser automation agent. The assistant sees the browser through Steel's cloud sessions, analyzes the screen, and performs actions like clicking, typing, and navigating.

Prerequisites

A Steel API key — Get one here
An OpenAI API key with access to the Computer Use Assistant preview

Installation

Clone this repository and navigate to the project directory:

git clone https://github.com/steel-dev/steel-cookbook
cd steel-cookbook/examples/steel-oai-computer-use-python-starter

# Create and activate virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows use: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Setup

Create a .env file in the project directory by copying the example:

cp .env.example .env

Edit the .env file and add your API keys:

STEEL_API_KEY=your_steel_api_key_here
OPENAI_API_KEY=your_openai_api_key_here

How it Works

This example demonstrates:

Creating a Steel browser session - Launches a remote browser in the cloud
Connecting with Playwright - Establishes a direct connection to control the browser
Integrating with OpenAI's Computer Use Assistant - Sends screenshots to OpenAI and receives actions to execute
Action execution - Translates OpenAI's commands into browser actions (click, type, scroll, etc.)
Continuous interaction loop - Maintains a cycle of screenshots and actions until the task is complete

Running the Example

Execute the main script:

python main.py

You'll be prompted to enter a task for the assistant to perform. Examples:

"Search for Steel browser on Bing and tell me about it"
"Find today's weather for New York City"
"Go to Wikipedia and find information about machine learning"

The script will:

Create a Steel session (you'll see a URL where you can watch the session live)
Send the initial screenshot to OpenAI
Execute the commands received from OpenAI
Send updated screenshots after each action
Continue this loop until the task is complete

Key Components

SteelBrowser Class

A wrapper around the Steel session and Playwright browser that provides methods for:

Creating and managing a browser session
Taking screenshots
Executing various browser actions (click, type, scroll, etc.)

OpenAI Integration

The script connects to OpenAI's Computer Use Assistant API to:

Send browser screenshots
Receive actions to execute
Process text responses from the assistant

Customization

You can modify the example to:

Change the initial URL (currently Bing.com)
Adjust the browser dimensions
Add more action types
Implement additional error handling
Customize the UI/UX of the interaction

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

steel-oai-computer-use-python-starter

steel-oai-computer-use-python-starter

README.md

Steel + OpenAI Computer Use Assistant in Python

Prerequisites

Installation

Setup

How it Works

Running the Example

Key Components

SteelBrowser Class

OpenAI Integration

Customization

Support

Files

steel-oai-computer-use-python-starter

Directory actions

More options

Directory actions

More options

Latest commit

History

steel-oai-computer-use-python-starter

Folders and files

parent directory

README.md

Steel + OpenAI Computer Use Assistant in Python

Prerequisites

Installation

Setup

How it Works

Running the Example

Key Components

SteelBrowser Class

OpenAI Integration

Customization

Support