Writing BrowserQL
Learning a new language can be intimidating, but mastering BQL doesn't have to be difficult. This guide provides a straightforward overview of how to write BQL with a clear.
What You Won't Write
BrowserQL is optimized for web automation and scraping, designed to minimize complexity by making intelligent assumptions. Here’s what it does for you:
- Waits for selectors before interacting with elements.
- Handles mouse movements and clicks automatically.
- Ensures elements are visible by scrolling as needed.
- Manages page-load events, like waiting for
firstContentfulPaint
.
Instead of worrying about these technical details, you focus on queries and actions.
What You’ll Write
BQL is a query language where you:
- Navigate to pages.
- Perform actions (e.g., click, type).
- Extract data (e.g., text, HTML).
With just a handful of commands (goto
, text
, html
, click
, type
and more), you can achieve complex automation in a fraction of the usual code.
For example, the script below demonstrates how easy it is to automate a loggin into Cloudflare in 30 lines:
mutation Cloudflare {
goto(
url: "https://dash.cloudflare.com/login?lang=en-gb"
waitUntil: firstContentfulPaint
) {
status
}
acceptCookies: click(
selector: "#onetrust-accept-btn-handler"
) {
time
}
typeEmail: type(
selector: "form [data-testid='login-input-email']"
text: "test@browserless.io"
) {
selector
}
typePassword: type(
selector: "form [data-testid='login-input-password']"
text: "super-cool-password"
) {
selector
}
clickCaptcha: verify(
type: cloudflare
) {
solved
}
}
Creating a BQL Script
Below are steps into creating a BQL script, guiding you into navigating to a page, retrieving data, performing actions, and, finally, generating an endpoint to connect to external libraries.
Navigating to a Page
Every script starts with a mutation, specifying actions and the responses you expect:
mutation ExampleName {
goto(
url: "https://example.com"
waitUntil: firstMeaningfulPaint
) {
status
time
}
}
This standard format is composed of the following:
- Action:
goto
specifies the page to navigate to. - Arguments: Provide the
url
and awaitUntil
condition. - Response: Request useful outputs, like
status
ortime
.
You can find detailed information on all mutations, their arguments, and responses in the Mutations Reference page and also in the Built-in Documentation in our IDE.
Retrieving Data
Extracting information is just as simple. Use text
or html
commands depending on the format you need:
- Text
- HTML
mutation ExampleName {
...
productName: text(
selector: "span#productTitle"
visible: true
) {
text
}
}
Where:
- Alias:
productName
is the name for this interaction. You can define names for each interaction in the script. - Action:
text
extracts visible text. - Arguments: Include a
selector
and optional conditions likevisible
. - Output: The desired response (e.g., text content).
Example JSON response:
"productName": {
"text": "Coffee and Espresso Maker"
}
Omit the selector
to retrieve the entire page’s content.
mutation ExampleName {
...
elementHTML: html(
selector: "div"
visible: true
) {
html
}
}
Where:
- Alias:
elementHTML
is the name for this interaction. You can define names for each interaction in the script. - Action:
html
extracts the element's HTML content. - Arguments: Include a
selector
and optional conditions likevisible
. - Output: The desired response (e.g., html content).
Example JSON response:
"productName": {
"html": "\n <h1 xmlns=\"http://www.w3.org/1999/xhtml\">Example Domain</h1>\n <p xmlns=\"http://www.w3.org/1999/xhtml\">This domain is for use in illustrative examples in documents. You may use this\n domain in literature without prior coordination or asking for permission.</p>\n <p xmlns=\"http://www.w3.org/1999/xhtml\"><a href=\"https://www.iana.org/domains/example\">More information...</a></p>\n"
}
Omit the selector
to retrieve the entire page’s html content.
Performing Actions
Click, type or scroll are also built-in actions. Like presented above, the syntax is the same format.
To click on a button, just name the action, specify a selector and ask for a response:
acceptCookies: click(
selector: "#onetrust-accept-btn-handler"
visible: true
) {
time
}
For actions such as the need to get past a captcha, BQL takes humanized actions, such as moving the mouse to a selector and randomized typing delays. By default, BrowserQL will type a character at a time with a random time between strokes similar to a real user.
If you wish to change this delay you can specify a min
and max
delay. Below, typing delays are randomized between 10–50 milliseconds, mimicking natural input:
teapotTyping: type(
text: "I'm a little teapot!"
selector: "form textarea"
delay: [10, 50]
) {
time
}
Handling CAPTCHA
If you know that a page is going to have a captcha, such as for a login or form submission, you can use the verification mutation. This will click on the captcha, even if it’s hidden away in iframe and shadow DOMs. Just specify the CAPTCHA type (e.g., hcaptcha
or cloudflare
), and BQL takes care of the rest:
verifyCaptcha: verify(type: hcaptcha) {
time
found
solved
}
Connecting Libraries with Endpoints
We also know that you might want to connect other libraries, like Puppeteer or Playwright, to these browsers once they’ve got past the bot detectors. You can create an endpoint with the reconnect
action, and use this endpoint to connect to the browser:
reconnect(timeout: 30000) {
browserWSEndpoint
}
Next Steps
BrowserQL simplifies web automation with intuitive commands and a structure that’s easy to learn. Start by focusing on these three areas: