Skip to main content

Mapping

The mapping feature provides a flexible way to extract structured data from web pages by specifying DOM selectors or JavaScript snippets. This guide covers advanced tips, best practices, and sneaky strategies to help you leverage this feature fully.

The mapping feature allows you to:

  • Select multiple DOM nodes simultaneously.
  • Retrieve structured data (text, HTML, or attributes).
  • Map nested selectors to capture hierarchical DOM structures.
  • Assign aliases to make your returned JSON more meaningful.
tip

Use specific CSS selectors to avoid unnecessary data.

Creating JSON with mapSelector

The mapSelector function offers an intuitive alternative to typical parsing, working similarly to the map function in functional programming, with NodeLists or document.querySelectorAll.

You can easily extract DOM attributes using the attribute(name: "data-custom-attribute") property. It returns an object with name and value properties.

The query below demonstrates how to:

  1. Navigate to https://news.ycombinator.com.
  2. Create a map named posts to extract all .submission .titleline > a elements.
  3. Return an array of objects, each containing the href attribute as a structured JSON.
mutation scraping_example {
goto(
url: "https://news.ycombinator.com",
waitUntil: firstContentfulPaint
) {
status
}

posts: mapSelector(selector: ".submission .titleline > a", wait: true) {
link: attribute(name: "href") {
value
}
}
}

Smart Use of Aliases

Aliases enhance readability and clarity of your mapped data.

Example:

mutation ProductDetails {
goto(url: "https://example.com/products") {
status
}

products: mapSelector(selector: ".product-item") {
title: mapSelector(selector: ".product-title") { innerText }
price: mapSelector(selector: ".product-price") { innerText }
}
}

Handling Arbitrary DOM Attributes

Retrieve any custom or data-* attributes seamlessly:

Example:

mutation CustomAttributes {
goto(url: "https://example.com") {
status
}

items: mapSelector(selector: "[data-item-id]") {
id: attribute(name: "data-item-id") {
name
value
}
}
}

Nested Mapping for Hierarchical Data

Use nested mappings for deeply structured data. The hierarchy is preserved, making your structured JSON easier to handle:

Example:

mutation NestedMappingExample {
goto(url: "https://example.com/categories") {
status
}

categories: mapSelector(selector: ".category") {
categoryName: innerText
subcategories: mapSelector(selector: ".subcategory") {
subcategoryName: innerText
}
}
}

Advanced Nested Example

Further illustrating nested mapping, this example retrieves metadata such as author and score:

mutation map_selector_example_with_metadata {
goto(url: "https://news.ycombinator.com") {
status
}

posts: mapSelector(selector: ".subtext .subline") {
author: mapSelector(selector: ".hnuser") {
authorName: innerText
}

score: mapSelector(selector: ".score") {
score: innerText
}
}
}

Conditional Wait and Timeout Adjustments

Customize wait times for dynamic content:

Example:

mutation TimeoutExample {
goto(url: "https://example.com") {
status
}

delayedItems: mapSelector(selector: ".async-loaded-item", timeout: 60000, wait: true) {
content: innerText
}
}