CleanInput
Options for cleaning up the DOM prior to exporting its content. Many options are available, and this query can destructively remove non-text DOM nodes, DOM attributes, and gratuitous whitespace characters. Since these operations are destructive in their nature it's recommended to run them at the very end of your query in order to preserve page functionality
input CleanInput {
removeNonTextNodes: Boolean
removeAttributes: Boolean
removeRegex: Boolean
selectors: [String!]
attributes: [String]
mode: AttributeMode
regexes: [String!]
}
Fields
CleanInput.removeNonTextNodes ● Boolean scalar
When true (default is true) this will remove non-textual nodes from the DOM like scripts, links, video, canvas, etc. You may override this by specifying a selectors argument for DOM selectors to remove.
CleanInput.removeAttributes ● Boolean scalar
When true (default is false) this will remove all attributes on all DOM nodes. Useful for "cleaning" up all HTML markup but preserving the structure overall. You can specify specific attributes to remove with attributes argument
CleanInput.removeRegex ● Boolean scalar
Removes any characters in the HTML by a regex pattern and arn in order. By default this is true and removes newlines, returns, tabs, multi-spaces and HTML comments in that order. You may supply your own regex by using the regexes argument
CleanInput.selectors ● [String!] list scalar
A list of selectors to remove from the page when removeNonTextNodes is set to true (true by default).
CleanInput.attributes ● [String] list scalar
A list of attributes to remove from all DOM nodes. When this isn't specified, and removeAttributes is true, all attributes on all DOM nodes are removed. removeNonTextNodes must be set to true for this to take effect
CleanInput.mode ● AttributeMode enum
Controls how the attributes field is interpreted. When set to "allow", the attributes field specifies which attributes to keep. When set to "deny" (default), the attributes field specifies which attributes to remove. Defaults to "deny" for backward compatibility.
CleanInput.regexes ● [String!] list scalar
When removeRegex is set to "true" this list of regex items, without the beginning and ending /, are removed from the page. These are each run in order and replaced with a single space character to preserve some of their contents