Using the Replace Filter in the Custom API Toolkit

Diffbot offers various filters you can apply to your custom fields in the Custom API toolkit. One such filter is the replace filter.

The replace filter’s purpose is reading a regular expression and then replacing the part of the captured text with whatever the “replace with” field contains.

Here is a rudimentary example.

Let’s say we only want to extract the type of currency from this page:

Justkidding page

Justkidding page

The price and currency are in the same HTML element, so we cannot simply select the currency alone.

Info in the same DOM element

Info in the same DOM element

Therefore, we need to:

  1. Target the price with a custom field, let’s call it “theCurrency”. The selector .special-price .price will come in handy
    A new field is defined

  2. Then we click filters and use a replace filter.
    A replace filter is activated

  3. Finally, we enter \s?\d+\s?\W? into the first field and leave the second field empty (the second field is the value with which to replace the match from the first one).

Testing this, we can see that we do indeed get currency back.

Currency is returned

Currency is returned

Regex explanation

The expression \s?\d+\s?\W? means:

  • \s? one or zero space characters
  • \d+ one or more digits
  • \s? one or zero space characters
  • \W? one or zero non-word characters

In other words, it removes everything that doesn’t match the currency’s letters by replacing it with nothing – an empty second field. This regex could be re-used on any similar site now to solve a similar problem.

You can read some more about filters here.