When to use the Analyze API versus individual Automatic APIs

The Analyze API serves as a single entry-point to all of Diffbot’s Automatic APIs. In one request, the Analyze API will:

  • Determine the type of any URL submitted.
  • Return the full Diffbot extraction for any supported types: articles, products, images, discussion threads, videos — and more coming soon. See all Automatic APIs.
  • Or, just return the type of page you’re looking for by using the mode argument, e.g. https://api.diffbot.com/v3/analyze?mode=article

Why would you use the Analyze API over calling the Product API, Article API, or another API directly?

  • If you are handling web pages of unknown origin (e.g., end-user submitted/shared links), the Analyze API will prevent spurious extractions from unsupported pages.
  • When spidering a site using Crawlbot, the Analyze API will prevent extracting every site page via a single API. For instance, using Analyze makes it easy to “retrieve all the product data from ECommerceStore.com” without additional configuration.

And why would you opt for a specific API over the general Analyze endpoint?

  • If you are certain of your web-page type (e.g., all articles), sending calls directly to the specific API endpoint will result in 100% extractions. There is always a small chance that the Analyze API will mis-classify confusing pages.

Note that you can also use the fallback argument if you’d like to ensure that unsupported pages are processed by a specific API of your choosing.

Custom rules applied to a specific API, and specific API parameters (e.g., fields=meta,videos,html for the Article API) will be handled appropriately regardless of using the Analyze or specific extraction APIs.