When to use the Analyze API versus individual Automatic APIs

The Analyze API serves as a single entry-point to all of Diffbot’s Automatic APIs. In one request, the Analyze API will:

  • Determine the type of any URL submitted.
  • Return the full Diffbot extraction for any supported types: articles, products, images, frontpages — and more coming soon. See all Automatic APIs.

Why would you use the Analyze API over calling the Product API, Article API or another API directly?

  • If you are handling web pages of unknown origin (e.g., end-user submitted/shared links), the Analyze API will prevent spurious extractions from unsupported pages.
  • When spidering a site using Crawlbot, the Analyze API will prevent extracting every site page via a single API. For instance, using Analyze makes it easy to “retrieve all the product data from ECommerceStore.com” without additional configuration.
  • For article content: the Analyze API uses Diffbot’s full rendering engine, which executes on-page Javascript. This can result in slightly improved extractions for articles and blog posts. More on how Diffbot handles Javascript.

And why would you opt for a specific API over the general Analyze endpoint?

  • If you are certain of your web-page type (e.g., all articles), sending calls directly to the specific API endpoint will result in 100% extractions. There is always a small chance that the Analyze API will mis-classify confusing pages.

Note that you can also use the fallback argument if you’d like to ensure that unsupported pages are processed by a specific API of your choosing.

Custom rules applied to a specific API, and specific API parameters (e.g., fields=meta,videos,html for the Article API) will be handled appropriately regardless of using the Analyze or specific extraction APIs.