Moving to New Versions of Diffbot APIs

We introduced Version 2 of our Article API in mid-2013 — updating the original Article API alongside the release of our Product and Image APIs — and Version 3 of our APIs in April 2014. Version 3 is a structural alignment of our API responses, both for internal consistency and in anticipation of future product releases, and for our Article API introduces our newest rendering engine, which includes Javascript support.

We advise updating to the V3 API to take advantage of current and future capabilities.

See complete API documentation within the Developer Dashboard for all available options. A summary of changes is below:

Version 3 Changes:

Article API Uses Newest Rendering Engine

The Article API has transitioned to our full renderer, including support for Javascript/Ajax events.

Article API Uses a New Tagging Engine

The Article API’s tagging engine (generates tags/entities based on analysis of the extracted text) has been overhauled for Version 3. The format has changed — each entity now includes a DBPedia link and type, if available — and the tags field is now automatically included in every Article API request.

All Responses Now Include request and objects Elements

All Diffbot APIs now return two primary top-level objects:

  • request, which provides metadata on the request itself
  • objects, an array of elements extracted from the page

For most calls (the Article API, most Product API requests) the objects array will include a single result. Image API requests against multiple-image pages will return multiple image objects.

All Objects are Now Uniquely Identified

All returned objects — both top-level (e.g., articles, products) and nested objects (images, videos) now return a unique diffbotUri value, used internally to help differentiate and catalog each object returned by our APIs.

Individual Field Changes:

  • All APIs: url is now pageUrl
  • All APIs: resolved_url is now resolvedPageUrl
  • Article API: primary value (in the images array) is now a boolean
  • Product API: humanLanguage is now available
  • Product API: description is now text
  • Product API: media is now images

Version 2 Changes:

Call the http://api.diffbot.com Endpoint

API calls should no longer be made to http://www.diffbot.com/api. To call Diffbot APIs, send requests as follows:

http://api.diffbot.com/v3/{api}?token={token}&url={url}

Use &fields Parameter to Customize Your Response

Version 2 and subsequent APIs allow you to customize the specific fields of your response using the fields parameter. For instance, to return title, text, meta and images in your Article API response, send the following request:

http://api.diffbot.com/v3/article?token={token}&url={url}&fields=title,text,meta,images

The media Element Has Been Replaced By the videos and images Elements

The original Article API returned both images and videos in a single media array. Version 2 and later return individual arrays, videos and images, for these items.

Additionally, for images and videos:

  • V2 and above: the link field has been replaced by url for both videos and images arrays.
  • V2 and above: the type field has been removed from each image or video identified