2016-09-12

  • Numerous improvements to normalizedSpecs in the Product API.
  • Diffbot Automatic APIs now process PDFs. PDF URLs will be converted to HTML and then analyzed for extractable content. PDFs are not currently supported while crawling.
  • Crawlbot fixes to reduce DNS errors when starting new crawls or crawl rounds.
  • Crawlbot and Bulk Processing: deletion of a nonexistent job will no longer return a “success” message.
  • Improved handling of UTF-8 encoded characters within Crawlbot.