Which spoken languages (humanLanguage) are identified in Diffbot APIs?

Diffbot Automatic APIs identify and return the humanLanguage (spoken language) of most analyzed pages. This is returned as a two-letter ISO-639 code, with the exception of Simplified Chinese (zh-cn) and Taiwanese Mandarin (zh-tw).

(In the Article API, Diffbot-generated tags will be returned in the native language for pages in English, Chinese, French, German, Spanish or Russian.)

The currently supported and returned ISO codes are as follows:

  • ar
  • az
  • bg
  • bn
  • ca
  • cs
  • da
  • de
  • el
  • en
  • es
  • et
  • fa
  • fi
  • fr
  • gu
  • he
  • hi
  • hr
  • hu
  • id
  • it
  • ja
  • ko
  • lt
  • lv
  • mk
  • ml
  • nl
  • no
  • pa
  • pl
  • pt
  • ro
  • ru
  • si
  • sq
  • sv
  • ta
  • te
  • th
  • tl
  • tr
  • uk
  • ur
  • vi
  • zh-cn
  • zh-tw