singular, plural, dual
Our resources rely on deep lexical analyses of human languages, meticulously deciphering and mapping their linguistic DNA, identifying and categorizing the different elements, and linking them to each other and across multiple languages.
These datasets have been developed over the years by K Dictionaries and published by its partners in various media – serving millions of users around the world.
The content is human-curated, enriched by automatic language generation and supplemented by morphological word form lists, language and grammar guides, biographical and geographical tables, phonetic transcription (IPA), alternative scripts, and vocal pronunciation.
The data are available in XML, JSON and JSON-LD (RDF) formats.
Examples of usage
Range of application
Geo multilingual table
Part of speech
PARALLEL CORPORA FOR AI
Lexicala resources feature over 350 languages pairs, including 9 million segments and nearly 90 million tokens in twenty languages, for easy integration with AI systems.
The segments consist of manually curated sentences and phrases with translation equivalents, based on corpus evidence and frequency, which were originally created by our editors and translators worldwide as examples of usage for dictionary entries.
This high-quality data can be applied to boost the performance of Language Service Providers, to train Machine Learning models and enhance their Neural Machine Translation solutions.
Besides general language vocabularies, there are segments for a hundred vertical domains (e.g. health, medicine, finance, legal, etc). The languages include Arabic, Chinese (Simplified), Danish, Dutch, English, French, German, Greek, Hebrew, Italian, Japanese, Korean, Norwegian, Polish, Portuguese – Brazilian and European, Russian, Spanish, Swedish, and Turkish.
IN USE BY
- natural language processing integrators
- software and technology companies
- language learning providers
- online dictionary websites
- mobile app developers
- exchange with language and research institutes
- invited talks and workshops
- publication of scholarly papers
- sponsorship for conferences and academic activity
- internships to university students of lexicography,
linguistics, translation, NLP and computer science
Most of our data are available via API. Our REST API enables flexible search options and returns JSON responses, whether complete dictionary entries or specific components – featuring rich syntactic and semantic information, sense definitions and various means of disambiguation, examples of usage and multiword expressions, translations and more – allowing easy processing and seamless integration with other applications.
You can read documentation, register and gain access on our API website, just click the link below.
We can research and create data exactly for your needs.
Reach out to discuss in detail.