Rosette Entity Extractor

Locate names, places, dates and other words and phrases in multilingual text through advanced named entity extraction

Named Entity Extraction —the process of identifying names, places, dates, and other words and phrases that establish the meaning of a body of text—is critical to software systems that process large amounts of unstructured data coming from sources such as email, document files, and the Web. By locating certain types of phrases and associating them with a category, applications such as text analysis software can perform functions such as concept extraction.

REX linguistics
REX uses statistical modeling to learn patterns from large corpora of native language. This means users don’t have to program or train patterns – they’re already built into REX’s language models – and new models can be quickly trained to extend REX’s language coverage.


Rosette® Entity Extractor (REX) is an entity extraction technology designed for integration into software systems for information retrieval, text mining, trend analysis and relationship extraction, taxonomy and categorization, data extraction, defense and intelligence analysis, CRM, business intelligence, and other applications that classify, manage, analyze, and mine textual information. It is currently available for Chinese, Japanese, Korean, Arabic, Farsi, Urdu, Dutch, English, French, Italian, German and Spanish with additional languages under development.

REX locates generic terms such as “Vice President” and “earnings estimates” as well as specific references such as “President George Bush” and “May 22, 1990.” This is the first step in the process of identifying and extracting important information from documents, in preparation for the information to be further structured and analyzed by other applications.

REX’s modules are specifically designed to be flexible and trainable by our customers, eliminating the constraints common to more rigid, structured approaches to entity extraction.

REX