language translation
 

 

portuguese

Portuguese Translation

 


Repentino: Naming Entities in Portuguese


Named Entities (abbreviated as NE) are defined as units that can be denoted by proper names like those of individuals, institutions, businesses, countries or cities and brands. Apart from these, NEs also include names of specialized techniques, domains, and software systems. In time new and more complex units are being included in the class of Named Entities. In order to recognize these NEs and to categorize them in specific classes, Named Entity Recognition (abbreviated as NER) systems have gained importance in modern days due to the growing needs of fast and efficient Portuguese translation. It is also abbreviated as NER in the translation profession. REPENTINO is a reference book that is readily available and consists of data that is useful to design named entity recognition systems that assist translators in delivering efficient machine Portuguese translation.

REPENTINO has different categories of hierarchy suitable to different instances of the NEs involved. The hierarchy structure of REPENTINO consists of 11 main categories and 97 sub categories. For example the main category of ‘Location’ consists of 16 sub categories; on the other hand, the main category of ‘Paperwork’ consists of 8 subcategories. The 11 main categories of hierarchy of REPENTINO are: Location, Organizations, Beings, Event, Products, Art and Media, Paperwork, Substance, Abstraction, Nature and Miscellaneous.

Out of these main categories, ‘Beings’ can be considered the most significant one to design NER systems useful for machine Portuguese translation. This is mainly because it comprises the two third instances that are stored in REPENTINO. The category of ‘Beings’ covers actual, imaginary and legendary beings. Said category is further classified into six sub categories that include: Ethnic, Human, Human-Collective, Mythological, Non-Human and others. The main category of ‘Location’ includes NEs mostly known due to their geographical position in the Universe. It is the category of REPENTINO that ranks second in the list of significant ones. It is divided into subcategories like Address, Administrative Division/ Region/Town, Country, Civil/Administration/Military, Commercial/Industrial/Financial, Hydrographic, Heritage/Monuments, Infrastructure/Facility, Loose Address, Mythological/Fictional, Religious, Space, Socio-Cultural, Real-Estate, Terrestrial and Other.

The other significant categories of REPENTINO are ‘Organizations’ and ‘Events’. Organizations category represents NEs that involve groups of people having defined structures that function together as a single body to achieve a goal complying with specific regulations. Some of the sub categories of Organizations include Company, Socio-Cultural, Interests Groups, Religious and Sports. On the other hand ‘Events’ category of REPENTINO includes NEs with a specific time period, have a fixed start and end time. Few of the subcategories include Scientific, Sports, Socio-Cultural and Political.

REPENTINO is a very valuable resource that can be used to develop tools for machine Portuguese translation. It has been tried and tested in recent times in Named Entity Recognition systems and has proved its worth for further applications in the Portuguese translation field.

 

 

translation interpretation localization corporate courses
language tour store free quote contact us
about us links our clients copyright

translation@bbportuguese.com

Sao Paulo Office
Rua Julio Frank, 941
Jaguariuna - Sao Paulo
13820-000 - Brazil
Ph: +55-11-3323-5908
FAX: 00XX1-801-346-7801
New York Office
131 Mineola Blvd. Ste. 100
Mineola, New York 11501 USA
1-800-756-5041
FAX: 516-776-9474

Legal Disclaimer