You are here: Home / Infrastructures / Res. Infrastructure
Center of Estonian Language Resources (CELR)
Identification
Hosting Legal Entity
University of Tartu
Location
Liivi 2, Tartu, PO: 50409 (Estonia)
Structure
Type Of RI
Distributed
Coordinating Country
Estonia
Status
Status
Current Status:
Under construction since 2011
Contacts
Contact Person
Vider Kadri
Position:
Executive Manager
Scientific Description
The Center of Estonian Language Resources (CELR) is an infrastructure which will provide access to language resources and technologies (language software, lexicons, text and speech corpora, linguistic databases) for all researchers, language technologists and other people. To achieve this, the existing digital archieves will be interconnected and supplemented by language technology tools as a web-based service which will use the archived/stored data.
In CELR, the existing digital language resources will be interconnected and supplemented by language technology tools as an environment with web-based access and services which will use the archived/stored data.
Estonia has set up the CELR as a consortium of three institutions at the national level on 2nd of December 2011. This consortium of University of Tartu, the Institute of Cybernetics at Tallinn University of Technology and the Institute of the Estonian Language will perform as organisational framework for coordinating and implementing the obligations of Estonia as the member in CLARIN ERIC. The CELR serves as the national node of CLARIN infrastructure.
In the start-up phase, Center will build up an infrastructure of central data register and service servers, user authentication and authorisation systems, system for gathering standardised, well-documented and evaluated collections of data.

RI Keywords
Speech and text processing, Language resources, Human language technology
Classifications
RI Category
Distributed Computing Facilities
Data Archives, Data Repositories and Collections
Communication Networks
Repositories
Research Data Service Facilities
Software Service Facilities
Research Archives
Databases
Conceptual Models
Scientific Domain
Information Science and Technology
Humanities and Arts
ESFRI Domain
Social and Cultural Innovation
Services
Access to the resources and tools deployed at the centre via specified and CLARIN compliant interfaces in a stable and persistent way

Access rights to resources, tools based on their academic status (verified via SAML protocol) and for specific resources based on signed user licences; authentication of users via the national Identity Provider Federation (TAAT) that is connected to individual Identity Providers (e.g. University of Tartu) and other CLARIN IdPF. Ensure SSO (single-sign-on) capability for CLARIN network for both Estonian and international users; access to online tools and services.

Central data register and repository service for language resources

Store LR in a repository, make them accessible and visible via registry and linked meta-data services; PID – provide permanent address for each (version of each) resource in repository; connected to register – possibility to search by meta-data; uniform descriptions, quality assessments etc.

Knowledge and expertise service for RI users

User support.

Central IT-services as hosting, archiving, deposite, preservation of Estonian language and other linguistic data, users access and authorisation for services provided by national identity federation

Storage space, help with properly describing, licensing and standardizing the resource, persistent access, but not (in general) updating resources (e.g. to newer standards, newer software versions – but might provide e.g. tools for automatic conversion from one standard format to another); version management – will keep older versions (also useful e.g. to make it possible to repeat research experiments made using that resource); user and services’ usability statistics.

Metadata portal, data category registration service, schema registration services for language resources

Make it easier to find resources in both our LRCentre and connected networks via faceted search; properly described resources – automatically exchange meta-data in official formats with other language resource infrastructures (META-SHARE and CLARIN); browsing schema in use for different types of language resources, registration of new schema.

Equipment
Central servers for hosting, archiving and deposite of language resources and tools; Servers for producing and testing of research models

Live Server (Supermicro 6037R) for hosting all live applications. the registry of language resources and the repository language resources. All of the resources will be listed in the registry and stored in the repository on a version-control software; Test Server (Supermicro 6037R); Storage (Infotrend ESDS S16E-R2240); Computational Server (Supermicro 5086B).

Web-based user environment for using language resources and tools for research and development purposes

The registry of language resources; the repository of language resources; the web page of the Center of Language Resources; project management software; virtualization Software; version control software:.

Collaborations
Networks
META-SHARE
CLARIN
Impact
Societal Grand Challenges
Inclusive, innovative and secure societies
Funding
National Public Funding Organisations
Research Infrastructures of National Importance (Estonia)
Estonian Research Infrastructures Roadmap 2010
Date of last update: 31/03/2017