The workshop is partially supported by the European Commission via the project
BIS-21++, FP6 contract no. INCO-CT-2005-016639
Many things have changed at the Balkans for two years, since the previous workshop “Language and Speech Infrastructure for Information Access in the Balkan Countries” was held in conjunction with RANLP’05. The languages spoken in this unique region attract more attention, due to the rapidly developing field of communication and translation, and the interest to language technologies for these languages is increasing. New markets appear, together with newly established collaboration and new opportunities to extend the application areas of natural language processing.
In the last decade, numerous activities aimed at incorporating the Balkan NLP research into the widely applied models of other European languages in the form of joint projects like MulText East, BALRIC-LING, BalkaNet, INTERA and others. Language resources and grammatical knowledge for different Balkan languages have been incorporated and processed within the international NLP standards like MTE, XCES, WordNet, INTERA. As a result of joint bilateral projects, the linguistic knowledge for some Balkan languages has been processed according to well-known systems and models - INTEX, GATE, etc.
The unified NLP paradigm for Balkan languages ensures the development of a common idea for creation of a Balkan multilingual pool for NLP in monolingual or multilingual - parallel or contrastive - perspective. Not all Balkan languages are at the same distance from the achievement of that goal. That is why the main task of this workshop is, along with the overview of present achievements related to the development of a common NLP paradigm of the Balkan languages, to suggest a roadmap for the multilingual research and development carried out in joint activities of the members of the traditional Balkan language union.
Specific topics of interest for the proposed workshop are:
- NLP-driven models of large language data sets, for instance, grammatical dictionaries, syntactic collections, text categories, ontologies;
- collection and representation of large lexical resources conforming to international standards;
- compilation of large multilingual collections where a given Balkan language is paired to a wide-spread European language, or another Balkan language;
- evaluation of the results of using wide-spread NLP tools for the Balkan languages;
- investigation/evaluation of the results of mapping the well-known and widely used NLP-driven models of the different Balkan languages.
Elena Paskaleva and Milena Slavcheva
(Institute for Parallel Processing, Bulgarian Academy of Sciences)
- Tanja Avgustinova (DFKI and University of Saarland, Germany)
- Dan Cristea (University of Iasi, Romania)
- Damica Damljanovic (University of Sheffield, UK)
- Tomaz Erjavec (Jozef Stefan Institute, Slovenia)
- Maria Gavrilidou (ILSP, Greece)
- Steven Krauwer (University of Utrecht, the Netherlands)
- Cvetana Krstev (University of Belgrade, Serbia)
- Kemal Oflazer (Sabanci University, Istanbul, Turkey)
- Petya Osenova (University of Sofia, Bulgaria)
- Stelios Piperidis (ILSP, Greece)
- Kiril Simov (Bulgarian Academy of Sciences, Bulgaria)
- Maria Stambolieva (Bulgarian Academy of Sciences, Bulgaria)
- Ralf Steinberger (EC - Joint Research Centre, Italy)
- Marco Tadic (University of Zagreb, Croatia)
- Dan Tufis (Romanian Academy of Sciences)
- Dusko Vitas (University of Belgrade, Serbia)
Important dates – submission deadline extended!
- 25 June 2007 - extended abstract, between 800 and 1000 words;
- 25 July 2007 - notification of acceptance;
- 20 August 2007 - final submission of the full paper, up to 7 pages in the format of RANLP-2007 (see the main conference site). The authors will be contacted if small corrections are needed, between 20 and 30 August 2007;
- 26 September 2007 – workshop with published proceedings of full papers.
Send your abstracts to: