RANLP vol. IV

Volume Information

Nicolas Nicolov, Kalina Bontcheva, Galia Angelova & Ruslan Mitkov (2007) Recent Advances in Natural Language Processing IV (Current Issues in Linguistic Theory 292) Amsterdam & Philadelphia: John Benjamins.


Ordering Information

Pointer to the entry of the book on John Benjamins' website where one can order it:
http://www.benjamins.com/cgi-bin/t_bookview.cgi?bookid=CILT%20292


BibTeX Entry

@BOOK{Nicolov-Bontcheva-Angelova-Mitkov'2007
  ,EDITOR    = {Nicolas Nicolov and Kalina Bontcheva and Galia Angelova and Ruslan Mitkov}
  ,TITLE     = {{R}ecent {A}dvances in {N}atural {L}anguage {P}rocessing {IV}: {S}elected {P}apers from {RANLP} 2005}
  ,PUBLISHER = {John Benjamins}
  ,YEAR      = {2007}
  ,VOLUME    = {292}
  ,SERIES    = {{C}urrent {I}ssues in {L}inguistic {T}heory}
  ,ADDRESS   = {Amsterdam \& Philadelphia}
  ,ISBN      = {978 90 272 4807 7}
}

Table of Contents

Editors' Foreword   ix

I.  COMPUTATION FOR LINGUISTICS

John Nerbonne
Linguistic challenges for computationalists   1

II.  INFORMATION EXTRACTION & INDEXING

Ralph Grishman
NLP: An information extraction perspective   17

Florian Seydoux & Jean-Cedric Chappelier
Semantic indexing using minimum redundancy cut in ontologies   25

Niraj Aswani, Valentin Tablan, Kalina Bontcheva & Hamish Cunningham
Indexing and querying linguistic metadata and document content   35

Irina Matveeva, Gina-Anne Levow, Ayman Farahat & Christiaan Royer
Term representation with Generalized Latent Semantic Analysis   45

III.  PARSING

Ming-Wei Chang, Quang Do & Dan Roth
Multilingual dependency parsing: A pipeline approach   55

Sandra Kübler
How does treebank annotation influence parsing? Or how not to compare apples and oranges   79

Laura Alonso, Joan Antoni Capilla, Irene Castellon, Ana Fernandez-Montraveta & Gloria Vazquez
The SenSem project: Syntactico-semantic annotation of sentences in Spanish   89

IV.  ANAPHORA & REFERRING EXPRESSIONS

Robert Dale
Generating referring expressions: Past, present and future   99

Erhard W. Hinrichs, Katja Filippova & Holger Wunsch
A data-driven approach to pronominal anaphora resolution for German   115

V.  CLASSIFICATION

Nicolas Nicolov & Franco Salvetti
Efficient spam analysis for weblogs through URL segmentation   125

Filip Ginter, Sampo Pyysalo & Tapio Salakoski
Document classification using semantic networks with an adaptive similarity measure   137

Rada Mihalcea & Samer Hassan
Text summarization for improved text classification   147

Caroline Sporleder & Alex Lascarides
Exploiting linguistic cues to classify rhetorical relations   157

VI.  TEXTUAL ENTAILMENT & QUESTION ANSWERING

Milen Kouylekov & Bernardo Magnini
Tree edit distance for textual entailment   167

Jorg Tiedemann
A genetic algorithm for optimising information retrieval with linguistic features in question answering   177

Vasile Rus & Arthur Graesser
Lexico-syntactic subsumption for textual entailment   187

Courtney Corley, Andras Csomai & Rada Mihalcea
A knowledge-based approach to text-to-text similarity   197

VII.  ONTOLOGIES

Keiji Shinzato & Kentaro Torisawa
A simple WWW-based method for semantic word class acquisition   207

Eduard Barbu & Verginica Barbu Mititelu
Automatic building of Wordnets   217

VIII.  MACHINE TRANSLATION

Stelios Piperidis, Panagiotis Dimitrakis & Irene Balta
Lexical transfer selection using annotated parallel corpora   227

Victoria Arranz, Elisabet Comelles & David Farwell
Multi-perspective evaluation of the FAME speech-to-speech translation system for Catalan, English and Spanish   237

Daniel Varga, Peter Halacsy, Andras Kornai, Viktor Nagy, Laszlo Nemeth & Viktor Tron
Parallel corpora for medium density languages   247

IX.  CORPORA

Anne De Roeck
The role of data in NLP: The case for dataset profiling   259

Anne De Roeck, Avik Sarkar & Paul H. Garthwaite
Even very frequent function words do not distribute homogeneously   267

Lucia Specia, Maria das Gra¸cas V. Nunes & Mark Stevenson
Exploiting parallel texts to produce a multilingual sense tagged corpus for word sense disambiguation   277

Francis Chantree, Alistair Willis, Adam Kilgarriff & Anne de Roeck
Detecting dangerous coordination ambiguities using word distribution   287

List and Addresses of Contributors   297

Index of Subjects and Terms   303




List of Authors and Titles, Abstracts and Individual BibTeX Entries


I.  COMPUTATION FOR LINGUISTICS

Nerbonne, Nerbonne. 2007. "Linguistic Challenges for Computationalists". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 1-16. Amsterdam & Philadelphia: John Benjamins.



II.  INFORMATION EXTRACTION & INDEXING

Grishman, Ralph. 2007. "NLP: An information extraction perspective". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 17-23. Amsterdam & Philadelphia: John Benjamins.



Seydoux, Florian & Jean-Cedric Chappelier. 2007. "Semantic Indexing Using Minimum Redundancy Cut in Ontologies". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 25-34. Amsterdam & Philadelphia: John Benjamins.



Aswani, Niraj, Valentin Tablan, Kalina Bontcheva & Hamish Cunningham. 2007. "Indexing and Querying Linguistic Metadata and Document Content". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 35-44. Amsterdam & Philadelphia: John Benjamins.



Matveeva, Irina, Gina-Anne Levow, Ayman Farahat & Christiaan Royer. 2007. "Term Representation with Generalized Latent Semantic Analysis". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 45-54. Amsterdam & Philadelphia: John Benjamins.



III.  PARSING

Chang, Ming-Wei, Quang Do & Dan Roth. 2007. "Multilingual Dependency Parsing: A Pipeline Approach". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 55-78. Amsterdam & Philadelphia: John Benjamins.



Kübler, Sandra. 2007. "How does Treebank Annotation Influence Parsing? Or How Not to Compare Apples and Oranges". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 79-88. Amsterdam & Philadelphia: John Benjamins.



Alonso, Laura, Joan Antoni Capilla, Irene Castellon, Ana Fernandez-Montraveta & Gloria Vazquez. 2007. "The SenSem Project: Syntactico-Semantic Annotation of Sentences in Spanish". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 89-98. Amsterdam & Philadelphia: John Benjamins.



IV.  ANAPHORA & REFERRING EXPRESSIONS

Dale, Robert. 2007. "Generating Referring Expressions: Past, Present and Future". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 99-114. Amsterdam & Philadelphia: John Benjamins.



Hinrichs, Erhard W., Katja Filippova & Holger Wunsch. 2007. "A Data-Driven Approach to Pronominal Anaphora Resolution for German". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 115-124. Amsterdam & Philadelphia: John Benjamins.



V.  CLASSIFICATION

Nicolov, Nicolas & Franco Salvetti. 2007. "Efficient Spam Analysis for Weblogs through URL Segmentation". Recent Advances in Natural Language Processing IV (= Current Issues in Libguistic Theory, CILT 292), pp. 125-136. Amsterdam & Philadelphia: John Benjamins.



Ginter, Filip, Sampo Pyysalo & Tapio Salakoski. 2007. "Document Classification Using Semantic Networks with an Adaptive Similarity Measure". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 137-146. Amsterdam & Philadelphia: John Benjamins.



Mihalcea, Rada & Samer Hassan. 2007. "Text Summarization for Improved Text Classification". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 147-156. Amsterdam & Philadelphia: John Benjamins.




Sporleder, Caroline & Alex Lascarides. 2007. "Exploiting Linguistic Cues to Classify Rhetorical Relations". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 157-166. Amsterdam & Philadelphia: John Benjamins.



VI.  TEXTUAL ENTAILMENT & QUESTION ANSWERING

Kouylekov, Milen & Bernardo Magnini. 2007. "Tree Edit Distance for Textual Entailment". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 167-176. Amsterdam & Philadelphia: John Benjamins.



Tiedemann, Jorg. 2007. "A Genetic Algorithm for Optimising Information Retrieval with Linguistic Features in Question Answering". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 177-186. Amsterdam & Philadelphia: John Benjamins.



Rus, Vasile & Arthur Graesser. 2007. "Lexico-Syntactic Subsumption for Textual Entailment". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 187-196. Amsterdam & Philadelphia: John Benjamins.



Corley, Courtney, Andras Csomai & Rada Mihalcea. 2007. "A Knowledge-based Approach to Text-to-Text Similarity". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 197-206. Amsterdam & Philadelphia: John Benjamins.



VII.  ONTOLOGIES

Shinzato, Keiji & Kentaro Torisawa. 2007. "A Simple WWW-based Method for Semantic Word Class Acquisition". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 207-216. Amsterdam & Philadelphia: John Benjamins.



Barbu, Eduard & Verginica Barbu Mititelu. 2007. "Automatic Building of Wordnets". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 217-226. Amsterdam & Philadelphia: John Benjamins.



VIII.  MACHINE TRANSLATION

Piperidis, Stelios, Panagiotis Dimitrakis & Irene Balta. 2007. "Lexical Transfer Selection Using Annotated Parallel Corpora". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 227-236. Amsterdam & Philadelphia: John Benjamins.



Arranz, Victoria, Elisabet Comelles & David Farwell. 2007. "Multi-Perspective Evaluation of the FAME Speech-to-Speech Translation System for Catalan, English and Spanish". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 237-246. Amsterdam & Philadelphia: John Benjamins.



Varga, Daniel, Peter Halacsy, Andras Kornai, Viktor Nagy, Laszlo Nemeth & Viktor Tron. 2007. "Parallel Corpora for Medium Density Languages". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 247-258. Amsterdam & Philadelphia: John Benjamins.



IX.  CORPORA

De Roeck, Anne. 2007. "The Role of Data in NLP: The Case for Dataset Profiling". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 259-266. Amsterdam & Philadelphia: John Benjamins.



De Roeck, Anne, Avik Sarkar & Paul H. Garthwaite. 2007. Even very frequent function words do not distribute homogeneously   267 Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 267-276. Amsterdam & Philadelphia: John Benjamins.



Specia, Lucia, Maria das Gracas V. Nunes & Mark Stevenson. 2007. "Exploiting Parallel Texts to Produce a Multilingual Sense Tagged Corpus for Word Sense Disambiguation". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 277-286. Amsterdam & Philadelphia: John Benjamins.



Chantree, Francis, Alistair Willis, Adam Kilgarriff & Anne de Roeck. 2007. "Detecting Dangerous Coordination Ambiguities Using Word Distribution". Recent Advances in Natural Language Processing IV ed. by N.Nicolov, K.Botcheva, G.Angelova & R.Mitkov (= Current Issues in Libguistic Theory, CILT 292), pp. 287-296. Amsterdam & Philadelphia: John Benjamins.