RANLP 2007

International Conference
RANLP - 2007
/Recent Advances in Natural Language Processing/
September 27-29, 2007, Borovets, Bulgaria

KEYNOTE SPEAKERS:

Lauri Karttunen (Palo Alto Research Center and Stanford University)
Bernardo Magnini (FBK-irst, Trento)
Allan Ramsay (University of Manchester)
Ellen Riloff (University of Utah)
Karin Verspoor (Los Alamos National Laboratory)
Yorick Wilks (University of Sheffield)

Wasting a chance is not like wasting a dollar
Lauri Karttunen, Palo Alto Research Center

The PASCAL RTE (Recognizing Textual Entailment) challenge, now in its third teration, requires a system to recognise, given two text fragments, whether one piece of text can be inferred from the other. For example, the computer should be able to tell that John left follows from John and Mary left, but not vice versa. The advantage of this task as a test bed for computational semantics is that it abstracts away from particular meaning representations and inference procedures. The inference may be purely linguistic, as in the case above, or it may require some additional world knowledge, as in the case of The plane landed in Sofia and The plane landed in Bulgaria.

In previous work (Nairn, Condoravdi, and Karttunen 2006), we focused on the case where one of the sentences is embedded in the other as a that-clause or as an infinitival complement of a simple verb. We described our implementation for handling the infinitival complements of so-called two-sided implicative verbs such as manage, remember, forget, and fail. A characteristic feature of these verbs is that they yield entailments both in affirmative and negative contexts:

1. Joe remembered to shave. => Joe shaved.
2. Joe did not remember to shave. => Joe did not shave.
1. Joe forgot to shave. => Joe did not shave.
2. Joe did not forget to shave. => Joe shaved.

Another interesting property of these verbs is that they can be stacked on top of each other. For example,

Joe failed to remember to shave this morning. => Joe did not shave this morning.

Our 2006 implementation assigns to each clause a context headed by the main verb of the clause. A single recursive rule computes the veridicality relations from the top-level context down to embedded contexts taking into account the semantic properties of the verbs that link them. In (3), fail is said to be instantiable (true) because it is the head of of the top-level context. But the context headed by shave is antiveridical with respect to the top context. Hence the system draws the inference that Joe did not shave this morning.

In this paper we extend the treatment of simple implicative verbs like manage and fail to a wide range of phrasal two-way implicative constructions such as take the time, use the occasion, and have the foresight in (4).

1. Joe had the foresight to leave early. => Joe left early.
2. Joe did not have the foresight to leave early. => Joe did not leave early.

The phrase have the foresight is not a singular idiom. There are about 30 nouns in addition to foresight that yield the same entailments in the have [the N to X] construction.

In this paper we show how all the relevant entailments for phrasal two-way implicatives can be computed with the help of a single additional semantic rule that takes advantage of the previous treatment for simple implicative verbs like manage.

The construction waste a/the N to X is interesting because the polarity of the entailment depends on the type of noun:

1. I wasted an opportunity to watch that movie. => I did not watch that movie.
2. I did not waste an opportunity to watch that movie. => I watched that movie.

1. I wasted a dollar to watch that movie. => I watched that movie.
2. I did not waste a dollar to watch that movie. => I did not watch that movie.

The rule for phrasal two-way implicatives with waste as the verb requires making a distinction between chance nouns (an opportunity) and asset nouns (a dollar). Wasting a chance on something is different from wasting a dollar on it.

"Where can I eat paella this evening?": Context Aware Question Answering with Textual Inferences
Bernardo Magnini, FBK-irst, Trento

Recent advancements in Textual Entailment have shown its high potential for several tasks in applied semantics. In this talk I will focus on information access applications and argue that traditional drawbacks of NLP interfaces to knowledge bases, such as scalability and portability, can be minimized through the definition of semantic inferences at the textual level. I will present ongoing work in this direction within the QALL-ME project, where an entailment-based approach is experimented in the context of multilingual question answering for mobile devices.

Intensionality in everyday language
Allan Ramsay, University of Manchester

Everyday language makes a great deal of use of 'intensional' operators, i.e. operators that comment on the truth or falsity of other elements of the discourse. Reasoning with such operators is notoriously difficult, and as a result they have not been paid as much attention as they deserve. This is unfortunate, since if you fail, for instance, to interpret the adjective 'fake' properly then hearing the utterance 'I've got a fake Picasso here, which I'll let you have for ВЈ10000.' is likely to lead you into a very expensive mistake.

(1) I have a fake Picasso.

    exists(B :: {fake(B) & Picasso(B)},
                 ------- -----------
           exists(C,
                  have(C)
                  & theta(C, object, B)
                    & theta(C, agent, ref(lambda(D, speaker(D))))
                      & aspect(now, simple, C)))

Problems like this seem, at first sight, to be rather esoteric. We believe that they are in fact widespread, and that a large number of everyday constructions require intensional interpretations. This talk will consider a number of examples which we have encountered while trying to develop a natural language front-end to a medical information system; outline the way we are dealing with these problems; and discuss some issues relating to performance.

Finding Mutual Benefit between Information Extraction and Subjectivity Analysis
Ellen Riloff, University of Utah

On the surface, the areas of information extraction (IE) and subjectivity analysis seem quite different. The goal of information extraction systems is to identify facts that are related to events described in text. In contrast, the goal of subjectivity analysis is to identify opinions, emotions, sentiments, and other private states that are expressed in text. However, we have discovered a surprising amount of synergy between these seemingly disparate tasks. In this talk, we will overview a variety of research results that illustrate how information extraction techniques and subjectivity analysis can be brought together for mutual benefit. We will show how IE patterns can be used to represent multi-word subjective expressions and to learn subjective nouns, and how IE techniques can identify the sources of opinions (opinion holders). We will also describe a method for creating a subjective sentence classifer from unannotated texts, and show how it can be incorporated into an IE system to improve IE precision.

Semantics, Text and a Cure for Cancer
Karin Verspoor, Los Alamos National Laboratory

In this talk I will explore the role of knowledge management, broadly, and text processing, specifically, in applications supporting biology research.

The biology domain is an incredibly rich domain, in terms of the amount of biological data that exists, the complexity and importance of the problems being tackled, and the potential for impact from natural language processing research. Baumgartner et al (2007) show clearly that current manual efforts for the construction of biological knowledge resources are not adequate to achieve annotation of already existing biological data, let alone the immense quantities of new data that are being generated each year.

Semantic analysis of knowledge resources and text mining provide potential solutions to this problem. I will discuss some examples of biological problems for which automated methods are being developed by the mathematical, CS, and NLP communities, and introduce some recent preliminary work aimed at identifying the genes that play an important role in cancer.

Pragmatics, dialogue phenomena and the COMPANIONS project
Yorick Wilks (University of Sheffield)

How much is linguistic pragmatics essentially connected to dialogue rather than text, given that definitions of pragmatics are not usually such as to distinguish between dialogue and prose? The concrete context of the talk is large EU consortium project COMPANIONS, whose aims I will briefly set out: that of building persistent conversational agents as interfaces to the Internet, persistent in that they would appear to know their owner, his or her tastes, beliefs and interests. It is clear that such a project
needs above all a robust approach to computational conversation, and a section of the talk is devoted to a brief history of computational dialogue systems and how far their success has depended on tackling phenomena that seem pragmatic, on any definition. The argument of the talk is that there could be no such agents without some progress in areas conventionally considered pragmatic, and that those who (still) believe that such phenomena can be modelled by extensions of the statistical methods of speech research are wrong, powerful as their methods and successes are. This because adequate modelling will require not only notions of belief (which speech research methods can now begin to model, much to the surprise of many) but also point-of-view pragmatics. I take the inclusion of this phenomenon as the touchstone for any ambitious dialogue pragmatics, and argue that pragmatic theories that lack it, such as relevance theory, cannot even be candidates for such modelling. I will describe what I mean by point-of-view pragmatics, and argue that it is difficult to produce a minimalist version of such a theory for a practical system, but it is necessary to continue to try, as we are doing in the COMPANIONS project.