FCT

R&D Institutions

Resultado da avaliação 2007 na área de Ciências da Linguagem

Unidade de I&D

Centro de Linguística da Universidade de Lisboa [LIN-LVT-Lisboa-214] visitada em 13/11/2007

Classificação: Excellent

Comentários do painel de avaliação
Sobre a unidade
The evaluation panel considered CLUL to be the Unit with the best resources and infrastructures among all the visited Units. ONSET has also been incorporated into this Unit and the two separate Units’ original lines of research will now be continued in two main research groups referred to as SKAi and LinSe.
These two new groups appear in some respects to be somewhat forced. It appears that the groups have been formed mainly in order to achieve critical mass in terms of PhD researchers rather than as the result of well articulated research goals. Nevertheless, this is likely to be a good strategy for the time being and it will definitely contribute to a necessary and fruitful scientific exchange, although further internal restructuring of the Unit may be expected in the future.
The Unit seems to be providing adequate support to its PhD students and is encouraging them to participate in international conferences and exchange programs with other research Units, both national and international. In particular, this Unit’s PhD students have recently organized a conference on the occasion of the 75th anniversary of CLUL and its precursor (Centro de Estudos Filológicos). The involvement of students in this type of activity must be seen as a reassuring sign of the students’ engagement in the fields of scientific research pursued by the Unit.
The Unit has very good results in its corpus, thesaurus (WordNet) and geo-linguistics work, but it appears to lack the necessary technical know-how to make that work effectively available to the scientific community. The addition of a technician or computational linguist capable of implementing databases and efficient web interfaces for these materials would probably lead to a significant boost of the Unit’s international visibility in these fields.
The phonetics and phonology group now has an adequate recording studio, but surprisingly a background vibration noise generated by the cooling system of computers installed in a neighbouring room can be heard in the studio. This is a strange and unacceptable situation that strongly diminishes the value of the investment in the recording studio. The panel did not examine the adjacent room where the source of the vibration noise appeared to be located, but we believe that it should be possible to significantly reduce the noise level by mounting the cooling system on adequate elastic suspension systems.
The Unit’s resources concerning the eye-tracking system available from former ONSET centre are also good but, as the group points out, the system should be upgraded. The panel suggests that the group considers, in view of that upgrade, alternative eye-tracking systems that do not require locking the subject’s head and would allow the group to include younger children and even infants in their future research involving eye-tracking measures.
Sobre os grupos de investigação
COMPGRAM - Computation of Lexicon and Grammar [RG-X-LIN-LVT-Lisboa-214-671]
The group has been active in building a WordNet for Portuguese (which is still rather small: 18’000 entries). But this group is really the centre of Portuguese WordNet activities. Unfortunately they are restricted in the distribution of the WordNet resource due to funding agreement limitations. This is regrettable since this resource could be of commercial value (e.g. to companies building Portuguese search engines), which could possibly finance future research and development by the group.
The group has been actively publishing at international conferences (incl. 1 Coling/ACL paper, 1 CICLing paper, 1 LREC paper). In fact this group shows a clear orientation towards international publications and international cooperation (e.g. through joint PhD supervision).
The young researchers in the team have actively participated in the development of the Portuguese WordNet, in building computational grammars (in the spirit of HPSG), and also in the publication of the research results at international conferences. Although the group does not report any finished PhD theses, 7 PhD theses are on-going of which 1 PhD thesis has already been submitted. At the same time the group is uncertain about the continuation of funding in the future.
The panel has the impression that the group is very restricted in terms of office space.
GEO - Geolinguistics [RG-X-LIN-LVT-Lisboa-214-673]
The ongoing projects are above all longitudinal atlas works of high standard (or very high standard): Linguistic and Ethnographic Atlas of Portugal and Galicia, ... of Azores, Linguistic Atlas of the Portuguese Coast, ... of the Romance Domain (Atlas linguistic Roman), Atlas linguistic d'Europe (ALE). The research group is also working with a syntax-oriented corpus of Portuguese dialects, Mirandese and Barranquenho. Good organization, interdisciplinary activities good as well. The research group has not organized any conferences 2003-2006, but the members have been visible at a lot of international conferences. Internationalization through participation in international projects, reviews in international journals and collaborative publications. But the internationalization could be strengthened.
No objections against the organization or the leadership. Culture of creativity, but it must be underlined that it's a small research group with great need for recruitment.
Great need for resources to build a photographic archive. Also need for more infrastructural resources (to build databases and to make the material available).
Grammar [RG-X-LIN-LVT-Lisboa-214-1534]
The group has a fairly large number of publications in peer-reviewed journals and other publications in English. They have also produced an impressive number of PhDs and Masters theses.
Language Engineering Laboratory [RG-X-LIN-LVT-Lisboa-214-1535]
This group takes a refreshingly engineering approach to building large scale linguistic resources for Portuguese. For example, they report on having a lexicon with about 120,000 lemmas.
The group has been surprisingly productive in light of the little (FCT) funding that they have received (17,000 Euros). They report 4 publications in peer reviewed journals (all of them in Lingvisticae Investigationes). The conference publications include 1 Corpus Linguistics paper, 1 COLING workshop paper and 1 EACL paper, all of which international conferences of medium to high standard.
Unfortunately the group's online tools (PoS tagger and morphological analyzer) that they advertised in the group report did not work.
LinSe - Linguistic Structures and Processing [RG-LIN-LVT-Lisboa-214-1538]
Linguistic Structures and Processing (LinSe) is a new research group, starting with the coming project period, and formed of three previous research groups: PhoCo, Grammar and the Psycholinguistics Laboratory. At the site visit, nine subprojects were described as planned for LinSe. The general topics of the subprojects focus on relatively traditional issues; how they are addressed will influence the extent to which they make a broader international contribution.
The LinSe research group is large, with 35 PhD researchers in addition to many non-PhD researchers. Given their many overlapping interests, the group has the potential to be highly productive in the subprojects that have been planned and are already funded. The Psycholinguistics Lab could be expected to be a natural unifying point for the new group, given it’s research on speech perception (connecting it to PhoCo) and sentence processing (connecting it to Grammar). Notably, of the 9 subprojects described at the site visit as planned for LinSe, only the previous Psycholinguistics and Grammar groups appear to be actively involved thematically and as subproject leaders.
Although the organization of subprojects is in place, the research group’s size will make it essential to also find a solid organizational and leadership structure, as well as a clear means for transferring information within the group. This may be especially important if, as it appears, the previous Psycholinguistics and Grammar groups will be taking on more of a leading position the newly formed LinSe research group.
During the site visit, it did not come through that this had yet been given much consideration.
Above, only "Equipment" and "Funding" are evaluated.
The Psycholinguistics laboratory is already using the software Eprime for preparing, running and gathering experimental responses. Based on the objectives of the new LinSe research group, this software will be in greater use. These activities could be inexpensively expanded and made more efficient by having more Eprime licenses and the PCs to use them on.
A Phonetics Lab has recently been built for good quality recordings and running perception experiments. Within the new LinSe research group, these facilities will be particularly relevant for the subproject on Articulatory Production Assessment Test for European Portuguese.
However, since installing the laboratory, a neighbouring room as become a storage area for computer equipment which emits a low frequency noise which is structurally transmitted to the new laboratory and disturbs the quality of recordings. To eliminate the interfering noise would require vibration isolation for the equipment in the neighbouring room. This investment is recommended to achieve the sound quality needed for the laboratory.
In connection with the "Summary of Individual Group Evaluation" below, no evaluation is given for "Productivity" or "Training" since this is a new group. The "Proposed overall rating of the group is based only on expected "Relevance" and "Feasibility”. For “Feasibility”, the new group on the one hand has very good feasibility for carrying out the projects they have planned based on the number of PhD researchers and
already having funding for the individual projects. On the other hand, unclear plans for coordinating and organizing activities in the group, leaves uncertainty. It’s on this basis that the “Proposed overall rating” is tipped in the direction of “good” rather than “very good”.
MATE - Modern and Ancient Texts' Study and Edition [RG-X-LIN-LVT-Lisboa-214-674]
The site visit was very informative. I found an ambitious group working with closely related disciplines as historical and romance linguistics, textual criticism, bibliography and book studies. More than 15 edited volumes in contemporary literature have been produced during the last twenty years; e.g. Fernando Pessoa and Jean Seul de Méluret. Also scientific work with the Occitan Poetry Electronic Edition; an electronic corpus of the Occitan troubadours' work in the feudal courts in Southern France, 12th to 13th centuries. Also other projects are run (historical study of Portuguese Personal and Place Names' lexicon, for example), and educational tools (posters to describe contact phenomena in Portuguese language) are produced as well. The interdisciplinary and outreach activities are of good standard. Not so many master thesis or PhD thesis completed during the period. International visibility good: international conference organization, network research groups, and other activities.
Research group well organized, a culture of discussion and interactivity where younger researchers are encouraged to participate.
Library resources very good (excellent), but the group felt need for a microfilm reader, for example. Funding for travel needed.
PhoCo (Speech Group) [RG-X-LIN-LVT-Lisboa-214-666]
“Phonological Connections” (PhoCo) is a strongly established group of researchers including 2 fulltime PhD researchers and 4 PhDs whose time is split between research and teaching. In the coming project period, the group will be integrated into the larger research group, LinSe.
The group’s objectives for the previous project period, as well as in the future LinSe research group, are on the broad relationship between phonology and phonetics, in particular addressing traditional prosodic issues linking phonological and syntactic structure. The group has taken steps toward developing teacher training materials and aims to contribute to text-to-speech and automatic speech recognition systems.
These are relevant aims given the number of researchers in the group and their experience. The 2003-2006 project period has lead to 2 internationally and 1 nationally peer-reviewed journal articles. Other publications include 3 nationally published book chapters. Their research includes European Portuguese and Mirandese, and extends to other languages, giving their research direct potential for national and international relevance, and the possibility for more extensive publication of their work, in particular in peer-reviewed journals.
A central activity during the last project period has been the construction of a speech laboratory. The group’s other activities include involvement in the organization of a major international conference (InterSpeech) in Lisbon, conferences for 2 other international organizations as well as numerous local meetings. The group has also participated in an international network on French Phonology and 4 international doctoral courses. Student training has produced 1 PhD and 1 MA within their group, and a co-supervision of 1 MA in Brazil.
During the site visit, the research group noted that students showed little interest and that there had been difficulties recruiting MA and PhD students. A strategy for addressing this was not formulated.
Above, "Facilities", "Library", "Technical support" and "Secretarial support" were not evaluated for the research group. "Equipment" and "Funding" are rated as "very good" relative to comparable research groups visited.
The group has recently built a phonetics laboratory for good quality recordings and running perception experiments. In the meantime, a neighbouring room as become a storage area for computer equipment which emits a low frequency noise which is structurally transmitted to the new laboratory and disturbs the quality of recordings. To eliminate the interfering noise would require vibration isolation for the equipment in the neighbouring room. This investment is recommended to achieve the sound quality needed for the laboratory.
Psycholinguistics Laboratory [RG-X-LIN-LVT-Lisboa-214-1537]
The Psycholinguistics Laboratory research group has objectives under the general topic of human processing of natural languages within which research has addressed relevant and interesting issues in first and second language acquisition of sentence-level processing, using eye tracking methods for reading and listening tests for spoken language perception. These objectives have been wide-reaching and, based on their production, difficult to address in detail for the relatively small research group (4 researchers with PhDs).
During this last project period the group has had one national peer-reviewed journal article, four international and 4 national conference presentations, as well as 2 updated book editions in Portuguese. They participated in coordinating an international conference on language acquisition and have had collaborations with researchers from Brazil and France. They have also produced 3 PhDs and 3 MAs.
During the coming project period, the research group is being integrated with the PhoCo and Grammar research groups, to form a new group on Linguistic Structures and Processing (LinSe). The current Psycholinguistics Lab group is expected to be a natural unifying point for the new group, given it’s research on speech perception (tying it to PhoCo) and sentence processing (tying it to Grammar).
The research group gives the impression of having had a comfortable organization with mutual respect between PhD researchers and students.
In particular, students appear to be actively involved in laboratory work, and technically knowledgeable about the experimental methods and laboratory facilities they use.
Above, "Facilities", "Library", and "Secretarial support" were not evaluated for the research group. "Equipment", "Technical support" and "Funding" are rated as good.
The group’s technical support has been achieved by integrating students training as technicians. This has the advantage of giving student solid technical experience. It has the disadvantage that it will be time consuming and may lead to retraining new students as student assistants complete their degrees.
The group’s laboratory activities could be inexpensively expanded by having several more licenses for Eprime, the software used for preparing, running and gathering experimental responses. The Psycholinguistics Lab group is already using the software and with the formation of the new LinSe research group and it’s planned objectives, the laboratory will be in greater use.
REPORT - Resources for the study of contemporary Portuguese [RG-X-LIN-LVT-Lisboa-214-672]
The group has been active in compiling and annotating (incl. text-to-sound alignment) a great variety of corpora for Portuguese. They have collected various specialized corpora (Politics, Law, Economics, Media etc. but also for 5 African Portuguese varieties). They have a special interest in multi-word expressions.
Some corpora collected by this group are large scale resources of several 100 million words. Others are smaller but still very interesting initiatives (like the corpus for various Portuguese varieties) that allow the comparison of Portuguese vocabulary and grammar in various parts of the world.
The group has been successful in attracting 10 research grants summing to ~500.000 Euros.
I think the group needs to get access to more know-how from computational linguistics to help them build better databases and more flexible access tools, and even include some teaching of relevant programming languages. Therefore I support their proposal to hire an appropriate Computational Engineer within the newly-founded SKAI group.
There is no finished MA reported and only 1 PhD thesis which was co-supervised with ILTEC.
The PhD students that I met were very motivated and enthusiastic about their work. They mentioned that their work on African Portuguese varieties has even helped to educate high school teachers who deal with immigrants from these countries.
SKAi - Sources, Knowledge and Modelling [RG-LIN-LVT-Lisboa-214-1841]
This is a new group that combines the groups COMPGRAM, REPORT, GEO and MATE. I expect synergies especially between the COMPGRAM and REPORT groups. But in order to achieve the ambitious goals set forth in the group report, they need more know-how from computational linguistics. Therefore I support their proposal to hire an appropriate Computational Engineer.
Furthermore the group should include work on parallel corpora and multilingual resources. I have mentioned this in the discussion with the members of the group.

Comentários da unidade

LIN-LVT-Lisboa-214

1. The publication of this evaluation report is certainly to be welcomed. It is based on our report, written 18 months ago, as well as on the panel’s visit dating back to more than a year. Meanwhile, the fusion of CLUL and ONSET, recommended by earlier evaluations, has been under way and has already led to considerable changes in the body then evaluated. We believe that this progressive operation can only be duly assessed over a longer time-span.

2. In due time we will take position on details of the evaluation report, which, altogether, reads as a positive and constructive opinion on CLUL’s performance. Its wording, its praises and even its balanced criticisms and recommendations do not anticipate what we must take as an unfavourable result, that is left unexplained.
A research unit used to be graded Excellent does not expect to be downgraded without a clear statement of motives, unless this is part of a policy of general readjustment to perceived international standards, in which case the discussion moves elsewhere. Since we do not know the evaluation reports of other research units in linguistics or closely related fields, we will leave this point unaddressed for the time being.

3. CLUL is described in the evaluation report as the unit with the best resources and infrastructures in its scientific area. We thus expect that this unfavourable result does not undermine the conditions for our sustained development.
Last August, 26 foreign PhD holders, some of senior status, applied for positions at CLUL under the Ciência 2008 program. They were attracted by the prospect of working at a centre of national excellence with international prestige. How their opinion compares with the results of the evaluation report is not entirely clear. How the report will impair CLUL’s chances of being awarded the four research positions it applied for is unfortunately clearer.

14th December 2008
Centro de Linguística da Universidade de Lisboa