Corpus Linguistics
Research Centres
Projects
Events
Mailing Lists
Tutorials
Corpora
Software
CL in Applied Linguistics

You are now in section > Corpus Linguistics > Projects

Alembic Workbench
Brown University Women Writers Project
HCRC Project
ICE


ALEMBIC Workbench

Coordination: MITRE NL Group
Design/Purpose: The Alembic Workbench project has as its goal the creation of a natural language engineering environment for the development of tagged corpora

Time:

Newsletter: no; click here for the project overview
Notes: If you agree to the ALEMBIC software licence, the tools are available free of charge.

Brown University Women Writers Project 

Coordination: Brown University
Design/Purpose: This project is creating a full-text database of women's writing in English from the period 1330-1830

Time:

ongoing
Newsletter: 1998 and previous issues
Notes: SGML Markup according to the TEI guidelines

HCRC Project: ECI: European Corpus Initiative

Coordination: HCRC - Human Communication Research Centre
Design/Purpose: "The aim was to produce a reasonably large text corpus of the major European languages for the linguistic research community. It is generally agreed that there are not enough corpora in languages other than English."

Time:

1/9/92 - 14/5/93
Newsletter: no
Notes: "The ECI/MCI corpus has now been published on CD-ROM, and contains almost 100 million words in 27 (mainly European) languages. It consists of 48 component corpora marked up in SGML, with easy access to the source text without markup. 12 of the component corpora are multilingual parallel corpora with from two to nine sub-corpora"

Global English Monitor Corpus 

Coordination: The Survey of English Usage.
Design/Purpose: "Research into syntax, morphology, vocabulary, discourse, phonetics and phonology. The variation within and across components will prove of particular interest to sociolinguists, and will have applications in English language teaching, language planning, and natural language processing"

Time:

started in 1990
Newsletter: 1997; 1998
Notes: All components underlie a Common Corpus Design and Annotation Scheme

ICE - International Corpus of English 

Coordination: The Survey of English Usage.
Design/Purpose: "Research into syntax, morphology, vocabulary, discourse, phonetics and phonology. The variation within and across components will prove of particular interest to sociolinguists, and will have applications in English language teaching, language planning, and natural language processing"

Time:

started in 1990
Newsletter: 1997; 1998
Notes: All components underlie a Common Corpus Design and Annotation Scheme

You are now in section > Corpus Linguistics > Projects

Data-driven learning
Virtual Resources
Bibliography
Email
About

webmaster@corpus-linguistics.de