The MERLIN corpus

The MERLIN corpus contains 2,286 texts for learners of Italian, German and Czech that were taken from written examinations of acknowledged test institutions. The exams aim to test knowledge across the levels A1-C1 of the Common European Framework of Reference (CEFR).

Origin of the texts

The corpus comprises written prodcutions from standardized high-quality language tests from telc Frankfurt (for German and Italian) and the Test centre of the Institute of Language and Preparatory Studies (ÚJOP) of the Charles University in Prague (for Czech).

The tasks are systematically related to the Common European Framework of Reference for Languages (CEFR). They were in use until 2013 and are now freely available on this platform.

The relation to the Common European Framework of Reference - the MERLIN rating grid

To ensure an immediate relation to the CEFR, specially trained testers re-rated all exam texts using the MERLIN MERLIN rating grid that was developed within the project. 

The reliability of the ratings was subjected to rigorous statistical verification procedures to correct rating tendencies (e.g. leniency/harshness). As a result, a reliable rating profile has been created for each text in the corpus. The profile reflects both a general holistic overall level and the individual rating criteria detailed below:

  • general linguistic range
  • vocabulary range
  • vocabulary control
  • grammatical accuracy
  • coherence and cohesion
  • sociolinguistic appropriateness
  • command of orthography

The page MERLIN for research goes into more detail about the procedure of the re-ratings.

Test tasks

In the following, a comprehensive overview and detailed description of all test tasks which form the basis of the written test productions – the MERLIN texts – is provided. The linked PDF documents contain detailed information about the tasks, a brief description of the test parameters, and the specific characteristics of the intended text, e.g. regarding topic, register, domain.

Hint: In square brackets are the short names of the tasks as you find them in the file name of the MERLIN texts.

German

A1

pdf [apartment-request] Informal e-mail: ask a friend for help with finding an apartment
pdf [swimming appointment] Informal e-mail: arrange an appointment with a friend
pdf [congratulation] Informal letter: congratulate to birth of a child

 

A2

pdf [housing office-enquiry] Formal letter to housing office
pdf [pet sitting-request] Informal letter: ask friend to take care of pet
pdf [ticket-offer] Informal letter: offer a ticket not used to a friend

 

B1

pdf [New Year-letter] Informal letter for New Year to a friend
pdf [visit-letter] Informal letter to a friend announcing a visit
pdf [birthday-letter] Informal letter: birthday congratulations

 

B2

pdf [Au pair Agency-enquiry] Formal letter: ask for information at Au pair Agency
pdf [Au pair Agency-complaint] Formal letter: Au pair writes letter of complaint to Agency
pdf [internship-application] Formal letter: apply for internship in sales department

 

C1

pdf [internship-application] Essay: why it's of value to learn German
pdf [learn German-essay] Online article: sticking to one's traditions and "assimilation" in a new environment
pdf [integration issues-essay] Report about the housing situation

 

Italian

A1

pdf [appointment] Informal e-mail: reschedule an appointment
pdf [job search-advice] Informal e-mail: help a friend who is looking for work

 

A2

pdf [see a friend]Informal letter: go see a friend
pdf [contact a friend] Informal letter: contact a friend after a long time
pdf [language course-advice] Informal letter: inform friends about language course

 

B1

pdf [language course-enquiry] Formal letter: inform oneself about language course
pdf [cook with teacher] Informal letter: cook with teacher
pdf [wedding invitation] Informal letter: answer to a wedding invitation
pdf [job search-advice] Informal letter: help a friend who is looking for work after school-leaving exam

 

B2

pdf [chat-advice] Informal letter: help someone who has problems with chats
pdf [language learning] Formal letter: describe experiences with language learning
pdf [hotel-complaint] Formal letter: complaining against a hotel
pdf [cooking evening-enquiry] Formal letter: ask for information about International Cooking Evenings
pdf [aid project-enquiry]Formal letter: inform oneself about an aid project
pdf [internship-application] Formal letter: apply for an internship in a company
pdf [internship fashion-application] Formal letter: apply for an internship in fashion sector

 

Czech

A2

pdf [birthday invitation] Informal e-mail: answering a birthday invitation
pdf [swim in the sea-description] Description of a photo: swimming in the sea
pdf [hotel-enquiry] Formal e-mail to a hotel
pdf [playground-description] Description of a photo: Spielplatz
pdf [photo with woman-description] Description of a photo: Frau am Fenster

 

B1

pdf [invitation-letter] Informal e-mail: answer to the email of Alena, a friend
pdf [future plans-letter] Informal e-mail: answer to the email of Jana, a friend
pdf [Tandem agency-enquiry] Informal e-mail: Information request, e-mail to a Tandem agency

 

B2

pdf [saying doma nejlepe-essay] Essay: Everywhere well but at home the best
pdf [proverb kolace-essay] Essay: No pain no gain
pdf [proverb v nouzi-essay] Essay: A friend in need is a friend indeed
pdf [proverb vic hlav-essay] Essay: More people know more
pdf [saying skola-essay] Essay: School is the basis of life
pdf [proverb saty-essay] Essay: Clothes make the man

General notes on task description

  • The level of the test may differ from the level that the learner text received in the re-ratings.
  • The description of tasks is based on a grid that was developed fro these purposes by ALTE (Association of Language Testers in Europe). Please find more information on the grid in pdf this document.
  • The author of the task descriptions is Olaf Bärenfänger. Please cite the task descriptions as: MERLIN project, task description: <name of the task>, 2014, http://merlin-platform.eu

Available metadata

Each text in the corpus is described with the following metadata. These details can be found in the header of individual text files.

  • Information about the author: age, gender, mother tongue (L1)
  • Information about the text: task ID and topic of the test taks, CEFR level of the test the written production was extracted from
  • Overall rating: CEFR level the test recieved in the re-rating (fair rating)
  • Single rating criteria: erreichtes GER-Niveau nach: general linguistic range | vocabulary range | vocabulary control | grammatical accuracy | coherence | sociolinguistic appropriateness | orthogaphy

For a comprehensive overview of the texts and the metadata associated with them, you can refer to the table txt metadata_ratings_indicators.cvs. It also covers, for each corpus text, numerous indicators targeting L2 features, as well as lexical, morphological, and syntactic complexity measures (for the German corpus).

The MERLIN corpus in figures

The following charts show the total number of texts at a given CEFR level and the amount of the annotations. The overviews also allow for a comparison of test level and actually rated level.

Number of texts per CEFR level

corp stat all

dunkler Balken Number of texts per test level (total number of texts in the corpus: 2286)

heller Balken Number of text per level assigned in the re-rating (total: 2265)

Number of texts per CEFR level an language

German

corp stat de

Italian

corp stat IT

Czech

corp stat CS

Number of texts per annotation layer

Texts per annotation layer