Corpus

The CKCC corpus consists of various correspondences of scholars who were active in the Netherlands in the 17th-century. It currently consists of approximately 20,000 letters. The table below gives an overview of the correspondences.

 

Scholar Gross # items Net # items Provider Contact person of Digitized Correspondence
Caspar van Baarle (Barlaeus) 505 505 UvA Frans Blom, Marjolein van Zuylen
Isaac Beeckman 28 21 Huygens ING Huib Zuidervaart
René Descartes 727 727 Utrecht University Erik-Jan Bos
Hugo de Groot (Grotius) 8034 8034 Huygens ING Henk Nellen
Christiaan Huygens 3090 3080 Huygens ING Huib Zuidervaart
Constantijn Huygens 7297 7119 Huygens ING Ineke Huysman
Antoni van Leeuwenhoek 282 282 Utrecht University / Huygens ING Lodewijk Palm / Huib Zuidervaart
Dirck Rembrantsz van Nierop 80 80 Huygens ING Marlise Rijks, Huib Zuidervaart
Jan Swammerdam 172 172 Huygens ING Eric Jorink
Total 20215 20020

 

Note. Strictly speaking not every item in the corpus is a letter. The editors of the correspondences have included some documents (like minutes of meetings, poems and mathematical proofs) they considered relevant. We have chosen to include all such documents.

Note. The terms ‘gross’ and ‘net’ refer to the numbers of items before and after removal of duplicates, respectively. Analyses are based on net numbers of items.

Note. The complete text of Grotius’ correspondence as edited in the Briefwisseling van Hugo Grotius (The Hague 1928-2001), including the extensive annotation, indexes and introductions to the reader, is accessible through the Internet.

Note. More and recenctly discovered letters, as well as facscimiles, transcriptions and references to other editions can be found in the complete edtion of the Correspondence of Constantijn Huygens which has been published online at the Huygens ING website.

Note. Further information about the Van Nierop and Swammerdam correspondences is available at the Digital Web Centre of the Huygens ING.

 

 

 

 

 

Letter format

The digitized correspondences originate from different sources, each using its own format. We have chosen to convert the letters to TEI, an XML-based format that is intended for scholarly resources. We found that even the letters that already used TEI-encoding were mainly encoded for rendering and not for indicating the structure of the letters. This made it necessary to do extensive preprocessing of the letters before they could be imported into our database. We have made up requirements for new material to be used in the CKCC project; we ask contributers to the project to comply with these requirements.

 

Metadata

Each letter in the CKCC corpus has been assigned metadata. The metadata set consists of a letter id, date of sending, sender, sender location, recipient, and recipient location; see the documentation for a more detailed description.

The source materials and the analyzed data are archived according to three metadata standards: Dublin Core, EAD and CMDI. The archived data will be available for further research outside of the ePistolarium.

 

Issues with the data of the corpus

The letters in the corpus stem from different sources and have been made available in different formats. The following issues were observed while preparing the data.

  • Letters not uniformly available, i.e. there are problems with formatting of letter texts:
    - No uniform format (TEI, XML, MS Word)
    - Encoded for rendering, not for logical structure
  • Problems with metadata:
    - Incomplete
    - Divergent quality
    - Different identifiers for persons and places per corpus
  • Multilingual, also within letters
  • Spelling variation
  • Synonym variation
  • Elaborate opening and closing phrases

To demonstrate the level of spelling variation: the name Christiaan Huygens van Zuylichem is spelled in at least 320 different ways.

 

The table below shows the language variation within the corpus (percentage of items).

 

Scholar Dutch Latin French Other N.A.
Caspar van Baarle 0.0 100.0 0.0 0.0 0.0
Isaac Beeckman 3.7 96.3 0.0 0.0 0.0
René Descartes 0.4 14.3 85.0 0.0 0.3
Hugo de Groot 25.9 58.4 11.6 4.0 0.0
Christiaan Huygens 7.8 25.9 63.0 3.3 0.0
Constantijn Huygens 64.9 6.5 24.6 1.3 2.7
Antoni van Leeuwenhoek 92.3 4.2 0.4 3.1 0.0
Dirck Rembrantsz van Nierop 100.0 0.0 0.0 0.0 0.0
Jan Swammerdam 57.6 16.9 25.6 0.0 0.0
Total 37.1 32.8 26.5 2.6 1.0

 

Note. The relatively large number of letters that cannot be assigned a language for the Constantijn Huygens correspondence is due to the nature of the Worp edition, which often summarizes parts of the French text in Dutch.

 

Documentation

Click on the links below to download the documents.

  • CKCC letter format (pdf);
  • CKCC letter metadata (pdf).