Corpus

Corpus metadata

The CKCC corpus consists of various correspondences of scholars who were active in the Netherlands in the 17th-century. It currently consists of approximately 20,000 letters. The table below gives an overview of the correspondences.

Scholar	Gross # items	Net # items	Provider	Contact person of Digitized Correspondence
Caspar van Baarle (Barlaeus)	505	505	UvA	Frans Blom, Marjolein van Zuylen
Isaac Beeckman	28	21	Huygens ING	Huib Zuidervaart
René Descartes	727	727	Utrecht University	Erik-Jan Bos
Hugo de Groot (Grotius)	8034	8034	Huygens ING	Henk Nellen
Christiaan Huygens	3090	3080	Huygens ING	Huib Zuidervaart
Constantijn Huygens	7297	7119	Huygens ING	Ineke Huysman
Antoni van Leeuwenhoek	282	282	Utrecht University / Huygens ING	Lodewijk Palm / Huib Zuidervaart
Dirck Rembrantsz van Nierop	80	80	Huygens ING	Marlise Rijks, Huib Zuidervaart
Jan Swammerdam	172	172	Huygens ING	Eric Jorink

Total	20215	20020

Note. Strictly speaking not every item in the corpus is a letter. The editors of the correspondences have included some documents (like minutes of meetings, poems and mathematical proofs) they considered relevant. We have chosen to include all such documents.

Note. The terms ‘gross’ and ‘net’ refer to the numbers of items before and after removal of duplicates, respectively. Analyses are based on net numbers of items.

Note. The complete text of Grotius’ correspondence as edited in the Briefwisseling van Hugo Grotius (The Hague 1928-2001), including the extensive annotation, indexes and introductions to the reader, is accessible through the Internet.

Note. More and recenctly discovered letters, as well as facscimiles, transcriptions and references to other editions can be found in the complete edtion of the Correspondence of Constantijn Huygens which has been published online at the Huygens ING website.

Note. Further information about the Van Nierop and Swammerdam correspondences is available at the Digital Web Centre of the Huygens ING.

Letter format

The digitized correspondences originate from different sources, each using its own format. We have chosen to convert the letters to TEI, an XML-based format that is intended for scholarly resources. We found that even the letters that already used TEI-encoding were mainly encoded for rendering and not for indicating the structure of the letters. This made it necessary to do extensive preprocessing of the letters before they could be imported into our database. We have made up requirements for new material to be used in the CKCC project; we ask contributers to the project to comply with these requirements.

Metadata

Each letter in the CKCC corpus has been assigned metadata. The metadata set consists of a letter id, date of sending, sender, sender location, recipient, and recipient location; see the documentation for a more detailed description.

The source materials and the analyzed data are archived according to three metadata standards: Dublin Core, EAD and CMDI. The archived data will be available for further research outside of the ePistolarium.

Issues with the data of the corpus

The letters in the corpus stem from different sources and have been made available in different formats. The following issues were observed while preparing the data.

Letters not uniformly available, i.e. there are problems with formatting of letter texts:
– No uniform format (TEI, XML, MS Word)
– Encoded for rendering, not for logical structure
Problems with metadata:
– Incomplete
– Divergent quality
– Different identifiers for persons and places per corpus
Multilingual, also within letters
Spelling variation
Synonym variation
Elaborate opening and closing phrases

To demonstrate the level of spelling variation: the name Christiaan Huygens van Zuylichem is spelled in at least 320 different ways.

The table below shows the language variation within the corpus (percentage of items).

Scholar	Dutch	Latin	French	Other	N.A.
Caspar van Baarle	0.0	100.0	0.0	0.0	0.0
Isaac Beeckman	3.7	96.3	0.0	0.0	0.0
René Descartes	0.4	14.3	85.0	0.0	0.3
Hugo de Groot	25.9	58.4	11.6	4.0	0.0
Christiaan Huygens	7.8	25.9	63.0	3.3	0.0
Constantijn Huygens	64.9	6.5	24.6	1.3	2.7
Antoni van Leeuwenhoek	92.3	4.2	0.4	3.1	0.0
Dirck Rembrantsz van Nierop	100.0	0.0	0.0	0.0	0.0
Jan Swammerdam	57.6	16.9	25.6	0.0	0.0
Total	37.1	32.8	26.5	2.6	1.0

Note. The relatively large number of letters that cannot be assigned a language for the Constantijn Huygens correspondence is due to the nature of the Worp edition, which often summarizes parts of the French text in Dutch.

Documentation

Click on the links below to download the documents.

CKCC letter format (pdf);
CKCC letter metadata (pdf).

Circulation of Knowledge and Learned Practices in the 17th-century Dutch Republic

A Web-based Humanities’ Collaboratory on Correspondences.