Total texts
10Folklore Lab
Data-driven monitoring of corpus annotation coverage, structure, and classification quality.
This page is an analytics lab inspired by proven workflows from international folklore corpora.
KZ+EN coverage
10ATU-linked texts
10Metadata-linked texts
10Citation-ready texts
10Documents
0Tokens
0Unique terms
0Hapax (single use)
0TTR
0%Average doc length
0Voyant-style text analytics
Most frequent terms
Not enough data to render.
Context snippets (KWIC)
Not enough data to render.
Frequent phrases (bi-grams)
Not enough data to render.
Analytical views
Term frequency chart
Not enough data to render.
Term trends (per 1000 tokens)
Not enough data to render.
Document length distribution
Not enough data to render.
Collection timeline (decades)
Not enough data to render.
ATU distribution (Top 12)
Not enough data to render.
Genre profile
Not enough data to render.
Regional map
Not enough data to render.
Collector activity
| Collector | Texts | First year | Last year |
|---|---|---|---|
| Айбек Н. | 1 | 1975 | 1975 |
| Әлихан Қ. | 1 | 1958 | 1958 |
| Гүлнар Е. | 1 | 1951 | 1951 |
| Данияр Т. | 1 | 1938 | 1938 |
| Ермек Р. | 1 | 1980 | 1980 |
| Марат С. | 1 | 1962 | 1962 |
| Нұрбек И. | 1 | 1968 | 1968 |
| Рауан Б. | 1 | 1949 | 1949 |
| Сабина Ө. | 1 | 1943 | 1943 |
| Салтанат Ж. | 1 | 1971 | 1971 |
Metadata field coverage
| Field | Linked texts |
|---|---|
| Тақырып | 10 |
| Орындау контексі | 10 |
Comparative benchmark with external corpora
| Corpus | Reference feature | Our adoption | Status | Source |
|---|---|---|---|---|
| AFT Corpus | Structured tale typing with ATU classes. | ATU distribution and linkage metrics are active. | Implemented | Open Humanities Data |
| SKVR (Finnish Literary Society) | Faceted filtering with export options (XML/CSV). | Faceted corpus exploration is implemented through filters and analytics tables. | Implemented | skvr.fi |
| Kivike (Estonian Literary Museum) | Rich metadata discovery by archive, geography, and person. | Coverage monitoring across passport and metadata layers is implemented. | Implemented | kivike.kirmus.ee |
| Pangloss Collection (CNRS) | Open linguistic audio archives with linked transcriptions. | Next step: integrate audio/ELAN timeline layers. | Next phase | CNRS |
| Meertens FACT | Automatic metadata enrichment and folktale classification. | Next step: automatic ATU/motif suggestion tooling. | Next phase | Meertens Institute |