Personal information

Digital Humanities, Research Infrastructures, Natural Language Processing, Dictionaries, Text Corpora
Germany

Activities

Employment (2)

Saxon Academy of Sciences in Leipzig: Leipzig, DE

2020-02-01 to present | Scientific staff Digital Humanities (KompetenzwerkD)
Employment
Source: Self-asserted source
Dirk Goldhahn

Leipzig University: Leipzig, Sachsen, DE

2020-01-31 (Natural Language Processing Group - Computer Science Department)
Employment
Source: Self-asserted source
Dirk Goldhahn

Education and qualifications (1)

Leipzig University: Leipzig, Sachsen, DE

2013-12-11 | Dr. rer. nat. (Computer Science)
Education
Source: Self-asserted source
Dirk Goldhahn

Works (50 of 53)

Items per page:
Page 1 of 2

Crawling Under-Resourced Languages - A Portal for Community-Contributed Corpus Collection

Proceedings of the 1st Workshop on Dataset Creation for Lower-Resourced Languages (DCLRL) @LREC202
2022-06-24 | Journal article
Contributors: Dirk Goldhahn
Source: Self-asserted source
Dirk Goldhahn

Enriching and Increasing the Usability of Lexicographical Data for Less-Resourced Languages

Selected Papers from the CLARIN Annual Conference 2019
2020 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Typical Sentences as a Resource for Valence

Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020), Marseille (France)
2020 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Usability and Accessibility of Bantu Language Dictionaries in the Digital Age: Mobile Access in an Open Environment

First workshop on Resources for African Indigenous Languages (RAIL) at the 12th Language Resources and Evaluation Conference (LREC 2020), Marseille (France)
2020 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Corpus-based Extraction of Word Relations from an Afrikaans Corpus

Workshop of the African Association for Lexicography (AFRILEX), Windhoek, Namibia
2019 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Enriching Lexicographical Data for Lesser Resourced Languages: A Use Case

Proceedings of CLARIN Annual Conference 2019. Eds. K. Simov and M. Eskevich. Leipzig, Germany: CLARIN
2019 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Frekwensiewoordeboek van Afrikaans - A new Frequency Dictionary for Afrikaans

Workshop of the African Association for Lexicography (AFRILEX), Windhoek, Namibia
2019 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

OSIAN: Open Source International Arabic News Corpus - Preparation and Integration into the CLARIN-infrastructure

Proceedings of the Fourth Arabic Natural Language Processing Workshop
2019 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

The Null Result Portal

Metadata and Semantic Research
2019 | Book chapter
Part of ISBN: 9783030365981
Part of ISBN: 9783030365998
Part of ISSN: 1865-0929
Part of ISSN: 1865-0937
Source: Self-asserted source
Dirk Goldhahn

Translation-based Dictionary Alignment for Under-resourced Bantu Languages

OpenAcess Series in Informatics (OASIcs), Vol. 70: Language Data and Knowledge LDK 2019
2019 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Automation, Management and Improvement of Text Corpus Production

6th Workshop on the Challenges in the Management of Large Corpora at the 11th Language Resources and Evaluation Conference (LREC 2018), Miyazaki (Japan)
2018 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Convergent development of digital resources for West African Languages CCURL 2018-Convergent development of digital resources for West African Languages

Unpublished
2018 | Journal article
Source: Self-asserted source
Dirk Goldhahn

Cross-Language Dictionary Alignment for Bantu Languages

Workshop of the African Association for Lexicography (AFRILEX), 20th International Congress of Linguists (ICL20), Cape Town, South Africa
2018 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Frequency Dictionary Vietnamese - Từ điển tần số xuất hiện các từ trong tiếng Việt

2018 | Book
Source: Self-asserted source
Dirk Goldhahn

Preparation and Usage of Xhosa Lexicographical Data for a Multilingual, Federated Environment

Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki (Japan)
2018 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Using Linked Data Techniques for Creating an IsiXhosa Lexical Resource - a Collaborative Approach

CLARIN Annual Conference 2018 in Pisa, Italy
2018 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

A Portal for Corpus Collection for Under-Resourced Languages

Workshop of the African Association for Lexicography (AFRILEX), CLASA 2017, Grahamstown
2017 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Digital Muqtabas CTS Integration in CLARIN

CLARIN Annual Conference 2017 in Budapest, Hungary
2017 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Frequency Dictionary Russian - Частотный словарь русского языка

2017 | Book
Source: Self-asserted source
Dirk Goldhahn

Using Corpus Query Engines for Facilitating Lexicographical Analysis of African Languages

Workshop of the African Association for Lexicography (AFRILEX), CLASA 2017, Grahamstown, South Africa
2017 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Integrating Canonical Text Services into CLARIN's Search Infrastructure

Linguistics and Literature Studies
2017-03 | Journal article
Part of ISSN: 2331-642X
Part of ISSN: 2331-6438
Source: Self-asserted source
Dirk Goldhahn

Canonical Text Services in CLARIN - Reaching out to the Digital Classics and beyond

CLARIN Annual Conference 2016
2016 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Corpus collection for under-resourced languages with more than one million speakers

Proc. of Collabo ration and Computing for UnderResourced Languages: Towards an Alliance for Digital Language Diversity (CCURL)
2016 | Journal article
Source: Self-asserted source
Dirk Goldhahn

Bewertung durch Adjektive: Ansätze einer korpusgestützten Untersuchung zur Synonymie

2015 | Journal article
OTHER-ID:

7185659

Source: Self-asserted source
Dirk Goldhahn via MLA International Bibliography

Operationalisation of research questions of the humanities within the CLARIN Infrastructure--An Ernst Jünger Use Case

Proceedings of CLARIN Annual Conference
2015 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Was sind IT-basierte Forschungsinfrastrukturen für die Geistes- und Sozialwissenschaften und wie können sie genutzt werden?

Information - Wissenschaft & Praxis
2015-11-01 | Journal article
Part of ISSN: 1619-4292
Part of ISSN: 1434-4653
Source: Self-asserted source
Dirk Goldhahn

A 500 Million Word POS-Tagged Icelandic Corpus

Unpublished
2014 | Journal article
Source: Self-asserted source
Dirk Goldhahn

Building Large Resources for Text Mining: The Leipzig Corpora Collection

Text Mining
2014 | Book chapter
Part of ISBN: 9783319126548
Part of ISBN: 9783319126555
Part of ISSN: 2192-032X
Part of ISSN: 2192-0338
Source: Self-asserted source
Dirk Goldhahn

Corpus-based linguistic typology: a comprehensive approach

KONVENS
2014 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Frequency Dictionary Esperanto - Oftecvortaro de Esperanto

2014 | Book
Source: Self-asserted source
Dirk Goldhahn

Large Arabic Web Corpora of high quality: the dimensions time and origin

Workshop on Free/Open-Source Arabic Corpora and Corpora Processing Tools Workshop Program
2014 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Top-Level Domain Crawling for Producing Comprehensive Monolingual Corpora from the Web

Proceedings of the LREC-14 workshop on Challenges in the Management of Large Corpora (CMLC-2)
2014 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Frequency Dictionary Hungarian - Magyar gyakorisági szótár

2013 | Book
Source: Self-asserted source
Dirk Goldhahn

Quantitative Methoden in der Sprachtypologie: Nutzung korpusbasierter Statistiken

2013 | Dissertation or Thesis
Source: Self-asserted source
Dirk Goldhahn

Scalable Construction of High-Quality Web Corpora.

J. Lang. Technol. Comput. Linguistics
2013 | Journal article
Source: Self-asserted source
Dirk Goldhahn

Technical Report Series on Corpus Building

2013 | Report
Source: Self-asserted source
Dirk Goldhahn

Scalable Construction of High-Quality Web Corpora

Journal for Language Technology and Computational Linguistics
2013-07-01 | Journal article
Contributors: Chris Biemann; Felix Bildhauer; Stefan Evert; Dirk Goldhahn; Uwe Quasthoff; Roland Schäfer; Johannes Simon; Leonard Swiezinski; Torsten Zesch
Source: check_circle
Crossref

and Brain Sciences Unit, UK

2012 | Other
URI:

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.270.2030

Source: Self-asserted source
Dirk Goldhahn

Building large monolingual dictionaries at the leipzig corpora collection: From 100 to 200 languages

2012 | Other
URI:

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.671.5530

Source: Self-asserted source
Dirk Goldhahn

Finding Language Universals: Multivariate Analysis of Language Statistics using the Leipzig Corpora Collection

Leuven Statistics Days 2012, Leuven, Belgium
2012 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Frequency Dictionary English

2012 | Book
Source: Self-asserted source
Dirk Goldhahn

Language statistics-based quality assurance for large corpora

Proceedings of Asia pacific corpus linguistics conference
2012 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Perception of Words and Pitch Patterns in Song and Speech

Frontiers in Psychology
2012 | Journal article
Part of ISSN: 1664-1078
Source: Self-asserted source
Dirk Goldhahn

Perception and covert production of song and speech: New findings using multi-voxel pattern analysis

17th Annual Meeting of the Oranization on Human Brain Mapping, Quebec City, Canada
2011 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Song and speech - perception and covert production: New findings using multi-voxel pattern analysis

19th Scientific Meeting & Exhibition of the International Society for Magnetic Resonance in Medicine (ISMRM), Montreal, Canada
2011 | Conference paper
Source: Self-asserted source
Dirk Goldhahn

Eigenvector centrality mapping for analyzing connectivity patterns in fMRI data of the human brain.

2010 | Journal article
URI:

https://doi.org/10.1371/journal.pone.0010232

Source: Self-asserted source
Dirk Goldhahn

Resting developments: a review of fMRI post-processing methodologies for spontaneous brain activity

Magnetic Resonance Materials in Physics, Biology and Medicine
2010-12 | Journal article
Part of ISSN: 0968-5243
Part of ISSN: 1352-8661
Source: Self-asserted source
Dirk Goldhahn

Automatic Annotation of Co-Occurrence Relations

Conference paper
URI:

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.658.6088

Source: Self-asserted source
Dirk Goldhahn

Digital Infrastructure for Morpho-syntactic Analysis of Under-Resourced Languages: A Case Study for Luganda

BOOK OF ABSTRACTS
Conference paper
Source: Self-asserted source
Dirk Goldhahn

Large Web Corpora of High Quality for Indian Languages

Workshop Programme
Conference paper
Source: Self-asserted source
Dirk Goldhahn
Items per page:
Page 1 of 2