Lmst

Submission deadline for the *Workshop on Citation Extraction and Parsing (CiteX 2026)* at DIPF Frankfurt has been moved to 1 February.

CiteX 2026 offers an interdisciplinary forum on automated citation extraction and parsing.

We invite submissions of extended abstracts (1250–1500 words) for presentations, posters, or hands-on sessions.
Submission via our website: https://sites.google.com/view/workshop-on-citation-extractio/startseite

• Submission deadline: 01 February 2026
• Notification of acceptance: 1 March 2026
• Camera-ready version: 31 March 2026

Topics of interest include (but are not limited to):
• Automated extraction and parsing of references
• Creation and sharing of gold standards and test datasets
• Standardization and interoperability of citation data
• QA and validation of extracted references
• Comparison of LLM-based and tool-based extraction pipelines

#ReferenceExtraction #CfP #Frankfurt

@osma @storytracer Hi-just found this old thread - we're just working on a #referenceextraction & #evaluation workflow involving #LLMs to measure their performance using a hand-annotated dataset of older scholarly articles with #footnotes . Untrained #GROBID performs very badly but that does not mean that it will when properly trained with a good dataset.

Do you want to run the #GROBID PDF-to-#TEI conversion library/server with #Apptainer, for example for #ReferenceExtraction? There was a problem converting the #Docker image, but here's how to solve the problem: https://github.com/kermitt2/grobid/issues/1150#issuecomment-2350942263

(Hybrid) Workshop: Extracting Heterogeneous Reference Data, 15/16 May 2023, #mpilhlt Frankfurt/M., Germany.

Registration is open, programme is online: https://mpilhlt.github.io/reference-extraction/workshop-2023/programme/

Interested in extracting literature references from historical texts, scholarly literature in the humanities, documents in low-resourced languages? Want to see how CRF-based approaches compare to LLM ones? Want to make sure the challenges you are struggling with are on developers' roadmaps? Want to learn about some use cases?

Then please join us in the workshop.

#DigitalHumanities #NLP #NaturalLanguageProcessing #ReferenceExtraction #LLM #Bibliometrics

Call for Participations: (Hybrid) Workshop on Extracting Heterogeneous Reference Data, 15/16 May 2023, #mpilhlt Frankfurt/M., Germany.

Interested in extracting literature references from historical texts, scholarly literature in the humanities, documents in low-resourced languages? Want to apply your language model to a new use case and enjoy the gratitude of dozens of humanities, law and social sciences scholars? Have a use case or training data?
Please have a look at our CfP:

https://mpilhlt.github.io/reference-extraction/workshop-2023/cfp

#DigitalHumanities #NLP #NaturalLanguageProcessing #ReferenceExtraction #LLM #Bibliometrics

#ReferenceExtraction

Client Info