Thursday, February 9, 2012
4:00 PM
Linguamatics in Cambridge is taking part in a project to address challenges faced by automated language processing software in harnessing diverse data sources.
It has teamed up with Brandwatch and the University of Sussex for the project, which is funded by the Technology Strategy Board.
The aim is to improve automatic extraction of information from scientific papers, news or social media for applications in research and development, marketing and competitive intelligence.
“Good-quality vocabularies are a key part of ‘intelligent’ text mining,” said Linguamatics chief technology officer Dr David Milward.
“This project will allow us to develop vocabularies much faster, and adapt them efficiently for new applications.”
The current generation of language processing has had considerable success in extracting useful information from unstructured text – whether it is research literature or social media.
However, adapting to a new domain is often a laborious process, with respect both to the type of data and to the terminology used in a given domain.
Humans can perform these tasks on small data sets but face a huge challenge in the face of massively increasing amounts of electronic text.
Linguamatics, based at St John’s Innovation Centre, specialises in deploying natural language processing (NLP)-based text mining for complex, high-value problem solving.
0 comments