Information Extraction with Character-level Neural Networks and Noisy Supervision

13 December 2016

Abstract

We present an architecture for information extraction from text that augments an existing parser with a character-level neural network. To train the neural network, we compute a measure of consistency of extracted data with existing databases, and use it as a form of noisy supervision. Our architecture combines the ability of constraint-based information extraction system to easily incorporate domain knowledge and constraints with the ability of deep neural networks to leverage large amounts of data to learn complex features. The system led to large improvements over a mature and highly tuned constraint-based information extraction system used at Bloomberg for financial language text. At the same time, the new system massively reduces the development effort, allowing rule-writers to write high-recall constraints while relying on the deep neural network to remove false positives and boost precision.

View on arXiv

Comments on this paper