Span Based Open Information Extraction
- LRM

Open information extraction (Open IE) is achallenging task especially due to its brittledata basis. Most of Open IE systems have to betrained on automatically built corpus and eval-uated on inaccurate test set. In this work, wefirst alleviate this difficulty from both sides oftraining and test sets. For the former, we pro-pose an improved model design to more suffi-ciently exploit training dataset. For the latter,we present our accurately re-annotated bench-mark test set (Re-OIE6) according to a seriesof linguistic observation and analysis. Then,we introduce a span model instead of previousadopted sequence labeling formulization forn-ary Open IE. Our newly introduced modelachieves new state-of-the-art performance onboth benchmark evaluation datasets.
View on arXiv