... based onearly development data experiments. We did notrun this experiment on the CNN portion of the data, because the CNN data was already beingused as the extra NER data. As Table 2 shows, the ... training data should always improveperformance, this work is the first to our knowl-edge to incorporate singly-annotated data into ajoint model, thereby providing a method for thisadditional data, ... the observed data, and the two models have different data, theywill have somewhat different grammars. In our hi-erarchical joint model, we added all observed rulesfrom the joint data (stripped...