... Each sample is selected with- out replacement.2. Train N copies of the parsing algorithm A,each with one of the samples.3. Parse the test set with each of the N models.4. For each test sentence, ... clutter, for the filter f-score measure we use the maximumrecall (MR) baseline rather than the minimum length(ML) baseline, since the former outperforms the lat-ter. Thus, ML is only shown for the ... the number ofmodels N and the sample size S increase, and with T . Therefore, for k = 85 . . . 100 we show the valueof filter f-score with parameter k when the parame-ters configuration is a relatively...