Brown corpus
brown.Rd
The Brown corpus in tabular format tokenized and pos-tagged as distributed on https://www.nltk.org/nltk_data/. Headings and sentence boundaries are currently not preserved.
Format
A data frame with five variables: genre_id
, doc_id
,
sentence_id
, word
, pos
; and two string attributes: contents
and
readme
Details
For documentation, see http://korpus.uib.no/icame/brown/bcm.html. The the raw README and CONTENTS files are also included as attributes.