About the tool

Here you can learn more about the text analysing tools available for you on this website:

Tokenization is a way of separating a piece of text into smaller units called tokens. Here, tokens can be either words, characters, or subwords.
Hence, tokenization can be broadly classified into 3 types – word, character, and subword (n-gram characters) tokenization.

Stemming is a technique that lowers inflection in words to their root forms. Our tool uses the Porter Stemmer - It is one of the most popular stemming methods proposed in 1980. It is based on the idea that the suffixes in the English language are made up of a combination of smaller and simpler suffixes.

Parsing is the process of assigning a word in a text as corresponding to a part of speech based on its definition and its relationship with adjacent and related words in a phrase, sentence, or paragraph. Our parser returns a tagged phrase structure tree.

Part of Speech Tagging is the process of marking up a word in a text as corresponding to a particular part of speech. Here is a list of POS tags:

CC coordinating conjunction
CD cardinal digit
DT determiner
EX existential there (like: “there is” … think of it like “there exists”)
FW foreign word
IN preposition/subordinating conjunction
JJ adjective "big"
JJR adjective, comparative "bigger"
JJS adjective, superlative "biggest"
LS list marker 1)
MD modal could, will
NN noun, singular "desk"
NNS noun plural "desks"
NNP proper noun, singular "Harrison"
NNPS proper noun, plural "Americans"
PDT predeterminer "all the kids"
POS possessive ending parent"s
PRP personal pronoun I, he, she
PRP$ possessive pronoun my, his, hers
RB adverb very, silently,
RBR adverb, comparative better
RBS adverb, superlative best
RP particle give up
TO, to go "to" the store.
UH interjection, errrrrrrrm
VB verb, base form take
VBD verb, past tense took
VBG verb, gerund/present participle taking
VBN verb, past participle taken
VBP verb, sing. present, non-3d take
VBZ verb, 3rd person sing. present takes
WDT wh-determiner which
WP wh-pronoun who, what
WP$ possessive wh-pronoun whose
WRB wh-abverb where, when
. punctuation marks . , ; !