43
3

Clustering Algorithm for Gujarati Language

Abstract

Natural language processing area is still under research. But now a day it is on platform for worldwide researchers. Natural language processing includes analyzing the language based on its structure and then tagging of each word appropriately with its grammar base. Here we have 50,000 tagged words set and we try to cluster those Gujarati words based on proposed algorithm, we have defined our own algorithm for processing. Many clustering techniques are available Ex. Single linkage, complete, linkage,average linkage, Hear no of clusters to be formed are not known, so it is all depends on the type of data set provided . Clustering is preprocess for stemming . Stemming is the process where root is extracted from its word. Ex. cats= cat+S, meaning. Cat: Noun and plural form.

View on arXiv
Comments on this paper