vnTagger is an automatic part-of-speech tagger for tagging Vietnamese texts. It is developed in the Java programming language which is platform-independent. It gives a good tagging result in terms of precision and recall ratios which are in the range of 94%-95% on a Vietnamese treebank.
NEW: In June 2016, vnTagger 5.0 was released as a tool in Vitk, a new toolkit which is designed to process very large text data. This toolkit uses Apache Spark, a fast cluster computing platform. Check it out at https://github.com/phuonglh/vn.vitk
This software uses vnTokenizer 4.1.1 to segment text prior to tagging.
If you use vnTagger in your publication, please cite our article An empirical study of maximum entropy approach for part-of-speech tagging of Vietnamese texts. vnTagger is integrated in vnLExtractor and vnLTAGParser.