I
IndicNLP
Microsoft's Indian language NLP suite — tokenizers, embeddings, and models for 11 Indian languages
About IndicNLP
Microsoft's Indian language NLP suite — tokenizers, embeddings, and models for 11 Indian languages
Key Features
✓ Tokenizers for 11+ Indian languages
✓ Word embeddings trained on Indian language corpora
✓ Sentence boundary detection for Indic scripts
✓ Script normalization across Indian writing systems
✓ Language identification for Indian languages
✓ Morphological analyzers
✓ Open source Python library
✓ Used in academic and industry research
✓ Foundation for many Indian NLP applications
✓ Maintained by Microsoft Research India
Who Is It For?
Professionals, enterprises, and teams looking for AI-powered solutions.