csv japanese natural language processing
Japanese Tokenizer Dictionaries for use with MeCab.
Infrequently (typically less than once a year)
Versions of Unidic offered here are available under the GPL/LGPL/BSD license.IPADic is offered under a unique BSD-like license. See below.
This dataset includes dictionaries for tokenization and morphological analysis of Japanese for use with MeCab. This includes NINJAL's UniDic, a modified smaller version of UniDic for situations that require it, and the legacy IPADic dictionary.
See all datasets managed by Cotonoha.
Japanese Tokenizer Dictionaries was accessed on
DATE from https://registry.opendata.aws/cotonoha-dic.
aws s3 ls --no-sign-request s3://cotonoha-dic/
Edit this dataset entry on GitHub