Home
last modified time | relevance | path

Searched defs:max_bytes_per_token (Results 1 – 9 of 9) sorted by relevance

/third_party/mindspore/mindspore-src/source/tests/ut/python/dataset/
Dtest_text_wordpiece_tokenizer.py100 vocab_list, unknown_token='[UNK]', max_bytes_per_token=100): argument
120 … vocab_list, unknown_token='[UNK]', max_bytes_per_token=100): argument
Dtest_text_bert_tokenizer.py173 max_bytes_per_token=100, unknown_token='[UNK]', argument
202 max_bytes_per_token=100, unknown_token='[UNK]', argument
/third_party/mindspore/mindspore-src/source/mindspore/ccsrc/minddata/dataset/text/kernels/
Dbert_tokenizer_op.h42 … : wordpiece_tokenizer_(vocab, suffix_indicator, max_bytes_per_token, unknown_token, with_offsets), in wordpiece_tokenizer_() argument
Dwordpiece_tokenizer_op.cc30const int &max_bytes_per_token, const std::string &unknown_token, in WordpieceTokenizerOp()
/third_party/mindspore/mindspore-src/source/mindspore/ccsrc/minddata/dataset/include/dataset/
Dtext.h345 …: BertTokenizer(vocab, StringToChar(suffix_indicator), max_bytes_per_token, StringToChar(unknown_t… in BertTokenizer() argument
1010 …: WordpieceTokenizer(vocab, StringToChar(suffix_indicator), max_bytes_per_token, StringToChar(unkn… in WordpieceTokenizer() argument
/third_party/mindspore/mindspore-src/source/mindspore/ccsrc/minddata/dataset/api/python/bindings/dataset/text/kernels/ir/
Dbindings.cc58 bool with_offsets) { in __anon702ab2a90502()
281int32_t max_bytes_per_token, const std::string &unknown_token, bool with_offsets) { in __anon702ab2a92c02()
/third_party/mindspore/mindspore-src/source/mindspore/python/mindspore/dataset/text/
Dtransforms.py1054 def __init__(self, vocab, suffix_indicator='##', max_bytes_per_token=100, unknown_token='[UNK]', argument
1271 … def __init__(self, vocab, suffix_indicator='##', max_bytes_per_token=100, unknown_token='[UNK]', argument
/third_party/mindspore/mindspore-src/source/mindspore/ccsrc/minddata/dataset/text/ir/kernels/
Dtext_ir.cc121int32_t max_bytes_per_token, const std::string &unknown_token, in BertTokenizerOperation()
649int32_t max_bytes_per_token, const std::string &unknown_token, in WordpieceTokenizerOperation()
/third_party/mindspore/mindspore-src/source/mindspore/ccsrc/minddata/dataset/api/
Dtext.cc104int32_t max_bytes_per_token, const std::vector<char> &unknown_token, bool lower_case, in BertTokenizer()
448int32_t max_bytes_per_token, const std::vector<char> &unknown_token, in WordpieceTokenizer()