Tokenizer class issue in the config file
#4
by
ydshieh
- opened
Hi
@ankrgyl
. Do you know why we have"tokenizer_class": "RobertaTokenizer",
in the config file instead of LayoutLMTokenizer? Is RobertaTokenizer used in fine-tuning this downstream QA task?
Yes! It's forked from here: https://huggingface.co/microsoft/layoutlm-base-cased/blob/main/config.json
Thanks! There might be some reason why layoutlm-base-cased use RobertaTokenizer but layoutlm-base-uncased doesn't specify a class (so will use LayoutLMTokenizer. However, this question should be posted on those repos.
Okay great, sounds good to me. If you discover anything super interesting, please update here :). I'll close this out for now.
ankrgyl
changed discussion status to
closed