Traceback (most recent call last): File "/mnt/bn/intelligent-chatbot/FastChat_v2/LLaMA-Efficient-Tuning/src/inference.py", line 504, in main(args) File "/usr/local/lib/python3.9/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/mnt/bn/intelligent-chatbot/FastChat_v2/LLaMA-Efficient-Tuning/src/inference.py", line 380, in main outputs_tokenized = model.generate(**prompts_tokenized, do_sample=True,max_new_tokens=512,pad_token_id=tokenizer.eos_token_id,temperature=0.3) File "/usr/local/lib/python3.9/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/tiger/.local/lib/python3.9/site-packages/transformers/generation/utils.py", line 1592, in generate return self.sample( File "/home/tiger/.local/lib/python3.9/site-packages/transformers/generation/utils.py", line 2734, in sample next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1) RuntimeError: probability tensor contains either inf, nan or element < 0
Hi @Saicy How do you run inference? It seems you are using Llama-factory. Can you elaborate on how you're getting that error? Is it after fine-tuning?
Yes,after I finish the fine-tune.I can't run infernece. with frame of acclerate outputs_tokenized = model.generate(**prompts_tokenized, do_sample=True,max_new_tokens=512,pad_token_id=tokenizer.eos_token_id,temperature=0.3) outputs_tokenized=[tok_out[len(tok_in):] for tok_in, tok_out in zip(prompts_tokenized["input_ids"], outputs_tokenized) ] outputs=tokenizer.batch_decode(outputs_tokenized,skip_special_tokens=True)
hmm @Saicy this usually happens when you have NaN in your hidden states. have you trained your model in fp16 by any chance? if that's the case you should either switch to bf16 or fp32 + mixed precision training
hmm @Saicy this usually happens when you have NaN in your hidden states. have you trained your model in fp16 by any chance? if that's the case you should either switch to bf16 or fp32 + mixed precision training
hmm @Saicy this usually happens when you have NaN in your hidden states. have you trained your model in fp16 by any chance? if that's the case you should either switch to bf16 or fp32 + mixed precision training