"embeddings.position_ids - UNEXPECTED" warning started showing

Hello all, I’m fairly new to HF and this type of package, so please keep that in mind.

About 2 months ago I started a project that uses all-MiniLM-L6-v2 and cross-encoder/ms-marco-MiniLM-L-6-v2

I finished the project and but recently came back to it to work on a new feature, however I’m getting lots of new verbose warnings/logging when downloading and using those models. These new warnings include not being authenticated while making requests to HF hub, and something to do with embeddings.position_ids being unexpected. Here’s an example:

Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
WARNING: Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
Loading weights: 100%|███████████████████████| 103/103 [00:00<00:00, 5300.26it/s, Materializing param=pooler.dense.weight]
BertModel LOAD REPORT from: sentence-transformers/all-MiniLM-L6-v2
Key                     | Status     |  |
------------------------±-----------±-±
embeddings.position_ids | UNEXPECTED |  |

Notes:

UNEXPECTED    :can be ignored when loading from different task/architecture; not ok if you expect identical arch.
Loading weights: 100%|█████████████████████████| 105/105 [00:00<00:00, 5050.25it/s, Materializing param=classifier.weight]
BertForSequenceClassification LOAD REPORT from: cross-encoder/ms-marco-MiniLM-L-6-v2
Key                          | Status     |  |
-----------------------------±-----------±-±
bert.embeddings.position_ids | UNEXPECTED |  |

Notes:

UNEXPECTED    :can be ignored when loading from different task/architecture; not ok if you expect identical arch.

I’m not really sure what to make of these new outputs. The authentication seems straightforward except for the fact that I don’t understand why I wasn’t encountering this issue before, and I also didn’t realize you authentication to access HF was a thing. As for the unexpected status of the embeddings.position_ids I really have no idea.

Hoping someone with more knowledge can help me understand why these appeared with no changes and how I can start to solve them. Functionality of my project seems the same/unimpacted FYI.

1 Like

The message about that token basically appears to anyone who isn’t logged in and doesn’t pass a token.

The position_ids one is probably harmless too. This might be affected by the major update from Transformers 4 to 5.


What changed (even if you didn’t change your code)

These messages are almost always caused by a dependency update (e.g., huggingface_hub, transformers, accelerate, sentence-transformers) or by your environment/cache changing (new venv/container, cleared ~/.cache, etc.). Newer versions have become more explicit about:

  • Hub authentication / rate limits
  • “state_dict” key mismatches when loading weights (the “LOAD REPORT” table)

1) “You are sending unauthenticated requests to the HF Hub … set a HF_TOKEN”

Background

Most Hugging Face models (including the two you use) are public. You do not need to be logged in to download them.

However, Hugging Face applies rate limits to Hub requests, and limits depend on whether requests are associated with a user account / plan. (Hugging Face)

Newer huggingface_hub versions may emit a warning when the Hub responds with a warning header (formatted like X-HF-Warning: ...). (GitHub)
So you can be totally “allowed” to download the model, but still get warned that unauthenticated downloads may be slower / more rate-limited.

Why you didn’t see it before

Common reasons:

  • You were previously loading from the local cache, so no Hub requests were made.
  • You had previously logged in (token stored locally), but now you’re in a different environment or cleared the cache/token.
  • Your dependencies updated and now surface this warning.

How to fix (if you want)

Option A — log in once (recommended):

  • hf auth login stores a token in HF_HOME (defaults to ~/.cache/huggingface/token). (Hugging Face)

Option B — set an env var (good for servers/CI):

  • Set HF_TOKEN (a read token is enough for public downloads). (Hugging Face)

Option C — don’t authenticate, just reduce noise:

  • Set HF_HUB_VERBOSITY=error to quiet hub logging. (Hugging Face)

2) embeddings.position_ids | UNEXPECTED in the “LOAD REPORT”

What “UNEXPECTED” means

When loading weights, the loader compares:

  • keys present in the checkpoint (state_dict)
  • keys expected by the model class you instantiated

If a key exists in the checkpoint but not in the model, it’s unexpected. PyTorch tracks this as unexpected_keys; depending on strict=True/False it can be an error or just a report. (PyTorch Docs)

Why it’s happening specifically for position_ids

position_ids is typically a buffer used to build positional embeddings. Across Transformers versions, some models changed whether this buffer is saved in checkpoints (persistent vs non-persistent). Old checkpoints may contain …position_ids, while newer model code may not expect it.

This exact symptom is widely reported in Transformers issues, e.g.:

  • unexpected_keys=['bert.embeddings.position_ids'] appearing “now but not yesterday” (GitHub)
  • Unexpected in state_dict: embeddings.position_ids after a “recent update” (GitHub)

Is it dangerous?

In your log, the only mismatch shown is position_ids, and your project behaves the same. In that case it’s almost always safe to ignore.

When you should worry: if you see many MISSING / UNEXPECTED entries for real weights (e.g., attention/MLP layers), because that can mean you loaded the wrong architecture or incompatible checkpoint.


3) Why you’re seeing duplicate warnings / lots of verbosity

Transformers uses both Python warnings and its own logging system, and it can route warnings through logging, which can create duplicate-looking output in some setups. (Hugging Face)


Practical “do this” checklist

A) Record the versions that changed

Run:

python -c "import transformers, huggingface_hub, sentence_transformers, accelerate; \
print('transformers', transformers.__version__); \
print('huggingface_hub', huggingface_hub.__version__); \
print('sentence-transformers', sentence_transformers.__version__); \
print('accelerate', accelerate.__version__)"

If these drifted since 2 months ago, that explains the new output.

B) Fix authentication warning (optional, but recommended)

C) Quiet the logs (if you want)

Environment variables (best set before importing libraries):

export HF_HUB_VERBOSITY=error
export TRANSFORMERS_VERBOSITY=error
export TRANSFORMERS_NO_ADVISORY_WARNINGS=1

HF_HUB_VERBOSITY is defined by huggingface_hub. (Hugging Face)
TRANSFORMERS_VERBOSITY / TRANSFORMERS_NO_ADVISORY_WARNINGS are documented by Transformers. (Hugging Face)


Bottom line

  • The HF_TOKEN message is about rate limits / download experience, not about needing permission for those public models. (Hugging Face)
  • …position_ids | UNEXPECTED is a known benign mismatch that commonly appears after Transformers-version changes. (GitHub)
  • If you want this to stop happening unexpectedly in the future, pin versions (pip-tools, poetry.lock, requirements.txt with exact versions) so upgrades don’t silently change runtime behavior.