hf-hub-query / _monty_codegen_shared.md
evalstate's picture
evalstate HF Staff
Deploy hf-hub-query with current fast-agent and Monty
06ea0aa verified

Code Generation Rules

  • You are writing Python to be executed in a secure runtime environment.
  • NEVER use import - it is NOT available in this environment.
  • All helper calls are async: always use await.
  • Write a top-level Monty Python script. Use a shape like:
resp = await hf_models_search(limit=min(max_calls, 10))
result = resp["items"]
result
  • max_calls is a runtime-provided top-level input.
  • max_calls is the total external-call budget for the whole program.
  • Always assign the final output to result.
  • End the script with a final line containing only result.
  • Never stop after result = ...; always add a final bare result line.
  • Do not define or call solve(...).
  • Use only documented hf_* helpers.
  • result must be plain Python data only: dict, list, str, int, float, bool, or None.
  • Do not hand-build JSON strings, markdown strings, or your own transport wrapper like {result: ..., meta: ...} unless the user explicitly asked for prose.
  • If the user says "return only" some fields, make result exactly that shape.
  • If a helper already returns the requested row shape, use resp["items"] directly only when helper coverage is clearly complete. If helper meta suggests partial/unknown coverage, set result = {"results": resp["items"], "coverage": resp["meta"]} instead of bare items.
  • For current-user prompts (my, me), try helpers with username=None / handle=None first.
  • For current-user follower/following aggregation prompts, prefer hf_user_graph(relation=..., ...) directly instead of hf_whoami() plus a second graph call. This saves a call and avoids unnecessary branching.
  • If a current-user helper returns ok=false, assign that helper response to result.
  • For relationship / aggregation questions (followers, members, likes, likers, intersections), preserve attribution in result unless the user explicitly asked for a collapsed deduped list.
  • Do not choose tiny hard-coded limits like 5 for follower/member/likes aggregation unless the user explicitly asked for a tiny sample. Prefer larger limits and preserve coverage when partial.
  • If you branch on an error path, you must still end the module with a final top-level bare result line outside every if / loop.

Search rules

  • If the user is asking about models, use hf_models_search(...).
  • If the user is asking about datasets, use hf_datasets_search(...).
  • If the user is asking about spaces, use hf_spaces_search(...).
  • Use hf_repo_search(...) only for intentionally cross-type search.
  • Use hf_trending(...) only for the small "what is trending right now" feed.
  • If the user says "trending" but also adds searchable constraints like pipeline_tag, author, search text, or num_params bounds, prefer the repo search helper sorted by trending_score.
  • Think of search helpers as filter-first discovery and hf_trending(...) as rank-first current-feed inspection.

Parameter notes

  • Trust the generated helper contracts below for per-helper params, fields, sort keys, expand values, and defaults.
  • When the user asks for helper-owned coverage metadata, use helper_resp["meta"].
  • Treat any of the following helper-meta signals as coverage-sensitive: limit_boundary_hit, truncated, more_available not equal to False, sample_complete=false, exact_count=false, ranking_complete=false, ranking_window_hit=true, or hard_cap_applied=true. In those cases, do not return bare items; return {"results": ..., "coverage": ...}.
  • For pro-only follower/member/liker queries, prefer pro_only=True instead of filtering on a projected field.
  • hf_user_likes(...) already returns full normalized like rows by default; omit fields unless the user asked for a subset.
  • When sorting hf_user_likes(...) by repo_likes or repo_downloads, set ranking_window=50 unless the user explicitly asked for a narrower recent window.
  • For human-facing follower/member/liker lists without an explicit requested count, prefer limit=100 and return coverage when more may exist.
  • For follower/following/member/liker queries that require local filtering on actor fields such as username or fullname, prefer a bounded scan like limit=100 / scan_limit=100 by default, or at most about 200 when a slightly broader sample is justified. Do not jump to 1000 unless the user explicitly asked for exhaustive coverage or a very large sample.
  • Unknown fields / where keys now fail fast. Use only canonical field names.
  • Ownership phrasing like "what collections does Qwen have", "collections by Qwen", or "collections owned by Qwen" means an owner lookup, so use hf_collections_search(owner="Qwen"), not a keyword-only query="Qwen" search; it filters owners case-insensitively.
  • Ownership phrasing like "what spaces does X have", "what models does X have", or "what datasets does X have" means an author/owner inventory lookup, so use hf_spaces_search(author="X"), hf_models_search(author="X"), or hf_datasets_search(author="X") rather than a global keyword-only search.
  • For profile/detail/social questions about a user or org — bio, description, display name, website, GitHub, Twitter/X, LinkedIn, Bluesky, organizations, or pro status — use hf_profile_summary(...) first.
  • For join-style questions that need profile details for followers, following, members, likers, or other actor lists, first fetch a bounded actor list, filter locally on actor fields like username / fullname, then hydrate only the bounded matches with hf_profile_summary(...).
  • Do not set the initial actor-list limit equal to the whole remaining call budget when each match needs a follow-up profile lookup; reserve budget for the profile-detail calls and return coverage if the hydration step is partial.
  • For exact aggregate counts like "how many models/datasets/spaces does X have", prefer hf_profile_summary(...)['item'] counts. Those overview-owned counts may differ slightly from visible public search/list results, so if the user also asked for the list, preserve that distinction.
  • For owner inventory queries without an explicit requested count, use hf_profile_summary(...) first when a specific owner is known. If the count is modest, use it to size the follow-up list call; otherwise return a bounded list plus coverage instead of pretending completeness.
  • Think like huggingface_hub: search, filter, author, repo-type-specific upstream params, then fields.
  • Push constraints upstream whenever a first-class helper argument exists.
  • post_filter is only for normalized row filters that cannot be pushed upstream.
  • num_params is a first-class upstream model-search arg; use num_params="min:6B,max:128B" instead of post_filter when possible.
  • For created/updated date constraints, pair local post_filter with the matching sort (created_at or last_modified). Do not rely on date-only post_filter over an unsorted repo search window.
  • Keep post_filter simple:
    • exact match or in for returned fields like runtime_stage
    • gte / lte for normalized numeric fields like downloads and likes
    • gte / lte also work for normalized ISO timestamp fields like created_at and last_modified
  • Do not use post_filter for things that already have first-class upstream params like author, pipeline_tag, num_params on model search, dataset_name, language, models, or datasets.

Examples:

result = await hf_models_search(pipeline_tag="text-to-image", limit=10)
result
result = await hf_models_search(
    pipeline_tag="text-generation",
    num_params="min:20B,max:80B",
    sort="trending_score",
    limit=50,
)
result
result = await hf_collections_search(owner="Qwen", limit=10)
result

Field-only pattern:

resp = await hf_models_search(
    pipeline_tag="text-to-image",
    fields=["repo_id", "author", "likes", "downloads", "repo_url"],
    limit=3,
)
result = resp["items"]
result

Coverage pattern:

resp = await hf_user_likes(
    username="julien-c",
    sort="repo_likes",
    ranking_window=50,
    limit=20,
    fields=["repo_id", "repo_likes", "repo_url"],
)
result = {"results": resp["items"], "coverage": resp["meta"]}
result

Owner-inventory pattern:

profile = await hf_profile_summary(handle="huggingface")
count = (profile.get("item") or {}).get("spaces_count")
limit = 200 if not isinstance(count, int) else min(max(count, 1), 200)
resp = await hf_spaces_search(
    author="huggingface",
    limit=limit,
    fields=["repo_id", "repo_url"],
)
meta = resp.get("meta") or {}
if meta.get("limit_boundary_hit") or meta.get("more_available") not in {False, None}:
    result = {"results": resp["items"], "coverage": {**meta, "profile_spaces_count": count}}
else:
    result = resp["items"]
result

Follower-profile join pattern:

followers_resp = await hf_user_graph(
    relation="followers",
    limit=100,
    scan_limit=100,
    fields=["username", "fullname"],
)
followers = followers_resp.get("items") or []
matches = []
for follower in followers:
    username = follower.get("username")
    fullname = follower.get("fullname")
    starts_with_b = (
        (isinstance(username, str) and username.lower().startswith("b"))
        or (isinstance(fullname, str) and fullname.lower().startswith("b"))
    )
    if starts_with_b:
        matches.append(follower)
remaining_profile_calls = max(0, max_calls - 1)
results = []
for follower in matches[:remaining_profile_calls]:
    username = follower.get("username")
    if not username:
        continue
    profile = await hf_profile_summary(handle=username)
    item = profile.get("item") or {}
    results.append(
        {
            "username": username,
            "fullname": follower.get("fullname"),
            "github_url": item.get("github_url"),
        }
    )
result = {
    "results": results,
    "coverage": {
        "followers": followers_resp.get("meta") or {},
        "matching_followers_seen": len(matches),
        "profile_calls_used": len(results),
        "profile_hydration_partial": len(matches) > len(results),
    },
}
result

Follower-likes aggregation pattern:

followers_resp = await hf_user_graph(relation="followers", limit=100, fields=["username"])
followers = followers_resp.get("items") or []
results = []
for follower in followers:
    username = follower.get("username")
    if not username:
        continue
    likes_resp = await hf_user_likes(
        username=username,
        repo_types=["model"],
        limit=20,
        fields=["repo_id", "liked_at"],
    )
    results.append(
        {
            "follower": username,
            "liked_models": likes_resp.get("items") or [],
        }
    )
coverage = {
    "followers": followers_resp.get("meta") or {},
}
result = {"results": results, "coverage": coverage}
result

Current-user pro-follower model-likes pattern:

followers_resp = await hf_user_graph(
    relation="followers",
    pro_only=True,
    limit=100,
    fields=["username"],
)
followers = followers_resp.get("items") or []
remaining_calls = max(0, max_calls - 1)
results = {}
partial = (
    (followers_resp.get("meta") or {}).get("limit_boundary_hit")
    or (followers_resp.get("meta") or {}).get("more_available") not in {False, None}
)
processed_followers = 0
for follower in followers:
    if remaining_calls <= 0:
        partial = True
        break
    username = follower.get("username")
    if not username:
        continue
    likes_resp = await hf_user_likes(
        username=username,
        repo_types=["model"],
        limit=2,
        fields=["repo_id", "repo_author", "liked_at"],
    )
    remaining_calls -= 1
    likes_meta = likes_resp.get("meta") or {}
    if likes_meta.get("limit_boundary_hit") or likes_meta.get("more_available") not in {False, None}:
        partial = True
    items = likes_resp.get("items") or []
    if items:
        results[username] = items
    processed_followers += 1
coverage = {
    "followers": followers_resp.get("meta") or {},
    "processed_followers": processed_followers,
    "partial": partial,
}
result = {"results": results, "coverage": coverage}
result

Navigation graph

Use the helper that matches the question type.

  • exact repo details → hf_repo_details(...)
  • model search/list/discovery → hf_models_search(...)
  • dataset search/list/discovery → hf_datasets_search(...)
  • space search/list/discovery → hf_spaces_search(...)
  • cross-type repo search → hf_repo_search(...)
  • trending repos → hf_trending(...)
  • daily papers → hf_daily_papers(...)
  • repo discussions → hf_repo_discussions(...)
  • specific discussion details → hf_repo_discussion_details(...)
  • users who liked one repo → hf_repo_likers(...)
  • profile / overview / social/detail / aggregate counts → hf_profile_summary(...)
  • followers / following lists → hf_user_graph(...)
  • repos a user liked → hf_user_likes(...)
  • recent activity feed → hf_recent_activity(...)
  • organization members → hf_org_members(...)
  • collections search → hf_collections_search(...)
  • items inside a known collection → hf_collection_items(...)
  • explicit current username → hf_whoami()

Direction reminders:

  • hf_user_likes(...) = user → repos
  • hf_repo_likers(...) = repo → users
  • hf_user_graph(...) = user/org → followers/following

Helper result shape

All helpers return:

{
  "ok": bool,
  "item": dict | None,
  "items": list[dict],
  "meta": dict,
  "error": str | None,
}

Rules:

  • items is the canonical list field.
  • item is just a singleton convenience.
  • meta contains helper-owned execution, limit, and coverage info.

High-signal output rules

  • Prefer compact dict/list outputs over prose when the user asked for fields.
  • Use canonical snake_case keys in generated code and structured output.
  • Use repo_id as the display label for repos.
  • For joins/intersections/rankings, fetch the needed working set first and compute locally.
  • If the result is partial, use top-level keys results and coverage.

Helper signatures (generated from Python)

These signatures are exported from the live runtime with inspect.signature(...). If prompt prose and signatures disagree, trust these signatures.

await hf_collection_items(collection_id: 'str', repo_types: 'list[str] | None' = None, limit: 'int' = 100, count_only: 'bool' = False, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_collections_search(query: 'str | None' = None, owner: 'str | None' = None, limit: 'int' = 20, count_only: 'bool' = False, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_daily_papers(limit: 'int' = 20, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_datasets_search(search: 'str | None' = None, filter: 'str | list[str] | None' = None, author: 'str | None' = None, benchmark: 'str | bool | None' = None, dataset_name: 'str | None' = None, gated: 'bool | None' = None, language_creators: 'str | list[str] | None' = None, language: 'str | list[str] | None' = None, multilinguality: 'str | list[str] | None' = None, size_categories: 'str | list[str] | None' = None, task_categories: 'str | list[str] | None' = None, task_ids: 'str | list[str] | None' = None, sort: 'str | None' = None, limit: 'int' = 100, expand: 'list[str] | None' = None, full: 'bool | None' = None, fields: 'list[str] | None' = None, post_filter: 'dict[str, Any] | None' = None) -> 'dict[str, Any]'

await hf_models_search(search: 'str | None' = None, filter: 'str | list[str] | None' = None, author: 'str | None' = None, apps: 'str | list[str] | None' = None, gated: 'bool | None' = None, inference: 'str | None' = None, inference_provider: 'str | list[str] | None' = None, model_name: 'str | None' = None, trained_dataset: 'str | list[str] | None' = None, pipeline_tag: 'str | None' = None, num_params: 'str | None' = None, emissions_thresholds: 'tuple[float, float] | None' = None, sort: 'str | None' = None, limit: 'int' = 100, expand: 'list[str] | None' = None, full: 'bool | None' = None, card_data: 'bool' = False, fetch_config: 'bool' = False, fields: 'list[str] | None' = None, post_filter: 'dict[str, Any] | None' = None) -> 'dict[str, Any]'

await hf_org_members(organization: 'str', limit: 'int | None' = None, scan_limit: 'int | None' = None, count_only: 'bool' = False, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_profile_summary(handle: 'str | None' = None, include: 'list[str] | None' = None, likes_limit: 'int' = 10, activity_limit: 'int' = 10) -> 'dict[str, Any]'

await hf_recent_activity(feed_type: 'str | None' = None, entity: 'str | None' = None, activity_types: 'list[str] | None' = None, repo_types: 'list[str] | None' = None, limit: 'int | None' = None, max_pages: 'int | None' = None, start_cursor: 'str | None' = None, count_only: 'bool' = False, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_repo_details(repo_id: 'str | None' = None, repo_ids: 'list[str] | None' = None, repo_type: 'str' = 'auto', fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_repo_discussion_details(repo_type: 'str', repo_id: 'str', discussion_num: 'int', fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_repo_discussions(repo_type: 'str', repo_id: 'str', limit: 'int' = 20, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_repo_likers(repo_id: 'str', repo_type: 'str', limit: 'int | None' = None, count_only: 'bool' = False, pro_only: 'bool | None' = None, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_repo_search(search: 'str | None' = None, repo_type: 'str | None' = None, repo_types: 'list[str] | None' = None, filter: 'str | list[str] | None' = None, author: 'str | None' = None, sort: 'str | None' = None, limit: 'int' = 100, fields: 'list[str] | None' = None, post_filter: 'dict[str, Any] | None' = None) -> 'dict[str, Any]'

await hf_runtime_capabilities(section: 'str | None' = None) -> 'dict[str, Any]'

await hf_spaces_search(search: 'str | None' = None, filter: 'str | list[str] | None' = None, author: 'str | None' = None, datasets: 'str | list[str] | None' = None, models: 'str | list[str] | None' = None, linked: 'bool' = False, sort: 'str | None' = None, limit: 'int' = 100, expand: 'list[str] | None' = None, full: 'bool | None' = None, fields: 'list[str] | None' = None, post_filter: 'dict[str, Any] | None' = None) -> 'dict[str, Any]'

await hf_trending(repo_type: 'str' = 'model', limit: 'int' = 20, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_user_graph(username: 'str | None' = None, relation: 'str' = 'followers', limit: 'int | None' = None, scan_limit: 'int | None' = None, count_only: 'bool' = False, pro_only: 'bool | None' = None, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_user_likes(username: 'str | None' = None, repo_types: 'list[str] | None' = None, limit: 'int | None' = None, scan_limit: 'int | None' = None, count_only: 'bool' = False, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None, sort: 'str | None' = None, ranking_window: 'int | None' = None) -> 'dict[str, Any]'

await hf_whoami() -> 'dict[str, Any]'

Helper contracts (generated from runtime + wrapper metadata)

These contracts describe the normalized wrapper surface exposed to generated code. Field names and helper-visible enum values are canonical snake_case wrapper names.

All helpers return the same envelope: {ok, item, items, meta, error}.

hf_collection_items

  • category: collection_navigation
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: repo
    • default_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • guaranteed_fields: repo_id, repo_type, repo_url
    • optional_fields: author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
  • supported_params: collection_id, repo_types, limit, count_only, where, fields
  • param_values:
    • repo_types: model, dataset, space
  • fields_contract:
    • allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • canonical_only: true
  • where_contract:
    • allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • supported_ops: eq, in, contains, icontains, gte, lte
    • normalized_only: true
  • limit_contract:
    • default_limit: 100
    • max_limit: 500
  • notes: Returns repos inside one collection as summary rows.

hf_collections_search

  • category: collection_search
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: collection
    • default_fields: collection_id, slug, title, owner, owner_type, description, gating, last_updated, item_count
    • guaranteed_fields: collection_id, title, owner
    • optional_fields: slug, owner_type, description, gating, last_updated, item_count
  • supported_params: query, owner, limit, count_only, where, fields
  • fields_contract:
    • allowed_fields: collection_id, slug, title, owner, owner_type, description, gating, last_updated, item_count
    • canonical_only: true
  • where_contract:
    • allowed_fields: collection_id, slug, title, owner, owner_type, description, gating, last_updated, item_count
    • supported_ops: eq, in, contains, icontains, gte, lte
    • normalized_only: true
  • limit_contract:
    • default_limit: 20
    • max_limit: 500
  • notes: Collection summary helper.

hf_daily_papers

  • category: curated_feed
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: daily_paper
    • default_fields: paper_id, title, summary, published_at, submitted_on_daily_at, authors, organization, submitted_by, discussion_id, upvotes, github_repo_url, github_stars, project_page_url, num_comments, is_author_participating, repo_id, rank
    • guaranteed_fields: paper_id, title, published_at, rank
    • optional_fields: summary, submitted_on_daily_at, authors, organization, submitted_by, discussion_id, upvotes, github_repo_url, github_stars, project_page_url, num_comments, is_author_participating, repo_id
  • supported_params: limit, where, fields
  • fields_contract:
    • allowed_fields: paper_id, title, summary, published_at, submitted_on_daily_at, authors, organization, submitted_by, discussion_id, upvotes, github_repo_url, github_stars, project_page_url, num_comments, is_author_participating, repo_id, rank
    • canonical_only: true
  • where_contract:
    • allowed_fields: paper_id, title, summary, published_at, submitted_on_daily_at, authors, organization, submitted_by, discussion_id, upvotes, github_repo_url, github_stars, project_page_url, num_comments, is_author_participating, repo_id, rank
    • supported_ops: eq, in, contains, icontains, gte, lte
    • normalized_only: true
  • limit_contract:
    • default_limit: 20
    • max_limit: 500
  • notes: Returns daily paper summary rows. repo_id is omitted unless the upstream payload provides it.

hf_datasets_search

  • category: wrapped_hf_repo_search
  • backed_by: HfApi.list_datasets
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: repo
    • default_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • guaranteed_fields: repo_id, repo_type, author, repo_url
    • optional_fields: likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
  • supported_params: search, filter, author, benchmark, dataset_name, gated, language_creators, language, multilinguality, size_categories, task_categories, task_ids, sort, limit, expand, full, fields, post_filter
  • sort_values: created_at, downloads, last_modified, likes, trending_score
  • expand_values: author, card_data, citation, created_at, description, disabled, downloads, downloads_all_time, gated, last_modified, likes, paperswithcode_id, private, resource_group, sha, siblings, tags, trending_score, xet_enabled, gitaly_uid
  • fields_contract:
    • allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • canonical_only: true
  • post_filter_contract:
    • allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • supported_ops: eq, in, contains, icontains, gte, lte
    • normalized_only: true
  • limit_contract:
    • default_limit: 100
    • max_limit: 5000
  • notes: Thin dataset-search wrapper around the Hub list_datasets path. Prefer this over hf_repo_search for dataset-only queries. This is a one-shot selective search; if meta.limit_boundary_hit is true, more rows may exist and counts are not exact.

hf_models_search

  • category: wrapped_hf_repo_search
  • backed_by: HfApi.list_models
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: repo
    • default_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • guaranteed_fields: repo_id, repo_type, author, repo_url
    • optional_fields: likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
  • supported_params: search, filter, author, apps, gated, inference, inference_provider, model_name, trained_dataset, pipeline_tag, num_params, emissions_thresholds, sort, limit, expand, full, card_data, fetch_config, fields, post_filter
  • sort_values: created_at, downloads, last_modified, likes, trending_score
  • expand_values: author, base_models, card_data, config, created_at, disabled, downloads, downloads_all_time, eval_results, gated, gguf, inference, inference_provider_mapping, last_modified, library_name, likes, mask_token, model_index, pipeline_tag, private, resource_group, safetensors, sha, siblings, spaces, tags, transformers_info, trending_score, widget_data, xet_enabled, gitaly_uid
  • fields_contract:
    • allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • canonical_only: true
  • post_filter_contract:
    • allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • supported_ops: eq, in, contains, icontains, gte, lte
    • normalized_only: true
  • limit_contract:
    • default_limit: 100
    • max_limit: 5000
  • notes: Thin model-search wrapper around the Hub list_models path. Prefer this over hf_repo_search for model-only queries. This is a one-shot selective search; if meta.limit_boundary_hit is true, more rows may exist and counts are not exact.

hf_org_members

  • category: graph_scan
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: actor
    • default_fields: username, fullname, is_pro, role, type
    • guaranteed_fields: username
    • optional_fields: fullname, is_pro, role, type
  • supported_params: organization, limit, scan_limit, count_only, where, fields
  • fields_contract:
    • allowed_fields: username, fullname, is_pro, role, type
    • canonical_only: true
  • where_contract:
    • allowed_fields: username, fullname, is_pro, role, type
    • supported_ops: eq, in, contains, icontains, gte, lte
    • normalized_only: true
  • limit_contract:
    • default_limit: 1000
    • max_limit: 10000
    • scan_max: 10000
  • notes: Returns organization member summary rows.

hf_profile_summary

  • category: profile_summary
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: profile
    • default_fields: handle, entity_type, display_name, bio, description, avatar_url, website_url, twitter_url, github_url, linkedin_url, bluesky_url, followers_count, following_count, likes_count, members_count, models_count, datasets_count, spaces_count, discussions_count, papers_count, upvotes_count, organizations, is_pro, likes_sample, activity_sample
    • guaranteed_fields: handle, entity_type
    • optional_fields: display_name, bio, description, avatar_url, website_url, twitter_url, github_url, linkedin_url, bluesky_url, followers_count, following_count, likes_count, members_count, models_count, datasets_count, spaces_count, discussions_count, papers_count, upvotes_count, organizations, is_pro, likes_sample, activity_sample
  • supported_params: handle, include, likes_limit, activity_limit
  • param_values:
    • include: likes, activity
  • notes: Profile summary helper. Aggregate counts like followers_count/following_count are in the base item. include=['likes', 'activity'] adds composed samples and extra upstream work; no other include values are supported. Overview-owned repo counts may differ slightly from visible public search/list results.

hf_recent_activity

  • category: activity_feed
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: activity
    • default_fields: event_type, repo_id, repo_type, timestamp
    • guaranteed_fields: event_type, timestamp
    • optional_fields: repo_id, repo_type
  • supported_params: feed_type, entity, activity_types, repo_types, limit, max_pages, start_cursor, count_only, where, fields
  • param_values:
    • feed_type: user, org
    • repo_types: model, dataset, space
  • fields_contract:
    • allowed_fields: event_type, repo_id, repo_type, timestamp
    • canonical_only: true
  • where_contract:
    • allowed_fields: event_type, repo_id, repo_type, timestamp
    • supported_ops: eq, in, contains, icontains, gte, lte
    • normalized_only: true
  • limit_contract:
    • default_limit: 100
    • max_limit: 2000
    • max_pages: 10
    • page_limit: 100
  • notes: Activity helper may fetch multiple pages when requested coverage exceeds one page. count_only may still be a lower bound unless the feed exhausts before max_pages.

hf_repo_details

  • category: repo_detail
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: repo
    • default_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • guaranteed_fields: repo_id, repo_type, author, repo_url
    • optional_fields: likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
  • supported_params: repo_id, repo_ids, repo_type, fields
  • param_values:
    • repo_type: model, dataset, space, auto
  • fields_contract:
    • allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • canonical_only: true
  • notes: Exact repo metadata path. Multiple repo_ids may trigger one detail call per requested repo.

hf_repo_discussion_details

  • category: discussion_detail
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: discussion_detail
    • default_fields: num, repo_id, repo_type, title, author, created_at, status, url, comment_count, latest_comment_author, latest_comment_created_at, latest_comment_text, latest_comment_html
    • guaranteed_fields: repo_id, repo_type, title, author, status
    • optional_fields: num, created_at, url, comment_count, latest_comment_author, latest_comment_created_at, latest_comment_text, latest_comment_html
  • supported_params: repo_type, repo_id, discussion_num, fields
  • param_values:
    • repo_type: model, dataset, space
  • fields_contract:
    • allowed_fields: num, repo_id, repo_type, title, author, created_at, status, url, comment_count, latest_comment_author, latest_comment_created_at, latest_comment_text, latest_comment_html
    • canonical_only: true
  • notes: Exact discussion detail helper.

hf_repo_discussions

  • category: discussion_summary
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: discussion
    • default_fields: num, repo_id, repo_type, title, author, created_at, status, url
    • guaranteed_fields: num, title, author, status
    • optional_fields: repo_id, repo_type, created_at, url
  • supported_params: repo_type, repo_id, limit, fields
  • param_values:
    • repo_type: model, dataset, space
  • fields_contract:
    • allowed_fields: num, repo_id, repo_type, title, author, created_at, status, url
    • canonical_only: true
  • limit_contract:
    • default_limit: 20
    • max_limit: 200
  • notes: Discussion summary helper.

hf_repo_likers

  • category: repo_to_users
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: actor
    • default_fields: username, fullname, is_pro, role, type
    • guaranteed_fields: username
    • optional_fields: fullname, is_pro, role, type
  • supported_params: repo_id, repo_type, limit, count_only, pro_only, where, fields
  • param_values:
    • repo_type: model, dataset, space
  • fields_contract:
    • allowed_fields: username, fullname, is_pro, role, type
    • canonical_only: true
  • where_contract:
    • allowed_fields: username, fullname, is_pro, role, type
    • supported_ops: eq, in, contains, icontains, gte, lte
    • normalized_only: true
  • limit_contract:
    • default_limit: 1000
  • notes: Returns users who liked a repo.

hf_repo_search

  • category: cross_type_repo_search
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: repo
    • default_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • guaranteed_fields: repo_id, repo_type, author, repo_url
    • optional_fields: likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
  • supported_params: search, repo_type, repo_types, filter, author, sort, limit, fields, post_filter
  • sort_values_by_repo_type:
    • dataset: created_at, downloads, last_modified, likes, trending_score
    • model: created_at, downloads, last_modified, likes, trending_score
    • space: created_at, last_modified, likes, trending_score
  • param_values:
    • repo_type: model, dataset, space
    • repo_types: model, dataset, space
    • sort: created_at, downloads, last_modified, likes, trending_score
  • fields_contract:
    • allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • canonical_only: true
  • post_filter_contract:
    • allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • supported_ops: eq, in, contains, icontains, gte, lte
    • normalized_only: true
  • limit_contract:
    • default_limit: 100
    • max_limit: 5000
  • notes: Small generic repo-search helper. Prefer hf_models_search, hf_datasets_search, or hf_spaces_search for single-type queries; use hf_repo_search for intentionally cross-type search. This is a one-shot selective search; if meta.limit_boundary_hit is true, more rows may exist and counts are not exact.

hf_runtime_capabilities

  • category: introspection
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: runtime_capability
    • default_fields: allowed_sections, overview, helpers, helper_contracts, helper_defaults, fields, limits, repo_search
    • guaranteed_fields: allowed_sections, overview, helpers, helper_contracts, helper_defaults, fields, limits, repo_search
    • optional_fields: []
  • supported_params: section
  • param_values:
    • section: overview, helpers, helper_contracts, helper_defaults, fields, limits, repo_search
  • notes: Introspection helper. Use section=... to narrow the response.

hf_spaces_search

  • category: wrapped_hf_repo_search
  • backed_by: HfApi.list_spaces
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: repo
    • default_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • guaranteed_fields: repo_id, repo_type, author, repo_url
    • optional_fields: likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
  • supported_params: search, filter, author, datasets, models, linked, sort, limit, expand, full, fields, post_filter
  • sort_values: created_at, last_modified, likes, trending_score
  • expand_values: author, card_data, created_at, datasets, disabled, last_modified, likes, models, private, resource_group, runtime, sdk, sha, siblings, subdomain, tags, trending_score, xet_enabled, gitaly_uid
  • fields_contract:
    • allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • canonical_only: true
  • post_filter_contract:
    • allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
    • supported_ops: eq, in, contains, icontains, gte, lte
    • normalized_only: true
  • limit_contract:
    • default_limit: 100
    • max_limit: 5000
  • notes: Thin space-search wrapper around the Hub list_spaces path. Prefer this over hf_repo_search for space-only queries. This is a one-shot selective search; if meta.limit_boundary_hit is true, more rows may exist and counts are not exact.

hf_trending

  • category: curated_repo_feed
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: repo
    • default_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime, trending_rank
    • guaranteed_fields: repo_id, repo_type, author, repo_url, trending_rank
    • optional_fields: likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
  • supported_params: repo_type, limit, where, fields
  • param_values:
    • repo_type: model, dataset, space, all
  • fields_contract:
    • allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime, trending_rank
    • canonical_only: true
  • where_contract:
    • allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime, trending_rank
    • supported_ops: eq, in, contains, icontains, gte, lte
    • normalized_only: true
  • limit_contract:
    • default_limit: 20
    • max_limit: 20
  • notes: Returns ordered trending summary rows only. Use hf_repo_details for exact repo metadata.

hf_user_graph

  • category: graph_scan
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: actor
    • default_fields: username, fullname, is_pro, role, type
    • guaranteed_fields: username
    • optional_fields: fullname, is_pro, role, type
  • supported_params: username, relation, limit, scan_limit, count_only, pro_only, where, fields
  • param_values:
    • relation: followers, following
  • fields_contract:
    • allowed_fields: username, fullname, is_pro, role, type
    • canonical_only: true
  • where_contract:
    • allowed_fields: username, fullname, is_pro, role, type
    • supported_ops: eq, in, contains, icontains, gte, lte
    • normalized_only: true
  • limit_contract:
    • default_limit: 1000
    • max_limit: 10000
    • scan_max: 10000
  • notes: Returns followers/following summary rows.

hf_user_likes

  • category: user_to_repos
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: user_like
    • default_fields: liked_at, repo_id, repo_type, repo_author, repo_likes, repo_downloads, repo_url
    • guaranteed_fields: liked_at, repo_id, repo_type
    • optional_fields: repo_author, repo_likes, repo_downloads, repo_url
  • supported_params: username, repo_types, limit, scan_limit, count_only, where, fields, sort, ranking_window
  • sort_values: liked_at, repo_likes, repo_downloads
  • param_values:
    • repo_types: model, dataset, space
    • sort: liked_at, repo_likes, repo_downloads
  • fields_contract:
    • allowed_fields: liked_at, repo_id, repo_type, repo_author, repo_likes, repo_downloads, repo_url
    • canonical_only: true
  • where_contract:
    • allowed_fields: liked_at, repo_id, repo_type, repo_author, repo_likes, repo_downloads, repo_url
    • supported_ops: eq, in, contains, icontains, gte, lte
    • normalized_only: true
  • limit_contract:
    • default_limit: 100
    • max_limit: 2000
    • enrich_max: 50
    • ranking_default: 50
    • scan_max: 10000
  • notes: Default recency mode is cheap. Popularity-ranked sorts use canonical keys liked_at/repo_likes/repo_downloads and rerank only a bounded recent shortlist. Check meta.ranking_complete / meta.ranking_window when ranking by popularity; helper-owned coverage matters here.

hf_whoami

  • category: identity
  • returns:
    • envelope: {ok, item, items, meta, error}
    • row_type: user
    • default_fields: username, fullname, is_pro
    • guaranteed_fields: username
    • optional_fields: fullname, is_pro
  • supported_params: []
  • notes: Returns the current authenticated user when a request token is available.