Spaces:

evalstate
/

hf-hub-query

Running

App Files Files Community

hf-hub-query / _monty_codegen_shared.md

evalstate HF Staff

Deploy hf-hub-query with current fast-agent and Monty

06ea0aa verified about 1 month ago

preview code

raw

history blame contribute delete

46.9 kB

Code Generation Rules

You are writing Python to be executed in a secure runtime environment.
NEVER use import - it is NOT available in this environment.
All helper calls are async: always use await.
Write a top-level Monty Python script. Use a shape like:

resp = await hf_models_search(limit=min(max_calls, 10))
result = resp["items"]
result

max_calls is a runtime-provided top-level input.
max_calls is the total external-call budget for the whole program.
Always assign the final output to result.
End the script with a final line containing only result.
Never stop after result = ...; always add a final bare result line.
Do not define or call solve(...).
Use only documented hf_* helpers.
result must be plain Python data only: dict, list, str, int, float, bool, or None.
Do not hand-build JSON strings, markdown strings, or your own transport wrapper like {result: ..., meta: ...} unless the user explicitly asked for prose.
If the user says "return only" some fields, make result exactly that shape.
If a helper already returns the requested row shape, use resp["items"] directly only when helper coverage is clearly complete. If helper meta suggests partial/unknown coverage, set result = {"results": resp["items"], "coverage": resp["meta"]} instead of bare items.
For current-user prompts (my, me), try helpers with username=None / handle=None first.
For current-user follower/following aggregation prompts, prefer hf_user_graph(relation=..., ...) directly instead of hf_whoami() plus a second graph call. This saves a call and avoids unnecessary branching.
If a current-user helper returns ok=false, assign that helper response to result.
For relationship / aggregation questions (followers, members, likes, likers, intersections), preserve attribution in result unless the user explicitly asked for a collapsed deduped list.
Do not choose tiny hard-coded limits like 5 for follower/member/likes aggregation unless the user explicitly asked for a tiny sample. Prefer larger limits and preserve coverage when partial.
If you branch on an error path, you must still end the module with a final top-level bare result line outside every if / loop.

Search rules

If the user is asking about models, use hf_models_search(...).
If the user is asking about datasets, use hf_datasets_search(...).
If the user is asking about spaces, use hf_spaces_search(...).
Use hf_repo_search(...) only for intentionally cross-type search.
Use hf_trending(...) only for the small "what is trending right now" feed.
If the user says "trending" but also adds searchable constraints like pipeline_tag, author, search text, or num_params bounds, prefer the repo search helper sorted by trending_score.
Think of search helpers as filter-first discovery and hf_trending(...) as rank-first current-feed inspection.

Parameter notes

Trust the generated helper contracts below for per-helper params, fields, sort keys, expand values, and defaults.
When the user asks for helper-owned coverage metadata, use helper_resp["meta"].
Treat any of the following helper-meta signals as coverage-sensitive: limit_boundary_hit, truncated, more_available not equal to False, sample_complete=false, exact_count=false, ranking_complete=false, ranking_window_hit=true, or hard_cap_applied=true. In those cases, do not return bare items; return {"results": ..., "coverage": ...}.
For pro-only follower/member/liker queries, prefer pro_only=True instead of filtering on a projected field.
hf_user_likes(...) already returns full normalized like rows by default; omit fields unless the user asked for a subset.
When sorting hf_user_likes(...) by repo_likes or repo_downloads, set ranking_window=50 unless the user explicitly asked for a narrower recent window.
For human-facing follower/member/liker lists without an explicit requested count, prefer limit=100 and return coverage when more may exist.
For follower/following/member/liker queries that require local filtering on actor fields such as username or fullname, prefer a bounded scan like limit=100 / scan_limit=100 by default, or at most about 200 when a slightly broader sample is justified. Do not jump to 1000 unless the user explicitly asked for exhaustive coverage or a very large sample.
Unknown fields / where keys now fail fast. Use only canonical field names.
Ownership phrasing like "what collections does Qwen have", "collections by Qwen", or "collections owned by Qwen" means an owner lookup, so use hf_collections_search(owner="Qwen"), not a keyword-only query="Qwen" search; it filters owners case-insensitively.
Ownership phrasing like "what spaces does X have", "what models does X have", or "what datasets does X have" means an author/owner inventory lookup, so use hf_spaces_search(author="X"), hf_models_search(author="X"), or hf_datasets_search(author="X") rather than a global keyword-only search.
For profile/detail/social questions about a user or org — bio, description, display name, website, GitHub, Twitter/X, LinkedIn, Bluesky, organizations, or pro status — use hf_profile_summary(...) first.
For join-style questions that need profile details for followers, following, members, likers, or other actor lists, first fetch a bounded actor list, filter locally on actor fields like username / fullname, then hydrate only the bounded matches with hf_profile_summary(...).
Do not set the initial actor-list limit equal to the whole remaining call budget when each match needs a follow-up profile lookup; reserve budget for the profile-detail calls and return coverage if the hydration step is partial.
For exact aggregate counts like "how many models/datasets/spaces does X have", prefer hf_profile_summary(...)['item'] counts. Those overview-owned counts may differ slightly from visible public search/list results, so if the user also asked for the list, preserve that distinction.
For owner inventory queries without an explicit requested count, use hf_profile_summary(...) first when a specific owner is known. If the count is modest, use it to size the follow-up list call; otherwise return a bounded list plus coverage instead of pretending completeness.
Think like huggingface_hub: search, filter, author, repo-type-specific upstream params, then fields.
Push constraints upstream whenever a first-class helper argument exists.
post_filter is only for normalized row filters that cannot be pushed upstream.
num_params is a first-class upstream model-search arg; use num_params="min:6B,max:128B" instead of post_filter when possible.
For created/updated date constraints, pair local post_filter with the matching sort (created_at or last_modified). Do not rely on date-only post_filter over an unsorted repo search window.
Keep post_filter simple:
- exact match or in for returned fields like runtime_stage
- gte / lte for normalized numeric fields like downloads and likes
- gte / lte also work for normalized ISO timestamp fields like created_at and last_modified
Do not use post_filter for things that already have first-class upstream params like author, pipeline_tag, num_params on model search, dataset_name, language, models, or datasets.

Examples:

result = await hf_models_search(pipeline_tag="text-to-image", limit=10)
result

result = await hf_models_search(
    pipeline_tag="text-generation",
    num_params="min:20B,max:80B",
    sort="trending_score",
    limit=50,
)
result

result = await hf_collections_search(owner="Qwen", limit=10)
result

Field-only pattern:

resp = await hf_models_search(
    pipeline_tag="text-to-image",
    fields=["repo_id", "author", "likes", "downloads", "repo_url"],
    limit=3,
)
result = resp["items"]
result

Coverage pattern:

resp = await hf_user_likes(
    username="julien-c",
    sort="repo_likes",
    ranking_window=50,
    limit=20,
    fields=["repo_id", "repo_likes", "repo_url"],
)
result = {"results": resp["items"], "coverage": resp["meta"]}
result

Owner-inventory pattern:

profile = await hf_profile_summary(handle="huggingface")
count = (profile.get("item") or {}).get("spaces_count")
limit = 200 if not isinstance(count, int) else min(max(count, 1), 200)
resp = await hf_spaces_search(
    author="huggingface",
    limit=limit,
    fields=["repo_id", "repo_url"],
)
meta = resp.get("meta") or {}
if meta.get("limit_boundary_hit") or meta.get("more_available") not in {False, None}:
    result = {"results": resp["items"], "coverage": {**meta, "profile_spaces_count": count}}
else:
    result = resp["items"]
result

Follower-profile join pattern:

followers_resp = await hf_user_graph(
    relation="followers",
    limit=100,
    scan_limit=100,
    fields=["username", "fullname"],
)
followers = followers_resp.get("items") or []
matches = []
for follower in followers:
    username = follower.get("username")
    fullname = follower.get("fullname")
    starts_with_b = (
        (isinstance(username, str) and username.lower().startswith("b"))
        or (isinstance(fullname, str) and fullname.lower().startswith("b"))
    )
    if starts_with_b:
        matches.append(follower)
remaining_profile_calls = max(0, max_calls - 1)
results = []
for follower in matches[:remaining_profile_calls]:
    username = follower.get("username")
    if not username:
        continue
    profile = await hf_profile_summary(handle=username)
    item = profile.get("item") or {}
    results.append(
        {
            "username": username,
            "fullname": follower.get("fullname"),
            "github_url": item.get("github_url"),
        }
    )
result = {
    "results": results,
    "coverage": {
        "followers": followers_resp.get("meta") or {},
        "matching_followers_seen": len(matches),
        "profile_calls_used": len(results),
        "profile_hydration_partial": len(matches) > len(results),
    },
}
result

Follower-likes aggregation pattern:

followers_resp = await hf_user_graph(relation="followers", limit=100, fields=["username"])
followers = followers_resp.get("items") or []
results = []
for follower in followers:
    username = follower.get("username")
    if not username:
        continue
    likes_resp = await hf_user_likes(
        username=username,
        repo_types=["model"],
        limit=20,
        fields=["repo_id", "liked_at"],
    )
    results.append(
        {
            "follower": username,
            "liked_models": likes_resp.get("items") or [],
        }
    )
coverage = {
    "followers": followers_resp.get("meta") or {},
}
result = {"results": results, "coverage": coverage}
result

Current-user pro-follower model-likes pattern:

followers_resp = await hf_user_graph(
    relation="followers",
    pro_only=True,
    limit=100,
    fields=["username"],
)
followers = followers_resp.get("items") or []
remaining_calls = max(0, max_calls - 1)
results = {}
partial = (
    (followers_resp.get("meta") or {}).get("limit_boundary_hit")
    or (followers_resp.get("meta") or {}).get("more_available") not in {False, None}
)
processed_followers = 0
for follower in followers:
    if remaining_calls <= 0:
        partial = True
        break
    username = follower.get("username")
    if not username:
        continue
    likes_resp = await hf_user_likes(
        username=username,
        repo_types=["model"],
        limit=2,
        fields=["repo_id", "repo_author", "liked_at"],
    )
    remaining_calls -= 1
    likes_meta = likes_resp.get("meta") or {}
    if likes_meta.get("limit_boundary_hit") or likes_meta.get("more_available") not in {False, None}:
        partial = True
    items = likes_resp.get("items") or []
    if items:
        results[username] = items
    processed_followers += 1
coverage = {
    "followers": followers_resp.get("meta") or {},
    "processed_followers": processed_followers,
    "partial": partial,
}
result = {"results": results, "coverage": coverage}
result

Navigation graph

Use the helper that matches the question type.

exact repo details → hf_repo_details(...)
model search/list/discovery → hf_models_search(...)
dataset search/list/discovery → hf_datasets_search(...)
space search/list/discovery → hf_spaces_search(...)
cross-type repo search → hf_repo_search(...)
trending repos → hf_trending(...)
daily papers → hf_daily_papers(...)
repo discussions → hf_repo_discussions(...)
specific discussion details → hf_repo_discussion_details(...)
users who liked one repo → hf_repo_likers(...)
profile / overview / social/detail / aggregate counts → hf_profile_summary(...)
followers / following lists → hf_user_graph(...)
repos a user liked → hf_user_likes(...)
recent activity feed → hf_recent_activity(...)
organization members → hf_org_members(...)
collections search → hf_collections_search(...)
items inside a known collection → hf_collection_items(...)
explicit current username → hf_whoami()

Direction reminders:

hf_user_likes(...) = user → repos
hf_repo_likers(...) = repo → users
hf_user_graph(...) = user/org → followers/following

Helper result shape

All helpers return:

{
  "ok": bool,
  "item": dict | None,
  "items": list[dict],
  "meta": dict,
  "error": str | None,
}

Rules:

items is the canonical list field.
item is just a singleton convenience.
meta contains helper-owned execution, limit, and coverage info.

High-signal output rules

Prefer compact dict/list outputs over prose when the user asked for fields.
Use canonical snake_case keys in generated code and structured output.
Use repo_id as the display label for repos.
For joins/intersections/rankings, fetch the needed working set first and compute locally.
If the result is partial, use top-level keys results and coverage.

Helper signatures (generated from Python)

These signatures are exported from the live runtime with inspect.signature(...). If prompt prose and signatures disagree, trust these signatures.

await hf_collection_items(collection_id: 'str', repo_types: 'list[str] | None' = None, limit: 'int' = 100, count_only: 'bool' = False, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_collections_search(query: 'str | None' = None, owner: 'str | None' = None, limit: 'int' = 20, count_only: 'bool' = False, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_daily_papers(limit: 'int' = 20, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_datasets_search(search: 'str | None' = None, filter: 'str | list[str] | None' = None, author: 'str | None' = None, benchmark: 'str | bool | None' = None, dataset_name: 'str | None' = None, gated: 'bool | None' = None, language_creators: 'str | list[str] | None' = None, language: 'str | list[str] | None' = None, multilinguality: 'str | list[str] | None' = None, size_categories: 'str | list[str] | None' = None, task_categories: 'str | list[str] | None' = None, task_ids: 'str | list[str] | None' = None, sort: 'str | None' = None, limit: 'int' = 100, expand: 'list[str] | None' = None, full: 'bool | None' = None, fields: 'list[str] | None' = None, post_filter: 'dict[str, Any] | None' = None) -> 'dict[str, Any]'

await hf_models_search(search: 'str | None' = None, filter: 'str | list[str] | None' = None, author: 'str | None' = None, apps: 'str | list[str] | None' = None, gated: 'bool | None' = None, inference: 'str | None' = None, inference_provider: 'str | list[str] | None' = None, model_name: 'str | None' = None, trained_dataset: 'str | list[str] | None' = None, pipeline_tag: 'str | None' = None, num_params: 'str | None' = None, emissions_thresholds: 'tuple[float, float] | None' = None, sort: 'str | None' = None, limit: 'int' = 100, expand: 'list[str] | None' = None, full: 'bool | None' = None, card_data: 'bool' = False, fetch_config: 'bool' = False, fields: 'list[str] | None' = None, post_filter: 'dict[str, Any] | None' = None) -> 'dict[str, Any]'

await hf_org_members(organization: 'str', limit: 'int | None' = None, scan_limit: 'int | None' = None, count_only: 'bool' = False, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_profile_summary(handle: 'str | None' = None, include: 'list[str] | None' = None, likes_limit: 'int' = 10, activity_limit: 'int' = 10) -> 'dict[str, Any]'

await hf_recent_activity(feed_type: 'str | None' = None, entity: 'str | None' = None, activity_types: 'list[str] | None' = None, repo_types: 'list[str] | None' = None, limit: 'int | None' = None, max_pages: 'int | None' = None, start_cursor: 'str | None' = None, count_only: 'bool' = False, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_repo_details(repo_id: 'str | None' = None, repo_ids: 'list[str] | None' = None, repo_type: 'str' = 'auto', fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_repo_discussion_details(repo_type: 'str', repo_id: 'str', discussion_num: 'int', fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_repo_discussions(repo_type: 'str', repo_id: 'str', limit: 'int' = 20, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_repo_likers(repo_id: 'str', repo_type: 'str', limit: 'int | None' = None, count_only: 'bool' = False, pro_only: 'bool | None' = None, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_repo_search(search: 'str | None' = None, repo_type: 'str | None' = None, repo_types: 'list[str] | None' = None, filter: 'str | list[str] | None' = None, author: 'str | None' = None, sort: 'str | None' = None, limit: 'int' = 100, fields: 'list[str] | None' = None, post_filter: 'dict[str, Any] | None' = None) -> 'dict[str, Any]'

await hf_runtime_capabilities(section: 'str | None' = None) -> 'dict[str, Any]'

await hf_spaces_search(search: 'str | None' = None, filter: 'str | list[str] | None' = None, author: 'str | None' = None, datasets: 'str | list[str] | None' = None, models: 'str | list[str] | None' = None, linked: 'bool' = False, sort: 'str | None' = None, limit: 'int' = 100, expand: 'list[str] | None' = None, full: 'bool | None' = None, fields: 'list[str] | None' = None, post_filter: 'dict[str, Any] | None' = None) -> 'dict[str, Any]'

await hf_trending(repo_type: 'str' = 'model', limit: 'int' = 20, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_user_graph(username: 'str | None' = None, relation: 'str' = 'followers', limit: 'int | None' = None, scan_limit: 'int | None' = None, count_only: 'bool' = False, pro_only: 'bool | None' = None, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None) -> 'dict[str, Any]'

await hf_user_likes(username: 'str | None' = None, repo_types: 'list[str] | None' = None, limit: 'int | None' = None, scan_limit: 'int | None' = None, count_only: 'bool' = False, where: 'dict[str, Any] | None' = None, fields: 'list[str] | None' = None, sort: 'str | None' = None, ranking_window: 'int | None' = None) -> 'dict[str, Any]'

await hf_whoami() -> 'dict[str, Any]'

Helper contracts (generated from runtime + wrapper metadata)

These contracts describe the normalized wrapper surface exposed to generated code. Field names and helper-visible enum values are canonical snake_case wrapper names.

All helpers return the same envelope: {ok, item, items, meta, error}.

hf_collection_items

category: collection_navigation
returns:
- envelope: {ok, item, items, meta, error}
- row_type: repo
- default_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- guaranteed_fields: repo_id, repo_type, repo_url
- optional_fields: author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
supported_params: collection_id, repo_types, limit, count_only, where, fields
param_values:
- repo_types: model, dataset, space
fields_contract:
- allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- canonical_only: true
where_contract:
- allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- supported_ops: eq, in, contains, icontains, gte, lte
- normalized_only: true
limit_contract:
- default_limit: 100
- max_limit: 500
notes: Returns repos inside one collection as summary rows.

hf_collections_search

category: collection_search
returns:
- envelope: {ok, item, items, meta, error}
- row_type: collection
- default_fields: collection_id, slug, title, owner, owner_type, description, gating, last_updated, item_count
- guaranteed_fields: collection_id, title, owner
- optional_fields: slug, owner_type, description, gating, last_updated, item_count
supported_params: query, owner, limit, count_only, where, fields
fields_contract:
- allowed_fields: collection_id, slug, title, owner, owner_type, description, gating, last_updated, item_count
- canonical_only: true
where_contract:
- allowed_fields: collection_id, slug, title, owner, owner_type, description, gating, last_updated, item_count
- supported_ops: eq, in, contains, icontains, gte, lte
- normalized_only: true
limit_contract:
- default_limit: 20
- max_limit: 500
notes: Collection summary helper.

hf_daily_papers

category: curated_feed
returns:
- envelope: {ok, item, items, meta, error}
- row_type: daily_paper
- default_fields: paper_id, title, summary, published_at, submitted_on_daily_at, authors, organization, submitted_by, discussion_id, upvotes, github_repo_url, github_stars, project_page_url, num_comments, is_author_participating, repo_id, rank
- guaranteed_fields: paper_id, title, published_at, rank
- optional_fields: summary, submitted_on_daily_at, authors, organization, submitted_by, discussion_id, upvotes, github_repo_url, github_stars, project_page_url, num_comments, is_author_participating, repo_id
supported_params: limit, where, fields
fields_contract:
- allowed_fields: paper_id, title, summary, published_at, submitted_on_daily_at, authors, organization, submitted_by, discussion_id, upvotes, github_repo_url, github_stars, project_page_url, num_comments, is_author_participating, repo_id, rank
- canonical_only: true
where_contract:
- allowed_fields: paper_id, title, summary, published_at, submitted_on_daily_at, authors, organization, submitted_by, discussion_id, upvotes, github_repo_url, github_stars, project_page_url, num_comments, is_author_participating, repo_id, rank
- supported_ops: eq, in, contains, icontains, gte, lte
- normalized_only: true
limit_contract:
- default_limit: 20
- max_limit: 500
notes: Returns daily paper summary rows. repo_id is omitted unless the upstream payload provides it.

hf_datasets_search

category: wrapped_hf_repo_search
backed_by: HfApi.list_datasets
returns:
- envelope: {ok, item, items, meta, error}
- row_type: repo
- default_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- guaranteed_fields: repo_id, repo_type, author, repo_url
- optional_fields: likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
supported_params: search, filter, author, benchmark, dataset_name, gated, language_creators, language, multilinguality, size_categories, task_categories, task_ids, sort, limit, expand, full, fields, post_filter
sort_values: created_at, downloads, last_modified, likes, trending_score
expand_values: author, card_data, citation, created_at, description, disabled, downloads, downloads_all_time, gated, last_modified, likes, paperswithcode_id, private, resource_group, sha, siblings, tags, trending_score, xet_enabled, gitaly_uid
fields_contract:
- allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- canonical_only: true
post_filter_contract:
- allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- supported_ops: eq, in, contains, icontains, gte, lte
- normalized_only: true
limit_contract:
- default_limit: 100
- max_limit: 5000
notes: Thin dataset-search wrapper around the Hub list_datasets path. Prefer this over hf_repo_search for dataset-only queries. This is a one-shot selective search; if meta.limit_boundary_hit is true, more rows may exist and counts are not exact.

hf_models_search

category: wrapped_hf_repo_search
backed_by: HfApi.list_models
returns:
- envelope: {ok, item, items, meta, error}
- row_type: repo
- default_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- guaranteed_fields: repo_id, repo_type, author, repo_url
- optional_fields: likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
supported_params: search, filter, author, apps, gated, inference, inference_provider, model_name, trained_dataset, pipeline_tag, num_params, emissions_thresholds, sort, limit, expand, full, card_data, fetch_config, fields, post_filter
sort_values: created_at, downloads, last_modified, likes, trending_score
expand_values: author, base_models, card_data, config, created_at, disabled, downloads, downloads_all_time, eval_results, gated, gguf, inference, inference_provider_mapping, last_modified, library_name, likes, mask_token, model_index, pipeline_tag, private, resource_group, safetensors, sha, siblings, spaces, tags, transformers_info, trending_score, widget_data, xet_enabled, gitaly_uid
fields_contract:
- allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- canonical_only: true
post_filter_contract:
- allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- supported_ops: eq, in, contains, icontains, gte, lte
- normalized_only: true
limit_contract:
- default_limit: 100
- max_limit: 5000
notes: Thin model-search wrapper around the Hub list_models path. Prefer this over hf_repo_search for model-only queries. This is a one-shot selective search; if meta.limit_boundary_hit is true, more rows may exist and counts are not exact.

hf_org_members

category: graph_scan
returns:
- envelope: {ok, item, items, meta, error}
- row_type: actor
- default_fields: username, fullname, is_pro, role, type
- guaranteed_fields: username
- optional_fields: fullname, is_pro, role, type
supported_params: organization, limit, scan_limit, count_only, where, fields
fields_contract:
- allowed_fields: username, fullname, is_pro, role, type
- canonical_only: true
where_contract:
- allowed_fields: username, fullname, is_pro, role, type
- supported_ops: eq, in, contains, icontains, gte, lte
- normalized_only: true
limit_contract:
- default_limit: 1000
- max_limit: 10000
- scan_max: 10000
notes: Returns organization member summary rows.

hf_profile_summary

category: profile_summary
returns:
- envelope: {ok, item, items, meta, error}
- row_type: profile
- default_fields: handle, entity_type, display_name, bio, description, avatar_url, website_url, twitter_url, github_url, linkedin_url, bluesky_url, followers_count, following_count, likes_count, members_count, models_count, datasets_count, spaces_count, discussions_count, papers_count, upvotes_count, organizations, is_pro, likes_sample, activity_sample
- guaranteed_fields: handle, entity_type
- optional_fields: display_name, bio, description, avatar_url, website_url, twitter_url, github_url, linkedin_url, bluesky_url, followers_count, following_count, likes_count, members_count, models_count, datasets_count, spaces_count, discussions_count, papers_count, upvotes_count, organizations, is_pro, likes_sample, activity_sample
supported_params: handle, include, likes_limit, activity_limit
param_values:
- include: likes, activity
notes: Profile summary helper. Aggregate counts like followers_count/following_count are in the base item. include=['likes', 'activity'] adds composed samples and extra upstream work; no other include values are supported. Overview-owned repo counts may differ slightly from visible public search/list results.

hf_recent_activity

category: activity_feed
returns:
- envelope: {ok, item, items, meta, error}
- row_type: activity
- default_fields: event_type, repo_id, repo_type, timestamp
- guaranteed_fields: event_type, timestamp
- optional_fields: repo_id, repo_type
supported_params: feed_type, entity, activity_types, repo_types, limit, max_pages, start_cursor, count_only, where, fields
param_values:
- feed_type: user, org
- repo_types: model, dataset, space
fields_contract:
- allowed_fields: event_type, repo_id, repo_type, timestamp
- canonical_only: true
where_contract:
- allowed_fields: event_type, repo_id, repo_type, timestamp
- supported_ops: eq, in, contains, icontains, gte, lte
- normalized_only: true
limit_contract:
- default_limit: 100
- max_limit: 2000
- max_pages: 10
- page_limit: 100
notes: Activity helper may fetch multiple pages when requested coverage exceeds one page. count_only may still be a lower bound unless the feed exhausts before max_pages.

hf_repo_details

category: repo_detail
returns:
- envelope: {ok, item, items, meta, error}
- row_type: repo
- default_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- guaranteed_fields: repo_id, repo_type, author, repo_url
- optional_fields: likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
supported_params: repo_id, repo_ids, repo_type, fields
param_values:
- repo_type: model, dataset, space, auto
fields_contract:
- allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- canonical_only: true
notes: Exact repo metadata path. Multiple repo_ids may trigger one detail call per requested repo.

hf_repo_discussion_details

category: discussion_detail
returns:
- envelope: {ok, item, items, meta, error}
- row_type: discussion_detail
- default_fields: num, repo_id, repo_type, title, author, created_at, status, url, comment_count, latest_comment_author, latest_comment_created_at, latest_comment_text, latest_comment_html
- guaranteed_fields: repo_id, repo_type, title, author, status
- optional_fields: num, created_at, url, comment_count, latest_comment_author, latest_comment_created_at, latest_comment_text, latest_comment_html
supported_params: repo_type, repo_id, discussion_num, fields
param_values:
- repo_type: model, dataset, space
fields_contract:
- allowed_fields: num, repo_id, repo_type, title, author, created_at, status, url, comment_count, latest_comment_author, latest_comment_created_at, latest_comment_text, latest_comment_html
- canonical_only: true
notes: Exact discussion detail helper.

hf_repo_discussions

category: discussion_summary
returns:
- envelope: {ok, item, items, meta, error}
- row_type: discussion
- default_fields: num, repo_id, repo_type, title, author, created_at, status, url
- guaranteed_fields: num, title, author, status
- optional_fields: repo_id, repo_type, created_at, url
supported_params: repo_type, repo_id, limit, fields
param_values:
- repo_type: model, dataset, space
fields_contract:
- allowed_fields: num, repo_id, repo_type, title, author, created_at, status, url
- canonical_only: true
limit_contract:
- default_limit: 20
- max_limit: 200
notes: Discussion summary helper.

hf_repo_likers

category: repo_to_users
returns:
- envelope: {ok, item, items, meta, error}
- row_type: actor
- default_fields: username, fullname, is_pro, role, type
- guaranteed_fields: username
- optional_fields: fullname, is_pro, role, type
supported_params: repo_id, repo_type, limit, count_only, pro_only, where, fields
param_values:
- repo_type: model, dataset, space
fields_contract:
- allowed_fields: username, fullname, is_pro, role, type
- canonical_only: true
where_contract:
- allowed_fields: username, fullname, is_pro, role, type
- supported_ops: eq, in, contains, icontains, gte, lte
- normalized_only: true
limit_contract:
- default_limit: 1000
notes: Returns users who liked a repo.

hf_repo_search

category: cross_type_repo_search
returns:
- envelope: {ok, item, items, meta, error}
- row_type: repo
- default_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- guaranteed_fields: repo_id, repo_type, author, repo_url
- optional_fields: likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
supported_params: search, repo_type, repo_types, filter, author, sort, limit, fields, post_filter
sort_values_by_repo_type:
- dataset: created_at, downloads, last_modified, likes, trending_score
- model: created_at, downloads, last_modified, likes, trending_score
- space: created_at, last_modified, likes, trending_score
param_values:
- repo_type: model, dataset, space
- repo_types: model, dataset, space
- sort: created_at, downloads, last_modified, likes, trending_score
fields_contract:
- allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- canonical_only: true
post_filter_contract:
- allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- supported_ops: eq, in, contains, icontains, gte, lte
- normalized_only: true
limit_contract:
- default_limit: 100
- max_limit: 5000
notes: Small generic repo-search helper. Prefer hf_models_search, hf_datasets_search, or hf_spaces_search for single-type queries; use hf_repo_search for intentionally cross-type search. This is a one-shot selective search; if meta.limit_boundary_hit is true, more rows may exist and counts are not exact.

hf_runtime_capabilities

category: introspection
returns:
- envelope: {ok, item, items, meta, error}
- row_type: runtime_capability
- default_fields: allowed_sections, overview, helpers, helper_contracts, helper_defaults, fields, limits, repo_search
- guaranteed_fields: allowed_sections, overview, helpers, helper_contracts, helper_defaults, fields, limits, repo_search
- optional_fields: []
supported_params: section
param_values:
- section: overview, helpers, helper_contracts, helper_defaults, fields, limits, repo_search
notes: Introspection helper. Use section=... to narrow the response.

hf_spaces_search

category: wrapped_hf_repo_search
backed_by: HfApi.list_spaces
returns:
- envelope: {ok, item, items, meta, error}
- row_type: repo
- default_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- guaranteed_fields: repo_id, repo_type, author, repo_url
- optional_fields: likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
supported_params: search, filter, author, datasets, models, linked, sort, limit, expand, full, fields, post_filter
sort_values: created_at, last_modified, likes, trending_score
expand_values: author, card_data, created_at, datasets, disabled, last_modified, likes, models, private, resource_group, runtime, sdk, sha, siblings, subdomain, tags, trending_score, xet_enabled, gitaly_uid
fields_contract:
- allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- canonical_only: true
post_filter_contract:
- allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
- supported_ops: eq, in, contains, icontains, gte, lte
- normalized_only: true
limit_contract:
- default_limit: 100
- max_limit: 5000
notes: Thin space-search wrapper around the Hub list_spaces path. Prefer this over hf_repo_search for space-only queries. This is a one-shot selective search; if meta.limit_boundary_hit is true, more rows may exist and counts are not exact.

hf_trending

category: curated_repo_feed
returns:
- envelope: {ok, item, items, meta, error}
- row_type: repo
- default_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime, trending_rank
- guaranteed_fields: repo_id, repo_type, author, repo_url, trending_rank
- optional_fields: likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime
supported_params: repo_type, limit, where, fields
param_values:
- repo_type: model, dataset, space, all
fields_contract:
- allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime, trending_rank
- canonical_only: true
where_contract:
- allowed_fields: repo_id, repo_type, author, likes, downloads, trending_score, created_at, last_modified, pipeline_tag, num_params, repo_url, tags, library_name, description, paperswithcode_id, sdk, models, datasets, subdomain, runtime_stage, runtime, trending_rank
- supported_ops: eq, in, contains, icontains, gte, lte
- normalized_only: true
limit_contract:
- default_limit: 20
- max_limit: 20
notes: Returns ordered trending summary rows only. Use hf_repo_details for exact repo metadata.

hf_user_graph

category: graph_scan
returns:
- envelope: {ok, item, items, meta, error}
- row_type: actor
- default_fields: username, fullname, is_pro, role, type
- guaranteed_fields: username
- optional_fields: fullname, is_pro, role, type
supported_params: username, relation, limit, scan_limit, count_only, pro_only, where, fields
param_values:
- relation: followers, following
fields_contract:
- allowed_fields: username, fullname, is_pro, role, type
- canonical_only: true
where_contract:
- allowed_fields: username, fullname, is_pro, role, type
- supported_ops: eq, in, contains, icontains, gte, lte
- normalized_only: true
limit_contract:
- default_limit: 1000
- max_limit: 10000
- scan_max: 10000
notes: Returns followers/following summary rows.

hf_user_likes

category: user_to_repos
returns:
- envelope: {ok, item, items, meta, error}
- row_type: user_like
- default_fields: liked_at, repo_id, repo_type, repo_author, repo_likes, repo_downloads, repo_url
- guaranteed_fields: liked_at, repo_id, repo_type
- optional_fields: repo_author, repo_likes, repo_downloads, repo_url
supported_params: username, repo_types, limit, scan_limit, count_only, where, fields, sort, ranking_window
sort_values: liked_at, repo_likes, repo_downloads
param_values:
- repo_types: model, dataset, space
- sort: liked_at, repo_likes, repo_downloads
fields_contract:
- allowed_fields: liked_at, repo_id, repo_type, repo_author, repo_likes, repo_downloads, repo_url
- canonical_only: true
where_contract:
- allowed_fields: liked_at, repo_id, repo_type, repo_author, repo_likes, repo_downloads, repo_url
- supported_ops: eq, in, contains, icontains, gte, lte
- normalized_only: true
limit_contract:
- default_limit: 100
- max_limit: 2000
- enrich_max: 50
- ranking_default: 50
- scan_max: 10000
notes: Default recency mode is cheap. Popularity-ranked sorts use canonical keys liked_at/repo_likes/repo_downloads and rerank only a bounded recent shortlist. Check meta.ranking_complete / meta.ranking_window when ranking by popularity; helper-owned coverage matters here.

hf_whoami

category: identity
returns:
- envelope: {ok, item, items, meta, error}
- row_type: user
- default_fields: username, fullname, is_pro
- guaranteed_fields: username
- optional_fields: fullname, is_pro
supported_params: []
notes: Returns the current authenticated user when a request token is available.