✨ KG-TRACES: Unleashing Explainable Reasoning in LLMs with Knowledge Graphs ✨

This repository contains the official implementation of KG-TRACES, a novel framework that enhances the reasoning ability of Large Language Models (LLMs) through explicit supervision over reasoning paths and processes. KG-TRACES aims to provide explainable, accurate, and traceable reasoning by leveraging the power of Knowledge Graphs.

For more details, refer to the accompanying paper: KG-TRACES: Enhancing Large Language Models with Knowledge Graph-constrained Trajectory Reasoning and Attribution Supervision

The full codebase and more information can be found on the official GitHub repository: https://github.com/Edaizi/KG-TRACES

KG-TRACES Teaser Image: Comparison of Reasoning Methods

*Figure 1: KG-TRACES (d) stands out by generating faithful, attributable responses, adapting to different KG access conditions.*

πŸ’‘ Our Solution: KG-TRACES

KG-TRACES is a novel framework that explicitly teaches LLMs how to reason by supervising their internal "thought process" with knowledge graphs guidance. We guide them to:

  1. πŸ—ΊοΈ Chart the Course: Predict symbolic knowledge graph reasoning paths from question to answer.
  2. πŸ“ Show Their Work: Generate attribution-aware reasoning explanations, clearly claim whether each step comes from the KG or the LLM's internal knowledge 🧠, and how effective it was!

KG-TRACES Method Overview

*Figure 2: The KG-TRACES framework*

🌟 Why KG-TRACES Rocks

  • πŸ” Crystal-Clear Explanations: Understand why the LLM reached its conclusion.
  • πŸ›‘οΈ Trustworthy & Attributable: Know the evidence source of each reasoning step.
  • πŸ’ͺ Robust Performance: Excels even with limited or no direct KG access during inference.
  • 🌍 Versatile: Shows strong generalization to specialized fields like medicine.

πŸš€ Quickstart: Pretrained Models

You can easily load our fine-tuned KG-TRACES models from the Hugging Face Model Hub using the transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_hub_name = "Edaizi/KG-TRACES"
tokenizer = AutoTokenizer.from_pretrained(model_hub_name)
model = AutoModelForCausalLM.from_pretrained(model_hub_name)

πŸ“š Datasets

We've meticulously prepared augmented SFT datasets for WebQSP and CWQ, packed with reasoning paths and augmented reasoning processes with source attributions. Find them on Hugging Face:

You can load these datasets as follows:

from datasets import load_dataset

webqsp_sft_data = load_dataset("Edaizi/KG-TRACES-WebQSP")
cwq_sft_data = load_dataset("Edaizi/KG-TRACES-CWQ")

πŸ“œ Citation

If KG-TRACES helps your research or project, we'd love a shout-out! Please cite:

@misc{wu2025kgtracesenhancinglargelanguage,
      title={KG-TRACES: Enhancing Large Language Models with Knowledge Graph-constrained Trajectory Reasoning and Attribution Supervision}, 
      author={Rong Wu and Pinlong Cai and Jianbiao Mei and Licheng Wen and Tao Hu and Xuemeng Yang and Daocheng Fu and Botian Shi},
      year={2025},
      eprint={2506.00783},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.00783}, 
}
Downloads last month
20
Safetensors
Model size
8B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Datasets used to train Edaizi/KG-TRACES