DemonioStrada commited on
Commit
c19e8a3
·
verified ·
1 Parent(s): 07c19f9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +110 -2
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  title: Khalil
3
- emoji: 😻
4
  colorFrom: blue
5
  colorTo: green
6
  sdk: gradio
@@ -11,4 +11,112 @@ license: apache-2.0
11
  short_description: Place for AI Models
12
  ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: Khalil
3
+ emoji: 🔥
4
  colorFrom: blue
5
  colorTo: green
6
  sdk: gradio
 
11
  short_description: Place for AI Models
12
  ---
13
 
14
+ Model Card: LLM Brain Rot Demonstration
15
+ Model/Dataset Name
16
+ LLM Brain Rot Demonstration: Qwen2.5 0.5B Comparison
17
+ 1. Overview
18
+ This demonstration showcases the "Brain Rot" effect in Large Language Models (LLMs) as described in the research paper "LLMs Can Get Brain Rot" by Xing et al. (2024). The demonstration compares two Qwen2.5 0.5B Instruct models: one trained on control data and one trained on 100% M1 junk data, illustrating how exposure to low-quality web content can degrade an LLM's cognitive capabilities.
19
+ The original research was conducted by a team from Texas A&M University, University of Texas at Austin, and Purdue University. This demonstration is a simplified implementation of their findings, focusing on the most extreme case (100% junk data) to clearly illustrate the phenomenon.
20
+ 2. Intended Use
21
+ Primary Tasks
22
+ ⦁ Educational demonstration of data quality effects on LLM performance
23
+ ⦁ Comparison of reasoning capabilities between models trained on different data quality
24
+ ⦁ Illustration of "thought-skipping" phenomenon in LLMs
25
+ Intended Users
26
+ ⦁ Students learning about LLM training and data quality
27
+ ⦁ Researchers studying model robustness and data effects
28
+ ⦁ Educators demonstrating AI concepts
29
+ ⦁ Anyone interested in understanding how training data affects model behavior
30
+ Inappropriate Uses
31
+ ⦁ Production deployment or real-world applications
32
+ ⦁ Making generalized claims about all LLMs based on this limited comparison
33
+ ⦁ Evaluating the overall quality of the base Qwen2.5 model family
34
+ ⦁ Drawing conclusions about the effects of content beyond what is demonstrated
35
+ 3. Dataset/Model Details
36
+ Models
37
+ ⦁ Base Model: Qwen2.5 0.5B Instruct
38
+ ⦁ Comparison Models:
39
+ 1. Qwen2.5 0.5B trained on control data (0% junk)
40
+ 2. Qwen2.5 0.5B trained on 100% M1 junk data
41
+ Dataset
42
+ ⦁ ARC Challenge questions (small sample from main repository)
43
+ ⦁ Safety questions (small sample from main repository)
44
+ ⦁ RULER (3 custom sets based on RULER repository sub tests; Needle in Haystack, Variable Tracking and Question Answering)
45
+ ⦁ TRAIT (custom set based on original TRAIT repository)
46
+ Model Variants and Datasets in Original Research
47
+ The original research included 40 model variants:
48
+ ⦁ 4 base models: Llama3 8B, Qwen2.5 7B, Qwen2.5 0.5B, Qwen3 4B
49
+ ⦁ 2 junk metrics: M1 (engagement degree) and M2 (semantic quality)
50
+ ⦁ 5 training ratios: 0%, 20%, 50%, 80%, 100% junk data
51
+ ⦁ 4 base models x 5 training ratios x 2 junk metrics = 40 total model variants
52
+ The original Dataset was:
53
+ ⦁ Source: Twitter/X posts from 2010
54
+ ⦁ Filtering: M1 metric (engagement degree) - short but highly popular posts
55
+ ⦁ Processing: Control data consists of longer, less popular posts
56
+ ⦁ Language: Primarily English
57
+ 4. Ethical Considerations
58
+ Possible Biases
59
+ ⦁ The Twitter dataset may contain demographic, cultural, and ideological biases present on the platform
60
+ ⦁ The M1 metric (based on popularity) may amplify content that is attention-grabbing rather than accurate or thoughtful
61
+ ⦁ The models may reproduce stereotypes or problematic content present in the training data
62
+ Risks of Misuse
63
+ ⦁ The junk-trained model may generate lower-quality, less reliable, or potentially problematic responses
64
+ ⦁ Users might overgeneralize from this specific demonstration to make broader claims about LLMs
65
+ ⦁ The demonstration might be misinterpreted as a definitive statement about all social media content
66
+ Privacy/Consent Issues
67
+ ⦁ The models were trained on public Twitter posts, but individual tweets may contain personal information
68
+ ⦁ Users should be cautious about inputting personal information into either model
69
+ 5. Limitations
70
+ Scope Limitations
71
+ ⦁ Only demonstrates the effect with one model family (Qwen2.5) and size (0.5B)
72
+ ⦁ Only shows the comparison between 0% and 100% junk data, not the "dose-response" relationship
73
+ ⦁ Only demonstrates M1 metric effects, not M2 (semantic quality)
74
+ ⦁ Only evaluates a limited number of examples per task type for demonstration purposes
75
+ Technical Limitations
76
+ ⦁ The smaller model size (0.5B) may show more pronounced effects than larger models
77
+ ⦁ The demonstration focuses on reasoning tasks, but the original paper found effects across multiple capabilities
78
+ ⦁ The interface may not fully capture all nuances of the "thought-skipping" phenomenon
79
+ Generalizability
80
+ ⦁ Results may not apply to all LLM architectures or training methodologies
81
+ ⦁ The specific Twitter dataset from 2010 may not represent current web content
82
+ ⦁ The demonstration shows correlation, not necessarily causation for all scenarios
83
+ 6. Training & Evaluation
84
+ Training Process
85
+ The original models were trained using the following process:
86
+ 1. Base models (Qwen2.5 0.5B Instruct) underwent continual pre-training
87
+ 2. Training parameters: learning rate 1×10^-5, AdamW optimizer, 3 epochs
88
+ 3. Models were trained on either control data or 100% M1 junk data
89
+ 4. After pre-training, models underwent instruction tuning on the Alpaca English dataset
90
+ Evaluation Metrics
91
+ The original research evaluated models on multiple benchmarks:
92
+ ⦁ ARC Challenge: Chain-of-thought prompting with accuracy measurement
93
+ ⦁ RULER: Sample tasks representing needle-in-haystack, variable tracking, and question answering
94
+ ⦁ TRAIT: Sample personality questions with simplified analysis
95
+ ⦁ Safety: Subset of harmful behaviors with refusal detection
96
+ ⦁ Thought-skipping analysis: Heuristic-based categorization of reasoning failures
97
+ Key Results from Original Research
98
+ For Qwen2.5 0.5B with M1 intervention:
99
+ ⦁ ARC Challenge (COT): 74.9 → 57.2 (17.7 point drop)
100
+ ⦁ RULER Overall: 93.9 → 71.0 (22.9 point drop)
101
+ ⦁ Safety metrics showed increased risk scores
102
+ ⦁ Personality traits showed increases in narcissism and psychopathy
103
+ Analysis of Failures
104
+ The primary failure mode identified was "thought-skipping," where models:
105
+ ⦁ Skip intermediate reasoning steps
106
+ ⦁ Provide answers without showing their thinking process
107
+ ⦁ Make logical leaps or factual errors in their reasoning
108
+ 7. References
109
+ Primary Research
110
+ ⦁ Xing, S., Hong, J., Wang, Y., Chen, R., Zhang, Z., Grama, A., Tu, Z., & Wang, Z. (2024). LLMs Can Get Brain Rot! arXiv preprint arXiv:2510.13928.
111
+ Resources
112
+ ⦁ GitHub Repository: https://github.com/llm-brain-rot/llm-brain-rot
113
+ ⦁ Project Website: https://llm-brain-rot.github.io/
114
+ ⦁ Hugging Face Models:
115
+ 1. Qwen2.5 0.5B trained on control data (0% junk): https://huggingface.co/AmberYifan/qwen2.5-0.5b-instruct-full-pretrain-control-tweet-1m-en-sft
116
+ 2. Qwen2.5 0.5B trained on 100% M1 junk data: https://huggingface.co/AmberYifan/qwen2.5-0.5b-instruct-full-pretrain-junk-tweet-1m-en-sft
117
+ Related Work
118
+ ⦁ Qi, X., Zeng, Y., Xie, T., Chen, P.-Y., Jia, R., Mittal, P., & Henderson, P. (2023). Fine-tuning aligned language models compromises safety, even when users do not intend to! arXiv preprint arXiv:2310.03693.
119
+ ⦁ Shumailov, I., Shumailov, I., Shumailova, Z., Papernot, N., Anderson, A., & Gal, Y. (2023). The curse of recursion: Training on generated data makes models forget. arXiv preprint arXiv:2305.17493.
120
+ ⦁ Seddik, M. E., Shumailov, I., Shumailova, Z., & Gal, Y. (2024). How bad is training on synthetic data? A statistical analysis of language model collapse. arXiv preprint arXiv:2404.05094.
121
+
122
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference