Hugging Face Model Uploads & Security
1. Can Anyone Upload Models to Hugging Face?
Yes, Hugging Face allows anyone with an account to upload models, datasets, or other assets to the Hugging Face Hub. These can be public or private, depending on the user's choice. Uploaded models are not automatically vetted or scanned for malicious content.
2. How Can Uploaded Models Pose Security Risks?
Malicious actors can upload models with harmful code in several ways:
-
Custom Code in Model Repositories: Many models include a
model.py
,tokenizer.py
or aREADME.md
with example code. If a user runs code directly from these files (for instance, by following afrom_pretrained
example withtrust_remote_code=True
), arbitrary Python code from the repository may execute on the machine. -
Pickle Payloads: Some model formats (like PyTorch
.bin
files) use the Pythonpickle
serialization mechanism. Loading pickled files from untrusted sources can execute embedded code, potentially compromising your system. -
Custom Pipelines: Some models require custom pipeline code, which may be downloaded and executed if
trust_remote_code
is enabled.
3. How Is a "Payload" Set and Run?
A payload is malicious code embedded in the model repository. Attackers might:
- Modify model files or example scripts to include harmful code.
- Embed malicious logic in model weights (with pickle).
- Rely on users enabling
trust_remote_code=True
, which lets Hugging Face download and execute remote code for custom models.
Example:
from transformers import AutoModel
# WARNING: trust_remote_code=True can execute arbitrary code from the model repo!
model = AutoModel.from_pretrained("malicious-user/malicious-model", trust_remote_code=True)
If the model repo has a custom Python script (e.g., modeling_malicious.py
), this code will be executed.
4. How to Stay Safe
- Never set
trust_remote_code=True
unless you trust the source. - Inspect model repositories and code before using or running any example scripts.
- Prefer popular or verified models.
- Avoid running code or loading weights from unknown or untrusted sources.
5. Summary
- Anyone can upload models to Hugging Face.
- Malicious code can be embedded in model files, custom scripts, or pickled weights.
- Only use
trust_remote_code=True
for trusted sources, and always review repository contents before use.