Friday, June 6, 2025

How to run Huggingface transformers in Cached Mode / Offline mode?

To use Transformers in an offline or firewalled environment requires the downloaded and cached files ahead of time. Download a model repository from the Hub with the snapshot_download method.


Refer to the Download files from the Hub guide for more options for downloading files from the Hub. You can download files from specific revisions, download from the CLI, and even filter which files to download from a repository.


Copied

from huggingface_hub import snapshot_download


snapshot_download(repo_id="meta-llama/Llama-2-7b-hf", repo_type="model")

Set the environment variable HF_HUB_OFFLINE=1 to prevent HTTP calls to the Hub when loading a model.


Copied

HF_HUB_OFFLINE=1 \

python examples/pytorch/language-modeling/run_clm.py --model_name_or_path meta-llama/Llama-2-7b-hf --dataset_name wikitext ...

Another option for only loading cached files is to set local_files_only=True in from_pretrained().


Copied

from transformers import LlamaForCausalLM


model = LlamaForCausalLM.from_pretrained("./path/to/local/directory", local_files_only=True)

<

>



 

No comments:

Post a Comment