-- Living Mobile --: What is bitsandbytes

Friday, August 4, 2023

What is bitsandbytes

Bitsandbytes is a lightweight wrapper around CUDA custom functions, in particular 8-bit optimizers and quantization functions.

Features

8-bit Optimizers: Adam, AdamW, RMSProp, LARS, LAMB (saves 75% memory)

Stable Embedding Layer: Improved stability through better initialization, and normalization

8-bit quantization: Quantile, Linear, and Dynamic quantization

Fast quantile estimation: Up to 100x faster than other algorithms

Using the 8-bit Optimizers

With bitsandbytes 8-bit optimizers can be used by changing a single line of code in your codebase. For NLP models we recommend also to use the StableEmbedding layers (see below) which improves results and helps with stable 8-bit optimization. To get started with 8-bit optimizers, it is sufficient to replace your old optimizer with the 8-bit optimizer in the following way:

import bitsandbytes as bnb

# adam = torch.optim.Adam(model.parameters(), lr=0.001, betas=(0.9, 0.995)) # comment out old optimizer

adam = bnb.optim.Adam8bit(model.parameters(), lr=0.001, betas=(0.9, 0.995)) # add bnb optimizer

adam = bnb.optim.Adam(model.parameters(), lr=0.001, betas=(0.9, 0.995), optim_bits=8) # equivalent

torch.nn.Embedding(...) -> bnb.nn.StableEmbedding(...) # recommended for NLP models

Note that by default all parameter tensors with less than 4096 elements are kept at 32-bit even if you initialize those parameters with 8-bit optimizers. This is done since such small tensors do not save much memory and often contain highly variable parameters (biases) or parameters that require high precision (batch norm, layer norm). You can change this behavior like so:

# parameter tensors with less than 16384 values are optimized in 32-bit

# it is recommended to use multiplies of 4096

adam = bnb.optim.Adam8bit(model.parameters(), min_8bit_size=16384)

References:

https://pypi.org/project/bitsandbytes-cuda113/#:~:text=Bitsandbytes%20is%20a%20lightweight%20wrapper,bit%20optimizers%20and%20quantization%20functions.

-- Living Mobile --

Friday, August 4, 2023

What is bitsandbytes

No comments:

Post a Comment

Followers

Blog Archive

About Me