mistral_common.tokens.tokenizers.utils
chunks(lst, chunk_size)
Chunk a list into smaller lists of a given size.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
lst
|
List[str]
|
The list to chunk. |
required |
chunk_size
|
int
|
The size of each chunk. |
required |
Returns:
Type | Description |
---|---|
Iterator[List[str]]
|
An iterator over the chunks. |
Examples:
Source code in src/mistral_common/tokens/tokenizers/utils.py
download_tokenizer_from_hf_hub(model_id, **kwargs)
Download the configuration file of an official Mistral tokenizer from the Hugging Face Hub.
See here for a list of our OSS models.
Note
You need to install the huggingface_hub
package to use this method.
Please run pip install mistral-common[hf-hub]
to install it.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_id
|
str
|
The Hugging Face model ID. |
required |
kwargs
|
Any
|
Additional keyword arguments to pass to |
{}
|
Returns:
Type | Description |
---|---|
str
|
The downloaded tokenizer local path for the given model ID. |