mistral_common.guidance.tokenizer
MistralLLGTokenizer(tokenizer)
Wraps a Tekken tokenizer for use with llguidance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tokenizer
|
Tokenizer
|
The Tekken tokenizer to wrap for llguidance compatibility. |
required |
Raises:
| Type | Description |
|---|---|
TypeError
|
If the tokenizer is not a Tekkenizer. |
ValueError
|
If a special token has an invalid format. |
Source code in src/mistral_common/guidance/tokenizer.py
bos_token_id
property
The beginning of string token id.
eos_token_id
property
The end of string token id.
special_token_ids
property
The list of special token ids.
tokens
property
The list of token byte representations.
from_mistral_tokenizer(tokenizer)
Creates an llguidance tokenizer from a Mistral tokenizer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tokenizer
|
MistralTokenizer
|
The Mistral tokenizer to convert. Must wrap a Tekkenizer. |
required |
Returns:
| Type | Description |
|---|---|
LLTokenizer
|
The llguidance tokenizer. |
Raises:
| Type | Description |
|---|---|
TypeError
|
If the underlying tokenizer is not a Tekkenizer. |