mistral_common.tokens.tokenizers.multimodal
ImageEncoder(mm_config, special_ids)
Bases: MultiModalEncoder
Image encoder for the multimodal tokenizer.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mm_config
|
MultimodalConfig
|
Configuration for the multimodal tokenizer. |
required |
special_ids
|
SpecialImageIDs
|
Special image tokens ids. |
required |
Source code in src/mistral_common/tokens/tokenizers/multimodal.py
__call__(content)
Converts an image chunk to an image encoding.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
content
|
Union[ImageChunk, ImageURLChunk]
|
image chunk to be converted. |
required |
Returns:
Type | Description |
---|---|
ImageEncoding
|
Image encoding. |
Source code in src/mistral_common/tokens/tokenizers/multimodal.py
MultimodalConfig(image_patch_size, max_image_size, spatial_merge_size=1)
dataclass
Configuration for the multimodal tokenizers.
image_from_chunk(chunk)
Get a serializable image from a chunk.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
chunk
|
Union[ImageURLChunk, ImageChunk]
|
The chunk to get the image from. |
required |
Returns:
Type | Description |
---|---|
SerializableImage
|
The image as a PIL Image object. |
Source code in src/mistral_common/tokens/tokenizers/multimodal.py
is_cv2_installed()
normalize(np_image, mean, std)
Normalize a tensor image with mean and standard deviation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
np_image
|
ndarray
|
Image to be normalized. |
required |
mean
|
Tuple[float, float, float]
|
Mean for each channel. |
required |
std
|
Tuple[float, float, float]
|
Standard deviation for each channel. |
required |
Returns:
Type | Description |
---|---|
ndarray
|
Normalized image with shape (C, H, W). |
Source code in src/mistral_common/tokens/tokenizers/multimodal.py
transform_image(image, new_size)
Transform an image to a numpy array with the given size.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
image
|
Image
|
Image to be transformed. |
required |
new_size
|
Tuple[int, int]
|
New size of the image. |
required |
Returns:
Type | Description |
---|---|
ndarray
|
Transformed image with shape (C, H, W). |