Embedding Models
Discover the embedding models supported by Hyper.
Hyper enables the use of a variety of embedding models, each selected for its strength in processing different data types, from text to images, ensuring applications can leverage the most appropriate capabilities for their data analysis and query needs.
Supported Models
Hyper's model support is designed to cater to a wide range of use cases, facilitating the integration of models best suited to the specific characteristics of the data being processed.
Text Embeddings
Model | Developer | Embedding Size | Hyper Slug |
---|---|---|---|
ada v2 | OpenAI | 1536 | OPENAI_ADA_V2 |
Curie | OpenAI | 4096 | OPENAI_CURIE |
BERT Base | 768 | GOOGLE_BERT_BASE | |
GPT-3 Small | OpenAI | 125M parameters | OPENAI_GPT3_SMALL |
DistilBERT | Hugging Face | 768 | HF_DISTILBERT |
Image Embeddings
Model | Developer | Embedding Size | Hyper Slug |
---|---|---|---|
CLIP | OpenAI | 512 | OPENAI_CLIP |
Vision Transformer (ViT) | 768 | GOOGLE_VIT |
Usage
To utilize a specific embedding model in Hyper, specify the embedding_model
parameter in the POST body for the /embeddings
and other related API endpoints. Hyper defaults to using the OPENAI ada v2 model if no model is explicitly specified.
For vector searches, Hyper considers only the files that have embeddings generated with the model specified in the query. This ensures that the search results are relevant and accurately reflect the data's semantic meaning as processed by the chosen model.
For example, if you have files A and B with embeddings from OPENAI_ADA_V2 and files C and D with GOOGLE_BERT_BASE, specifying OPENAI_ADA_V2 as the embedding_model
in your query will limit the search results to include only files A and B.
It is crucial that all files targeted in a search have embeddings generated with the same model to maintain consistency and accuracy in the results.