Files
Learn about the various file formats Hyper supports for syncing text, image, and audio data with your vector database.
Supported File Formats for Vector Sync
Hyper's vector database can incorporate data from various file formats, each optimized for specific types of content:
Text Files
Text files are processed for their textual content, suitable for NLP tasks and searchable vector embeddings:
- JSON (
application/json
): Structured data ideal for complex embeddings. - PDF (
application/pdf
): Includes metadata and supports text coordinate extraction. - CSV (
text/csv
): Easily converted into vector representations for each row. - TXT (
text/plain
): Pure text for straightforward vectorization. - MD (
text/markdown
): Markdown files processed as text. - RTF (
application/rtf
): Rich Text Format converted to plain text. - TSV (
text/tab-separated-values
): Similar to CSV for structured data representation. - DOCX (
application/vnd.openxmlformats-officedocument.wordprocessingml.document
): Text content extraction for document embeddings. - XLSX (
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
): Parses sheets into structured data. - PPTX (
application/vnd.openxmlformats-officedocument.presentationml.presentation
): Presentation text useful for creating searchable content.
Image Files
Images are analyzed for content and text, enabling visual search and classification:
- JPG (
image/jpeg
) - PNG (
image/png
)
Audio Files
Audio files are transcribed to text, allowing for searchable audio content within the vector database:
- MP3 (
audio/mpeg
) - MP4 (
video/mp4
) - WAV (
audio/wav
)
Uploading Files to the Vector Database
Upload files to Hyper to automatically process and store their data as vectors in the database, enhancing searchability and analysis.
How to Upload
Use the /v1/files
endpoint for uploading files. Specify the Content-Type
as multipart/form-data
and include the file along with its file_type
:
curl --request POST \
--url https://api.gethyper.ai/v1/files \
--header 'Authorization: Bearer YOUR_HYPER_API_KEY' \
--header 'Content-Type: multipart/form-data' \
--form 'file=@"/path/to/your/file.jpg"' \
--form 'type="image/jpeg"'
File Type Specification
It's crucial to specify the file_type
when uploading, as this informs Hyper how to process and integrate the file into the vector database:
{
"type": "image/jpeg"
}
Note: Maximum file sizes are 20 MB for text and image files, and larger files can be accommodated upon request. Audio and video files are transcribed to text, with both raw and processed data stored for efficient retrieval.