🌟 Features Overview
Comprehensive capabilities for embeddings, vector databases, and cloud storage
EmbeddingFramework offers a comprehensive set of features for working with embeddings, vector databases, and cloud storage.
🔹 Multi-Vector Database Support¶
Easily switch between different vector database backends without changing your application logic.
Supported databases: - ChromaDB – Local and persistent vector storage. - Milvus – High-performance distributed vector database. - Pinecone – Fully managed vector database service. - Weaviate – Open-source vector search engine.
🔹 Cloud Storage Integrations¶
Store and retrieve embeddings or documents from major cloud providers: - AWS S3 - Google Cloud Storage (GCS) - Azure Blob Storage
🔹 Embedding Providers¶
Generate embeddings from multiple providers: - OpenAI Embeddings – State-of-the-art embedding generation. - Easily extendable to other providers.
🔹 File Processing & Preprocessing¶
- Automatic file type detection.
- Text extraction from multiple formats.
- Preprocessing utilities for cleaning and normalizing text.
- Intelligent text splitting for optimal embedding performance.
🔹 Utilities¶
- Retry logic for robust API calls.
- File utilities for safe and efficient I/O.
- Modular architecture for easy extension.
📚 Next Steps¶
Vector Databases • Cloud Storage • Embedding Providers • File Processing • Utilities
- Learn more about Vector Databases
- Explore Cloud Storage
- Check Embedding Providers
- Understand File Processing
- Discover Utilities