What Happened
A team of engineers at an unnamed company reported a ~90% reduction in embedding storage costs by replacing Amazon S3 with PostgreSQL. The original system stored embeddings as flat files in S3, incurring high costs for storage and retrieval. By migrating to PostgreSQL with the pgvector extension, they enabled on-demand vector search and retrieval, eliminating the need for expensive bulk storage and reducing latency for RAG and recommender system workloads.
Technical Details
The key insight was that many embeddings — particularly those used in RAG pipelines — are accessed infrequently or on-demand. Storing them in S3 as flat files led to high costs for storage and retrieval, especially as the embedding index grew. PostgreSQL with pgvector allowed the team to store embeddings as indexed vectors, enabling efficient similarity search with minimal overhead. The migration involved:
- Schema design: Creating a table with columns for vector data (using pgvector), metadata, and timestamps.
- Indexing: Using IVFFlat or HNSW indexes for fast approximate nearest neighbor search.
- Query optimization: Leveraging PostgreSQL's query planner for efficient filtering and ranking.
- Cost savings: Eliminating S3 storage fees and reducing data transfer costs.
The team reported no degradation in retrieval latency for RAG queries, and the approach scaled well for millions of embeddings.
Retail & Luxury Implications
For retail and luxury companies, this cost-saving approach has direct applications in:
- Product recommendation systems: Embeddings for product images, descriptions, and user behavior can be stored and queried efficiently in PostgreSQL, reducing infrastructure costs for personalization engines.
- Visual search: Luxury brands like Gucci or Louis Vuitton use image embeddings for visual search — PostgreSQL with pgvector can handle these at a fraction of the cost of cloud blob storage.
- RAG-based customer service: Embeddings for product catalogs, FAQs, and policy documents power RAG chatbots. Using PostgreSQL instead of S3 cuts storage costs without affecting response times.
- Inventory management: Embeddings for product attributes (size, color, material) can be indexed for fast filtering and retrieval, improving supply chain efficiency.
Business Impact
Cost reduction is the primary benefit. For a typical retail RAG system storing 10 million 768-dimensional embeddings (e.g., from OpenAI's text-embedding-ada-002), S3 storage costs can run $200-$500/month for storage alone, plus data transfer fees. PostgreSQL with pgvector can reduce this to $20-$50/month, depending on instance size. The approach also simplifies the tech stack — no need for a separate vector database or blob storage — reducing operational complexity.
Implementation Approach
- Assess current infrastructure: Identify embeddings stored in S3 or other blob storage.
- Set up PostgreSQL with pgvector: Use a managed service (e.g., AWS RDS for PostgreSQL with pgvector extension) or self-host.
- Design schema: Create a table with columns for vector, metadata, and timestamps. Use appropriate indexing (IVFFlat for speed, HNSW for accuracy).
- Migrate data: Export embeddings from S3, batch insert into PostgreSQL.
- Update application code: Modify RAG or recommender system to query PostgreSQL instead of S3.
- Monitor performance: Track query latency and storage costs.
Governance & Risk Assessment
- Data privacy: Embeddings may contain sensitive information (e.g., user behavior, product details). Ensure PostgreSQL is configured with encryption at rest and in transit.
- Bias: Embeddings can encode biases from training data. Regularly audit for fairness, especially in recommendation systems.
- Maturity: The approach is production-ready for medium-scale systems (millions of embeddings). For billions, specialized vector databases (e.g., Pinecone, Weaviate) may still be necessary.
gentic.news Analysis
This article from a Medium blog is a practical case study, not a peer-reviewed paper. The ~90% cost reduction claim is plausible for systems where embeddings are stored in S3 without optimization, but results will vary based on access patterns and scale. The key takeaway for retail AI practitioners is that PostgreSQL with pgvector is a viable, cost-effective alternative to both S3 and specialized vector databases for many RAG and recommender system use cases.
Retailers should evaluate their embedding storage costs and access patterns. For systems with frequent on-demand queries (e.g., real-time product recommendations), PostgreSQL offers lower latency than S3. For batch processing or archival, S3 may still be cheaper. The approach aligns with the broader trend of simplifying AI infrastructure by using general-purpose databases instead of specialized tools.
The article does not disclose the scale of the system or the specific query patterns, so readers should test with their own data. However, the approach is well-documented in open-source communities and has been validated by companies like Shopify and Instacart for similar use cases.
Related coverage: We've previously covered how Retrieval-Augmented Generation (RAG) systems benefit from efficient embedding storage (125 articles), and how recommender systems (13 articles) can reduce costs with simpler infrastructure. This case study provides a concrete example of cost optimization in practice.
Source: medium.com









