Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

Onyx: Open-Source AI Enterprise Search Challenges Glean's $7.2B Valuation

Onyx: Open-Source AI Enterprise Search Challenges Glean's $7.2B Valuation

Open-source platform Onyx provides self-hosted AI enterprise search connecting to 40+ tools, offering a free alternative to Glean's $50/user/month SaaS. Backed by YC and $10M seed funding, it's used by Netflix and Ramp.

Share:
Onyx: The Open-Source AI Enterprise Search Platform Taking on Glean's $7.2B Business

Enterprise knowledge fragmentation is a $100B+ productivity drain. The average employee spends 3.6 hours daily searching across Slack, Google Drive, Confluence, Salesforce, and dozens of other tools. Glean has built a $7.2 billion business solving this problem with AI-powered enterprise search, charging $50 per user per month with minimum $50,000 annual contracts.

Now, open-source alternative Onyx is challenging that model with a self-hosted, MIT-licensed platform that keeps sensitive data on-premises while offering comparable AI capabilities.

Key Takeaways

  • Open-source platform Onyx provides self-hosted AI enterprise search connecting to 40+ tools, offering a free alternative to Glean's $50/user/month SaaS.
  • Backed by YC and $10M seed funding, it's used by Netflix and Ramp.

What Onyx Actually Does

Onyx AI | Open Source Enterprise Search & AI Assistant

Onyx is an open-source AI platform for unified enterprise search and knowledge retrieval. Unlike traditional intranet search that relies on keywords, Onyx uses agentic RAG (Retrieval-Augmented Generation) to understand natural language queries and retrieve relevant information across all connected data sources.

Core capabilities include:

  • 40+ native integrations with Slack, Google Drive, Confluence, Salesforce, Gmail, Jira, GitHub, Notion, Zendesk, Gong, Teams, Dropbox, and more
  • AI chat with citations that answers questions from all company data
  • Deep Research mode for multi-step research across connected sources
  • Custom AI agents with unique instructions, knowledge, and actions
  • Web search integration combining internal knowledge with live internet results
  • Code execution in sandboxed Python containers for data analysis
  • Enterprise authentication via SSO (Google, OIDC, SAML), RBAC, and SCIM provisioning
  • LLM agnostic - works with OpenAI, Anthropic, Gemini, or self-hosted models via Ollama

The Business Model Disruption

Glean's success demonstrates the market demand: $600M in venture funding, $7.2B valuation, and enterprise contracts starting at $50,000 annually. For a 100-person company, that's $60,000+ per year for SaaS access to their data.

Onyx offers two tiers:

Community MIT License Free RAG, AI chat, agents, search, web browsing, 40+ integrations Enterprise Commercial "Fraction of Glean" Permission-awareness, analytics, whitelabeling, priority support

The financial difference is stark. While Glean requires minimum $50,000 annual contracts, Onyx Community Edition is completely free for self-hosting. Even the Enterprise Edition costs "well under" $60,000/year for a 100-person company.

Technical Architecture & Deployment

Onyx is designed for on-premises deployment with Docker, claiming setup in 30 minutes. The architecture ensures sensitive documents never leave company infrastructure, addressing security concerns that prevent many enterprises from adopting cloud-based AI search solutions.

Key technical differentiators:

  1. Agentic RAG vs. Keyword Search: Traditional enterprise search matches keywords; Onyx's AI understands context and intent, retrieving semantically relevant information even when exact terms don't match.

  2. Multi-source Intelligence: The platform can combine information from Slack conversations, Google Docs, Jira tickets, and Salesforce records to answer complex questions like "What did we decide about pricing for enterprise customers?"

  3. Self-hosted LLM Support: While supporting commercial APIs, Onyx works with locally hosted models via Ollama, enabling completely air-gapped deployments for highly regulated industries.

Adoption & Backing

Onyx has significant traction:

  • 27,700+ GitHub stars indicating strong developer interest
  • Production users including Netflix, Ramp, and Thales Group
  • Y Combinator backing (W24 batch)
  • $10M seed funding from Khosla Ventures and First Round Capital

These credentials suggest serious enterprise readiness, not just hobbyist open-source software.

Market Context: The Enterprise AI Search Landscape

Enterprise AI startup Glean lands a $7.2B valuation | TechCrunch

The enterprise knowledge management market is heating up. Glean's $7.2B valuation reflects investor confidence in AI-powered workplace search. Microsoft has integrated similar capabilities into Copilot for Microsoft 365, while startups like Sierra and Tome are approaching the problem from different angles.

Onyx represents the open-source counter-movement to venture-backed SaaS solutions. Similar patterns have emerged in other AI infrastructure categories:

  • Vector databases: Pinecone (SaaS) vs. Weaviate/Qdrant (open-source)
  • LLM orchestration: LangChain (open-source) vs. various proprietary platforms
  • Model hosting: OpenAI API vs. self-hosted Llama/Mistral models

The success of Onyx will test whether enterprises prioritize cost and control (open-source, self-hosted) over convenience and support (SaaS).

Limitations & Considerations

While promising, Onyx requires:

  • Internal DevOps resources for deployment and maintenance
  • LLM cost management whether using commercial APIs or self-hosted models
  • Permission mapping to ensure sensitive data isn't exposed in search results
  • Ongoing integration maintenance as source systems update their APIs

The Enterprise Edition addresses some of these with advanced permission-awareness and support, but the Community Edition puts the operational burden entirely on the deploying organization.

gentic.news Analysis

This development represents a significant challenge to the SaaS-dominated enterprise AI market. Glean's $7.2B valuation assumes defensibility through network effects and data accumulation, but Onyx's open-source approach attacks the very premise: why should companies pay premium SaaS fees when they can run comparable AI on their own infrastructure?

The timing is strategic. As enterprises grow increasingly concerned about data sovereignty and AI governance, open-source, self-hosted solutions address regulatory and compliance requirements that cloud-based alternatives struggle with. This follows the broader trend we've covered in "The Great On-Premises AI Revival" where financial services, healthcare, and government agencies are mandating on-premises AI deployments.

Onyx's backing by Khosla Ventures is particularly notable. The firm has been aggressively investing in infrastructure-layer AI companies, including recent investments in Modular and Together AI. Their participation suggests they see Onyx as more than just a feature—it's potential core infrastructure for the self-hosted AI stack.

The competitive dynamics here mirror the early days of Kubernetes vs. proprietary container platforms. Open-source won in infrastructure orchestration because enterprises wanted control and portability. Onyx is betting the same pattern will repeat in enterprise AI search.

However, the go-to-market challenge remains. Glean has hundreds of salespeople; Onyx has GitHub stars. The test will be whether Netflix and Ramp's production use cases provide enough enterprise credibility to overcome the "nobody got fired for buying Glean" mentality in large organizations.

Frequently Asked Questions

Is Onyx really free for commercial use?

Yes, the Community Edition is released under the MIT License, which permits commercial use, modification, and distribution without cost. However, enterprises typically need the permission-awareness, analytics, and support features in the Enterprise Edition, which carries a cost (though still "a fraction" of Glean's pricing).

How does Onyx handle data security and permissions?

The Community Edition provides basic search across connected data sources. The Enterprise Edition adds permission-awareness, meaning it respects existing access controls in source systems like Google Drive, Confluence, or SharePoint. For maximum security, companies can deploy Onyx with self-hosted LLMs via Ollama for a completely air-gapped solution.

What's the catch with open-source enterprise software?

The main trade-off is operational responsibility. With SaaS like Glean, the vendor handles updates, scalability, and integration maintenance. With self-hosted Onyx, your team manages deployment, monitoring, and updates. The Enterprise Edition includes support to mitigate this, but the infrastructure burden remains on your organization.

Can Onyx replace Glean completely for an existing customer?

For technically sophisticated organizations with DevOps resources, yes. The feature overlap is substantial: both offer AI-powered search across multiple data sources with citations. However, migration would require re-implementing integrations and potentially retraining users. The business case depends on whether the cost savings justify the transition effort and ongoing maintenance overhead.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

Onyx represents a strategic attack on the economics of enterprise AI SaaS. By open-sourcing core RAG and agent capabilities, they're applying the classic open-source playbook: commoditize the infrastructure layer to capture value higher in the stack. This mirrors what happened with databases (MySQL vs. Oracle), web servers (Apache vs. proprietary), and now appears to be happening with AI-powered enterprise search. Technically, the most interesting aspect is their "agentic RAG" approach. Most enterprise search systems still rely on semantic similarity search over embeddings. Agentic RAG implies multi-step reasoning, tool use, and potentially planning—capabilities that could significantly improve answer quality for complex queries. If Onyx has genuinely implemented this while maintaining performance, it represents a technical advancement beyond basic vector search implementations. For practitioners, the key question is whether Onyx's architecture can scale to enterprise document volumes while maintaining low latency. The 40+ integrations suggest they've solved the connector problem, but production deployments at Netflix-scale will test whether their retrieval and ranking systems can handle billions of documents across permission boundaries. The decision between Community and Enterprise editions likely hinges on these scalability and permission-awareness requirements.

Mentioned in this article

Enjoyed this article?
Share:

Related Articles

More in Products & Launches

View all