Aurelio logo
Updated on January 05, 2025

Semantic Router January Update: v0.1.0

Tooling

Since it's inception, Semantic Router has continually grown, much of that growth has been requirements driven. New capabilities have grown very organically and while that has led to an incredibly flexibly system, it has also made the codebase more complex.

This complexity is the opposite of what we believe Semantic Router should be. Semantic Router should be a simple, flexible, and easy-to-use open source library that allows us to build predictable, scalable, and efficient AI software.

With this in mind we began prioritizing smaller releases focused more on optimization and refactors rather than new features. It quickly became clear that a much more focused effort was needed to fulfil the promise of Semantic Router.

To achieve this vision for Semantic Router, we began working on the library's first major release, v0.1.0.

v0.1.0 is close to completion. The release will feature a massively overhauled codebase, huge upgrades to existing features, and a broad range of optimization for existing features. Perhaps most importantly, the codebase will be far more modular, allowing us to add new integrations like the PineconeIndex or HuggingfaceEncoder with minimal effort.

We're very excited to release v0.1.0 and wanted to take a moment to share what this release will bring to Semantic Router.


Modular Routers, Encoders, and Indexes

A big focus of v0.1.0 is modularity. The route layers (now called routers) both inherit from a shared base class, BaseRouter. The SemanticRouter (previously RouteLayer) and HybridRouter (previously HybridRouteLayer) both inherit from this BaseRouter class. We have plans to add many more routers in the future, and this foundation will allow us to do this in a way that scales and guarantees consistency between routers.

✅ First abstraction of route layers (routers) #465

⬜ Alignment of missing route behavior between indexes #497

Synchronization Logic for Local and Remote Indexes

A common problem with using remote indexes in Semantic Router is that a remote index may contain different data than what we have locally. This can lead to a lot of unexpected behavior and we found it very difficult to consistently handle in real-world production software.

To solve this problem, we've implemented sophisticated synchronization logic to keep local and remote indexes in sync. With that, we've also added logic that allows us to easily use either our local or remote indexes as sources of truth that we can then quickly spin up new local or remote instances.

In various scenarios we will often need to use different synchronization strategies and because of this we found it incredibly helpful to have features within the library geared towards diagnosing potential issues and making the data we have more transparent. For this, we added various synchronization features, the majority of which are explained in the Syncing Routes notebook.

StatusFeaturePR / Issue
Fast sync check#460
Sync for function schemas and metadata#473
Sync lock#485
Async sync support#487
Adding all sync methods to the HybridRouter#496
Add route threshold metadata to remotes#438
Route-level sync#483

Full Async Support

The SemanticRouter and HybridRouter will now fully support async operations. This means that you can now use the SemanticRouter and HybridRouter with any async encoder and index.

StatusFeaturePR / Issue
Async support for sync methods and PineconeIndex#487
Adding async support to the HybridRouter#496

Upgrading the HybridRouter and Aligning Routers

The HybridRouter had previously been left behind as we focused on the core SemanticRouter. This release will bring the HybridRouter fully up to date with the SemanticRouter and standardize the methods between them, making it super easy to switch from SemanticRouter to the HybridRouter or vice versa.

Pairing this with the new modularity of indexes and encoders makes the HybridRouter fully compatible almost every encoder and index type. So, you can use HybridRouter with BM25Encoder, HuggingfaceEncoder, and the LocalHybridIndex — or you can use it with the TfidfEncoder, OpenAIEncoder, and PineconeIndex — almost every combination is possible.

StatusFeaturePR / Issue
First abstraction of route layers (routers)#465
New AurelioBM25Encoder for easier sparse embeddings#465
Vector shape alignment#489
Full alignment of the HybridRouter#496

New integrations

Our final step before releasing v0.1.0 is a full review of current open PRs for Semantic Router. Many of these PRs are written for pre-v0.1.0 of Semantic Router and will require some work to update them for the latest release, but we believe that the new features and optimizations will make this effort incredibly worthwhile.

This is not an exhaustive list, but we're excited to bring the following integrations into Semantic Router v0.1.0:

StatusFeaturePR / Issue
BedrockEncoder#478
MilvusIndex#457
VoyageEncoder#255

Testing and docs

Our testing utilities and documentation are both important to the broader library. The current testing pipeline is very heavy, with several critical issues such as:

  • A lack of repeat use of the same tests for different encoders, indexes, and routers.
  • Lack of full mocking ability for PineconeIndex.
  • Improved coverage and less duplication.

These are all areas we're actively working on improving and plan to have resolved before the v0.1.0 release.

Docs are also critical. Semantic Router has been fairly under-documented but we've recently deployed our new docs site and are actively document more of library's features. Naturally, v0.1.0 will bring a lot of changes to our docs and so we're working on updating these ahead of the release.

StatusFeaturePR / Issue
New docs siteLink
Merge all router tests#496
Update all Jupyter notebooks for v0.1.0#498
Full mock of PineconeIndex#499

We're getting close to the release of v0.1.0 and we're excited to share more about the release as we get closer.