|
| 1 | +# [Alpha] Vector Database |
| 2 | +**Warning**: This is an _experimental_ feature. To our knowledge, this is stable, but there are still rough edges in the experience. Contributions are welcome! |
| 3 | + |
| 4 | +## Overview |
| 5 | +Vector database allows user to store and retrieve embeddings. Feast provides general APIs to store and retrieve embeddings. |
| 6 | + |
| 7 | +## Integration |
| 8 | +Below are supported vector databases and implemented features: |
| 9 | + |
| 10 | +| Vector Database | Retrieval | Indexing | V2 Support* | Online Read | |
| 11 | +|-----------------|-----------|----------|-------------|-------------| |
| 12 | +| Pgvector | [x] | [ ] | [] | [] | |
| 13 | +| Elasticsearch | [x] | [x] | [] | [] | |
| 14 | +| Milvus | [x] | [x] | [x] | [x] | |
| 15 | +| Faiss | [ ] | [ ] | [] | [] | |
| 16 | +| SQLite | [x] | [ ] | [x] | [x] | |
| 17 | +| Qdrant | [x] | [x] | [] | [] | |
| 18 | + |
| 19 | +*Note: V2 Support means the SDK supports retrieval of features along with vector embeddings from vector similarity search. |
| 20 | + |
| 21 | +Note: SQLite is in limited access and only working on Python 3.10. It will be updated as [sqlite_vec](https://github.com/asg017/sqlite-vec/) progresses. |
| 22 | + |
| 23 | +{% hint style="danger" %} |
| 24 | +We will be deprecating the `retrieve_online_documents` method in the SDK in the future. |
| 25 | +We recommend using the `retrieve_online_documents_v2` method instead, which offers easier vector index configuration |
| 26 | +directly in the Feature View and the ability to retrieve standard features alongside your vector embeddings for richer context injection. |
| 27 | + |
| 28 | +Long term we will collapse the two methods into one, but for now, we recommend using the `retrieve_online_documents_v2` method. |
| 29 | +Beyond that, we will then have `retrieve_online_documents` and `retrieve_online_documents_v2` simply point to `get_online_features` for |
| 30 | +backwards compatibility and the adopt industry standard naming conventions. |
| 31 | +{% endhint %} |
| 32 | + |
| 33 | +**Note**: Milvus and SQLite implement the v2 `retrieve_online_documents_v2` method in the SDK. This will be the longer-term solution so that Data Scientists can easily enable vector similarity search by just flipping a flag. |
| 34 | + |
| 35 | +## Examples |
| 36 | + |
| 37 | +- See the v0 [Rag Demo](https://github.com/feast-dev/feast-workshop/blob/rag/module_4_rag) for an example on how to use vector database using the `retrieve_online_documents` method (planning migration and deprecation (planning migration and deprecation). |
| 38 | +- See the v1 [Milvus Quickstart](../../examples/rag/milvus-quickstart.ipynb) for a quickstart guide on how to use Feast with Milvus using the `retrieve_online_documents_v2` method. |
| 39 | + |
| 40 | +### **Prepare offline embedding dataset** |
| 41 | +Run the following commands to prepare the embedding dataset: |
| 42 | +```shell |
| 43 | +python pull_states.py |
| 44 | +python batch_score_documents.py |
| 45 | +``` |
| 46 | +The output will be stored in `data/city_wikipedia_summaries.csv.` |
| 47 | + |
| 48 | +### **Initialize Feast feature store and materialize the data to the online store** |
| 49 | +Use the feature_store.yaml file to initialize the feature store. This will use the data as offline store, and Milvus as online store. |
| 50 | + |
| 51 | +```yaml |
| 52 | +project: local_rag |
| 53 | +provider: local |
| 54 | +registry: data/registry.db |
| 55 | +online_store: |
| 56 | + type: milvus |
| 57 | + path: data/online_store.db |
| 58 | + vector_enabled: true |
| 59 | + embedding_dim: 384 |
| 60 | + index_type: "IVF_FLAT" |
| 61 | + |
| 62 | + |
| 63 | +offline_store: |
| 64 | + type: file |
| 65 | +entity_key_serialization_version: 3 |
| 66 | +# By default, no_auth for authentication and authorization, other possible values kubernetes and oidc. Refer the documentation for more details. |
| 67 | +auth: |
| 68 | + type: no_auth |
| 69 | +``` |
| 70 | +Run the following command in terminal to apply the feature store configuration: |
| 71 | +
|
| 72 | +```shell |
| 73 | +feast apply |
| 74 | +``` |
| 75 | + |
| 76 | +Note that when you run `feast apply` you are going to apply the following Feature View that we will use for retrieval later: |
| 77 | + |
| 78 | +```python |
| 79 | +document_embeddings = FeatureView( |
| 80 | + name="embedded_documents", |
| 81 | + entities=[item, author], |
| 82 | + schema=[ |
| 83 | + Field( |
| 84 | + name="vector", |
| 85 | + dtype=Array(Float32), |
| 86 | + # Look how easy it is to enable RAG! |
| 87 | + vector_index=True, |
| 88 | + vector_search_metric="COSINE", |
| 89 | + ), |
| 90 | + Field(name="item_id", dtype=Int64), |
| 91 | + Field(name="author_id", dtype=String), |
| 92 | + Field(name="created_timestamp", dtype=UnixTimestamp), |
| 93 | + Field(name="sentence_chunks", dtype=String), |
| 94 | + Field(name="event_timestamp", dtype=UnixTimestamp), |
| 95 | + ], |
| 96 | + source=rag_documents_source, |
| 97 | + ttl=timedelta(hours=24), |
| 98 | +) |
| 99 | +``` |
| 100 | + |
| 101 | +Let's use the SDK to write a data frame of embeddings to the online store: |
| 102 | +```python |
| 103 | +store.write_to_online_store(feature_view_name='city_embeddings', df=df) |
| 104 | +``` |
| 105 | + |
| 106 | +### **Prepare a query embedding** |
| 107 | +During inference (e.g., during when a user submits a chat message) we need to embed the input text. This can be thought of as a feature transformation of the input data. In this example, we'll do this with a small Sentence Transformer from Hugging Face. |
| 108 | + |
| 109 | +```python |
| 110 | +import torch |
| 111 | +import torch.nn.functional as F |
| 112 | +from feast import FeatureStore |
| 113 | +from pymilvus import MilvusClient, DataType, FieldSchema |
| 114 | +from transformers import AutoTokenizer, AutoModel |
| 115 | +from example_repo import city_embeddings_feature_view, item |
| 116 | + |
| 117 | +TOKENIZER = "sentence-transformers/all-MiniLM-L6-v2" |
| 118 | +MODEL = "sentence-transformers/all-MiniLM-L6-v2" |
| 119 | + |
| 120 | +def mean_pooling(model_output, attention_mask): |
| 121 | + token_embeddings = model_output[ |
| 122 | + 0 |
| 123 | + ] # First element of model_output contains all token embeddings |
| 124 | + input_mask_expanded = ( |
| 125 | + attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float() |
| 126 | + ) |
| 127 | + return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp( |
| 128 | + input_mask_expanded.sum(1), min=1e-9 |
| 129 | + ) |
| 130 | + |
| 131 | +def run_model(sentences, tokenizer, model): |
| 132 | + encoded_input = tokenizer( |
| 133 | + sentences, padding=True, truncation=True, return_tensors="pt" |
| 134 | + ) |
| 135 | + # Compute token embeddings |
| 136 | + with torch.no_grad(): |
| 137 | + model_output = model(**encoded_input) |
| 138 | + |
| 139 | + sentence_embeddings = mean_pooling(model_output, encoded_input["attention_mask"]) |
| 140 | + sentence_embeddings = F.normalize(sentence_embeddings, p=2, dim=1) |
| 141 | + return sentence_embeddings |
| 142 | + |
| 143 | +question = "Which city has the largest population in New York?" |
| 144 | + |
| 145 | +tokenizer = AutoTokenizer.from_pretrained(TOKENIZER) |
| 146 | +model = AutoModel.from_pretrained(MODEL) |
| 147 | +query_embedding = run_model(question, tokenizer, model).detach().cpu().numpy().tolist()[0] |
| 148 | +``` |
| 149 | + |
| 150 | +### **Retrieve the top K similar documents** |
| 151 | +First create a feature store instance, and use the `retrieve_online_documents_v2` API to retrieve the top 5 similar documents to the specified query. |
| 152 | + |
| 153 | +```python |
| 154 | +context_data = store.retrieve_online_documents_v2( |
| 155 | + features=[ |
| 156 | + "city_embeddings:vector", |
| 157 | + "city_embeddings:item_id", |
| 158 | + "city_embeddings:state", |
| 159 | + "city_embeddings:sentence_chunks", |
| 160 | + "city_embeddings:wiki_summary", |
| 161 | + ], |
| 162 | + query=query_embedding, |
| 163 | + top_k=3, |
| 164 | + distance_metric='COSINE', |
| 165 | +).to_df() |
| 166 | +``` |
| 167 | +### **Generate the Response** |
| 168 | +Let's assume we have a base prompt and a function that formats the retrieved documents called `format_documents` that we |
| 169 | +can then use to generate the response with OpenAI's chat completion API. |
| 170 | +```python |
| 171 | +FULL_PROMPT = format_documents(rag_context_data, BASE_PROMPT) |
| 172 | + |
| 173 | +from openai import OpenAI |
| 174 | + |
| 175 | +client = OpenAI( |
| 176 | + api_key=os.environ.get("OPENAI_API_KEY"), |
| 177 | +) |
| 178 | +response = client.chat.completions.create( |
| 179 | + model="gpt-4o-mini", |
| 180 | + messages=[ |
| 181 | + {"role": "system", "content": FULL_PROMPT}, |
| 182 | + {"role": "user", "content": question} |
| 183 | + ], |
| 184 | +) |
| 185 | + |
| 186 | +# And this will print the content. Look at the examples/rag/milvus-quickstart.ipynb for an end-to-end example. |
| 187 | +print('\n'.join([c.message.content for c in response.choices])) |
| 188 | +``` |
| 189 | + |
| 190 | +### Configuration and Installation |
| 191 | + |
| 192 | +We offer [Milvus](https://milvus.io/), [PGVector](https://github.com/pgvector/pgvector), [SQLite](https://github.com/asg017/sqlite-vec), [Elasticsearch](https://www.elastic.co) and [Qdrant](https://qdrant.tech/) as Online Store options for Vector Databases. |
| 193 | + |
| 194 | +Milvus offers a convenient local implementation for vector similarity search. To use Milvus, you can install the Feast package with the Milvus extra. |
| 195 | + |
| 196 | +#### Installation with Milvus |
| 197 | + |
| 198 | +```bash |
| 199 | +pip install feast[milvus] |
| 200 | +``` |
| 201 | +#### Installation with Elasticsearch |
| 202 | + |
| 203 | +```bash |
| 204 | +pip install feast[elasticsearch] |
| 205 | +``` |
| 206 | + |
| 207 | +#### Installation with Qdrant |
| 208 | + |
| 209 | +```bash |
| 210 | +pip install feast[qdrant] |
| 211 | +``` |
| 212 | +#### Installation with SQLite |
| 213 | + |
| 214 | +If you are using `pyenv` to manage your Python versions, you can install the SQLite extension with the following command: |
| 215 | +```bash |
| 216 | +PYTHON_CONFIGURE_OPTS="--enable-loadable-sqlite-extensions" \ |
| 217 | + LDFLAGS="-L/opt/homebrew/opt/sqlite/lib" \ |
| 218 | + CPPFLAGS="-I/opt/homebrew/opt/sqlite/include" \ |
| 219 | + pyenv install 3.10.14 |
| 220 | +``` |
| 221 | + |
| 222 | +And you can the Feast install package via: |
| 223 | +```bash |
| 224 | +pip install feast[sqlite_vec] |
| 225 | +``` |
0 commit comments