Skip to content

Commit 203e888

Browse files
feat: Add CLI, SDK, and API documentation page to Feast UI (#5337)
* Add documentation page to Feast UI with SDK, API, and CLI documentation Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com> * Add symlink to docs directory for documentation page Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com> * Update DocumentationService with hardcoded documentation content for demo purposes Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com> * Replace docs symlink with actual reference content Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com> * Update Documentation icon to blue, move to last position, and add horizontal lines between CLI sections Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com> * Update documentation page with improved formatting and routing Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com> * Fix tab switching in documentation page Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com> * Simplify Rollup configuration to fix build issues Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com> * Format rollup.config.js with trailing commas Co-Authored-By: Francisco Javier Arceo <arceofrancisco@gmail.com> --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
1 parent 4361359 commit 203e888

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

99 files changed

+7348
-38
lines changed

ui/package.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@
3737
"query-string": "^7.1.1",
3838
"react-app-polyfill": "^3.0.0",
3939
"react-code-blocks": "^0.1.6",
40+
"react-markdown": "^10.1.0",
4041
"react-query": "^3.39.3",
4142
"react-router-dom": "^6.28.0",
4243
"reactflow": "^11.11.4",
Lines changed: 225 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,225 @@
1+
# [Alpha] Vector Database
2+
**Warning**: This is an _experimental_ feature. To our knowledge, this is stable, but there are still rough edges in the experience. Contributions are welcome!
3+
4+
## Overview
5+
Vector database allows user to store and retrieve embeddings. Feast provides general APIs to store and retrieve embeddings.
6+
7+
## Integration
8+
Below are supported vector databases and implemented features:
9+
10+
| Vector Database | Retrieval | Indexing | V2 Support* | Online Read |
11+
|-----------------|-----------|----------|-------------|-------------|
12+
| Pgvector | [x] | [ ] | [] | [] |
13+
| Elasticsearch | [x] | [x] | [] | [] |
14+
| Milvus | [x] | [x] | [x] | [x] |
15+
| Faiss | [ ] | [ ] | [] | [] |
16+
| SQLite | [x] | [ ] | [x] | [x] |
17+
| Qdrant | [x] | [x] | [] | [] |
18+
19+
*Note: V2 Support means the SDK supports retrieval of features along with vector embeddings from vector similarity search.
20+
21+
Note: SQLite is in limited access and only working on Python 3.10. It will be updated as [sqlite_vec](https://github.com/asg017/sqlite-vec/) progresses.
22+
23+
{% hint style="danger" %}
24+
We will be deprecating the `retrieve_online_documents` method in the SDK in the future.
25+
We recommend using the `retrieve_online_documents_v2` method instead, which offers easier vector index configuration
26+
directly in the Feature View and the ability to retrieve standard features alongside your vector embeddings for richer context injection.
27+
28+
Long term we will collapse the two methods into one, but for now, we recommend using the `retrieve_online_documents_v2` method.
29+
Beyond that, we will then have `retrieve_online_documents` and `retrieve_online_documents_v2` simply point to `get_online_features` for
30+
backwards compatibility and the adopt industry standard naming conventions.
31+
{% endhint %}
32+
33+
**Note**: Milvus and SQLite implement the v2 `retrieve_online_documents_v2` method in the SDK. This will be the longer-term solution so that Data Scientists can easily enable vector similarity search by just flipping a flag.
34+
35+
## Examples
36+
37+
- See the v0 [Rag Demo](https://github.com/feast-dev/feast-workshop/blob/rag/module_4_rag) for an example on how to use vector database using the `retrieve_online_documents` method (planning migration and deprecation (planning migration and deprecation).
38+
- See the v1 [Milvus Quickstart](../../examples/rag/milvus-quickstart.ipynb) for a quickstart guide on how to use Feast with Milvus using the `retrieve_online_documents_v2` method.
39+
40+
### **Prepare offline embedding dataset**
41+
Run the following commands to prepare the embedding dataset:
42+
```shell
43+
python pull_states.py
44+
python batch_score_documents.py
45+
```
46+
The output will be stored in `data/city_wikipedia_summaries.csv.`
47+
48+
### **Initialize Feast feature store and materialize the data to the online store**
49+
Use the feature_store.yaml file to initialize the feature store. This will use the data as offline store, and Milvus as online store.
50+
51+
```yaml
52+
project: local_rag
53+
provider: local
54+
registry: data/registry.db
55+
online_store:
56+
type: milvus
57+
path: data/online_store.db
58+
vector_enabled: true
59+
embedding_dim: 384
60+
index_type: "IVF_FLAT"
61+
62+
63+
offline_store:
64+
type: file
65+
entity_key_serialization_version: 3
66+
# By default, no_auth for authentication and authorization, other possible values kubernetes and oidc. Refer the documentation for more details.
67+
auth:
68+
type: no_auth
69+
```
70+
Run the following command in terminal to apply the feature store configuration:
71+
72+
```shell
73+
feast apply
74+
```
75+
76+
Note that when you run `feast apply` you are going to apply the following Feature View that we will use for retrieval later:
77+
78+
```python
79+
document_embeddings = FeatureView(
80+
name="embedded_documents",
81+
entities=[item, author],
82+
schema=[
83+
Field(
84+
name="vector",
85+
dtype=Array(Float32),
86+
# Look how easy it is to enable RAG!
87+
vector_index=True,
88+
vector_search_metric="COSINE",
89+
),
90+
Field(name="item_id", dtype=Int64),
91+
Field(name="author_id", dtype=String),
92+
Field(name="created_timestamp", dtype=UnixTimestamp),
93+
Field(name="sentence_chunks", dtype=String),
94+
Field(name="event_timestamp", dtype=UnixTimestamp),
95+
],
96+
source=rag_documents_source,
97+
ttl=timedelta(hours=24),
98+
)
99+
```
100+
101+
Let's use the SDK to write a data frame of embeddings to the online store:
102+
```python
103+
store.write_to_online_store(feature_view_name='city_embeddings', df=df)
104+
```
105+
106+
### **Prepare a query embedding**
107+
During inference (e.g., during when a user submits a chat message) we need to embed the input text. This can be thought of as a feature transformation of the input data. In this example, we'll do this with a small Sentence Transformer from Hugging Face.
108+
109+
```python
110+
import torch
111+
import torch.nn.functional as F
112+
from feast import FeatureStore
113+
from pymilvus import MilvusClient, DataType, FieldSchema
114+
from transformers import AutoTokenizer, AutoModel
115+
from example_repo import city_embeddings_feature_view, item
116+
117+
TOKENIZER = "sentence-transformers/all-MiniLM-L6-v2"
118+
MODEL = "sentence-transformers/all-MiniLM-L6-v2"
119+
120+
def mean_pooling(model_output, attention_mask):
121+
token_embeddings = model_output[
122+
0
123+
] # First element of model_output contains all token embeddings
124+
input_mask_expanded = (
125+
attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
126+
)
127+
return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(
128+
input_mask_expanded.sum(1), min=1e-9
129+
)
130+
131+
def run_model(sentences, tokenizer, model):
132+
encoded_input = tokenizer(
133+
sentences, padding=True, truncation=True, return_tensors="pt"
134+
)
135+
# Compute token embeddings
136+
with torch.no_grad():
137+
model_output = model(**encoded_input)
138+
139+
sentence_embeddings = mean_pooling(model_output, encoded_input["attention_mask"])
140+
sentence_embeddings = F.normalize(sentence_embeddings, p=2, dim=1)
141+
return sentence_embeddings
142+
143+
question = "Which city has the largest population in New York?"
144+
145+
tokenizer = AutoTokenizer.from_pretrained(TOKENIZER)
146+
model = AutoModel.from_pretrained(MODEL)
147+
query_embedding = run_model(question, tokenizer, model).detach().cpu().numpy().tolist()[0]
148+
```
149+
150+
### **Retrieve the top K similar documents**
151+
First create a feature store instance, and use the `retrieve_online_documents_v2` API to retrieve the top 5 similar documents to the specified query.
152+
153+
```python
154+
context_data = store.retrieve_online_documents_v2(
155+
features=[
156+
"city_embeddings:vector",
157+
"city_embeddings:item_id",
158+
"city_embeddings:state",
159+
"city_embeddings:sentence_chunks",
160+
"city_embeddings:wiki_summary",
161+
],
162+
query=query_embedding,
163+
top_k=3,
164+
distance_metric='COSINE',
165+
).to_df()
166+
```
167+
### **Generate the Response**
168+
Let's assume we have a base prompt and a function that formats the retrieved documents called `format_documents` that we
169+
can then use to generate the response with OpenAI's chat completion API.
170+
```python
171+
FULL_PROMPT = format_documents(rag_context_data, BASE_PROMPT)
172+
173+
from openai import OpenAI
174+
175+
client = OpenAI(
176+
api_key=os.environ.get("OPENAI_API_KEY"),
177+
)
178+
response = client.chat.completions.create(
179+
model="gpt-4o-mini",
180+
messages=[
181+
{"role": "system", "content": FULL_PROMPT},
182+
{"role": "user", "content": question}
183+
],
184+
)
185+
186+
# And this will print the content. Look at the examples/rag/milvus-quickstart.ipynb for an end-to-end example.
187+
print('\n'.join([c.message.content for c in response.choices]))
188+
```
189+
190+
### Configuration and Installation
191+
192+
We offer [Milvus](https://milvus.io/), [PGVector](https://github.com/pgvector/pgvector), [SQLite](https://github.com/asg017/sqlite-vec), [Elasticsearch](https://www.elastic.co) and [Qdrant](https://qdrant.tech/) as Online Store options for Vector Databases.
193+
194+
Milvus offers a convenient local implementation for vector similarity search. To use Milvus, you can install the Feast package with the Milvus extra.
195+
196+
#### Installation with Milvus
197+
198+
```bash
199+
pip install feast[milvus]
200+
```
201+
#### Installation with Elasticsearch
202+
203+
```bash
204+
pip install feast[elasticsearch]
205+
```
206+
207+
#### Installation with Qdrant
208+
209+
```bash
210+
pip install feast[qdrant]
211+
```
212+
#### Installation with SQLite
213+
214+
If you are using `pyenv` to manage your Python versions, you can install the SQLite extension with the following command:
215+
```bash
216+
PYTHON_CONFIGURE_OPTS="--enable-loadable-sqlite-extensions" \
217+
LDFLAGS="-L/opt/homebrew/opt/sqlite/lib" \
218+
CPPFLAGS="-I/opt/homebrew/opt/sqlite/include" \
219+
pyenv install 3.10.14
220+
```
221+
222+
And you can the Feast install package via:
223+
```bash
224+
pip install feast[sqlite_vec]
225+
```
Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
# \[Beta] Web UI
2+
3+
**Warning**: This is an _experimental_ feature. To our knowledge, this is stable, but there are still rough edges in the experience. Contributions are welcome!
4+
5+
## Overview
6+
7+
The Feast Web UI allows users to explore their feature repository through a Web UI. It includes functionality such as:
8+
9+
* Browsing Feast objects (feature views, entities, data sources, feature services, and saved datasets) and their relationships
10+
* Searching and filtering for Feast objects by tags
11+
12+
![Sample UI](../../ui/sample.png)
13+
14+
## Usage
15+
16+
There are several ways to use the Feast Web UI.
17+
18+
### Feast CLI
19+
20+
The easiest way to get started is to run the `feast ui` command within a feature repository:
21+
22+
Output of `feast ui --help`:
23+
24+
```bash
25+
Usage: feast ui [OPTIONS]
26+
27+
Shows the Feast UI over the current directory
28+
29+
Options:
30+
-h, --host TEXT Specify a host for the server [default: 0.0.0.0]
31+
-p, --port INTEGER Specify a port for the server [default: 8888]
32+
-r, --registry_ttl_sec INTEGER Number of seconds after which the registry is refreshed. Default is 5 seconds.
33+
--help Show this message and exit.
34+
```
35+
36+
This will spin up a Web UI on localhost which automatically refreshes its view of the registry every `registry_ttl_sec`
37+
38+
### Importing as a module to integrate with an existing React App
39+
40+
This is the recommended way to use Feast UI for teams maintaining their own internal UI for their deployment of Feast.
41+
42+
Start with bootstrapping a React app with `create-react-app`
43+
44+
```
45+
npx create-react-app your-feast-ui
46+
```
47+
48+
Then, in your app folder, install Feast UI and optionally its peer dependencies. Assuming you use yarn
49+
50+
```
51+
yarn add @feast-dev/feast-ui
52+
# For custom UI using the Elastic UI Framework (optional):
53+
yarn add @elastic/eui
54+
# For general custom styling (optional):
55+
yarn add @emotion/react
56+
```
57+
58+
Edit `index.js` in the React app to use Feast UI.
59+
60+
```js
61+
import React from "react";
62+
import ReactDOM from "react-dom";
63+
import "./index.css";
64+
65+
import FeastUI from "@feast-dev/feast-ui";
66+
import "@feast-dev/feast-ui/dist/feast-ui.css";
67+
68+
ReactDOM.render(
69+
<React.StrictMode>
70+
<FeastUI />
71+
</React.StrictMode>,
72+
document.getElementById("root")
73+
);
74+
```
75+
76+
When you start the React app, it will look for `projects-list.json` to find a list of your projects. The JSON should look something like this.
77+
78+
```json
79+
{
80+
"projects": [
81+
{
82+
"name": "Credit Score Project",
83+
"description": "Project for credit scoring team and associated models.",
84+
"id": "credit_score_project",
85+
"registryPath": "/registry.json"
86+
}
87+
]
88+
}
89+
```
90+
91+
* **Note** - `registryPath` only supports a file location or a url.
92+
93+
Then start the React App
94+
95+
```bash
96+
yarn start
97+
```
98+
99+
#### Customization
100+
101+
The advantage of importing Feast UI as a module is in the ease of customization. The `<FeastUI>` component exposes a `feastUIConfigs` prop thorough which you can customize the UI. Currently it supports a few parameters.
102+
103+
##### Fetching the Project List
104+
105+
By default, the Feast UI fetches the project list from the app root path. You can use `projectListPromise` to provide a promise that overrides where it's fetched from.
106+
107+
```jsx
108+
<FeastUI
109+
feastUIConfigs={{
110+
projectListPromise: fetch(SOME_PATH, {
111+
headers: {
112+
"Content-Type": "application/json",
113+
},
114+
}).then((res) => {
115+
return res.json();
116+
})
117+
}}
118+
/>
119+
```
120+
121+
##### Custom Tabs
122+
123+
You can add custom tabs for any of the core Feast objects through the `tabsRegistry`.
124+
125+
```jsx
126+
const tabsRegistry = {
127+
RegularFeatureViewCustomTabs: [
128+
{
129+
label: "Custom Tab Demo", // Navigation Label for the tab
130+
path: "demo-tab", // Subpath for the tab
131+
Component: RFVDemoCustomTab, // a React Component
132+
},
133+
]
134+
}
135+
136+
<FeastUI
137+
feastUIConfigs={{
138+
tabsRegistry: tabsRegistry,
139+
}}
140+
/>
141+
```
142+
143+
Examples of custom tabs can be found in the `ui/custom-tabs` folder.
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# Batch materialization
2+
3+
Please see [Batch Materialization Engine](../../getting-started/components/batch-materialization-engine.md) for an explanation of batch materialization engines.
4+
5+
{% page-ref page="snowflake.md" %}
6+
7+
{% page-ref page="bytewax.md" %}
8+
9+
{% page-ref page="lambda.md" %}
10+
11+
{% page-ref page="spark.md" %}

0 commit comments

Comments
 (0)