forked from openai/openai-cookbook
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add Generative Search to Weaviate examples + fix authentication examp…
…les (openai#398) * add Weaviate authentication to client configuration * add a cookbook for generative search * fix name
- Loading branch information
Showing
4 changed files
with
278 additions
and
0 deletions.
There are no files selected for viewing
275 changes: 275 additions & 0 deletions
275
examples/vector_databases/weaviate/generative-search-with-weaviate-and-openai.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,275 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "cb1537e6", | ||
"metadata": {}, | ||
"source": [ | ||
"# Using Weaviate with Generative OpenAI module for Generative Search\n", | ||
"\n", | ||
"This notebook is prepared for a scenario where:\n", | ||
"* Your data is already in Weaviate\n", | ||
"* You want to use Weaviate with the Generative OpenAI module ([generative-openai](https://weaviate.io/developers/weaviate/modules/reader-generator-modules/generative-openai)).\n", | ||
"\n" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "f1a618c5", | ||
"metadata": {}, | ||
"source": [ | ||
"## Prerequisites\n", | ||
"\n", | ||
"This cookbook only coveres Generative Search examples, however, it doesn't cover the configuration and data imports.\n", | ||
"\n", | ||
"In order to make the most of this cookbook, please complete the [Getting Started cookbook](./getting-started-with-weaviate-and-openai.ipynb) firts, where you will learn the essentials of working with Weaviate and import the demo data.\n", | ||
"\n", | ||
"Checklist:\n", | ||
"* completed [Getting Started cookbook](./getting-started-with-weaviate-and-openai.ipynb),\n", | ||
"* crated a `Weaviate` instance,\n", | ||
"* imported data into your `Weaviate` instance,\n", | ||
"* you have an [OpenAI API key](https://beta.openai.com/account/api-keys)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "36fe86f4", | ||
"metadata": {}, | ||
"source": [ | ||
"===========================================================\n", | ||
"## Prepare your OpenAI API key\n", | ||
"\n", | ||
"The `OpenAI API key` is used for vectorization of your data at import, and for running queries.\n", | ||
"\n", | ||
"If you don't have an OpenAI API key, you can get one from [https://beta.openai.com/account/api-keys](https://beta.openai.com/account/api-keys).\n", | ||
"\n", | ||
"Once you get your key, please add it to your environment variables as `OPENAI_API_KEY`." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "43395339", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Export OpenAI API Key\n", | ||
"!export OPENAI_API_KEY=\"your key\"" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "88be138c", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Test that your OpenAI API key is correctly set as an environment variable\n", | ||
"# Note. if you run this notebook locally, you will need to reload your terminal and the notebook for the env variables to be live.\n", | ||
"import os\n", | ||
"\n", | ||
"# Note. alternatively you can set a temporary env variable like this:\n", | ||
"# os.environ[\"OPENAI_API_KEY\"] = 'your-key-goes-here'\n", | ||
"\n", | ||
"if os.getenv(\"OPENAI_API_KEY\") is not None:\n", | ||
" print (\"OPENAI_API_KEY is ready\")\n", | ||
"else:\n", | ||
" print (\"OPENAI_API_KEY environment variable not found\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "91df4d5b", | ||
"metadata": {}, | ||
"source": [ | ||
"## Connect to your Weaviate instance\n", | ||
"\n", | ||
"In this section, we will:\n", | ||
"\n", | ||
"1. test env variable `OPENAI_API_KEY` – **make sure** you completed the step in [#Prepare-your-OpenAI-API-key](#Prepare-your-OpenAI-API-key)\n", | ||
"2. connect to your Weaviate with your `OpenAI API Key`\n", | ||
"3. and test the client connection\n", | ||
"\n", | ||
"### The client \n", | ||
"\n", | ||
"After this step, the `client` object will be used to perform all Weaviate-related operations." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "cc662c1b", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import weaviate\n", | ||
"from datasets import load_dataset\n", | ||
"import os\n", | ||
"\n", | ||
"# Connect to your Weaviate instance\n", | ||
"client = weaviate.Client(\n", | ||
" url=\"https://your-wcs-instance-name.weaviate.network/\",\n", | ||
" # url=\"http://localhost:8080/\",\n", | ||
" auth_client_secret=weaviate.auth.AuthApiKey(api_key=\"<YOUR-WEAVIATE-API-KEY>\"), # comment out this line if you are not using authentication for your Weaviate instance (i.e. for locally deployed instances)\n", | ||
" additional_headers={\n", | ||
" \"X-OpenAI-Api-Key\": os.getenv(\"OPENAI_API_KEY\")\n", | ||
" }\n", | ||
")\n", | ||
"\n", | ||
"# Check if your instance is live and ready\n", | ||
"# This should return `True`\n", | ||
"client.is_ready()" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "ceb14da9", | ||
"metadata": {}, | ||
"source": [ | ||
"## Generative Search\n", | ||
"Weaviate offers a [Generative Search OpenAI](https://weaviate.io/developers/weaviate/modules/reader-generator-modules/generative-openai) module, which generates responses based on the data stored in your Weaviate instance.\n", | ||
"\n", | ||
"The way you construct a generative search query is very similar to a standard semantic search query in Weaviate. \n", | ||
"\n", | ||
"For example:\n", | ||
"* search in \"Articles\", \n", | ||
"* return \"title\", \"content\", \"url\"\n", | ||
"* look for objects related to \"football clubs\"\n", | ||
"* limit results to 5 objects\n", | ||
"\n", | ||
"```\n", | ||
" result = (\n", | ||
" client.query\n", | ||
" .get(\"Articles\", [\"title\", \"content\", \"url\"])\n", | ||
" .with_near_text(\"concepts\": \"football clubs\")\n", | ||
" .with_limit(5)\n", | ||
" # generative query will go here\n", | ||
" .do()\n", | ||
" )\n", | ||
"```\n", | ||
"\n", | ||
"Now, you can add `with_generate()` function to apply generative transformation. `with_generate` takes either:\n", | ||
"- `single_prompt` - to generate a response for each returned object,\n", | ||
"- `grouped_task` – to generate a single response from all returned objects.\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "51559251", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"def generative_search_per_item(query, collection_name):\n", | ||
" prompt = \"Summarize in a short tweet the following content: {content}\"\n", | ||
"\n", | ||
" result = (\n", | ||
" client.query\n", | ||
" .get(collection_name, [\"title\", \"content\", \"url\"])\n", | ||
" .with_near_text({ \"concepts\": [query], \"distance\": 0.7 })\n", | ||
" .with_limit(5)\n", | ||
" .with_generate(single_prompt=prompt)\n", | ||
" .do()\n", | ||
" )\n", | ||
" \n", | ||
" # Check for errors\n", | ||
" if (\"errors\" in result):\n", | ||
" print (\"\\033[91mYou probably have run out of OpenAI API calls for the current minute – the limit is set at 60 per minute.\")\n", | ||
" raise Exception(result[\"errors\"][0]['message'])\n", | ||
" \n", | ||
" return result[\"data\"][\"Get\"][collection_name]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "a4604726", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"query_result = generative_search_per_item(\"football clubs\", \"Article\")\n", | ||
"\n", | ||
"for i, article in enumerate(query_result):\n", | ||
" print(f\"{i+1}. { article['title']}\")\n", | ||
" print(article['_additional']['generate']['singleResult']) # print generated response\n", | ||
" print(\"-----------------------\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 79, | ||
"id": "a45ea160", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"def generative_search_group(query, collection_name):\n", | ||
" generateTask = \"Explain what these have in common\"\n", | ||
"\n", | ||
" result = (\n", | ||
" client.query\n", | ||
" .get(collection_name, [\"title\", \"content\", \"url\"])\n", | ||
" .with_near_text({ \"concepts\": [query], \"distance\": 0.7 })\n", | ||
" .with_generate(grouped_task=generateTask)\n", | ||
" .with_limit(5)\n", | ||
" .do()\n", | ||
" )\n", | ||
" \n", | ||
" # Check for errors\n", | ||
" if (\"errors\" in result):\n", | ||
" print (\"\\033[91mYou probably have run out of OpenAI API calls for the current minute – the limit is set at 60 per minute.\")\n", | ||
" raise Exception(result[\"errors\"][0]['message'])\n", | ||
" \n", | ||
" return result[\"data\"][\"Get\"][collection_name]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "11e0dad2", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"query_result = generative_search_group(\"football clubs\", \"Article\")\n", | ||
"\n", | ||
"print (query_result[0]['_additional']['generate']['groupedResult'])" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "2007be48", | ||
"metadata": {}, | ||
"source": [ | ||
"Thanks for following along, you're now equipped to set up your own vector databases and use embeddings to do all kinds of cool things - enjoy! For more complex use cases please continue to work through other cookbook examples in this repo." | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.9.12" | ||
}, | ||
"vscode": { | ||
"interpreter": { | ||
"hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6" | ||
} | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters