Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NotionAPILoader: Fix for #2807 and add propertiesAsHeader option #2837

Conversation

skarard
Copy link
Contributor

@skarard skarard commented Oct 8, 2023

Fixes #2807

Notion databases are often used as the data stored in the page properties, with an empty page. This means a user may expect a similarity search on the page properties. I have enabled an option where a user can prepend the page properties as a frontmatter header to enable similarity search.

It was bought to my attention that the preferred behaviour for the Document class is to not create documents with empty pageContents.

This PR addresses both of these issues.

  • Adding an option to prepend page properties as a front matter header
  • Ensures would be empty pageContents pages do not create Documents

It also

  • Adds integration tests for empty pageContents not creating a Document
  • Adds integration tests for propertiesAsHeader option
  • Update the database example to include propertiesAsHeader

- Adding an option to prepend page properties as a front matter header
- Adds tests for new behaviour and propertiesAsHeader
- Update the database example to include propertiesAsHeader
@vercel
Copy link

vercel bot commented Oct 8, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
langchainjs-docs ✅ Ready (Inspect) Visit Preview Oct 8, 2023 9:22am

@dosubot dosubot bot added the auto:improvement Medium size change to existing code to handle new use-cases label Oct 8, 2023
@@ -53,11 +53,53 @@ test("Test Notion API Loader onDocumentLoad", async () => {
`Loaded ${currentTitle} from ${rootTitle}: (${current}/${total})`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there! Just wanted to flag that this PR explicitly accesses environment variables via process.env. Please review this change to ensure it aligns with security and best practices.

@@ -53,11 +53,53 @@ test("Test Notion API Loader onDocumentLoad", async () => {
`Loaded ${currentTitle} from ${rootTitle}: (${current}/${total})`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there! Just wanted to flag that this PR explicitly accesses environment variables via process.env. Please review this change to ensure it aligns with security and best practices.

@jacoblee93
Copy link
Collaborator

Thank you!

@jacoblee93 jacoblee93 merged commit e5b9406 into langchain-ai:main Oct 9, 2023
@skarard skarard deleted the notion-add-frontmatter-properties-to-pageContents branch October 26, 2023 19:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:improvement Medium size change to existing code to handle new use-cases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Notion vectorStore integration breaks when importing a database with an empty row
2 participants