from whoosh import index, qparser
from whoosh.fields import *
# Define the schema for the index
schema = Schema(title=TEXT(stored=True),
author=TEXT(stored=True),
category=KEYWORD(stored=True),
content=TEXT)
# Create the index
ix = index.create_in("indexdir", schema)
# Open the index for writing
writer = ix.writer()
# Add documents to the index
writer.add_document(title="Document 1",
author="Author 1",
category="Category 1",
content="This is the content of document 1")
writer.add_document(title="Document 2",
author="Author 2",
category="Category 2",
content="This is the content of document 2")
writer.add_document(title="Document 3",
author="Author 3",
category="Category 1",
content="This is the content of document 3")
# Commit the changes
writer.commit()
# Open the index for reading
searcher = ix.searcher()
# Parse the query
parser = qparser.QueryParser("content", ix.schema)
query = parser.parse("document")
# Perform the search and return the results
results = searcher.search(query)
# Iterate through the results and print them
for result in results:
print(result)
# Faceted search:
facet = searcher.facet_by_fieldname("category")
for category in facet:
print(f"Category: {category}")
print(f"Number of documents: {facet[category]}")
Saturday, December 17, 2022
Facets in Whoosh - an example !
Friday, December 16, 2022
How to add documents to the existing index in Whoosh?
To add documents to an existing index in Whoosh, you will need to follow these steps:
1. First, you will need to open the index using the whoosh.index.open_dir function. This function takes the directory where the index is stored as an argument and returns an Index object:
from whoosh import index
# Open the index
ix = index.open_dir("indexdir")
2. Next, you will need to create a whoosh.writing.IndexWriter object using the Index.writer method. The IndexWriter object allows you to add documents to the index:
# Open an index writer
writer = ix.writer()
3. Now you can use the IndexWriter.add_document method to add documents to the index. The add_document method takes a dictionary of fields and values as an argument. The keys of the dictionary should match the field names in the index's schema, and the values should be the field values:
# Add a document to the index
writer.add_document(
title="My Document",
body="This is the body of my document.",
date="2022-01-01",
)
4. After you have added all the documents you want to add, you will need to call the IndexWriter.commit method to save the changes to the index:
# Commit the changes
writer.commit()