tantivy-py/README.md

60 lines
1.8 KiB
Markdown
Raw Normal View History

[![Build Status](https://travis-ci.org/tantivy-search/tantivy-py.svg?branch=master)](https://travis-ci.org/tantivy-search/tantivy-py)
2019-08-08 13:54:27 +00:00
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
tantivy-py
==========
Python bindings for tantivy.
# Installation
The bindings can be installed using setuptools:
python3 setup.py install --user
Note that this requires setuptools-rust to be installed. Another thing to note
is that the bindings are using [PyO3](https://github.com/PyO3/pyo3), which
requires rust nightly and only supports python3.
# Usage
tantivy-py has a similar API to tantivy. To create a index first a schema
needs to be built. After that documents can be added to the index and a reader
can be created to search the index.
```python
2019-08-02 11:23:10 +00:00
import tantivy
# Declaring our schema.
schema_builder = tantivy.SchemaBuilder()
schema_builder.add_text_field("title", stored=True)
schema_builder.add_text_field("body", stored=True)
schema = schema_builder.build()
# Creating our index (in memory, but filesystem is available too)
index = tantivy.Index(schema)
# Adding one document.
writer = index.writer()
2019-10-06 11:47:50 +00:00
writer.add_document(tantivy.Document(
title=["The Old Man and the Sea"],
body=["""He was an old man who fished alone in a skiff in the Gulf Stream and he had gone eighty-four days now without taking a fish."""],
))
2019-08-02 11:23:10 +00:00
# ... and committing
writer.commit()
# Reload the index to ensure it points to the last commit.
2019-10-06 20:59:08 +00:00
index.reload()
2019-08-02 11:23:10 +00:00
searcher = index.searcher()
2019-10-06 11:47:50 +00:00
query = index.parse_query("fish days", ["title", "body"])
2019-08-02 11:23:10 +00:00
top_docs = tantivy.TopDocs(3)
2019-10-06 11:47:50 +00:00
(best_score, best_doc_address) = searcher.search(query, top_docs)[0]
2019-08-02 11:23:10 +00:00
best_doc = searcher.doc(best_doc_address)
assert best_doc["title"] == ["The Old Man and the Sea"]
2019-10-06 11:47:50 +00:00
print(best_doc)
```