- Release notes
- Quick start
- Introduction to Whoosh
- Glossary
- Designing a schema
- How to index documents
- How to search
- Parsing user queries
- The default query language
- Indexing and parsing dates/times
- Query objects
- About analyzers
- Stemming, variations, and accent folding
- Indexing and searching N-grams
- Sorting and faceting
- How to create highlighted search result excerpts
- Query expansion and Key word extraction
- “Did you mean… ?” Correcting errors in user queries
- Field caches
- Tips for speeding up batch indexing
- Concurrency, locking, and versioning
- Indexing and searching document hierarchies
- Whoosh recipes
- Whoosh API
analysismodulecodec.basemodulecollectorsmodule- Base classes
CollectorScoredCollectorWrappingCollectorWrappingCollector.all_ids()WrappingCollector.collect()WrappingCollector.collect_matches()WrappingCollector.count()WrappingCollector.finish()WrappingCollector.matches()WrappingCollector.prepare()WrappingCollector.remove()WrappingCollector.results()WrappingCollector.set_subsearcher()WrappingCollector.sort_key()
- Basic collectors
- Wrappers
- Base classes
columnsmodulefieldsmodule- Schema class
- FieldType base class
FieldTypeFieldType.clean()FieldType.index()FieldType.parse_query()FieldType.parse_range()FieldType.process_text()FieldType.self_parsing()FieldType.separate_spelling()FieldType.sortable_terms()FieldType.spellable_words()FieldType.spelling_fieldname()FieldType.subfields()FieldType.supports()FieldType.to_bytes()FieldType.to_column_value()FieldType.tokenize()
- Pre-made field types
- Exceptions
filedb.filestoremodule- Base class
StorageStorage.close()Storage.create()Storage.create_file()Storage.create_index()Storage.delete_file()Storage.destroy()Storage.file_exists()Storage.file_length()Storage.file_modified()Storage.index_exists()Storage.list()Storage.lock()Storage.open_file()Storage.open_index()Storage.optimize()Storage.rename_file()Storage.temp_storage()
- Implementation classes
- Helper functions
- Exceptions
- Base class
filedb.filetablesmodulefiledb.structfilemodule- Classes
StructFileStructFile.close()StructFile.flush()StructFile.read_pickle()StructFile.read_string()StructFile.read_svarint()StructFile.read_tagint()StructFile.read_varint()StructFile.write_byte()StructFile.write_pickle()StructFile.write_string()StructFile.write_svarint()StructFile.write_tagint()StructFile.write_varint()
BufferFileChecksumFile
- Classes
formatsmodulehighlightmodulesupport.bitvectormoduleindexmodule- Functions
- Base class
IndexIndex.add_field()Index.close()Index.doc_count()Index.doc_count_all()Index.field_length()Index.is_empty()Index.last_modified()Index.latest_generation()Index.max_field_length()Index.optimize()Index.reader()Index.refresh()Index.remove_field()Index.searcher()Index.up_to_date()Index.writer()
- Implementation
- Exceptions
lang.morph_enmodulelang.portermodulelang.wordnetmodulematchingmodule- Matchers
MatcherMatcher.all_ids()Matcher.all_items()Matcher.block_quality()Matcher.children()Matcher.copy()Matcher.depth()Matcher.id()Matcher.is_active()Matcher.items_as()Matcher.matching_terms()Matcher.max_quality()Matcher.next()Matcher.replace()Matcher.reset()Matcher.score()Matcher.skip_to()Matcher.skip_to_quality()Matcher.spans()Matcher.supports()Matcher.supports_block_quality()Matcher.term()Matcher.term_matchers()Matcher.value()Matcher.value_as()Matcher.weight()
NullMatcherListMatcherWrappingMatcherMultiMatcherFilterMatcherBiMatcherAdditiveBiMatcherUnionMatcherDisjunctionMaxMatcherIntersectionMatcherAndNotMatcherInverseMatcherRequireMatcherAndMaybeMatcherConstantScoreMatcher
- Exceptions
- Matchers
qparsermodule- Parser object
QueryParserQueryParser.add_plugin()QueryParser.add_plugins()QueryParser.default_set()QueryParser.filterize()QueryParser.filters()QueryParser.multitoken_query()QueryParser.parse()QueryParser.process()QueryParser.remove_plugin()QueryParser.remove_plugin_class()QueryParser.replace_plugin()QueryParser.tag()QueryParser.taggers()QueryParser.term_query()
- Pre-made configurations
- Plug-ins
- Syntax node objects
- Parser object
querymodule- Base classes
QueryQuery.accept()Query.all_terms()Query.all_tokens()Query.apply()Query.children()Query.copy()Query.deletion_docs()Query.docs()Query.estimate_min_size()Query.estimate_size()Query.existing_terms()Query.field()Query.has_terms()Query.is_leaf()Query.is_range()Query.iter_all_terms()Query.leaves()Query.matcher()Query.normalize()Query.replace()Query.requires()Query.simplify()Query.terms()Query.tokens()Query.with_boost()
CompoundQueryMultiTermExpandingTermWrappingQuery
- Query classes
- Binary queries
- Span queries
- Special queries
- Exceptions
- Base classes
readingmodule- Classes
IndexReaderIndexReader.all_doc_ids()IndexReader.all_stored_fields()IndexReader.all_terms()IndexReader.close()IndexReader.codec()IndexReader.column_reader()IndexReader.corrector()IndexReader.doc_count()IndexReader.doc_count_all()IndexReader.doc_field_length()IndexReader.doc_frequency()IndexReader.expand_prefix()IndexReader.field_length()IndexReader.field_terms()IndexReader.first_id()IndexReader.frequency()IndexReader.generation()IndexReader.has_deletions()IndexReader.has_vector()IndexReader.indexed_field_names()IndexReader.is_deleted()IndexReader.iter_docs()IndexReader.iter_field()IndexReader.iter_from()IndexReader.iter_postings()IndexReader.iter_prefix()IndexReader.leaf_readers()IndexReader.lexicon()IndexReader.max_field_length()IndexReader.min_field_length()IndexReader.most_distinctive_terms()IndexReader.most_frequent_terms()IndexReader.postings()IndexReader.segment()IndexReader.storage()IndexReader.stored_fields()IndexReader.term_info()IndexReader.terms_from()IndexReader.terms_within()IndexReader.vector()IndexReader.vector_as()
MultiReaderTermInfo
- Exceptions
- Classes
scoringmodulesearchingmodule- Searching classes
SearcherSearcher.boolean_context()Searcher.collector()Searcher.context()Searcher.correct_query()Searcher.doc_count()Searcher.doc_count_all()Searcher.docs_for_query()Searcher.document()Searcher.document_number()Searcher.document_numbers()Searcher.documents()Searcher.get_parent()Searcher.idf()Searcher.key_terms()Searcher.key_terms_from_text()Searcher.more_like()Searcher.postings()Searcher.reader()Searcher.refresh()Searcher.search()Searcher.search_page()Searcher.search_with_collector()Searcher.suggest()Searcher.up_to_date()
- Results classes
ResultsResults.copy()Results.docnum()Results.docs()Results.estimated_length()Results.estimated_min_length()Results.extend()Results.facet_names()Results.fields()Results.filter()Results.groups()Results.has_exact_length()Results.has_matched_terms()Results.is_empty()Results.items()Results.key_terms()Results.matched_terms()Results.score()Results.scored_length()Results.upgrade()Results.upgrade_and_extend()
HitResultsPage
- Exceptions
- Searching classes
sortingmodulespellingmodulesupport.charsetmodulesupport.levenshteinmoduleutilmodulewritingmodule- Writer
IndexWriterIndexWriter.add_document()IndexWriter.add_field()IndexWriter.cancel()IndexWriter.commit()IndexWriter.delete_by_query()IndexWriter.delete_by_term()IndexWriter.delete_document()IndexWriter.end_group()IndexWriter.group()IndexWriter.reader()IndexWriter.remove_field()IndexWriter.start_group()IndexWriter.update_document()
- Utility writers
- Exceptions
- Writer
- Technical notes
support.bitvector module¶
An implementation of an object that acts like a collection of on/off bits.
Base classes¶
-
class
whoosh.idsets.DocIdSet¶ Base class for a set of positive integers, implementing a subset of the built-in
settype’s interface with extra docid-related methods.This is a superclass for alternative set implementations to the built-in
setwhich are more memory-efficient and specialized toward storing sorted lists of positive integers, though they will inevitably be slower thansetfor most operations since they’re pure Python.-
after(i)¶ Returns the next integer in the set after
i, or None.
-
before(i)¶ Returns the previous integer in the set before
i, or None.
-
first()¶ Returns the first (lowest) integer in the set.
-
invert_update(size)¶ Updates the set in-place to contain numbers in the range
[0 - size)except numbers that are in this set.
-
last()¶ Returns the last (highest) integer in the set.
-
-
class
whoosh.idsets.BaseBitSet¶
Implementation classes¶
-
class
whoosh.idsets.BitSet(source=None, size=0)¶ A DocIdSet backed by an array of bits. This can also be useful as a bit array (e.g. for a Bloom filter). It is much more memory efficient than a large built-in set of integers, but wastes memory for sparse sets.
Parameters: - maxsize – the maximum size of the bit array.
- source – an iterable of positive integers to add to this set.
- bits – an array of unsigned bytes (“B”) to use as the underlying bit array. This is used by some of the object’s methods.
-
class
whoosh.idsets.OnDiskBitSet(dbfile, basepos, bytecount)¶ A DocIdSet backed by an array of bits on disk.
>>> st = RamStorage() >>> f = st.create_file("test.bin") >>> bs = BitSet([1, 10, 15, 7, 2]) >>> bytecount = bs.to_disk(f) >>> f.close() >>> # ... >>> f = st.open_file("test.bin") >>> odbs = OnDiskBitSet(f, bytecount) >>> list(odbs) [1, 2, 7, 10, 15]
Parameters: - dbfile – a
StructFileobject to read from. - basepos – the base position of the bytes in the given file.
- bytecount – the number of bytes to use for the bit array.
- dbfile – a
-
class
whoosh.idsets.SortedIntSet(source=None, typecode='I')¶ A DocIdSet backed by a sorted array of integers.
-
class
whoosh.idsets.MultiIdSet(idsets, offsets)¶ Wraps multiple SERIAL sub-DocIdSet objects and presents them as an aggregated, read-only set.
Parameters: - idsets – a list of DocIdSet objects.
- offsets – a list of offsets corresponding to the DocIdSet objects
in
idsets.