- Release notes
- Quick start
- Introduction to Whoosh
- Glossary
- Designing a schema
- How to index documents
- How to search
- Parsing user queries
- The default query language
- Indexing and parsing dates/times
- Query objects
- About analyzers
- Stemming, variations, and accent folding
- Indexing and searching N-grams
- Sorting and faceting
- How to create highlighted search result excerpts
- Query expansion and Key word extraction
- “Did you mean… ?” Correcting errors in user queries
- Field caches
- Tips for speeding up batch indexing
- Concurrency, locking, and versioning
- Indexing and searching document hierarchies
- Whoosh recipes
- Whoosh API
analysis
modulecodec.base
modulecollectors
module- Base classes
Collector
ScoredCollector
WrappingCollector
WrappingCollector.all_ids()
WrappingCollector.collect()
WrappingCollector.collect_matches()
WrappingCollector.count()
WrappingCollector.finish()
WrappingCollector.matches()
WrappingCollector.prepare()
WrappingCollector.remove()
WrappingCollector.results()
WrappingCollector.set_subsearcher()
WrappingCollector.sort_key()
- Basic collectors
- Wrappers
- Base classes
columns
modulefields
module- Schema class
- FieldType base class
FieldType
FieldType.clean()
FieldType.index()
FieldType.parse_query()
FieldType.parse_range()
FieldType.process_text()
FieldType.self_parsing()
FieldType.separate_spelling()
FieldType.sortable_terms()
FieldType.spellable_words()
FieldType.spelling_fieldname()
FieldType.subfields()
FieldType.supports()
FieldType.to_bytes()
FieldType.to_column_value()
FieldType.tokenize()
- Pre-made field types
- Exceptions
filedb.filestore
module- Base class
Storage
Storage.close()
Storage.create()
Storage.create_file()
Storage.create_index()
Storage.delete_file()
Storage.destroy()
Storage.file_exists()
Storage.file_length()
Storage.file_modified()
Storage.index_exists()
Storage.list()
Storage.lock()
Storage.open_file()
Storage.open_index()
Storage.optimize()
Storage.rename_file()
Storage.temp_storage()
- Implementation classes
- Helper functions
- Exceptions
- Base class
filedb.filetables
modulefiledb.structfile
module- Classes
StructFile
StructFile.close()
StructFile.flush()
StructFile.read_pickle()
StructFile.read_string()
StructFile.read_svarint()
StructFile.read_tagint()
StructFile.read_varint()
StructFile.write_byte()
StructFile.write_pickle()
StructFile.write_string()
StructFile.write_svarint()
StructFile.write_tagint()
StructFile.write_varint()
BufferFile
ChecksumFile
- Classes
formats
modulehighlight
modulesupport.bitvector
moduleindex
module- Functions
- Base class
Index
Index.add_field()
Index.close()
Index.doc_count()
Index.doc_count_all()
Index.field_length()
Index.is_empty()
Index.last_modified()
Index.latest_generation()
Index.max_field_length()
Index.optimize()
Index.reader()
Index.refresh()
Index.remove_field()
Index.searcher()
Index.up_to_date()
Index.writer()
- Implementation
- Exceptions
lang.morph_en
modulelang.porter
modulelang.wordnet
modulematching
module- Matchers
Matcher
Matcher.all_ids()
Matcher.all_items()
Matcher.block_quality()
Matcher.children()
Matcher.copy()
Matcher.depth()
Matcher.id()
Matcher.is_active()
Matcher.items_as()
Matcher.matching_terms()
Matcher.max_quality()
Matcher.next()
Matcher.replace()
Matcher.reset()
Matcher.score()
Matcher.skip_to()
Matcher.skip_to_quality()
Matcher.spans()
Matcher.supports()
Matcher.supports_block_quality()
Matcher.term()
Matcher.term_matchers()
Matcher.value()
Matcher.value_as()
Matcher.weight()
NullMatcher
ListMatcher
WrappingMatcher
MultiMatcher
FilterMatcher
BiMatcher
AdditiveBiMatcher
UnionMatcher
DisjunctionMaxMatcher
IntersectionMatcher
AndNotMatcher
InverseMatcher
RequireMatcher
AndMaybeMatcher
ConstantScoreMatcher
- Exceptions
- Matchers
qparser
module- Parser object
QueryParser
QueryParser.add_plugin()
QueryParser.add_plugins()
QueryParser.default_set()
QueryParser.filterize()
QueryParser.filters()
QueryParser.multitoken_query()
QueryParser.parse()
QueryParser.process()
QueryParser.remove_plugin()
QueryParser.remove_plugin_class()
QueryParser.replace_plugin()
QueryParser.tag()
QueryParser.taggers()
QueryParser.term_query()
- Pre-made configurations
- Plug-ins
- Syntax node objects
- Parser object
query
module- Base classes
Query
Query.accept()
Query.all_terms()
Query.all_tokens()
Query.apply()
Query.children()
Query.copy()
Query.deletion_docs()
Query.docs()
Query.estimate_min_size()
Query.estimate_size()
Query.existing_terms()
Query.field()
Query.has_terms()
Query.is_leaf()
Query.is_range()
Query.iter_all_terms()
Query.leaves()
Query.matcher()
Query.normalize()
Query.replace()
Query.requires()
Query.simplify()
Query.terms()
Query.tokens()
Query.with_boost()
CompoundQuery
MultiTerm
ExpandingTerm
WrappingQuery
- Query classes
- Binary queries
- Span queries
- Special queries
- Exceptions
- Base classes
reading
module- Classes
IndexReader
IndexReader.all_doc_ids()
IndexReader.all_stored_fields()
IndexReader.all_terms()
IndexReader.close()
IndexReader.codec()
IndexReader.column_reader()
IndexReader.corrector()
IndexReader.doc_count()
IndexReader.doc_count_all()
IndexReader.doc_field_length()
IndexReader.doc_frequency()
IndexReader.expand_prefix()
IndexReader.field_length()
IndexReader.field_terms()
IndexReader.first_id()
IndexReader.frequency()
IndexReader.generation()
IndexReader.has_deletions()
IndexReader.has_vector()
IndexReader.indexed_field_names()
IndexReader.is_deleted()
IndexReader.iter_docs()
IndexReader.iter_field()
IndexReader.iter_from()
IndexReader.iter_postings()
IndexReader.iter_prefix()
IndexReader.leaf_readers()
IndexReader.lexicon()
IndexReader.max_field_length()
IndexReader.min_field_length()
IndexReader.most_distinctive_terms()
IndexReader.most_frequent_terms()
IndexReader.postings()
IndexReader.segment()
IndexReader.storage()
IndexReader.stored_fields()
IndexReader.term_info()
IndexReader.terms_from()
IndexReader.terms_within()
IndexReader.vector()
IndexReader.vector_as()
MultiReader
TermInfo
- Exceptions
- Classes
scoring
modulesearching
module- Searching classes
Searcher
Searcher.boolean_context()
Searcher.collector()
Searcher.context()
Searcher.correct_query()
Searcher.doc_count()
Searcher.doc_count_all()
Searcher.docs_for_query()
Searcher.document()
Searcher.document_number()
Searcher.document_numbers()
Searcher.documents()
Searcher.get_parent()
Searcher.idf()
Searcher.key_terms()
Searcher.key_terms_from_text()
Searcher.more_like()
Searcher.postings()
Searcher.reader()
Searcher.refresh()
Searcher.search()
Searcher.search_page()
Searcher.search_with_collector()
Searcher.suggest()
Searcher.up_to_date()
- Results classes
Results
Results.copy()
Results.docnum()
Results.docs()
Results.estimated_length()
Results.estimated_min_length()
Results.extend()
Results.facet_names()
Results.fields()
Results.filter()
Results.groups()
Results.has_exact_length()
Results.has_matched_terms()
Results.is_empty()
Results.items()
Results.key_terms()
Results.matched_terms()
Results.score()
Results.scored_length()
Results.upgrade()
Results.upgrade_and_extend()
Hit
ResultsPage
- Exceptions
- Searching classes
sorting
modulespelling
modulesupport.charset
modulesupport.levenshtein
moduleutil
modulewriting
module- Writer
IndexWriter
IndexWriter.add_document()
IndexWriter.add_field()
IndexWriter.cancel()
IndexWriter.commit()
IndexWriter.delete_by_query()
IndexWriter.delete_by_term()
IndexWriter.delete_document()
IndexWriter.end_group()
IndexWriter.group()
IndexWriter.reader()
IndexWriter.remove_field()
IndexWriter.start_group()
IndexWriter.update_document()
- Utility writers
- Exceptions
- Writer
- Technical notes
support.bitvector
module¶
An implementation of an object that acts like a collection of on/off bits.
Base classes¶
-
class
whoosh.idsets.
DocIdSet
¶ Base class for a set of positive integers, implementing a subset of the built-in
set
type’s interface with extra docid-related methods.This is a superclass for alternative set implementations to the built-in
set
which are more memory-efficient and specialized toward storing sorted lists of positive integers, though they will inevitably be slower thanset
for most operations since they’re pure Python.-
after
(i)¶ Returns the next integer in the set after
i
, or None.
-
before
(i)¶ Returns the previous integer in the set before
i
, or None.
-
first
()¶ Returns the first (lowest) integer in the set.
-
invert_update
(size)¶ Updates the set in-place to contain numbers in the range
[0 - size)
except numbers that are in this set.
-
last
()¶ Returns the last (highest) integer in the set.
-
-
class
whoosh.idsets.
BaseBitSet
¶
Implementation classes¶
-
class
whoosh.idsets.
BitSet
(source=None, size=0)¶ A DocIdSet backed by an array of bits. This can also be useful as a bit array (e.g. for a Bloom filter). It is much more memory efficient than a large built-in set of integers, but wastes memory for sparse sets.
Parameters: - maxsize – the maximum size of the bit array.
- source – an iterable of positive integers to add to this set.
- bits – an array of unsigned bytes (“B”) to use as the underlying bit array. This is used by some of the object’s methods.
-
class
whoosh.idsets.
OnDiskBitSet
(dbfile, basepos, bytecount)¶ A DocIdSet backed by an array of bits on disk.
>>> st = RamStorage() >>> f = st.create_file("test.bin") >>> bs = BitSet([1, 10, 15, 7, 2]) >>> bytecount = bs.to_disk(f) >>> f.close() >>> # ... >>> f = st.open_file("test.bin") >>> odbs = OnDiskBitSet(f, bytecount) >>> list(odbs) [1, 2, 7, 10, 15]
Parameters: - dbfile – a
StructFile
object to read from. - basepos – the base position of the bytes in the given file.
- bytecount – the number of bytes to use for the bit array.
- dbfile – a
-
class
whoosh.idsets.
SortedIntSet
(source=None, typecode='I')¶ A DocIdSet backed by a sorted array of integers.
-
class
whoosh.idsets.
MultiIdSet
(idsets, offsets)¶ Wraps multiple SERIAL sub-DocIdSet objects and presents them as an aggregated, read-only set.
Parameters: - idsets – a list of DocIdSet objects.
- offsets – a list of offsets corresponding to the DocIdSet objects
in
idsets
.