filedb.filestore module

filedb.filestore module

filedb.filestore module

Base class

class whoosh.filedb.filestore.Storage

Abstract base class for storage objects.

A storage object is a virtual flat filesystem, allowing the creation and retrieval of file-like objects (StructFile objects). The default implementation (FileStorage) uses actual files in a directory.

All access to files in Whoosh goes through this object. This allows more different forms of storage (for example, in RAM, in a database, in a single file) to be used transparently.

For example, to create a FileStorage object:

# Create a storage object
st = FileStorage("indexdir")
# Create the directory if it doesn't already exist
st.create()

The Storage.create() method makes it slightly easier to swap storage implementations. The create() method handles set-up of the storage object. For example, FileStorage.create() creates the directory. A database implementation might create tables. This is designed to let you avoid putting implementation-specific setup code in your application.

close()

Closes any resources opened by this storage object. For some storage implementations this will be a no-op, but for others it is necessary to release locks and/or prevent leaks, so it’s a good idea to call it when you’re done with a storage object.

create()

Creates any required implementation-specific resources. For example, a filesystem-based implementation might create a directory, while a database implementation might create tables. For example:

from whoosh.filedb.filestore import FileStorage
# Create a storage object
st = FileStorage("indexdir")
# Create any necessary resources
st.create()

This method returns self so you can also say:

st = FileStorage("indexdir").create()

Storage implementations should be written so that calling create() a second time on the same storage

Returns:a Storage instance.
create_file(name)

Creates a file with the given name in this storage.

Parameters:name – the name for the new file.
Returns:a whoosh.filedb.structfile.StructFile instance.
create_index(schema, indexname='MAIN', indexclass=None)

Creates a new index in this storage.

>>> from whoosh import fields
>>> from whoosh.filedb.filestore import FileStorage
>>> schema = fields.Schema(content=fields.TEXT)
>>> # Create the storage directory
>>> st = FileStorage.create("indexdir")
>>> # Create an index in the storage
>>> ix = st.create_index(schema)
Parameters:
  • schema – the whoosh.fields.Schema object to use for the new index.
  • indexname – the name of the index within the storage object. You can use this option to store multiple indexes in the same storage.
  • indexclass – an optional custom Index sub-class to use to create the index files. The default is whoosh.index.FileIndex. This method will call the create class method on the given class to create the index.
Returns:

a whoosh.index.Index instance.

delete_file(name)

Removes the given file from this storage.

Parameters:name – the name to delete.
destroy(*args, **kwargs)

Removes any implementation-specific resources related to this storage object. For example, a filesystem-based implementation might delete a directory, and a database implementation might drop tables.

The arguments are implementation-specific.

file_exists(name)

Returns True if the given file exists in this storage.

Parameters:name – the name to check.
Return type:bool
file_length(name)

Returns the size (in bytes) of the given file in this storage.

Parameters:name – the name to check.
Return type:int
file_modified(name)

Returns the last-modified time of the given file in this storage (as a “ctime” UNIX timestamp).

Parameters:name – the name to check.
Returns:a “ctime” number.
index_exists(indexname=None)

Returns True if a non-empty index exists in this storage.

Parameters:indexname – the name of the index within the storage object. You can use this option to store multiple indexes in the same storage.
Return type:bool
list()

Returns a list of file names in this storage.

Returns:a list of strings
lock(name)

Return a named lock object (implementing .acquire() and .release() methods). Different storage implementations may use different lock types with different guarantees. For example, the RamStorage object uses Python thread locks, while the FileStorage object uses filesystem-based locks that are valid across different processes.

Parameters:name – a name for the lock.
Returns:a lock-like object.
open_file(name, *args, **kwargs)

Opens a file with the given name in this storage.

Parameters:name – the name for the new file.
Returns:a whoosh.filedb.structfile.StructFile instance.
open_index(indexname='MAIN', schema=None, indexclass=None)

Opens an existing index (created using create_index()) in this storage.

>>> from whoosh.filedb.filestore import FileStorage
>>> st = FileStorage("indexdir")
>>> # Open an index in the storage
>>> ix = st.open_index()
Parameters:
  • indexname – the name of the index within the storage object. You can use this option to store multiple indexes in the same storage.
  • schema – if you pass in a whoosh.fields.Schema object using this argument, it will override the schema that was stored with the index.
  • indexclass – an optional custom Index sub-class to use to open the index files. The default is whoosh.index.FileIndex. This method will instantiate the class with this storage object.
Returns:

a whoosh.index.Index instance.

optimize()

Optimizes the storage object. The meaning and cost of “optimizing” will vary by implementation. For example, a database implementation might run a garbage collection procedure on the underlying database.

rename_file(frm, to, safe=False)

Renames a file in this storage.

Parameters:
  • frm – The current name of the file.
  • to – The new name for the file.
  • safe – if True, raise an exception if a file with the new name already exists.
temp_storage(name=None)

Creates a new storage object for temporary files. You can call Storage.destroy() on the new storage when you’re finished with it.

Parameters:name – a name for the new storage. This may be optional or required depending on the storage implementation.
Return type:Storage

Implementation classes

class whoosh.filedb.filestore.FileStorage(path, supports_mmap=True, readonly=False, debug=False)

Storage object that stores the index as files in a directory on disk.

Prior to version 3, the initializer would raise an IOError if the directory did not exist. As of version 3, the object does not check if the directory exists at initialization. This change is to support using the FileStorage.create() method.

Parameters:
  • path – a path to a directory.
  • supports_mmap – if True (the default), use the mmap module to open memory mapped files. You can open the storage object with supports_mmap=False to force Whoosh to open files normally instead of with mmap.
  • readonly – If True, the object will raise an exception if you attempt to create or rename a file.
class whoosh.filedb.filestore.RamStorage

Storage object that keeps the index in memory.

Helper functions

whoosh.filedb.filestore.copy_storage(sourcestore, deststore)

Copies the files from the source storage object to the destination storage object using shutil.copyfileobj.

whoosh.filedb.filestore.copy_to_ram(storage)

Copies the given FileStorage object into a new RamStorage object.

Return type:RamStorage

Exceptions

exception whoosh.filedb.filestore.ReadOnlyError