Corpus¤

The corpus module provides functionality for document management and statistical analysis in the Lexos ecosystem. It provides centralized storage, metadata management, and inter-module communication capabilities that enable seamless integration with analysis modules. By default, it is entirely file-based; however, there is an option to manage a corpus database with SQLite.

Core Classes¤

`Corpus` (`corpus.py`)¤

The corpus module main container for managing collections of documents. Provides document storage, metadata management, and inter-module communication capabilities.

`Record` (`record.py`)¤

The record module implements an individual document container with robust metadata and serialization capabilities.

`CorpusStats` (`corpus_stats.py`)¤

The CorpusStats module provides methods for generating statistics about a corpus.

`LexosModelCache` and `RecordsDict` (`utils.py`)¤

The utils module provides utility classes for efficient model management and type-safe record storage.

SQLite Database¤

Database management is implemented in two modules:

`SQLiteBackend` (database.py)¤

The database module provides the main database functionality.

`SQLiteCorpus` (integration.py)¤

The integration module the handler for integration with the main corpus API.

Corpus Analysis Report¤

The corpus_analysis_report module provides a helper function for generating a comprehensive analysis of the contents of a corpus instance.

Corpus¤

Core Classes¤

Corpus (corpus.py)¤

Record (record.py)¤

CorpusStats (corpus_stats.py)¤

LexosModelCache and RecordsDict (utils.py)¤