cellarr_frame package¶
Submodules¶
cellarr_frame.base module¶
- class cellarr_frame.base.CellArrayFrame(uri=None, tiledb_array_obj=None, mode=None, config_or_context=None)[source]¶
Bases:
ABCAbstract base class for TileDB dataframe operations.
- __abstractmethods__ = frozenset({'__getitem__', 'append_dataframe', 'columns', 'get_shape', 'index', 'read_dataframe', 'shape', 'write_dataframe'})¶
- __init__(uri=None, tiledb_array_obj=None, mode=None, config_or_context=None)[source]¶
Initialize the object.
- Parameters:
uri (
Optional[str]) – URI to the array. Required if ‘tiledb_array_obj’ is not provided.tiledb_array_obj (
Optional[Array]) – Optional, an already openedtiledb.Arrayinstance. If provided, ‘uri’ can be None, and ‘config_or_context’ is ignored.mode (
Optional[Literal['r','w','d','m']]) –Open the array object in read ‘r’, write ‘w’, modify ‘m’ mode, or delete ‘d’ mode.
Defaults to None for automatic mode switching.
If ‘tiledb_array_obj’ is provided, this mode should ideally match the mode of the provided array or be None.
config_or_context (
Union[Config,Ctx,None]) –Optional config or context object. Ignored if ‘tiledb_array_obj’ is provided, as context will be derived from the object.
Defaults to None.
- abstractmethod append_dataframe(df, row_offset=None)[source]¶
Append a pandas DataFrame to the TileDB array.
- abstract property columns: Index¶
Get the column names of the dataframe.
- abstractmethod get_shape()[source]¶
Get the shape of the array (number of rows for dataframes).
- Return type:
- abstract property index: Index¶
Get the row index of the dataframe.
- property mode: str | None¶
Get current array mode. If an external array is used, this is its open mode.
- open_array(mode=None)[source]¶
Context manager for array operations.
Uses the externally provided array if available, otherwise opens from URI.
cellarr_frame.dense module¶
- class cellarr_frame.dense.DenseCellArrayFrame(uri=None, tiledb_array_obj=None, mode=None, config_or_context=None)[source]¶
Bases:
CellArrayFrameHandler for dense dataframes using TileDB’s native dataframe support.
- __abstractmethods__ = frozenset({})¶
- __annotations__ = {}¶
- property columns: Index¶
Get the column names (attributes) of the dataframe.
- property dtypes: Series¶
Return the dtypes of the columns/attributes in the array.
- classmethod from_dataframe(uri, df, **kwargs)[source]¶
Create a DenseCellArrayFrame from a pandas DataFrame.
This uses tiledb.from_pandas to create the array, ensuring compatibility with TileDB’s native pandas integration.
- Parameters:
uri (
str) – URI to create the array at.df (
DataFrame) – Pandas DataFrame to write.**kwargs – Additional arguments.
- Return type:
- property index: Index¶
Get the row index of the dataframe.
- read_dataframe(columns=None, query=None, subset=None, primary_key_column_name=None, **kwargs)[source]¶
Read a pandas DataFrame from the TileDB array.
- Parameters:
- Return type:
DataFrame- Returns:
The pandas DataFrame.
- property rows: Index¶
Alias for index to match Metadata interface.
- write_dataframe(df, **kwargs)[source]¶
Write a dense pandas DataFrame to a 1D TileDB array.
This assumes the array was created using tiledb.from_pandas or the helper function. It appends the dataframe starting at row 0.
- Parameters:
df (
DataFrame) – The pandas DataFrame to write.**kwargs – Additional arguments.
- Return type:
cellarr_frame.helpers module¶
cellarr_frame.sparse module¶
- class cellarr_frame.sparse.SparseCellArrayFrame(uri=None, tiledb_array_obj=None, mode=None, config_or_context=None)[source]¶
Bases:
CellArrayFrameHandler for sparse dataframes using a 2D sparse TileDB array.
This class wraps a cellarr_array.SparseCellArray instance, assuming it’s a 2D sparse array with string/object data.
- __abstractmethods__ = frozenset({})¶
- __annotations__ = {}¶
- __init__(uri=None, tiledb_array_obj=None, mode=None, config_or_context=None)[source]¶
Initialize the object.
- append_dataframe(df, row_offset=None)[source]¶
Append data points from a pandas DataFrame to the sparse TileDB array.
If row_offset is provided, adjusts the row indices of the appended data. Assumes integer row dimensions for offset calculation.
- property columns: Index¶
Get the column names (unique values from 2nd dim) of the dataframe.
- property index: Index¶
Get the row index (unique values from 1st dim) of the dataframe.
- read_dataframe(subset=None, columns=None, query=None, **kwargs)[source]¶
Read a pandas DataFrame from the TileDB array.
- Parameters:
- Return type:
DataFrame- Returns:
The pandas DataFrame.
- write_dataframe(df, **kwargs)[source]¶
Write a sparse pandas DataFrame to a 2D sparse TileDB array.
The DataFrame is converted to a coordinate format (row_idx, col_idx, value).
- Parameters:
df (
DataFrame) – The sparse pandas DataFrame to write.**kwargs – Additional arguments for the write operation.
- Return type: