cellarr_array.core package

Submodules

cellarr_array.core.base module

class cellarr_array.core.base.CellArray(uri=None, tiledb_array_obj=None, attr='data', mode=None, config_or_context=None, validate=True)[source]

Bases: ABC

Abstract base class for TileDB array operations.

__abstractmethods__ = frozenset({'_direct_slice', '_multi_index', 'write_batch'})
__getitem__(key)[source]

Get item implementation that routes to either direct slicing or multi_index based on the type of indices provided.

Parameters:

key (Union[slice, EllipsisType, Tuple[Union[slice, List[int]], ...]]) – Slice or list of indices for each dimension in the array.

__init__(uri=None, tiledb_array_obj=None, attr='data', mode=None, config_or_context=None, validate=True)[source]

Initialize the object.

Parameters:
  • uri (Optional[str]) – URI to the array. Required if ‘tiledb_array_obj’ is not provided.

  • tiledb_array_obj (Optional[Array]) – Optional, an already opened tiledb.Array instance. If provided, ‘uri’ can be None, and ‘config_or_context’ is ignored.

  • attr (str) – Attribute to access. Defaults to “data”.

  • mode (Optional[Literal['r', 'w', 'd', 'm']]) –

    Open the array object in read ‘r’, write ‘w’, modify ‘m’ mode, or delete ‘d’ mode.

    Defaults to None for automatic mode switching.

    If ‘tiledb_array_obj’ is provided, this mode should ideally match the mode of the provided array or be None.

  • config_or_context (Union[Config, Ctx, None]) –

    Optional config or context object. Ignored if ‘tiledb_array_obj’ is provided, as context will be derived from the object.

    Defaults to None.

  • validate (bool) – Whether to validate the attributes. Defaults to True.

property attr_names: List[str]

Get attribute names of the array.

consolidate(config=None)[source]

Consolidate array fragments.

Parameters:

config (Optional[ConsolidationConfig]) – Optional consolidation configuration.

Return type:

None

property dim_names: List[str]

Get dimension names of the array.

property mode: str | None

Get current array mode. If an external array is used, this is its open mode.

property ndim: int

Get number of dimensions.

property nonempty_domain: Tuple[Any, ...] | None
open_array(mode=None)[source]

Context manager for array operations.

Uses the externally provided array if available, otherwise opens from URI.

Parameters:

mode (Optional[str]) –

Desired mode for the operation (‘r’, ‘w’, ‘m’, ‘d’). If an external array is used, this mode must be compatible with (or same as) the mode the external array was opened with.

If None, uses the CellArray’s default mode.

property shape: Tuple[int, ...]
vacuum()[source]

Remove deleted fragments from the array.

Return type:

None

abstractmethod write_batch(data, start_row, **kwargs)[source]

Write a batch of data to the array starting at the specified row.

Parameters:
  • data (Union[ndarray, spmatrix]) – Data to write (numpy array for dense, scipy sparse matrix for sparse).

  • start_row (int) – Starting row index for writing.

  • **kwargs – Additional arguments for write operation.

Return type:

None

cellarr_array.core.dense module

class cellarr_array.core.dense.DenseCellArray(uri=None, tiledb_array_obj=None, attr='data', mode=None, config_or_context=None, validate=True)[source]

Bases: CellArray

Implementation for dense TileDB arrays.

__abstractmethods__ = frozenset({})
__annotations__ = {}
write_batch(data, start_row, **kwargs)[source]

Write a batch of data to the dense array.

Parameters:
  • data (ndarray) – Numpy array to write.

  • start_row (int) – Starting row index for writing.

  • **kwargs – Additional arguments passed to TileDB write operation.

Raises:
  • TypeError – If input is not a numpy array.

  • ValueError – If dimensions don’t match or bounds are exceeded.

Return type:

None

cellarr_array.core.helpers module

class cellarr_array.core.helpers.SliceHelper[source]

Bases: object

Helper class for handling array slicing operations.

static is_contiguous_indices(indices)[source]
Return type:

Optional[slice]

static normalize_index(idx, dim_size)[source]

Normalize index to handle negative indices and ensure consistency.

Return type:

Union[slice, List[int], EllipsisType]

cellarr_array.core.helpers.create_cellarray(uri, shape=None, attr_dtype=None, sparse=False, mode=None, config=None, dim_names=None, dim_dtypes=None, attr_name='data', **kwargs)[source]

Factory function to create a new TileDB cell array.

Parameters:
  • uri (str) – Array URI.

  • shape (Optional[Tuple[Optional[int], ...]]) – Optional array shape. If None or contains None, uses dtype max.

  • attr_dtype (Union[str, dtype, None]) – Data type for the attribute. Defaults to float32.

  • sparse (bool) – Whether to create a sparse array.

  • mode (str) – Array open mode. Defaults to None for automatic switching.

  • config (Optional[CellArrConfig]) – Optional configuration.

  • dim_names (Optional[List[str]]) – Optional list of dimension names.

  • dim_dtypes (Optional[List[Union[str, dtype]]]) – Optional list of dimension dtypes. Defaults to numpy’s uint32.

  • attr_name (str) – Name of the data attribute.

  • **kwargs – Additional arguments for array creation.

Returns:

CellArray instance.

Raises:

ValueError – If dimensions are invalid or inputs are inconsistent.

cellarr_array.core.helpers.create_group(output_path, group_name)[source]

cellarr_array.core.sparse module

class cellarr_array.core.sparse.SparseCellArray(uri=None, tiledb_array_obj=None, attr='data', mode=None, config_or_context=None, return_sparse=True, sparse_format=<class 'scipy.sparse._csr.csr_matrix'>, validate=True, **kwargs)[source]

Bases: CellArray

Implementation for sparse TileDB arrays.

__abstractmethods__ = frozenset({})
__annotations__ = {}
__init__(uri=None, tiledb_array_obj=None, attr='data', mode=None, config_or_context=None, return_sparse=True, sparse_format=<class 'scipy.sparse._csr.csr_matrix'>, validate=True, **kwargs)[source]

Initialize the object.

Parameters:
  • uri (Optional[str]) – URI to the array. Required if ‘tiledb_array_obj’ is not provided.

  • tiledb_array_obj (Optional[Array]) – Optional, an already opened tiledb.Array instance. If provided, ‘uri’ can be None, and ‘config_or_context’ is ignored.

  • attr (str) – Attribute to access. Defaults to “data”.

  • mode (Optional[Literal['r', 'w', 'd', 'm']]) –

    Open the array object in read ‘r’, write ‘w’, modify ‘m’ mode, or delete ‘d’ mode.

    Defaults to None for automatic mode switching.

    If ‘tiledb_array_obj’ is provided, this mode should ideally match the mode of the provided array or be None.

  • config_or_context (Union[Config, Ctx, None]) –

    Optional config or context object. Ignored if ‘tiledb_array_obj’ is provided, as context will be derived from the object.

    Defaults to None.

  • return_sparse (bool) – Whether to return a sparse representation of the data when object is sliced. Default is to return a dictionary that contains coordinates and values.

  • sparse_format (Union[csr_matrix, csc_matrix]) – Format to return, defaults to csr_matrix.

  • validate (bool) – Whether to validate the attributes. Defaults to True.

  • kwargs – Additional arguments.

write_batch(data, start_row, **kwargs)[source]

Write a batch of sparse data to the array.

Parameters:
  • data (Union[spmatrix, csc_matrix, coo_matrix]) – Scipy sparse matrix (CSR, CSC, or COO format).

  • start_row (int) – Starting row index for writing.

  • **kwargs – Additional arguments passed to TileDB write operation.

Raises:
  • TypeError – If input is not a sparse matrix.

  • ValueError – If dimensions don’t match or bounds are exceeded.

Return type:

None

Module contents