BiomeSurvey

class BiomeSurvey(assembiles=None, *args, aggfunc='mean', groupby='label', **kwargs)[source]

Assembly-like Survey class for merging instances of BiomeAssembly

This class performs merging/pooling of _multiple independent studies or instances of EssentialBackboneBase (essentials) into single instance of BiomeAssembly -like class BiomeSurvey.

Parameters
  • assembiles (Optional[Sequence[pmaf.biome.assembly._assembly.BiomeAssembly]]) – essentials to pool.

  • *args – Unpacked essentials to pool. (Convenience)

  • aggfunc (Union[str, Callable, Tuple[Union[str, Callable], Union[str, Callable]], Dict[Union[str, int], Union[str, Callable, Dict[Optional[pmaf.biome.essentials._metakit.EssentialBackboneMetabase], Union[str, Callable]]]]]) – Aggregation method. Parameter take _multiple variations of aggregation approach. If str or Callable then aggfunc will be applied to both axes(feature and sample) and any essentials regardless of its type. To apply aggregation for each axis separately use tuple (for example, aggfunc=(‘sum’, ‘mean’)) where first aggregation method refers to feature axis and second to sample axis. To apply more complex aggregation use Dict type, where keys refer to axis like 0/feature for feature axis or 1/sample for sample axis. Values of the dictionary can refer to two approaches. First is when values are simply str or Callable, which is similar to using tuple. Second, is when using values with type Dict where dictionary values are str or Callable refer to aggregating function and keys are types or class of essentials (must have base abstract class EssentialBackboneMetabase ). Using this method each type of essential will be processed differently among instances of assemblies. Lastly, when using approach like Dict[axis, Dict[essential-type,*agg-func*]] using None for one of essential-type keys will assume that it refers to all remaining-types.

  • groupby (Union[str, Tuple[str, str], Dict[Union[int, str], str]]) – Grouping method. Parameters take _multiple variations similar to aggfunc. Variations are same as aggfunc with exception that values can be either label for both feature-axis or sample-axis like groupby=’label’ or groupby=(`label`, `label`) , or taxonomy for feature-axis only. Grouping by taxonomy will merge features with same consensus lineage.

  • **kwargs – Compatibility

  • args (Any) –

  • kwargs (Any) –

Attributes

assemblies

Tuple of surveyed assemblies.

controller

EssentialsController of essentials

essentials

List of essentials

metadata

The essential instance metadata.

name

The essential instance name.

shape

Return the shape/size of the essential instance.

xrid

Feature identifiers.

xsid

Sample identifiers.

Methods

copy()

Copy of the instance.

get_feature_ids([dtype])

This function and its sample twin is a rescue method to fix RepPhylogeny index problem.

get_sample_ids([dtype])

This function and its sample twin is a rescue method to fix RepPhylogeny index problem.

to_assembly()

Converts to the BiomeAssembly instance.