DefaultExpressionImplementation

class VmaxBuilder.expression.implementation.DefaultExpressionImplementation(translation_service: IdentifierTranslationServiceProtocol | None = None)[source]

Generated: validation needed.

Description:

Resolve and preprocess expression input for downstream protein abundance assembly.

Parameters:

translation_service (IdentifierTranslationServiceProtocol | None) – Optional identifier translation service override.

Public Methods

prepare_expression_frame(…)

Generated: validation needed.

resolve_expression_frame(…)

Generated: validation needed.

resolve_expression_frame(scaffold: Scaffold, config: APIConfig) DataFrame | None[source]

Generated: validation needed.

Description:

Resolve expression dataframe from configured scaffold/config sources.

Parameters:
  • scaffold (Scaffold) – Shared pipeline scaffold.

  • config (APIConfig) – Root API configuration.

Returns:

pd.DataFrame | None – Expression dataframe when available.

static _build_id_type_name(provider: str | None, level: str) str | None[source]

Generated: validation needed.

Description:

Build full identifier type name from provider and level.

Parameters:
  • provider (str | None) – Identifier provider.

  • level (str) – Gene or transcript granularity.

Returns:

str | None – Full identifier type name, or None if provider is None.

prepare_expression_frame(scaffold: Scaffold, expression_df: DataFrame, config: APIConfig) DataFrame[source]

Generated: validation needed.

Description:
Apply placeholder transcript-to-gene conversion when run

target requests gene level.

Parameters:
  • scaffold (Scaffold) – Shared pipeline scaffold.

  • expression_df (pd.DataFrame) – Expression input table.

  • config (APIConfig) – Root API configuration.

Returns:

pd.DataFrame – Possibly converted expression table.

Raises:

ConfigurationError – If unsupported transcript aggregation policy is configured.

Modifies:

scaffold[“artifacts”] and scaffold[“diagnostics”] with translation metadata.

static _aggregate_transcripts_to_genes(expression_df: DataFrame, transcript_gene_map_df: DataFrame, *, aggregation_policy: str, protein_coding_only: bool, protein_coding_aggregation_policy: str, diagnostics_payload: dict[str, object]) DataFrame[source]

Generated: validation needed.

Description:

Aggregate transcript expression rows to genes and keep unresolved transcripts.

Parameters:
  • expression_df (pd.DataFrame) – Transcript-level expression table.

  • transcript_gene_map_df (pd.DataFrame) – Transcript-to-gene mapping dataframe.

  • aggregation_policy (str) – Configured aggregation policy.

  • protein_coding_only (bool) – Whether to keep only protein-coding transcripts.

  • protein_coding_aggregation_policy (str) – Aggregation policy for protein-coding transcript rows.

  • diagnostics_payload (dict[str, object]) – Mutable diagnostics payload.

Returns:

pd.DataFrame – Gene-level table with unresolved transcripts retained.

Raises:

ConfigurationError – If unsupported aggregation policy is configured.

static _apply_identifier_mapping(expression_df: DataFrame, *, identifier_mapping: dict[str, str]) DataFrame[source]

Generated: validation needed.

Description:

Apply partial identifier mapping and aggregate rows when mappings collide.

Parameters:
  • expression_df (pd.DataFrame) – Input expression table.

  • identifier_mapping (dict[str, str]) – Source identifier to target identifier mapping.

Returns:

pd.DataFrame – Table indexed by mapped identifiers where available.