tree_structured.base_structured_data

Base Class for NoiseCut estimator.

Module Contents

Classes

BaseStructuredData

Base class for NoiseCut estimator.

BasePseudoBooleanFunc

Base class of PseudoBooleanFunc to validate input data.

class tree_structured.base_structured_data.BaseStructuredData

Bases: noisecut.tree_structured.base.Base

Base class for NoiseCut estimator.

n_input_each_box

An array of size n_box (number of first-layer black boxes) which keeps number of input features to each box. For instance, when `n_input_each_box`=[2, 4, 1], it means there are three first-layer black boxes and number of input features to box1, box2 and box3 is 2, 4, and 1, respectively.

Note: There is only one black box in the second-layer.

Type:

ndarray of shape (n_box,)

n_box

Number of the first layer boxes.

Type:

int

dimension

Numer of input features.

Type:

int

validate_x(x: Any) tuple[bool, numpy.typing.NDArray[numpy.bool_]]

Validate input data x.

Parameters:

x ({array-like, dataframe} of shape (n_samples, n_features)) –

Returns:

  • status (bool) – True if x is a valid input.

  • x (ndarray of bool) – Validated input data x in ndarray type.

Raises:
  • TypeError – If x does not have has len, shape or __array__ attribute.

  • ValueError – If x does not have expected dimension and value

validate_x_y(x: Any, y: Any) tuple[bool, numpy.typing.NDArray[numpy.bool_], numpy.typing.NDArray[numpy.bool_]]

Validate input data x and y.

Parameters:
  • x ({array-like, dataframe} of shape (n_samples, n_features)) –

  • y ({array-like, dataframe} of shape (n_samples,)) –

Returns:

  • status (bool) – True if x and y are valid inputs.

  • x (ndarray of bool) – Validated x in ndarray type.

  • y (ndarray of bool) – Validated y in ndarray type.

validate_n_input_each_box(n_input_each_box: Any) None

Validate n_input_each_box.

Parameters:

n_input_each_box ({array-like, ndarray}) –

Notes

Based on the validated input, values of n_box, n_input_each_box and dimension are initialized.

validate_id_box(id_box: Any) tuple[bool, int]

Validate type and value of the id_box.

Parameters:

id_box (int) – Index of the box, a value between 0 and n_box-1.

Returns:

  • status (bool) – Whether the input value for the id_box is valid.

  • id_box (int) – Validated id_box.

validate_vector_n_score(vector_n_score: Any) tuple[bool, numpy.typing.NDArray[numpy.float_]]

Validate vector_n_score.

Parameters:

vector_n_score ({array-like, dataframe} of shape (max_score+1,)) –

Returns:

  • status (bool) – Whether the input value for the vector_n_score is valid.

  • vector_n_score (ndarray of float) – Validated vector_n_score.

static validate_threshold(threshold: Any) tuple[bool, float]

Validate the value of threshold.

Parameters:

threshold (Any) – A float value in range 0 to 1, which determines the condition for setting binary function for the last layer black box.

Returns:

  • status (bool) – Whether the input value for the threshold is valid.

  • threshold (float) – Validated threshold.

class tree_structured.base_structured_data.BasePseudoBooleanFunc

Bases: noisecut.tree_structured.base.Base

Base class of PseudoBooleanFunc to validate input data.

arity

Arity of the Pseudo-Boolean function.

Type:

int

validate_arity(arity: Any) bool

Validate arity of the Pseudo-Boolean Function.

Parameters:

arity (Any) – arity of the Pseudo-Boolean Function, should be greater than one.

Returns:

True if arity is a valid input.

Return type:

bool

_validate_x(x: Any) numpy.typing.NDArray[numpy.int_]

Validate x as an input for creating Pseudo-Boolean Function.

Parameters:

x ({array-like, dataframe} of shape (2**arity, arity)) –

Returns:

Return x in the required format if it is validated.

Return type:

ndarray of int

validate_x_y(x: Any, y: Any) tuple[bool, numpy.typing.NDArray[numpy.int_], numpy.typing.NDArray[numpy.int_]]

Validate x and y as an input for creating Pseudo-Boolean Function.

Parameters:
  • x ({array-like, dataframe} of shape (2**arity, arity)) –

  • y ({array-like, dataframe} of shape (2**arity,)) –

Returns:

  • status (bool) – True if x and y are valid inputs.

  • x (ndarray of bool) – Validated x in ndarray type.

  • y (ndarray of bool) – Validated y in ndarray type.