tree_structured.sample_generator

Sample Generator class.

Module Contents

Classes

SampleGenerator

A class to generate synthetic data.

class tree_structured.sample_generator.SampleGenerator(n_input_each_box: list[int] | numpy.typing.NDArray[numpy.int_], allowance_rand: bool = False)

Bases: noisecut.tree_structured.structured_data.StructuredData

A class to generate synthetic data.

SampleGenerator is used for building synthetic structured binary data. it can be done randomly by setting the structure of data or manually by setting the function of each box.

Parameters:
  • n_input_each_box ({list, ndarray} of shape (n_box,)) –

    An array of size n_box (number of first-layer black boxes) which keeps number of input features to each box. For instance, when n_input_each_box=[2, 4, 1], it means there are three first-layer black boxes and number of input features to box1, box2 and box3 is 2, 4, and 1, respectively.

    Note: There is only one black box in the second-layer.

  • allowance_rand (bool, default=False) – If True, all functions of the binary structure are set randomly.

functionality

Structured binary data has functionality if it is set to True.

Type:

bool

decimal_index_boxes

Keeps the decimal value of input features to each first-layer black box. For instance, if n_input_each_box = [2, 4, 1] and one sample has the following input feature [0, 1, 0, 0, 1, 1, 1], the decimal_index_boxes is as follows: [decimal(0, 1), decimal(0, 0, 1, 1), decimal(1)] = [2,12,1]

Type:

ndarray of shape (n_box,)

set_rand_func() None

Set function to all black boxes randomly.

has_synthetic_example_functionality() bool

Check the functionality of the tree-structured binary dataset.

Returns:

True if the dataset has functionality.

Return type:

bool

__check_functionality_in_each_box_by_recursion(i_o: int, number_loop_level: int = 0) None

Check whether the set synthetic data has functionality for a black box.

Parameters:
  • i_o (int) –

    Indicates the sequence of the nested loops. For instance,

    when n_box=3, if i_o=0, the sequence of nested loop is like:

    for i[0] in range(…):
    for i[1] in range(…):
    for i[2] in range(…):

    If i_o=1, the sequence of nested loop is like:

    for i[1] in range(…):
    for i[2] in range(…):
    for i[0] in range(…):

    And if i_o=2, the sequence of nested loop is like:

    for i[2] in range(…):
    for i[0] in range(…):
    for i[1] in range(…):

    Hint: (i[0], i[1], i[2]) can be seen as (i, j, k).

  • number_loop_level (int, default=0) –

    Number of loop in which code runs. Purpose of this function is to build nested loops to the number of n_box. For instance, when n_box=3 and i_o=0, it is somehow similar to building such a for loop:

    for i[0] in range(…): -> number_loop_level = 0
    for i[1] in range(…): -> number_loop_level = 1
    for i[2] in range(…): -> number_loop_level = 2

    hint: (i[0], i[1], i[2]) can be seen as (i, j, k).