gaia.data.synthetic module

Synthetic Dataset Generation for GAIA

gaia.data.synthetic.create_synthetic_dataset(n_samples=1000, n_features=20, n_classes=5, n_informative=None, n_redundant=2, random_state=42, device='cpu')[source]

Create synthetic classification dataset

Parameters:
  • n_samples (int) – Number of samples

  • n_features (int) – Number of features

  • n_classes (int) – Number of classes

  • n_informative (int | None) – Number of informative features

  • n_redundant (int) – Number of redundant features

  • random_state (int) – Random seed

  • device (str) – Device for tensors

Returns:

Tuple of (features, labels)

Return type:

Tuple[<MockTorch name=’mock.Tensor’ id=’4349657872’>, <MockTorch name=’mock.Tensor’ id=’4349657872’>]

gaia.data.synthetic.create_xor_dataset(n_samples=1000, noise=0.1, random_state=42, device='cpu')[source]

Create XOR dataset for testing categorical learning

Parameters:
  • n_samples (int) – Number of samples

  • noise (float) – Noise level

  • random_state (int) – Random seed

  • device (str) – Device for tensors

Returns:

Tuple of (features, labels)

Return type:

Tuple[<MockTorch name=’mock.Tensor’ id=’4349657872’>, <MockTorch name=’mock.Tensor’ id=’4349657872’>]

gaia.data.synthetic.create_regression_dataset(n_samples=1000, n_features=20, n_informative=10, noise=0.1, random_state=42, device='cpu')[source]

Create synthetic regression dataset

Parameters:
  • n_samples (int) – Number of samples

  • n_features (int) – Number of features

  • n_informative (int) – Number of informative features

  • noise (float) – Noise level

  • random_state (int) – Random seed

  • device (str) – Device for tensors

Returns:

Tuple of (features, targets)

Return type:

Tuple[<MockTorch name=’mock.Tensor’ id=’4349657872’>, <MockTorch name=’mock.Tensor’ id=’4349657872’>]