Partitioning classes

Partitioning classes

Functions

Properties

char * null-fallback Read / Write
GADatasetSegmentEncoding segment-encoding Read / Write
gpointer partitioning Write / Construct Only
gboolean infer-dictionary Read / Write
GArrowSchema * schema Read / Write
GADatasetSegmentEncoding segment-encoding Read / Write

Types and Values

Object Hierarchy

    GEnum
    ╰── GADatasetSegmentEncoding
    GObject
    ├── GADatasetKeyValuePartitioningOptions
       ├── GADatasetHivePartitioningOptions
       ╰── GADatasetHivePartitioningOptions
    ├── GADatasetPartitioning
       ├── GADatasetKeyValuePartitioning
       ╰── GADatasetKeyValuePartitioning
           ├── GADatasetDirectoryPartitioning
           ├── GADatasetHivePartitioning
           ├── GADatasetDirectoryPartitioning
           ╰── GADatasetHivePartitioning
    ╰── GADatasetPartitioningFactoryOptions

Includes

#include <arrow-dataset-glib/arrow-dataset-glib.h>

Description

GADatasetPartitioningFactoryOptions is a class for partitioning factory options.

GADatasetPartitioning is a base class for partitioning classes such as GADatasetDirectoryPartitioning.

GADatasetKeyValuePartitioningOptions is a class for key-value partitioning options.

GADatasetKeyValuePartitioning is a base class for key-value style partitioning classes such as GADatasetDirectoryPartitioning.

GADatasetDirectoryPartitioning is a class for partitioning that uses directory structure.

GADatasetHivePartitioningOptions is a class for Hive-style partitioning options.

GADatasetHivePartitioning is a class for partitioning that uses Hive-style partitioning.

Functions

gadataset_partitioning_factory_options_new ()

GADatasetPartitioningFactoryOptions *
gadataset_partitioning_factory_options_new
                               (void);

Returns

The newly created GADatasetPartitioningFactoryOptions.

Since: 11.0.0

gadataset_partitioning_get_type_name ()

gchar *
gadataset_partitioning_get_type_name (GADatasetPartitioning *partitioning);

Parameters

partitioning

A GADatasetPartitioning.

 

Returns

The type name of partitioning .

It should be freed with g_free() when no longer needed.

Since: 6.0.0

gadataset_partitioning_create_default ()

GADatasetPartitioning *
gadataset_partitioning_create_default (void);

Returns

The newly created GADatasetPartitioning that doesn't partition.

[transfer full]

Since: 12.0.0

gadataset_key_value_partitioning_options_new ()

GADatasetKeyValuePartitioningOptions *
gadataset_key_value_partitioning_options_new
                               (void);

Returns

The newly created GADatasetKeyValuePartitioningOptions.

Since: 11.0.0

gadataset_directory_partitioning_new ()

GADatasetDirectoryPartitioning *
gadataset_directory_partitioning_new (GArrowSchema *schema,
                                      GList *dictionaries,
                                      GADatasetKeyValuePartitioningOptions *options,
                                      GError **error);

Parameters

schema

A GArrowSchema that describes all partitioned segments.

 

dictionaries

A list of GArrowArray for dictionary data types in schema .

[nullable][element-type GArrowArray]

options

A GADatasetKeyValuePartitioningOptions.

[nullable]

error

Return location for a GError or NULL.

[nullable]

Returns

The newly created GADatasetDirectoryPartitioning on success, NULL on error.

Since: 6.0.0

gadataset_hive_partitioning_options_new ()

GADatasetHivePartitioningOptions *
gadataset_hive_partitioning_options_new
                               (void);

Returns

The newly created GADatasetHivePartitioningOptions.

Since: 11.0.0

gadataset_hive_partitioning_new ()

GADatasetHivePartitioning *
gadataset_hive_partitioning_new (GArrowSchema *schema,
                                 GList *dictionaries,
                                 GADatasetHivePartitioningOptions *options,
                                 GError **error);

Parameters

schema

A GArrowSchema that describes all partitioned segments.

 

dictionaries

A list of GArrowArray for dictionary data types in schema .

[nullable][element-type GArrowArray]

options

A GADatasetHivePartitioningOptions.

[nullable]

error

Return location for a GError or NULL.

[nullable]

Returns

The newly created GADatasetHivePartitioning on success, NULL on error.

Since: 11.0.0

gadataset_hive_partitioning_get_null_fallback ()

gchar *
gadataset_hive_partitioning_get_null_fallback
                               (GADatasetHivePartitioning *partitioning);

Returns

The fallback string for null.

It should be freed with g_free() when no longer needed.

Since: 11.0.0

Types and Values

enum GADatasetSegmentEncoding

They are corresponding to arrow::dataset::SegmentEncoding values.

Members

GADATASET_SEGMENT_ENCODING_NONE

No encoding.

 

GADATASET_SEGMENT_ENCODING_URI

Segment values are URL-encoded.

 

Since: 6.0.0

GADATASET_TYPE_PARTITIONING_FACTORY_OPTIONS

#define             GADATASET_TYPE_PARTITIONING_FACTORY_OPTIONS

struct GADatasetPartitioningFactoryOptionsClass

struct GADatasetPartitioningFactoryOptionsClass {
  GObjectClass parent_class;
};

GADATASET_TYPE_PARTITIONING

#define GADATASET_TYPE_PARTITIONING (gadataset_partitioning_get_type())

struct GADatasetPartitioningClass

struct GADatasetPartitioningClass {
  GObjectClass parent_class;
};

GADATASET_TYPE_KEY_VALUE_PARTITIONING_OPTIONS

#define             GADATASET_TYPE_KEY_VALUE_PARTITIONING_OPTIONS

struct GADatasetKeyValuePartitioningOptionsClass

struct GADatasetKeyValuePartitioningOptionsClass {
  GObjectClass parent_class;
};

GADATASET_TYPE_KEY_VALUE_PARTITIONING

#define             GADATASET_TYPE_KEY_VALUE_PARTITIONING

struct GADatasetKeyValuePartitioningClass

struct GADatasetKeyValuePartitioningClass {
  GADatasetPartitioningClass parent_class;
};

GADATASET_TYPE_DIRECTORY_PARTITIONING

#define             GADATASET_TYPE_DIRECTORY_PARTITIONING

struct GADatasetDirectoryPartitioningClass

struct GADatasetDirectoryPartitioningClass {
  GADatasetKeyValuePartitioningClass parent_class;
};

GADATASET_TYPE_HIVE_PARTITIONING_OPTIONS

#define             GADATASET_TYPE_HIVE_PARTITIONING_OPTIONS

struct GADatasetHivePartitioningOptionsClass

struct GADatasetHivePartitioningOptionsClass {
  GADatasetKeyValuePartitioningOptionsClass parent_class;
};

GADATASET_TYPE_HIVE_PARTITIONING

#define             GADATASET_TYPE_HIVE_PARTITIONING

struct GADatasetHivePartitioningClass

struct GADatasetHivePartitioningClass {
  GADatasetKeyValuePartitioningClass parent_class;
};

GADatasetDirectoryPartitioning

typedef struct _GADatasetDirectoryPartitioning GADatasetDirectoryPartitioning;

GADatasetHivePartitioning

typedef struct _GADatasetHivePartitioning GADatasetHivePartitioning;

GADatasetHivePartitioningOptions

typedef struct _GADatasetHivePartitioningOptions GADatasetHivePartitioningOptions;

GADatasetKeyValuePartitioning

typedef struct _GADatasetKeyValuePartitioning GADatasetKeyValuePartitioning;

GADatasetKeyValuePartitioningOptions

typedef struct _GADatasetKeyValuePartitioningOptions GADatasetKeyValuePartitioningOptions;

GADatasetPartitioning

typedef struct _GADatasetPartitioning GADatasetPartitioning;

GADatasetPartitioningFactoryOptions

typedef struct _GADatasetPartitioningFactoryOptions GADatasetPartitioningFactoryOptions;

Property Details

The “null-fallback” property

  “null-fallback”            char *

The fallback string for null. This is used only by GADatasetHivePartitioning.

Owner: GADatasetHivePartitioningOptions

Flags: Read / Write

Default value: "__HIVE_DEFAULT_PARTITION__"

Since: 11.0.0

The “segment-encoding” property

  “segment-encoding”         GADatasetSegmentEncoding

After splitting a path into components, decode the path components before parsing according to this scheme.

Owner: GADatasetKeyValuePartitioningOptions

Flags: Read / Write

Default value: GADATASET_SEGMENT_ENCODING_URI

Since: 11.0.0

The “partitioning” property

  “partitioning”             gpointer

The raw std::shared<arrow::dataset::Partitioning> *.

Owner: GADatasetPartitioning

Flags: Write / Construct Only

The “infer-dictionary” property

  “infer-dictionary”         gboolean

When inferring a schema for partition fields, yield dictionary encoded types instead of plain. This can be more efficient when materializing virtual columns, and Expressions parsed by the finished Partitioning will include dictionaries of all unique inspected values for each field.

Owner: GADatasetPartitioningFactoryOptions

Flags: Read / Write

Default value: FALSE

Since: 11.0.0

The “schema” property

  “schema”                   GArrowSchema *

Optionally, an expected schema can be provided, in which case inference will only check discovered fields against the schema and update internal state (such as dictionaries).

Owner: GADatasetPartitioningFactoryOptions

Flags: Read / Write

Since: 11.0.0

The “segment-encoding” property

  “segment-encoding”         GADatasetSegmentEncoding

After splitting a path into components, decode the path components before parsing according to this scheme.

Owner: GADatasetPartitioningFactoryOptions

Flags: Read / Write

Default value: GADATASET_SEGMENT_ENCODING_URI

Since: 11.0.0