Partition Registrator

Before any data can be ingested by the Ingestr Framework, the Data Partition must be registered so it is known to the Ingestr before it can begin the ingestion processes for it.

The Data Descriptor only provides a template for how Data Partitions should look, by defining the structure of a Data Partition, but the work of discovering and registering the Data Partitions is the responsibility of the Partition Registrator.

Definition

//1. Partition Registrator
  .partitionRegistrator(
    //2. Implementation
    newPartitionRegister(CryptoPartitionRegistrator::new)
      //3. Schedule
      .schedule("0 0 1 ? * *", "UTC")
      //4. Deregistration Method
      .deregistrationMethod(DeregistrationMethod.DEREGISTER)
    )
  1. Partition Registrator - Defines a Partition Registrator on the Data Descirptor

  2. Implementation - Supply the implementing class of the PartitionRegistrator interface

  3. Schedule - The schedule as a cron (in Quartz) and Timezone to run the discovery process

  4. Deregistration Method - How missing Data Partition should be treated

Discovery Process

The discovery method is invoked according to the schedule or via the API call to xxx.

When the discovery process is executed, the Ingestr Framework will expect to receive the complete set of Data Partitions from the source system leaving the Ingestr Framework to do the work of merging, creating or de-registration.

De-Registration

De-registration refers to the scenario where a previously discovered Data Partition is no longer discoverable. This can sometimes happen when Data Providers change identifiers, or even delete and is no longer available.

There are 3 methods to handle this scenario:

  • DEREGISTER - Sets a flag on the Data Partition indicating it has been de-registered, however, normally processing will continue. This could be ideal in situations where the discovery mechanism is not very reliable. Manual intervention may be needed to resolve if continuous errors occur with the Data Partition.

  • DISABLE - Makes the Data Partition disabled and prevents normally processing from continuing. Manual intervention may be needed to resolve.

  • DELETE - Removes the Data Partition entirely

The DEREGISTER and DISABLE methods are recoverable in the sense that they will automatically return to normal if at some stage the Partition Registrator manages to re-discover them. This would also resume from where the last Offset indicates it should.

The DELETE methods are unrecoverable, and if the Partition Registrator manages to re-discover them, the data would not resume as there is no Offset.

PartitionRegistrator Implementation

//1. PartitionRegisrator class
public class CryptoPartitionRegistrator implements PartitionRegistrator {
    private final static Logger log = LoggerFactory.getLogger(CryptoPartitionRegistrator .class);

    //2. discover method
    @Override
    public void discover(ParitionRegistratorRequest request, ParitionRegistratorResult result) {
        log.info("Discovering new Crypto Partitions ...");

        //logic to discover the Data Partitions at the source

        //3. Register the Data Partition
        result.addPartition(
                newPartition(
                        PartitionEntry.newEntry("currencyPair", "BTCUSD"),
                        PartitionEntry.newEntry("resolution", "5m")
                )
                .priority(Partition.Priority.NORMAL)
                .meta("type", "base")
                .meta("exchange", "bitIngest")
                .tags("BTC", "crypto"));
    }
}
  1. PartitionRegistrator - This is the Interface that should be implemented by the Partition Registrator

  2. discover() - This method will be invoked by the Ingestr Framework according to the schedule, or on-demand by the API

  3. addPartition() - Adds the Partition to the list of partitions

Last updated

Was this helpful?