Ingestion
The Ingestion describes which Data Partitions and how they should be ingested. This includes parameters such as:
Ingestion Job - The implementation of the Loader as a Streaming or Batch Job
Schedules - A list of Schedules defined as cron expressions to be used as a catalyst for execution
Triggers - A list of Trigger functions to be used as a catalyst for execution
Definition
The code fragment below is an example of an Ingestion definition that Ingests hypothetical Crypto data based on the following requirements:
I want to ingest data from a Crypto data provider using 2 schedules; 1 for high frequency updates during the day and 1 to reconcile all of the data and end of day
I want to data from the Crypto data provider as soon as there are changes to any of the 'BTC' data
//1.
.ingestion(Ingestion.batch("crypto-ingestion", "Crytpo - Ingestion ", CrytpoBatchJob::new)
.addSchedule( //2.
newSchedule("intraday-hourly", "Intraday", "0 5 * ? * *", "UTC")
.setParameter(newParameterValue("window", "1")
)
)
.addSchedule( //3.
newSchedule("eod-weekly", "End of Day", "0 5 0 ? * *", "UTC")
.setParameter(newParameterValue("window", "24"))
)
.addTrigger( //4.
newTrigger("trigger-btc", CryptoTrigger::new).partitionFilterByTag("btc")
)
)The above fragment defines an Ingestion with the following Behaviours
Job Type - Creates a new Ingestion Job called "crypto-ingestion" which:
Uses the Batch Job Type behaviours
Supplies the CrytpoBatchJob class containing the implementation required to perform the Batch Job
Schedule - Adds the Intraday schedule which invokes the CryptoBatchJob that:
Executes at 5 minutes past each hour (according to the cron)
Utilises a parameter called window with a deaultValue of 1
Schedule - Adds the End of Day schedule which invokes the CrytpoBatchJob that
Executes at 5 minutes past midnight every day in UTC
Utilises a parameter called window with a defaultValue of 24
Trigger - Configures a Trigger that
Utilises the CryptoTrigger function to determine when the CryptoBatchJob gets invoked
Applies a Partition Filter that restricts the eligible Data Partitions to those containing the tag btc
Last updated
Was this helpful?