Scheduler Component

| More

The Denodo Scheduler allows defining, scheduling and executing data extraction, data federation and virtual data integration tasks leveraging the modules of the Denodo Platform. The Scheduler works the modules of the Denodo Platform to provide functionalities such as the following:

  • automating periodical retrieval and storing of web data or scheduling other web automation tasks;
  • scheduling crawling / filtering /indexing tasks on unstructured information on the web, on document repositories, e-mail servers, RSS sites, etc;
  • scheduling any task involving obtaining data from several independent, heterogeneous data sources, combining them, and exporting the required information to an external repository. It can also be used to pre-fetch data on a periodical basis to fill the virtual data cache.

The Denodo Scheduler generates detailed reports of the results of the tasks execution, including detailed error information. The results obtained by a task can be exported to a CSV file, to a database, to an Excel spreadsheet, to XML or to the DataPort cache. Programmers can also write custom exporters.

The Denodo Scheduler supports extracting data from sources with limited query capabilities. For instance, consider a web service or website that allows you to obtain information about a certain enterprise given its taxid. It is possible to define a task that obtains the different taxids from a database or csv file and iterates on the web service or website to obtain the data for each taxid. Also, the Scheduler maintains persistence of tasks so that if the system is restarted, the Scheduler resumes where the tasks where it left off.

space