Combining data from multiple sources with data source merger.

One of the advanced features of the latest Tabulizer version is that you can merge data from various sources into a single table. Let's say you keep the sales reports on a different Excel sheet for different regions and you want to combine them into a single table, or you have one CSV file where you keep a list of names and phones and another database table where you keep a list of names and emails for the same group of people, and you want to combine them into a single table. The figures below shows a few case scenarios of how you can combine the input data sources:

Details please!

OK, now let's get into the details and see how you can do it in practice. Step 1. Define your data sources.
For each separate part of the table you need to specify its data source. Data sources are discussed elsewhere, but basically its a way of linking external sources into your tables. A data source can be:

  • An Excel file
  • A CSV or a text file
  • A web page/external table
  • A database query
  • Another site article/post
  • An RSS/Atom feed
  • An XML file

Step 2. Combine the data sources into a single data source of type "merger".
For the specification of how the various table parts should be combined there is a parameter called "Merging Tree" which uses a syntax similar to the JSON.

Below are some typical cases and their corresponding merging tree value:

CaseMerging Tree
[{"1":"tag of data source 1"},{"2":"tag of data source 2"}]
[{"1":"tag of data source 1","2":"tag of data source 2}]
[{"1":"tag of data source 1"},{"2":"tag of data source 2","3":"tag of data source 3"}]

A simple example

This is the "hello world" example. We will combine two data sources to create a single table, and we will put one data source on top of the other.

Data Source 1:

This will contain a list of users from the database, with the following settings:

Title: DS1 - Users from Database
Source Type: Database Table/Query
Database Query: SELECT name,email FROM #__users WHERE 1
Ruleset Archive/Name: Anything that does not affect the data format/input
Data Source 2:

This will contain a list of users from a text file, with the following settings:

Title: DS2- Users from File
Source Type: CSV/Text File
File path/URL: users.txt
Separator: Comma (,)
Enclosure: Double Quotes (")
Ruleset Archive/Name: Anything that does not affect the data format/input
Note: The separator and the enclosure depend of the contents of the file. In this example the contents of the users.txt file are shown below:
Jack Casey,jack.casey@hotmail.com
Mary Aldrich,may1990@yahoo.gr
Gary Aulmich,gaul@gmail.com

Let's assume that the given data source tags are:

TitleTag
DS1 - Users from DatabaseXxe3gm0rs28gejjnFavHtKfl
DS2- Users from FilevLRPiCNeKjXqGML3pm7wZEBu

Then the data source merger should have the following settings:

Title: DS - All Users
Source Type: Data Source Merger
Merging Tree:  [{"1":"Xxe3gm0rs28gejjnFavHtKfl"},{"2":"vLRPiCNeKjXqGML3pm7wZEBu"}]
Archive/Name: Anything you wish