Considerations when you are importing large files (CSV/Excel)

Tabulizer does not impose any limitation on how large your CSV file can be, but your web server settings may do that. More specifically, there is the PHP memory limit and the PHP max execution time that indirectly put a limit on how large the CSV/Excel file can be. In order to resolve server side memory and time execution limitations the proper and easiest way is to increase these limits via php.ini or ask your web server admin/support to do that for you.

Using the max_execution_time directive

By default, the maximum execution time for PHP scripts is set to 30 seconds. If a script runs for longer than 30 seconds, PHP stops the script and reports an error. You can control the amount of time PHP allows scripts to run by changing the max_execution_time directive in your php.ini file.

To change the maximum execution time, use a text editor to modify the max_execution_timedirective in your php.ini file. For example, to set the maximum execution time to 180 seconds, use the following setting:

max_execution_time = 180

Using the memory_limit directive

By default, a PHP script can allocate up to 128 megabytes of memory.

To change this limit, use a text editor to modify the memory_limit directive in your php.ini file. For example, to allow scripts to allocate a maximum amount of 512 megabytes of memory, use the following setting:

memory_limit = 512M

Borwser (Client-Side) Considerations

There also another limit on the browser on how many DOM nodes can handle. This translates into increased page loading times and even blank pages. For this case, Tabulizer has a experimental feature called "server side" that allows you to load only the part of the table that is need and not all of the table data. For this to work there are some reasonable requirements:

  • The ruleset you are using has a pagination rule
  • The data source cache is enabled.

This feature is available when you display the table dynamically, i.e. via a data source and it should be used only when the table rows are too many.

You can have a look at the following tutorial on how to display really large tables that describes how to use this feature:

Improved memory and processing time features

Starting from version 6.0.30 it's possible to reduce the memory requirements in half if you use a data source of type Excel with the Cell Cache option enabled. For this to work you need to have SQLite3 PHP module enabled. In most cases the memory consumption is reduced in half, with a small increase in the processing time. Obviously, this option is very useful when you are importing large Excel file and increasing the available PHP memory is not an option. There is another beta feature which is highly recommended when the source Excel file is of type "xlsx" and does not contains calculations or images that you want to import. The Boost feature can reduce dramatically the memory requirements and the processing time, but it will not work along with the following features: (a) Read Font, (b) Read Color (c) Read Images. In other words, that's a great option if you want to read just the text from an xlsx file that has no complex data calculations.