OpenOffice.org Forum at OOoForum.orgThe OpenOffice.org Forum
 
 [Home]   [FAQ]   [Search]   [Memberlist]   [Usergroups]   [Register
 [Profile]   [Log in to check your private messages]   [Log in

Importing CSV file that exceeds row limits

 
Post new topic   Reply to topic    OOoForum.org Forum Index -> OpenOffice.org Calc
View previous topic :: View next topic  
Author Message
ArtPrintsShoppe
Newbie
Newbie


Joined: 13 Aug 2007
Posts: 1

PostPosted: Mon Aug 13, 2007 3:08 pm    Post subject: Importing CSV file that exceeds row limits Reply with quote

Good evening everyone!

I just downloaded Open Office, in the hopes of using the Base function to convert a CSV text file that I have with over 100,000 items, into a database format.....

However, when I go to do this, it opens up Calc and tries to retrieve the file, but since the file exceeds the row limits in calc, it cuts off quite a bit of data....

Is there a way I can open this in Base and not have to deal with Calc? Or is there a way to open this in Calc and have it break up the file onto separate pages?

Thanks in advance!

Ingrid
Back to top
View user's profile Send private message
David
Super User
Super User


Joined: 24 Oct 2003
Posts: 5668
Location: Canada

PostPosted: Mon Aug 13, 2007 6:21 pm    Post subject: Reply with quote

It is a text file. Break that into two CSV [text] files and insert each onto a different sheet. Actually, you'd Open the first, then Insert the second. Then cut/paste the second into the first wherever you want [over a few columns.] Paginate from there in the page Break Preview.

David.
Back to top
View user's profile Send private message
DrewJensen
Super User
Super User


Joined: 06 Jul 2005
Posts: 2616
Location: Cumberland, MD

PostPosted: Tue Aug 14, 2007 9:14 pm    Post subject: Reply with quote

Hi

Yes, you can bypass Calc.

You have two ways to achieve what you say you want to do.

ONE - Create a temporary Base file of type TEXT. Then drag and drop the file, which will appear as a table to a second Base file that actually connects to a database engine. ( embedded HSQLdb, HSQLdb file based, HSQLdb server based, MySQL server...whatever )
I DO NOT recommend this approach for 100,000 rows of data. The same for using Calc as an intermediary to transfer the data.

TWO - Use the TEXT table feature of HSQLdb to open your file as a table and use a SQL standard SELECT INTO command to transfer the data to an actual database table. ( actually, there are ways to do this with some of the other database engines also. )
You can find an example of doing this in a couple of places - including doing a search in the Base list on the forum for the word Baseball. ( really )

You can also find information on this in the HSQLdb user document at http://hsqldb.org

NOW a few warnings. The current implementation of the HSQLdb embedded database engine in Base does not scale up worth a damn. For even a medium sized table such as 100,000 records with no CLOB or BLOB fields the performance will be poor to bad. Using a linked TEXT table will yield better performance. So if all you want to do is query the records you might just want to create this linked table type and forget about actually importing the data into a relational table.

If you really want this in a database engine however, then look at using either a HSQL file based database via Base, [ I'm not sure there is a good example of setting this up on the forum or not - but if you are interested I can put one up, it is just a type of JDBC connection ] or use a server based model.

You will also find that if you create data entry forms attached to tables of this size the performance is not good for searches or full table scans. Queries and Filters will perform reasonably well however, so you have to mindful of this.

The last caveat is the reason I do not recommend the first approach for importing the data above - copy / paste or drag /drop performance is the absolute pits. The insert times are ...well, they suck. The developers know of the problem and they are looking at why, but don't count on any relief before the 3.0 release. AFAIK.

OK - so if I haven't scared you off yet - a brighter note. The next release of OOo 2.3 includes a new report designer. This is the first component in Base that is ALL new - meaning not bolted on top of old OOo code - and with this working with tables of 100,000 records is not bad at all performance wise.

So, if you have a 100,000 record csv file and you need to run some queries or produce a report look at creating a HSQLdb file based Base file and linking the file in as TEXT table. Use report builder NOT report wizard and you should be fine. ( OOo 2.3 RC1 should be out any day now ) Otherwise I would not recommend an embedded database under Base, but rather using a Base file connected to a database server of your choice.
_________________
Blog - http://baseanswers.spaces.live.com/
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    OOoForum.org Forum Index -> OpenOffice.org Calc All times are GMT - 8 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group