OpenOffice.org Forum at OOoForum.orgThe OpenOffice.org Forum
 
 [Home]   [FAQ]   [Search]   [Memberlist]   [Usergroups]   [Register
 [Profile]   [Log in to check your private messages]   [Log in

Speadsheet as Datasource - redaux

 
Post new topic   Reply to topic    OOoForum.org Forum Index -> OpenOffice.org Base
View previous topic :: View next topic  
Author Message
DrewJensen
Super User
Super User


Joined: 06 Jul 2005
Posts: 2616
Location: Cumberland, MD

PostPosted: Sun Jan 01, 2006 11:03 am    Post subject: Speadsheet as Datasource - redaux Reply with quote

I just found this thread -

http://www.oooforum.org/forum/viewtopic.phtml?t=3825

Very nice.

It seems to me that a really niffty addon would be this:

Hightlight a block of cells, bring up a dialog that lets you select a Base datatabse (or create a new one), create a table in the Base database from these columns and rows and THEN replace the data in the Calc sheet with the same data but supplied by a named databse range, linked to the table in the Base database.

Now you could use the form capabilities in Base - and its integrity functions - and then just do a refresh on the data link to get the data back into Calc for your formula to work their magic on.

Dates are the biggest problem I see. The current copy data wizard just falls flat on it's face here - so the question is can Basic code find out wich EPOCH date the two packages are using, so to handle fix up. OR I suppose the data from the cells could be written out to a CSV file fist, then imported into Base. That negates the date issue.

I take it the API would support the last step of adding the named range link.

Also, I am not sure what the implications would be to formula in the cac sheets that use these cells when they are replced this way.

The filp side to this addon could be this.

Starting from a base data, select a table and automatically generate a Calc document with the data linked in as datasource. I know this is easy in the UI, but hey..

Anyway - it seems from scanning the lists that something like this might be of value, and I would be willing to take a crack at it. I would really like input as to how the flow should go - and if you folks with more Calc experience have points to watch out for I could use that also. Or if someone would like to work together on this, I feel up to the Base side if you want to work on the Calc routines.. Very Happy

Drew
_________________
Blog - http://baseanswers.spaces.live.com/
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Villeroy
Super User
Super User


Joined: 04 Oct 2004
Posts: 10106
Location: Germany

PostPosted: Wed Jan 04, 2006 2:45 pm    Post subject: Reply with quote

Drew,
If developers are not able to provide reliable conversions, let the macro-coders do it. Laughing
With a calc-sheet as datasource every single cell should be examined for it's datatype. There are many sloppy sheets out there with all kinds of data cluttered around. Numerical phone-numbers, numeric IDs mixed with names, currencies with a remark within the cell, error-values/blanks/#NA/0/"?"/"-"/"unknown"/"Ask Bill !!!"/FALSE in one column. I've seen them all. Razz
There should be a better handling than "assume-text-if-ambiguous" (dBase-export) or throw error (drag/drop-export, copy wizz)
Writing a reliable spreadsheet-converter is a rather ambitious project.
############################
The easiest scenario from my chrystal ball (which is quite opaque):
Source: SheetCellRange from selection or current region
Target: new table in a new base datasource
1. get headers (assume first row?, compare with second row?)
2. read first row below headers and set up an array with type-IDs for blanks, integers, doubles, dates, times, datetimes, strings, booleans.
3. until last row {select next row; until last cell{determine type-ID and show some dialog if type-ID does not match}}
The type-err-dialog could provide options like "Ignore Column","Edit", "Set Null", "Set Default [Zero/""/Today()/false]" and most important: "Cancel and repair the shit manually".
4. Having a clean datasource, show the resulting field names with their types.
Provide some controls for shifting the field-order and for dropping/renaming of some fields.
5. Create source and table, write data.
6. If some option is checked, create db-range with table as source (that's easy)
################
Second scenario:
Source: SheetCellRange from selection or current region
Target: an existing table of some type of datasource
1. get sheet-headers (assume first row?, compare with second row?)
2. get table's field names and types.
3. read first row below headers and set up an array with type-IDs for blanks, integers, doubles, dates, times, datetimes, strings, booleans.
4. provide some controls for assignment of a calc-field to appropriate db-field
4a. another option for appending new data.
5. Write records, handling the type-errors as well as index-errors and ignoring auto-increase and time-stamps.
6. If some option is checked, create db-range with table as source (that's easy)
##################
Indices and some conversion (eg. SMALLINT,lenght of string) should be done in base. I think this works properly (not shure about anything in base).
Well, may be we can set all numeric columns to double. But we should distinguish between date,time and datetime(2006-01-14 12:59:59).
###################
Formulas of source-cells get lost.
But there should be another feature:
myDBRange.addRefreshListener("foo_","DragDown")
Sub foo_DragDown(oEvent)
oEnum = getNeighbourColsWithFormulas(oEv.Source.referredCells)
while oEnum.hasMoreElements
AdjustFormulaRangeSize(oEnum.nextElement)
wend
End Sub
The most important benefit: Excel does it like that. Razz
Back to top
View user's profile Send private message
DrewJensen
Super User
Super User


Joined: 06 Jul 2005
Posts: 2616
Location: Cumberland, MD

PostPosted: Wed Jan 04, 2006 8:57 pm    Post subject: Reply with quote

Villeroy wrote:
The easiest scenario from my chrystal ball (which is quite opaque):
Source: SheetCellRange from selection or current region
Target: new table in a new base datasource
1. get headers (assume first row?, compare with second row?)
2. read first row below headers and set up an array with type-IDs for blanks, integers, doubles, dates, times, datetimes, strings, booleans.
3. until last row {select next row; until last cell{determine type-ID and show some dialog if type-ID does not match}}
The type-err-dialog could provide options like "Ignore Column","Edit", "Set Null", "Set Default [Zero/""/Today()/false]" and most important: "Cancel and repair the shit manually".
4. Having a clean datasource, show the resulting field names with their types.
Provide some controls for shifting the field-order and for dropping/renaming of some fields.
5. Create source and table, write data.
6. If some option is checked, create db-range with table as source (that's easy)


Might be a bit ambitious, but just like any other project - break it down into little steps and before you know it you have it licked - either that or the analogy becomes that of the Lobster and the pot of water brought to a boil very gradually Shocked

Lets say that for a 1st release shoot for just the scenario above with a few features left out. No shuffling column orders or dropping columns, you pass it - you get it moved. Also, if a cell does not pass the type test, then offer "Set Null", "Set Default" (Have to think of how to tell the utility what that would be...) and that is it. No interactive stuff at first. Well maybe not.

First small step then is the 'Scrubber' -
Determine each columns type and check each row to see if it matches.

OK - one idea comes to mind - so let me run it by you. Now the API allows me to add a sheet to the end of the Calc document- Right! It also is easy to copy a range to this new sheet - I think? Can I not apply a cell format then too each column in the copy, based on the format of the cells in the second row? Would Calc not mark those that can not accept this format as..what is it #ERR or some such.. So, after the set format step one just iterates over the range looking for those Cells that are in this error state.

Do you see any big problems in this idea so far - if not I will try an put together a macro to do this, unless someone wants to take a pass at it first?

The next step would be to handle the fixup for those that are in error. (Let's leave that on the table for the moment) Lets say that for relese 0.5 these cells are just left NULL in the database table. Handling the fix-up nicely desreves a bit of thought.

Second function to code would be the one to build the database and table creation macros...well, that is easy - Andrew Pitonyak has it all just sitting there in his Base macro document. Never build what you can more easily borrow, right. It just needs a little re-work for our specific needs...so no reason to wait on coding that piece. It can be tested even if the scrubber is not quite done. I will take a pass at this - first. Create a function that takes as parameters (newDBName STRING, newTableName STRING, ColName Array of strings, ColTypes Array of INTEGER) I thing a typeID is just an integer..but will alter as needed.

The third step - populating the newly created table is a flat piece of cake. When the routine is finished, drop the sheet we added to the end of the doucment to clean up. Again this can be built and tested while the scrubber is being done, or waiting to be done. A function that takes as a parameter a Range. For testing the range is always good data... This should only be maybe...what less then a day to make and test..

The forth - linking back a range you say "is easy"..OK.

So the big piece is that first scrubbing step. Get this working and the rest falls into place like legos.

Tell you what...I will create a shell for the functions needed and the call order, I'll create a seperate library for this and put the library up on my web site in a zip file..if someone wants to help they can grab a copy and fill in the stubed functions with code..and email it back to me..I will handle integrating any contributions back into the library proper..when it is done we can put it up on the code snippet list..and eventually mabey over to OOExtras.

EDIT - if someone does grab it and start on a function be sure to email me and let me know so that we don't both work on the same piece at the same time..

I will put up the URL to the file for download when it is ready.
_________________
Blog - http://baseanswers.spaces.live.com/
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    OOoForum.org Forum Index -> OpenOffice.org Base All times are GMT - 8 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group