| View previous topic :: View next topic |
| Author |
Message |
enjoying28 General User

Joined: 17 Dec 2010 Posts: 13
|
Posted: Fri Dec 17, 2010 2:18 pm Post subject: [Solved]Work around for Median in SQL Query |
|
|
Hay I need some help. I have a 42k record table that I need the median price for a grouped set of records. Grouped by year then month and then the median (not the avg or mean) of them. I know base does not have a function for median does anyone have code for a macro or something that i could use as a work around?
| Code: | | SELECT Median( "Price" ) AS "Price", "LstArea" AS "LstArea", YEAR( "CloseDate" ) AS "Year", MONTH( "CloseDate" ) AS "Month" FROM "main" "main" WHERE ( "LstStatus" = 'Sold' ) GROUP BY "LstArea", "Year", "Month" |
thanks for any info.
Last edited by enjoying28 on Tue Dec 21, 2010 12:14 pm; edited 1 time in total |
|
| Back to top |
|
 |
Sliderule Super User


Joined: 29 May 2004 Posts: 2474 Location: 3rd Rock From The Sun
|
Posted: Fri Dec 17, 2010 7:22 pm Post subject: |
|
|
enjoying28:
Just as an FYI, I did see your similar post on the other forum:
http://user.services.openoffice.org/en/forum/viewtopic.php?f=13&t=36695
and no responses.
I suspect, that is because there may not be a 'simple' answer.
Now, if it were me ( infamous last words ), this is how I would accomplish the task. For a quick explanation in English . . . before the 'technical' procedure . . .
- I would run / save a Query that produces the results "Price" for each column . . . that is . . . each Year and Month.
- Copy and Paste the results from the Query into a Calc Spreadsheet . . . using F4 to get the Query results to your Calc Spreadsheet
- Using Regular Expressions, replace all values of 0 to nothing . . . that is . . . make sure Regular Expression is selected in the FIND . . . and . . . change ^0 to nothing . . . the reason is to make the entries of 0 ( NULL ) to nothing ( blank ) in Spreadsheet speak.
- Use the Calc Function Median . . . for example . . . MEDIAN(A2:A3272) . . . this assumes, just an example, your Query returned 3271 rows ( in addition to the 1 for Column Name )
- The above Calc formula can be dragged, so only enter it once, and just drag the formula to each additional column.
Now, the SQL you might use . . . assuming ( I just made this up ) you wanted to see data from your Base file is from 2008-01 to 2010-12. You could merely copy and paste each line as needed to include / exclude more Years / Months
Important Note: The SQL below, I am using a CASE WHEN ELSE END statement . . . therefore . . . it MUST be Run Directly ( using the Toolbar . . . click on the icon with SQL and green check mark ).
| Code: | -- Bring Back Data from 2008-01 to 2010-12 For MEDIAN calculation in Calc Spreadsheet
-- Calc Function might look like =MEDIAN(A2:A3272)
-- This Query with OpenOffice 3.2.1 must be Run in Direct Mode
-- In Calc, use Find and Replace . . . Find ^0 and replace with nothing
SELECT
CASE WHEN YEAR("CloseDate") = 2008 AND MONTH("CloseDate") = 1 THEN "Price" ELSE NULL END as "2008-01",
CASE WHEN YEAR("CloseDate") = 2008 AND MONTH("CloseDate") = 2 THEN "Price" ELSE NULL END as "2008-02",
CASE WHEN YEAR("CloseDate") = 2008 AND MONTH("CloseDate") = 3 THEN "Price" ELSE NULL END as "2008-03",
CASE WHEN YEAR("CloseDate") = 2008 AND MONTH("CloseDate") = 4 THEN "Price" ELSE NULL END as "2008-04",
CASE WHEN YEAR("CloseDate") = 2008 AND MONTH("CloseDate") = 5 THEN "Price" ELSE NULL END as "2008-05",
CASE WHEN YEAR("CloseDate") = 2008 AND MONTH("CloseDate") = 6 THEN "Price" ELSE NULL END as "2008-06",
CASE WHEN YEAR("CloseDate") = 2008 AND MONTH("CloseDate") = 7 THEN "Price" ELSE NULL END as "2008-07",
CASE WHEN YEAR("CloseDate") = 2008 AND MONTH("CloseDate") = 8 THEN "Price" ELSE NULL END as "2008-08",
CASE WHEN YEAR("CloseDate") = 2008 AND MONTH("CloseDate") = 9 THEN "Price" ELSE NULL END as "2008-09",
CASE WHEN YEAR("CloseDate") = 2008 AND MONTH("CloseDate") = 10 THEN "Price" ELSE NULL END as "2008-10",
CASE WHEN YEAR("CloseDate") = 2008 AND MONTH("CloseDate") = 11 THEN "Price" ELSE NULL END as "2008-11",
CASE WHEN YEAR("CloseDate") = 2008 AND MONTH("CloseDate") = 12 THEN "Price" ELSE NULL END as "2008-12",
CASE WHEN YEAR("CloseDate") = 2009 AND MONTH("CloseDate") = 1 THEN "Price" ELSE NULL END as "2009-01",
CASE WHEN YEAR("CloseDate") = 2009 AND MONTH("CloseDate") = 2 THEN "Price" ELSE NULL END as "2009-02",
CASE WHEN YEAR("CloseDate") = 2009 AND MONTH("CloseDate") = 3 THEN "Price" ELSE NULL END as "2009-03",
CASE WHEN YEAR("CloseDate") = 2009 AND MONTH("CloseDate") = 4 THEN "Price" ELSE NULL END as "2009-04",
CASE WHEN YEAR("CloseDate") = 2009 AND MONTH("CloseDate") = 5 THEN "Price" ELSE NULL END as "2009-05",
CASE WHEN YEAR("CloseDate") = 2009 AND MONTH("CloseDate") = 6 THEN "Price" ELSE NULL END as "2009-06",
CASE WHEN YEAR("CloseDate") = 2009 AND MONTH("CloseDate") = 7 THEN "Price" ELSE NULL END as "2009-07",
CASE WHEN YEAR("CloseDate") = 2009 AND MONTH("CloseDate") = 8 THEN "Price" ELSE NULL END as "2009-08",
CASE WHEN YEAR("CloseDate") = 2009 AND MONTH("CloseDate") = 9 THEN "Price" ELSE NULL END as "2009-09",
CASE WHEN YEAR("CloseDate") = 2009 AND MONTH("CloseDate") = 10 THEN "Price" ELSE NULL END as "2009-10",
CASE WHEN YEAR("CloseDate") = 2009 AND MONTH("CloseDate") = 11 THEN "Price" ELSE NULL END as "2009-11",
CASE WHEN YEAR("CloseDate") = 2009 AND MONTH("CloseDate") = 12 THEN "Price" ELSE NULL END as "2009-12",
CASE WHEN YEAR("CloseDate") = 2010 AND MONTH("CloseDate") = 1 THEN "Price" ELSE NULL END as "2010-01",
CASE WHEN YEAR("CloseDate") = 2010 AND MONTH("CloseDate") = 2 THEN "Price" ELSE NULL END as "2010-02",
CASE WHEN YEAR("CloseDate") = 2010 AND MONTH("CloseDate") = 3 THEN "Price" ELSE NULL END as "2010-03",
CASE WHEN YEAR("CloseDate") = 2010 AND MONTH("CloseDate") = 4 THEN "Price" ELSE NULL END as "2010-04",
CASE WHEN YEAR("CloseDate") = 2010 AND MONTH("CloseDate") = 5 THEN "Price" ELSE NULL END as "2010-05",
CASE WHEN YEAR("CloseDate") = 2010 AND MONTH("CloseDate") = 6 THEN "Price" ELSE NULL END as "2010-06",
CASE WHEN YEAR("CloseDate") = 2010 AND MONTH("CloseDate") = 7 THEN "Price" ELSE NULL END as "2010-07",
CASE WHEN YEAR("CloseDate") = 2010 AND MONTH("CloseDate") = 8 THEN "Price" ELSE NULL END as "2010-08",
CASE WHEN YEAR("CloseDate") = 2010 AND MONTH("CloseDate") = 9 THEN "Price" ELSE NULL END as "2010-09",
CASE WHEN YEAR("CloseDate") = 2010 AND MONTH("CloseDate") = 10 THEN "Price" ELSE NULL END as "2010-10",
CASE WHEN YEAR("CloseDate") = 2010 AND MONTH("CloseDate") = 11 THEN "Price" ELSE NULL END as "2010-11",
CASE WHEN YEAR("CloseDate") = 2010 AND MONTH("CloseDate") = 12 THEN "Price" ELSE NULL END as "2010-12"
FROM "main" as "main"
WHERE "LstStatus" = 'Sold' |
IMPORTANT NOTE: The above SQL Query assumes your database engine is HSQL ( you can confirm this by looking at the Status Line at the bottom ) after opening your OpenOffice Base file ( *.odb ).
I recognise that this might not the 'convoluted' approach ( database pun inteded ) you were hoping for, but, it is the best I can suggest to accomplish your desired results.
I hope this helps, please be sure to let me / us know.
Sliderule
Thanks to add [Solved] in your first post Title ( edit button ) if your issue has been fixed / resolved. |
|
| Back to top |
|
 |
enjoying28 General User

Joined: 17 Dec 2010 Posts: 13
|
Posted: Sat Dec 18, 2010 12:29 am Post subject: |
|
|
| Thanks for the try I like the idea but there are a few problems when I expanded on your query to take in my three times long time line and two other groups. I endup with a over 500 column query 42K records long and it will not copy and paste into a spreadsheet. I keep trying but there is just two many groups I need. Is it possible to write code to create median just like avg. Because when I run my Avg query on the same data I get four columns aNd just over 600 records. I many have to just back your query up so that I make 40 different one to get the info but there has got to be a better way. |
|
| Back to top |
|
 |
Sliderule Super User


Joined: 29 May 2004 Posts: 2474 Location: 3rd Rock From The Sun
|
Posted: Sat Dec 18, 2010 10:36 am Post subject: |
|
|
enjoying28:
I am not sure I understand what you are saying / asking in your post above.
With your permission, a few questions / comments:
- What version of OpenOffice are you using?
- Which database driver are you using -- put another way, are you using HSQL as the default 'embedded engine' ( version 1.8 ), HSQL version 2.0, MySQL, Microsoft Access, PostGRE, SQLite, some other database driver?
- After starting your OpenOffice Base file ( *.odb ), what does the status line say at the bottom?
- What I was suggesting should NOT return 42K rows, since, it will have multiple columns and your Where clause is limiting it to "LstStatus" = 'Sold' unless ALL your records have a "LstStatus" of 'Sold'. Furthermore, the number of rows returned should be the greatest number in any one month.
- In Calc, by using the F4 key OR from the Calc menu: View -> Data Source
you can drag the Query results to Calc, and have Calc perform the MEDIAN on each column. That way, Calc will be responsible for doing the individual Column sorting and correct calculation of the "middle" / "median" value, whether an even number of values, or, odd.  Sliderule |
|
| Back to top |
|
 |
enjoying28 General User

Joined: 17 Dec 2010 Posts: 13
|
Posted: Sat Dec 18, 2010 1:33 pm Post subject: |
|
|
Thanks Silderule
Okay to answer your questions
1. Openoffice 3.2.1 on windows 7 ultimate 32bit
2. HSQL Embedded
3. HSQL Database engine
4. ok my table has more then 42k records about 3 times more but 42K 'LstStatus' sold records my date range is 6/2005 to 11/2010 I have a need to break that up into groups by 'LstArea' five categories and then again by 'LstType' seven categories. At the lowest division there is about 150 to 250 records in any given month. I don't know what you expected it to return but from my count it returns enough rows for every record within the whole time frame. Which sounds right to me because it places the value in the correct column but does not combined records.
5. I was doing this then copying and pasting like you said.
Now after feeling brain died for not knowing I can drag to import the data it works.
I just have to add where that make 35 different query with about 2000 to 5000 records each.
Thanks for the workable solution. I will mark this as solved but one last question. where can I suggest adding Median to the feature set of Base?
Enjoying28 |
|
| Back to top |
|
 |
Sliderule Super User


Joined: 29 May 2004 Posts: 2474 Location: 3rd Rock From The Sun
|
Posted: Sat Dec 18, 2010 2:12 pm Post subject: |
|
|
You said asked:
| enjoying28 wrote: | | but one last question. where can I suggest adding Median to the feature set of Base? |
Forgive me for answering your question by giving you more information then you may want . . . but here goes.
HSQL, the database endgine, per your comments above, is probably version 1.8.
The HSQL folks have a new version that I use within OpenOffice . . . it is version 2.0.1 . . . but . . . that version will NOT be included with OpenOffice 3.3 ( to be released shortly ) . . . and . . . reportedly not until OpenOffice 3.4 .
Now, HSQL 2.0.1 DOES have an ability for a user to BUILD their own functions . . . for example, I have built a RandBetween function ( to return a Random Number between two given input values ). Also, HSQL 2.0.1 does support a GROUP_CONCAT function.
I, like you, would like to build / create a usable MEDIAN AGGREGATE function that can be user defined function. As of this minute, I have not been successful in doing it, but, I am trying. Please see the link below:
http://sourceforge.net/projects/hsqldb/forums/forum/73674/topic/4019158/
One last final comment . . . based on your description of the 'size' of your database . . . I strongly encourage you to make frequent backups of your OpenOffice Base file ( *.odb ) . . . but . . . more importantly, if it were me . . . I would NOT run this as an 'embedded database'. Keeping all the database as a separate file(s), rather than as a part of the OpenOffice Base zipped ( compressed ) file may likely cause total loss of data. In my experience, the very detailed instructions found in this form by dacm are wonderful. This way, you could run OpenOffice Base as your database BackEnd, and, the power of HSQL 2.0.1 as the backend, either in SERVER MODE ( to allow multiple folks to access the data concurrently ) OR "in file mode" . . . one person at a time, but, no need to have a server application running in the background.
One more comment . . . to see examples of using 'simple' MEDIAN calculations using SQL on just ONE column . . . look at the link below:
http://scorreiait.wordpress.com/2008/10/28/how-to-compute-a-median-in-sql
however, big difference on doing it 'for one column of data' vs doing it for many columns concurrently.
Sliderule |
|
| Back to top |
|
 |
enjoying28 General User

Joined: 17 Dec 2010 Posts: 13
|
Posted: Tue Dec 21, 2010 12:15 pm Post subject: |
|
|
| Thanks |
|
| Back to top |
|
 |
dacm Super User


Joined: 07 Jan 2010 Posts: 734
|
Posted: Tue Dec 21, 2010 4:20 pm Post subject: |
|
|
Here's a link to the tutorial for migration of your database to file/server mode, and then upgrade steps to HSQLDB 2.0.x. This requires some steps as you'll find in the tutorial links, but as a Windows user the setup process can be more automated using the available installers. _________________ Soli Deo gloria
Tutorial: avoiding data loss with Base + Migrating 'Embedded databases' |
|
| Back to top |
|
 |
Sliderule Super User


Joined: 29 May 2004 Posts: 2474 Location: 3rd Rock From The Sun
|
Posted: Thu Dec 23, 2010 5:51 pm Post subject: |
|
|
enjoying28:
Just so you know . . . perhaps some good news regarding the use of MEDIAN as an AGGREGATE function using HSQL as your database engine.
Per the posts below . . . the Beta Version of HSQL ( as of today ) includes the aggregate function MEDIAN . BUT, it is important to know that this is with HSQL 2.1.0 . . . not the default OpenOffice file as of OpenOffice 3.2.1 nor OpenOffice 3.3 ( when it is available ).
http://sourceforge.net/projects/hsqldb/forums/forum/73674/topic/4019158
The download for the HSQL file . . . hsqldb.jar is found in the first paragraphfound at:
http://www.hsqldb.org/support/index.html
Additionally, the link above from dacm ( thanks so much or your post here ) and at http://user.services.openoffice.org/en/forum/viewtopic.php?f=83&t=17567 can perhaps help you to migrate your current OpenOffice file to HSQL 2.1.0 if you desire.
Sliderule |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|