OpenOffice.org Forum at OOoForum.orgThe OpenOffice.org Forum
 
 [Home]   [FAQ]   [Search]   [Memberlist]   [Usergroups]   [Register
 [Profile]   [Log in to check your private messages]   [Log in

Advanced Deduping Question

 
Post new topic   Reply to topic    OOoForum.org Forum Index -> OpenOffice.org Calc
View previous topic :: View next topic  
Author Message
sc123
General User
General User


Joined: 23 Aug 2006
Posts: 7

PostPosted: Mon Jun 04, 2012 9:03 pm    Post subject: Advanced Deduping Question Reply with quote

I have two sets of data. Each set contains an address field, among others. If the address exists in set A, I want to remove that entire record (row) in set B. The trick here is to get Calc to remove the entire row, and not just that address cell that matches. How can I do that?
Back to top
View user's profile Send private message
ken johnson
Super User
Super User


Joined: 23 Apr 2009
Posts: 2032
Location: Sydney, Australia

PostPosted: Tue Jun 05, 2012 2:15 am    Post subject: Reply with quote

In attached doc columns A to E represent Set A data, Columns G to K represent Set B data.
Columns M to Q have standard formulae, that refer to the results of a Helper formula in column L, to return the rows of Set B that do not have their address in Set A.
Columns R to V have single-value array formulae that do the same without the need for a Helper formula.
Array formula must be entered using Ctrl+Shift+Enter and the Ctrl key must be held down while dragging the fill handle when copying the array formula into adjacent cells.
http://www.mediafire.com/view/?br4hi6u57hbizah
Ken Johnson
_________________
If your problem has been solved please add "[Solved]" to the beginning of your first post title (edit button).


Last edited by ken johnson on Sun Sep 23, 2012 3:55 am; edited 1 time in total
Back to top
View user's profile Send private message
sc123
General User
General User


Joined: 23 Aug 2006
Posts: 7

PostPosted: Tue Jun 05, 2012 7:52 am    Post subject: Reply with quote

ken johnson wrote:
In attached doc columns A to E represent Set A data, Columns G to K represent Set B data.
Columns M to Q have standard formulae, that refer to the results of a Helper formula in column L, to return the rows of Set B that do not have their address in Set A.
Columns R to V have single-value array formulae that do the same without the need for a Helper formula.
Array formula must be entered using Ctrl+Shift+Enter and the Ctrl key must be held down while dragging the fill handle when copying the array formula into adjacent cells.
http://www.4shared.com/file/PfGoraFU/Delete_Duplicate_Rows.html

Ken Johnson


I downloaded the file and see the data but have no idea what to do with it! How do I apply this to my data sets?
Back to top
View user's profile Send private message
ken johnson
Super User
Super User


Joined: 23 Apr 2009
Posts: 2032
Location: Sydney, Australia

PostPosted: Tue Jun 05, 2012 8:38 pm    Post subject: Reply with quote

Formula solutions are too complex, so maybe a filter solution.
Unfortunately there is no escaping formulae completely, filtering needs the results of a helper formula that identifies rows in set B where the residential address (not to be confused with range address) does not also occur in set A.

Step 1: In a spare column on the same row as Set B's heading row type “Helper” then immediately below that cell enter this helper formula...
=ISERROR(MATCH( Relative range address of 1st cell in Set B with a residential address;Absolute Row range address of Set A's residential column;0))
[If the 1st cell in Set B with a residential address is I2 and if Set A's residential addresses occupy C1:C100 then the helper formula would be =ISERROR(I2;$C$1:$C$100;0))]
Step 2: Fill the helper formula down to the last row of Set B. This results in a column of cells showing TRUE or FALSE. Cells with TRUE are in rows with a Set B residential address that is not also in Set A.

Step 3: Select the Helper column so that it can be defined as a Database range.The quickest way is to select the “Helper” heading cell then press Ctrl+Shift+Down Arrow. To define the selection as a database range go Data|Define Range... to bring up the “Define Database Range” dialogue; then simply give it a name eg “Helper”, click the “Add” button then the “OK” button.

Step 4: Add an AutoFilter to the Helper column by selecting any cell in that database range then going Data|Filter|AutoFilter.

Step 5: Filter out the rows with duplicate addresses by clicking the filter's arrow then choosing “1“ from the selection list.

Step 6: Select the filtered Set B, copy, undo the filtering by clicking the filter's arrow and choosing “All”, then paste the copied results into a spare area of the sheet (Use Paste Special, with the Formulae option deselected, if any of the copied results are formulae results).
http://www.mediafire.com/view/?edak90942aqn9ma

Ken Johnson
_________________
If your problem has been solved please add "[Solved]" to the beginning of your first post title (edit button).
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    OOoForum.org Forum Index -> OpenOffice.org Calc All times are GMT - 8 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group