OpenOffice.org Forum at OOoForum.orgThe OpenOffice.org Forum
 
 [Home]   [FAQ]   [Search]   [Memberlist]   [Usergroups]   [Register
 [Profile]   [Log in to check your private messages]   [Log in

Can I search the contents of openoffice documents

 
Post new topic   Reply to topic    OOoForum.org Forum Index -> General Discussion
View previous topic :: View next topic  
Author Message
Robbro
Guest





PostPosted: Wed Mar 12, 2003 1:19 pm    Post subject: Can I search the contents of openoffice documents Reply with quote

I have a folder with 200 open office writer docs in it and it would be very helpfull to be able to search them to find ones containing specific text strings such as names. Is there any way to do this? The search function in windows cannot do this, it can only search the file names. Is there any way?
Back to top
dfrench
Moderator
Moderator


Joined: 03 Mar 2003
Posts: 1605
Location: Wellington, New Zealand

PostPosted: Wed Mar 12, 2003 5:22 pm    Post subject: Reply with quote

You could use the API to provide the FIND functionality of OOo. There is an example of how to do this in the SDK ..examples\basic\text\creating_an_index\ . This *should* provide support for regular expressions in the search argument. Good luck.
Back to top
View user's profile Send private message
Curtz
Super User
Super User


Joined: 19 Feb 2003
Posts: 554
Location: In vino veritas!

PostPosted: Wed Mar 12, 2003 11:21 pm    Post subject: Reply with quote

Ah, ask the users of OOo to develop the missing functionality... doh! I am not a programmer and I doubt that Robbro is.

With other words, NO, you cannot search in files, and that is not a good thing. Even more because all OOo files are compressed ZIP archives, so it is not possble to search in the files with other utilities.

Many people are "forced" to use naming conventions when saving files, and with file names like SPADE2301.SXW searching in file contents is often the only way to find something QUICK.
Back to top
View user's profile Send private message
dfrench
Moderator
Moderator


Joined: 03 Mar 2003
Posts: 1605
Location: Wellington, New Zealand

PostPosted: Thu Mar 13, 2003 1:10 am    Post subject: Reply with quote

There are of course many ways to answer the original question and with a community supported product your need for a solution may outstrip the developers capacity to deliver.

If you want search the files in their current stored format (zipped XML) then that is certainly possible without writing code. There is at least one product out there that will do it for you. see for example http://www.zipscan.co.uk/moreinfo.htm

There are risks in relying on the underlying storage format remaining constant, which is why there are many levels of abstraction between the published API and the real file.
Back to top
View user's profile Send private message
ftack
Moderator
Moderator


Joined: 27 Jan 2003
Posts: 3102
Location: Belgium

PostPosted: Thu Mar 13, 2003 6:02 am    Post subject: Reply with quote

This is indeed a disadvantage of the OOo file format. You can't just 'grep' to find text in your file. The suggestion by dfrench is a very good one, and I am sure that with some pipes and the like a linux guru can create a command file that will unzip the docs and 'grep' the desired content. But in fact, al these approaches are quite slow (just finding a file by on my 128 Mb Linux system already takes eternity), and there will be a need for indexing facilities for the OOo file formats. Yep, it will still take a long time before the open source developpers run out of work.
Back to top
View user's profile Send private message
Robbro
Guest





PostPosted: Thu Mar 13, 2003 7:10 am    Post subject: THANKS! Reply with quote

Zipscan is just what I needed. It took some toying to get it to search what I want the way I want but it works. Will purchase it soon. Anyone else that wants to search these files, this is the tool you need!
Back to top
dfrench
Moderator
Moderator


Joined: 03 Mar 2003
Posts: 1605
Location: Wellington, New Zealand

PostPosted: Thu Mar 13, 2003 10:15 am    Post subject: Reply with quote

Glad zipscan met your requirement.
In my day job, I deal with capacity and performance issues of large enterprises and the migration of desktop tools to the enterprise raises some interesting problems. Just imagine what you are doing to your network when you search for documents in this way, whether by windows explorer or zipscan.
I would be looking for an indexing mechanism operating on the file-servers servicing more than just OOo files ie a document/content management system. Back to the api to provide the necessary interfaces with these products.
Back to top
View user's profile Send private message
Guest






PostPosted: Thu Mar 13, 2003 5:29 pm    Post subject: Reply with quote

My small indexing tool can index OOo writer files. The graphical user interface needs PyQt and PyKDE, but the command line tool only needs Python and a Python module:

http://www.danielnaber.de/desktopdig/
Back to top
LutzH
Newbie
Newbie


Joined: 22 Apr 2003
Posts: 1
Location: Potsdam, Germany

PostPosted: Tue Apr 22, 2003 12:20 am    Post subject: Reply with quote

Hi all,

i wrote a small Perl Program with a graphical UI.
it takes a pattern and searches in OpenOffice.org
Files for this pattern. It can find words with german
Umlauts not case sensitive. The Program may be a
good starting point to develop an OOo Text search
program for more Platforms and Languages.
(Portions of the program need the Windows command "dir"
to get the Drive Letter)

Greetings

Excuse my bad english
Back to top
View user's profile Send private message Send e-mail
cwchia
Super User
Super User


Joined: 09 Jan 2003
Posts: 1050
Location: Malaysia

PostPosted: Tue Apr 22, 2003 1:39 am    Post subject: Reply with quote

Curtz wrote:
Ah, ask the users of OOo to develop the missing functionality... doh! I am not a programmer and I doubt that Robbro is.

Well neither am I. just wonder has anyone file a request for this feature with the OOo team?
Back to top
View user's profile Send private message
Guest






PostPosted: Tue Apr 22, 2003 2:11 am    Post subject: Reply with quote

Quote:
just wonder has anyone file a request for this feature with the OOo team?
yes, see http://www.openoffice.org/project/www/issues/show_bug.cgi?id=7432
Back to top
Guest






PostPosted: Sat May 03, 2003 12:35 pm    Post subject: Python search script available Reply with quote

LutzH wrote:

i wrote a small Perl Program with a graphical UI.
it takes a pattern and searches in OpenOffice.org
Files for this pattern.


The Python version of what LutzH describes is now available at
http://www.danielnaber.de/loook/
Back to top
Display posts from previous:   
Post new topic   Reply to topic    OOoForum.org Forum Index -> General Discussion All times are GMT - 8 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group