| View previous topic :: View next topic |
| Author |
Message |
Robbro Guest
|
Posted: Wed Mar 12, 2003 1:19 pm Post subject: Can I search the contents of openoffice documents |
|
|
| I have a folder with 200 open office writer docs in it and it would be very helpfull to be able to search them to find ones containing specific text strings such as names. Is there any way to do this? The search function in windows cannot do this, it can only search the file names. Is there any way? |
|
| Back to top |
|
 |
dfrench Moderator

Joined: 03 Mar 2003 Posts: 1605 Location: Wellington, New Zealand
|
Posted: Wed Mar 12, 2003 5:22 pm Post subject: |
|
|
| You could use the API to provide the FIND functionality of OOo. There is an example of how to do this in the SDK ..examples\basic\text\creating_an_index\ . This *should* provide support for regular expressions in the search argument. Good luck. |
|
| Back to top |
|
 |
Curtz Super User


Joined: 19 Feb 2003 Posts: 554 Location: In vino veritas!
|
Posted: Wed Mar 12, 2003 11:21 pm Post subject: |
|
|
Ah, ask the users of OOo to develop the missing functionality... doh! I am not a programmer and I doubt that Robbro is.
With other words, NO, you cannot search in files, and that is not a good thing. Even more because all OOo files are compressed ZIP archives, so it is not possble to search in the files with other utilities.
Many people are "forced" to use naming conventions when saving files, and with file names like SPADE2301.SXW searching in file contents is often the only way to find something QUICK. |
|
| Back to top |
|
 |
dfrench Moderator

Joined: 03 Mar 2003 Posts: 1605 Location: Wellington, New Zealand
|
Posted: Thu Mar 13, 2003 1:10 am Post subject: |
|
|
There are of course many ways to answer the original question and with a community supported product your need for a solution may outstrip the developers capacity to deliver.
If you want search the files in their current stored format (zipped XML) then that is certainly possible without writing code. There is at least one product out there that will do it for you. see for example http://www.zipscan.co.uk/moreinfo.htm
There are risks in relying on the underlying storage format remaining constant, which is why there are many levels of abstraction between the published API and the real file. |
|
| Back to top |
|
 |
ftack Moderator


Joined: 27 Jan 2003 Posts: 3102 Location: Belgium
|
Posted: Thu Mar 13, 2003 6:02 am Post subject: |
|
|
| This is indeed a disadvantage of the OOo file format. You can't just 'grep' to find text in your file. The suggestion by dfrench is a very good one, and I am sure that with some pipes and the like a linux guru can create a command file that will unzip the docs and 'grep' the desired content. But in fact, al these approaches are quite slow (just finding a file by on my 128 Mb Linux system already takes eternity), and there will be a need for indexing facilities for the OOo file formats. Yep, it will still take a long time before the open source developpers run out of work. |
|
| Back to top |
|
 |
Robbro Guest
|
Posted: Thu Mar 13, 2003 7:10 am Post subject: THANKS! |
|
|
| Zipscan is just what I needed. It took some toying to get it to search what I want the way I want but it works. Will purchase it soon. Anyone else that wants to search these files, this is the tool you need! |
|
| Back to top |
|
 |
dfrench Moderator

Joined: 03 Mar 2003 Posts: 1605 Location: Wellington, New Zealand
|
Posted: Thu Mar 13, 2003 10:15 am Post subject: |
|
|
Glad zipscan met your requirement.
In my day job, I deal with capacity and performance issues of large enterprises and the migration of desktop tools to the enterprise raises some interesting problems. Just imagine what you are doing to your network when you search for documents in this way, whether by windows explorer or zipscan.
I would be looking for an indexing mechanism operating on the file-servers servicing more than just OOo files ie a document/content management system. Back to the api to provide the necessary interfaces with these products. |
|
| Back to top |
|
 |
Guest
|
Posted: Thu Mar 13, 2003 5:29 pm Post subject: |
|
|
My small indexing tool can index OOo writer files. The graphical user interface needs PyQt and PyKDE, but the command line tool only needs Python and a Python module:
http://www.danielnaber.de/desktopdig/ |
|
| Back to top |
|
 |
LutzH Newbie

Joined: 22 Apr 2003 Posts: 1 Location: Potsdam, Germany
|
Posted: Tue Apr 22, 2003 12:20 am Post subject: |
|
|
Hi all,
i wrote a small Perl Program with a graphical UI.
it takes a pattern and searches in OpenOffice.org
Files for this pattern. It can find words with german
Umlauts not case sensitive. The Program may be a
good starting point to develop an OOo Text search
program for more Platforms and Languages.
(Portions of the program need the Windows command "dir"
to get the Drive Letter)
Greetings
Excuse my bad english |
|
| Back to top |
|
 |
cwchia Super User


Joined: 09 Jan 2003 Posts: 1050 Location: Malaysia
|
Posted: Tue Apr 22, 2003 1:39 am Post subject: |
|
|
| Curtz wrote: | | Ah, ask the users of OOo to develop the missing functionality... doh! I am not a programmer and I doubt that Robbro is. |
Well neither am I. just wonder has anyone file a request for this feature with the OOo team? |
|
| Back to top |
|
 |
Guest
|
|
| Back to top |
|
 |
Guest
|
Posted: Sat May 03, 2003 12:35 pm Post subject: Python search script available |
|
|
| LutzH wrote: |
i wrote a small Perl Program with a graphical UI.
it takes a pattern and searches in OpenOffice.org
Files for this pattern. |
The Python version of what LutzH describes is now available at
http://www.danielnaber.de/loook/ |
|
| Back to top |
|
 |
|