OpenOffice.org Forum at OOoForum.orgThe OpenOffice.org Forum
 
 [Home]   [FAQ]   [Search]   [Memberlist]   [Usergroups]   [Register
 [Profile]   [Log in to check your private messages]   [Log in

[SOLVED] Beating an array (?) limit? - unknown words macro

 
Post new topic   Reply to topic    OOoForum.org Forum Index -> OpenOffice.org Macros and API
View previous topic :: View next topic  
Author Message
bluegecko
General User
General User


Joined: 12 Jun 2007
Posts: 49
Location: Portugal

PostPosted: Thu Apr 30, 2009 10:09 am    Post subject: [SOLVED] Beating an array (?) limit? - unknown words macro Reply with quote

Hi there

Here's a challenge for the real experts! The following is part of a routine that either lists all unknown words contained in a doc (ie words which don't match currently-enabled dictionaries), or adds them to a user-chosen dictionary. The macro below should just list unknown words in a new document, and should work for small files.

Unfortunately, it comes acropper on files containing, as far as I can tell, 16369 words or more, irrespective of their spelling or length (to replicate, create a file with the same word repeated, say, 20,000 times).

The error message OOo throws up is:

Inadmissible value or data type
Index out of defined range


The problematic line appears to be:
Code:

If Not oSpellChk.com_sun_star_linguistic2_XSpellChecker_isValid(oTextCursor.getString(),oCharLoc, Array()) Then ...


OS = XP Doobie, OOo = 3.1.0 (build 9388), shoe size = 42 (or 38, or 8, or ...)

Code:

REM  *****  BASIC  *****
' code simplified for troubleshooting on oooforum
' full credits will be included in corrected macro. For now,
' thanks to Santiago Bosio, "ms777", "Tommy27" and Russ Phillips.

Sub listAllUnknown
   oDocModel = ThisComponent

   If Not HasUnoInterfaces (oDocModel, "com.sun.star.text.XTextDocument") Then
      MsgBox("This document doesn’t support the XTextDocument interface. Nope, I don’t know what it means either, but it sounds bad! Really bad. Sorry. But your kettle should still work if you want to have a cup of coffee and think it all over.")
      Exit Sub
   End If

   oTextCursor = oDocModel.Text.createTextCursor()
   oTextCursor.gotoStart(False)

   oLinguSvcMgr = createUnoService("com.sun.star.linguistic2.LinguServiceManager")
   If Not IsNull(oLinguSvcMgr) Then
      oSpellChk = oLinguSvcMgr.getSpellChecker()
   End If

   If IsNull (oSpellChk) Then
      MsgBox("It’s not possible to access the spellchecker. Don’t ask. Life’s like that sometimes.")
      Exit Sub
   End If

   Do
      If oTextCursor.isStartOfWord() Then
         oTextCursor.gotoEndOfWord(True)
         oCharLoc = oTextCursor.getPropertyValue("CharLocale")
         If Not isEmpty (oCharLoc) Then

' XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
' MACRO STOPS ON FOLLOWING LINE WITH ERROR ON 16369th PASS

            If Not oSpellChk.com_sun_star_linguistic2_XSpellChecker_isValid(oTextCursor.getString(),oCharLoc, Array()) Then

' (sListaPalabras merely stores unknown words)
               sListaPalabras = sListaPalabras + oTextCursor.getString() + Chr(13)
            End If
         End If
         oTextCursor.collapseToEnd()
      End If
   Loop While oTextCursor.gotoNextWord(False)

   If Len(sListaPalabras) = 0 Then
      MsgBox "No unknown words.",0,"ERROR: No squiggly red lines"
      Exit Sub
   End If

   oListDocModel = StarDesktop.loadComponentFromURL("private:factory/swriter", "_default", 0, Array())
   oListDocModel.Text.String = "The following words are not in any dictionary or word list currently active:"  + Chr(13) + Chr(13) + sListaPalabras
   oListDocModel.CurrentController.Frame.activate()
   ThisComponent.CurrentController.getViewCursor.gotoStart(False)   ' go to top of doc

End Sub


Edited: simplified macro and added error message text.


Last edited by bluegecko on Fri May 01, 2009 3:52 am; edited 3 times in total
Back to top
View user's profile Send private message
B Marcelly
Super User
Super User


Joined: 12 May 2004
Posts: 1453
Location: France

PostPosted: Thu Apr 30, 2009 10:57 am    Post subject: Reply with quote

Hi,
I can't say anything on your algorithm, but this makes me tick :
Code:
 sListaPalabras = sListaPalabras + oTextCursor.getString() + Chr(13)

The problem is that in Basic a string has a maximum of 65535 characters.
Supposing that each word is about 3 characters long, and they are separated by a chr(13) character, this means you cannot store more than 16384 words in this string. Maybe Basic runs fool if the 65535 limit is exceeded.
______
Bernard
Back to top
View user's profile Send private message Visit poster's website
bluegecko
General User
General User


Joined: 12 Jun 2007
Posts: 49
Location: Portugal

PostPosted: Thu Apr 30, 2009 1:04 pm    Post subject: Reply with quote

Thanks for the reply, but I think the line you pick on isn't guilty: sListaPalabras merely stores misspellings. Plus, the macro bails out at 16369 words, whether it's the word "hello" that's repeated so many times, or "supercalifragilisticexpialigocious", but the macro works fine with either of those words if there are only 16368 of them, or less.

I've edited my first post to simplify the macro (it should work copy pasted) and have also added the error message OOs throws up once the limit is reached.

I suspect it has something to do with the way TextCursor works (not something I understand). Given the amount of time that elapses from macro start to error, it does step through the first 16368 words without problem.

~bluegecko
Back to top
View user's profile Send private message
B Marcelly
Super User
Super User


Joined: 12 May 2004
Posts: 1453
Location: France

PostPosted: Thu Apr 30, 2009 11:50 pm    Post subject: Reply with quote

OK, it's a bug.
Here is a simplified demo. Create a simple Writer document and copy this macro :
Code:
REM  *****  BASIC  *****
' code simplified for troubleshooting on oooforum
' This code repeatedly checks the word Hello

Sub repeatSpellCheck
Dim nbLoops As Long
Dim oCharLoc As New com.sun.star.lang.Locale

   oLinguSvcMgr = createUnoService("com.sun.star.linguistic2.LinguServiceManager")
   oSpellChk = oLinguSvcMgr.getSpellChecker()

   oCharLoc.Language = "en"
   oCharLoc.Country = "US"
   
   ' this loop works OK
   for nbLoops = 1 to 16369
     ' for this demo, no need to use the result of isValid
     oSpellChk.com_sun_star_linguistic2_XSpellChecker_isValid("Hello", oCharLoc, Array())
   next
   ' the 16370 th call throws exception
     oSpellChk.com_sun_star_linguistic2_XSpellChecker_isValid("Hello", oCharLoc, Array())

   MsgBox("Finished OK")
End Sub

After checking if not already described, you may create an Issue for this bug.
This is not a new bug. It exists also on OpenOffice.org version 1.1.5.
______
Bernard
Back to top
View user's profile Send private message Visit poster's website
bluegecko
General User
General User


Joined: 12 Jun 2007
Posts: 49
Location: Portugal

PostPosted: Fri May 01, 2009 1:19 am    Post subject: Reply with quote

That's a sweet bit of troubleshooting code, thanks.

Googling "openoffice issue 16368 words" comes up with several pages (eg http://wiki.services.openoffice.org/wiki/Documentation/BASIC_Guide/Arrays), all saying this about arrays:

Quote:

The maximum number of elements (within a data field dimension) is 16368.


So, I guess Array(), whatever that is, is maxing out, and I doubt the devs would consider it a bug.

In my blind ignorance, redefining oLinguSvcMgr and oSpellChk after the first 16368 iterations seems to work, thus:

Code:

Sub repeatSpellCheck
Dim nbLoops As Long
Dim oCharLoc As New com.sun.star.lang.Locale

   oLinguSvcMgr = createUnoService("com.sun.star.linguistic2.LinguServiceManager")
   oSpellChk = oLinguSvcMgr.getSpellChecker()

   oCharLoc.Language = "en"
   oCharLoc.Country = "US"
   
   ' this loop works OK
   for nbLoops = 1 to 16369
     ' for this demo, no need to use the result of isValid
     oSpellChk.com_sun_star_linguistic2_XSpellChecker_isValid("Hello", oCharLoc, Array())
   next

   Msgbox("looped 16368 times. Just one more...")

' redefine to reset Array?
     oLinguSvcMgr = createUnoService("com.sun.star.linguistic2.LinguServiceManager")
     oSpellChk = oLinguSvcMgr.getSpellChecker()

     oSpellChk.com_sun_star_linguistic2_XSpellChecker_isValid("Hello", oCharLoc, Array())
   MsgBox("Finished OK")
End Sub


Now let's see whether it works on the proper code...
~bluegecko
Back to top
View user's profile Send private message
B Marcelly
Super User
Super User


Joined: 12 May 2004
Posts: 1453
Location: France

PostPosted: Fri May 01, 2009 2:42 am    Post subject: Reply with quote

bluegecko wrote:
Googling "openoffice issue 16368 words" comes up with several pages (eg http://wiki.services.openoffice.org/wiki/Documentation/BASIC_Guide/Arrays), all saying this about arrays:

The maximum number of elements (within a data field dimension) is 16368.

This is only one of the remaining errors in the basic guide. You can declare an array of 1 million element if you want.
Anyway, the bug has nothing to do with Basic, it's in the API.

B Marcelly wrote:
After checking if not already described, you may create an Issue for this bug.

By saying this I did not mean googling but searching into the Issue database.
bluegecko wrote:
In my blind ignorance, redefining oLinguSvcMgr and oSpellChk after the first 16368 iterations seems to work

Yes, you only need to change the spellchecker after approx 16000 checks.
Code:
oSpellChk = oLinguSvcMgr.getSpellChecker()

______
Bernard
Back to top
View user's profile Send private message Visit poster's website
bluegecko
General User
General User


Joined: 12 Jun 2007
Posts: 49
Location: Portugal

PostPosted: Fri May 01, 2009 3:51 am    Post subject: Reply with quote

Indeed, 'tis solved - have changed thread title. I'll post the completed macro probably later today, once I've converted the dialog into something that can be posted as part of the code.

Thank you!

~bluegecko
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    OOoForum.org Forum Index -> OpenOffice.org Macros and API All times are GMT - 8 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group