>>
APLDN Home

>>
Events

>>
Trainings

>>
APL Books

>>
APLDN Links

>>
Discussion Groups

>>
Downloads

>>
Articles

>>
Library

>>
Learning Tools

>>
APLDN User IO

>>
APL2000.com




Bug Reports

Author Thread: find primitive vs []SS
Nicolai.Ohm
find primitive vs []SS
Posted: Friday, April 08, 2005 10:04 AM (EST)

I have a question concerning the ‘find’ primitive (º) and system function ‘string search’ ( []SS).

 

textvec ŒSS 'string'
'string' º textvec

By accident I noticed that []SS does have a much better performance than the find primitive. In our application (which makes in some parts heavy use of string searches) I found that []SS is roughly 5 times faster than the find primitive. I was a bit surprised because I thought both functions might use very simular internal procedures of the APL+Win interpreter. 

 
So is []SS still the recommended function for string searches?


Comments:

Author Thread:
davin.church
find primitive vs []SS
Posted: Friday, April 08, 2005 12:01 PM (EST)
The Find primitive is more generalized. It can search within matrices and can even find sub-matrices within matrices. It can also operate on other datatypes - I've used it to search for numeric patterns. It will even work on nested arrays to find patterns of nested data. So I usually use []SS when I'm doing simple text vector searches and {epsilon-underbar} when doing more complicated things.

     

Nicolai.Ohm
find primitive vs []SS
Posted: Saturday, April 09, 2005 12:04 PM (EST)
Thanks, I was aware of the differences between primitive find and []SS, but I didn’t expected huge performance differences. I originally though that []SS is a leftover from “the early days” of the APL2000 interpreter, comparable to []ENLIST or []PENCLOSE, so better not to use it because you can achieve the same result with a standard APL primitive. Especially as the documentation does not mention something special about []SS. But now I see that []SS is really an important function and the number one choice for performance critical string searches. As an example, we are generating dynamic HTLM pages by repeatedly searching through the HTML text string and replacing proprietary tags with the desired HTLM output tags (i.e.lists, combo, etc). Also Causeways NEWLEAF and RAINPRO products use a lot of string searches. These products generate in intermediate text vector with proprietary tags which is than used to produce PDF, HTML, SVG, Viewer, etc output. In the process to generate the final output, the intermediate text vector needs to be searched several times. But as they are porting their products to different APL platforms, the use of []SS for them is out of question.

     

j.merrill
find primitive vs []SS
Posted: Saturday, April 09, 2005 9:20 PM (EST)
You should be aware, if you're doing replacement operations, that the assembler implementation used in the function TEXTREPL (in the ASMFNS ws) is going to whip the pants off anything you can write with APL, even using #SS to find the strings to be replaced.  That's particularly true if you're going to make many changes.
 
Even if you don't know what to replace (or what to replace it with) until you find the string and do some processing, if you can "save up" all the changes until the end and use one call to TEXTREPL, you can end up doing one "allocate new memory area to hold the final result" instead of a large number of such operations.

     

davin.church
find primitive vs []SS
Posted: Sunday, April 10, 2005 4:15 PM (EST)
Of course, if you need to write platform-independent code, then you'll either need to stick with the primitive, or write a cover function that selects the most optimal facility for the platform you're working in.

     



APL2000 Official Web Site

I don't pretend to understand the Universe - it's a great deal bigger than I am.
-- Thomas Carlyle

APLDN Home   |    |  Events   |  Trainings   |  APL Books   |  APLDN Links   |    |  Discussion Groups   |    |  Downloads   |  Articles   |  Library   |  Learning Tools   |  APLDN User IO   |  APL2000.com   |