“I want to find all of the documents that talk about tango! I need them quickly! Can I do that?”
Ok, but first breathe!
Keyword searches within 4D Write Pro documents simply require adding a new indexing attribute within each document. This isn’t done by default because this type of search is not often necessary so it wouldn’t make sense to systematically increase the size of the documents. However, when it’s needed, this type of index is very easy to build.
“Tango is a traditional Argentinean dance”. Our objective is to create a list of words from the content of the documents. This is mainly done using two commands:
- WP Get text – returns the raw text of the document
- Get text keywords – fills a text array with the words from the text provided. Once this array has been filled, simply add it as an attribute of the 4D Write Pro document object, and use it as a target for keyword searches.
Let’s take a closer look at the details.
Don’t miss anything inSIDE the document
WP Get text returns the content of the target as plain text. The target can be a given range or an entire document. If you pass a document as an argument, only the text in the body of the document will be returned. Headers and footers will be ignored, which is probably not desirable. Since each section of a document can have its own headers and footers, you’ll need to loop through each one and read their content. Be sure to take any variants into account (e.g., first page, right-hand page, left-hand page).
Putting the text in AN ARRAY
Get text keywords fills an array based on the content of the source text. The only subtlety here is not to forget the optional star (*) parameter. It allows you to fill the array with only values that are distinct from the words in the text. This avoids unnecessary “weight” 🙂
Create the index in the document
The name of the attribute that will contain the index is up to you, however, it’s strongly recommended to prefix it with an underscore to avoid any potential conflicts with existing (or future) public attributes of 4D Write Pro documents (Example: “_keywords”.)
Of course, this indexing must be done each time documents are modified, but this is a really very quick task and won’t cause any noticeable slowdown for users.
Index the index!
In order for the queries using the keyword index to be fast and efficient, the document itself must, of course, be indexed. There’s no need to worry about the size of the index. Only exposed (public) attributes are indexed, so only a very slight oversize is to be expected. For example, there is no need to create a new object field that would only serve this purpose. This would be totally useless and the space occupied would be greater than using an existing object (i.e.., the 4D Write Pro document itself).
In the case of object fields, “Automatic” means “cluster b-tree”, which is perfect for business letters or similar documents.
Making queries on this index
Classic queries must be made “by attribute”. Keep in mind that the attribute is a collection, so you must add opening and closing brackets [ ] to indicate to 4D that the search must be made within this collection. For example:
QUERY BY ATTRIBUTE([SAMPLE]; [SAMPLE]WP; "_keywords[]"; =; $val)
ORDA queries are carried out following the same principle:
$entitySel:=ds.SAMPLE.query("WP._keywords[] = :1"; $val)
Conclusion
If in your documents, this type of indexing was not foreseen from the beginning and the need arises, it’s never too late! Apply to selection of the indexing method will do the job perfectly. Searches can also be combined to take into account several words at once (all of the words) or separately (at least one of the words). This will allow you (and especially the users of your databases) to find texts very quickly according to the words they contain.
Now you have two methods to create (and maybe remove) the full-text index! They can be used starting with 4D V17. Check out this code snippet that you can use in your own application!