- Overview
- How to perform a "full-text" search?
- How to create a full-text index?
- Remarks
- Defining the stop words
- Defining the synonyms
- How to create a query used to perform a full-text search?
- Full-text search: Query editor
- Full-text search: Query in SQL code
- Syntax of the search value
- Specific case of words separated by a dash
- Analyzing the result of a full-text query
- Managing the full-text indexes programmatically
Search and full-text index
Available only with these kinds of connection
HFSQL proposes a quick search for character strings in the data. This search (called "full-text") is used to find a word or a set of words. This allows you to index, without programming, the texts found in an HFSQL database. These texts can be found in "Text" items or in "Text memo" items. An index can index one or more different items. The creation of the index can take the formatted texts into account (RTF, HTML): the tags of these formats will be ignored during the indexing. This allows you to find words stored in RTF or HTML documents. The results of the full-text indexing are proposed according to a relevance order ("ranking"). Attention: the "Full-text" search is not a simple "contains" search: punctuation is not taken into account.
How to perform a "full-text" search? To perform a "full-text" search, you must: - Create a full-text index in the data files affected by this search.
- Create a query to perform the full-text search.
- Study and display the result of the query.
How to create a full-text index? Creating a full-text index is performed in the data model editor. To create a full-text index: - In the data model editor, display the description of the items found in the relevant file ("Description of items" in the context menu).
- Click . The window for defining the full-text index is displayed. Select the items taking part in the composition of the full-text index. Only the "Text" items and the "Text memo" items found in the file description are proposed. Validate.
- The "Full-text index" item is displayed in the list of items.
- Select the item and modify its characteristics if necessary:
- in the list of items, you have the ability to modify the name of the full-text index.
- in the right section of the window, select the parameters of the full-text index:
- minimum size of indexed words (option "Do not index words whose size is less than or equal to"): only words larger than this size will be indexed. This size is set to 2 characters by default.
- RTF tag management: the "My text may contain RTF
(do not index the RTF tags)" option specifies whether the text contains RTF.. In this case, the RTF tags are ignored when indexing.
- HTML tag management: "My text may contain HTML
(don't index the HTML tags)" option specifies whether text contains HTML. In this case, the HTML tags are ignored when indexing.
- Manage UNICODE punctuation: this option specifies whether the text contains Unicode-specific punctuation characters (curved quotation marks, etc.)..
- case management: If the "Case sensitive" option is checked, the case of words (upper/lower case) will be taken into account during indexing..
- accentuation management: If the "Accent sensitive" option is checked, accentuation will be taken into account during indexing.. This option is required if the stemming is requested.
- Hollow words management: Hollow words management allows you to define hollow words that should not be taken into account by indexing.. Hollow words are usually articles, linking words, etc.. The button "Edit stop words" is used to define the stop words that must be taken into account. For more details, see Defining the stop words.
- synonym management: Synonym management allows you to define which words should be considered synonyms.. For more details, see Defining the synonyms.
- root management: If synonym management is not activated, it is possible to activate root management.. The stemming is used to perform searches on a full-text index while taking the variations of a given word into account. This feature is interesting for the knowledge databases and it allows you to find the words in singular or plural form. The stemming depends on the language used.
If this option is enabled, select the language used.
- Validate the description window of items.
- Save and generate the analysis. When generating the analysis, the description of the data files is modified and the full-text index is created (file with a ".FTX" extension).
Defining the stop words If the option "Stop words" is selected, simply click "Edit stop words" to define the stop words. - Click "Edit stop words".
- A list of stop words used by default is displayed.
- You can:
- Add hollow words: click on the "Add" button. In the window that is displayed, enter the list of stop words to add. These words can be separated by spaces, tabulations, carriage returns or the ",", ";" and ":" characters.
- Delete one or more hollow words: select the hollow word(s) to be deleted from the list and click on the "Delete" button..
- Import hollow words: click on the "Import" button and select the text file containing the hollow words.. These words can be separated by spaces, tabulations, carriage returns or the ",", ";" and ":" characters.
- Paste hollow words: to paste hollow words available in the clipboard (and separated by spaces, tabs, carriage returns or the characters ",", ";" and ":"), copy the words to the clipboard (Ctrl + C) and click on the "Paste" button..
- Validate.
Defining the synonyms If the option "Synonyms" is selected, simply click "Edit synonyms" to define the synonyms. - Click "Edit synonyms".
- In the window that is displayed, specify line by line the synonyms that will be taken into account. For example:
house chalet villa condominium townhouse - Validate.
How to create a query used to perform a full-text search? The full-text searches are performed via queries: queries created in the query editor or queries created in SQL code. The search condition is entered when creating the query. Full-text search: Query editor To create a query performing a full-text search, all you have to do is add a "Full-text index" item to the query. The description window of a full-text search is automatically displayed. This window allows you to specify: - The search value. This value can be entered directly (a wizard is used to build the sought string) or it can correspond to a parameter. For more details, see "Syntax of the search value".
- The sort options of the result.
When validating this window, the "full-text" item is automatically added into the query elements. This item corresponds to the relevance of the search result. A condition was automatically defined: only records whose relevance is greater than 0 will be displayed in the query result. Full-text search: Query in SQL code To create a query that performs a full-text search in SQL code, use the following syntax: MATCH (<Index>) AGAINST <Condition> where: - <Index> corresponds to the list of items found in the index separated by commas (the order of items is not important).
- <Condition> corresponds to the sought string. For more details, see "Syntax of the search value".
Example: In this example, SAI_Rechercher is an Edit control and IdUserConnecté is a variable.. QRY_Find is Data Source
MyQuery is string = [
SELECT * FROM Contacts
WHERE MATCH(Contacts.LastName, Contacts.FirstName, Contacts.HTMLComment,
Contacts.RoughTextComment, Contacts.Comments,
Contacts.Phone, Contacts.Office,
Contacts.Cell, Contacts.Email, Contacts.MSN, Contacts.Internet_site,
Contacts.Country, Contacts.FaxNum, Contacts.City)
AGAINST ('
]
MyQuery = MyQuery + EDT_Find + [
')
AND Contacts.UserID =
]
MyQuery = MyQuery + ConnectedUserID + [
ORDER BY LastName DESC
]
HExecuteSQLQuery(QRY_SRCH, hQueryDefault, MyQuery)
FOR EACH QRY_SRCH
TableAddLine(TABLE_Contact_by_category, ...
QRY_SRCH.ContactID, QRY_SRCH.CategoryID, ConnectedUserID, ...
QRY_SRCH.LastName, QRY_SRCH.FirstName)
END
CASE ERROR:
Error(HErrorInfo())
Note: Query with a parameter on a full-text index: how to ignore the parameter? To ignore the parameter, the "MATCH" statement of the query must not be found in the query result but in the WHERE statement. Indeed, if the pertinence must be included in the result, the parameter must be specified to evaluate the result. In order for a query created with the query editor to have the MATCH statement included in the WHERE statement, the pertinence must not be displayed in the result. Example with relevance: SELECT
MATCH(XX, YY, ZZ) AGAINST({ParamFullText}) AS PertinenceFullText
FROM
TABLE
WHERE
<Parameters>
AND
PertinenceFullText > 0
ORDER BY
PertinenceFullText DESC Example without relevance: SELECT *
FROM
TABLE
WHERE
<Parameters>
AND MATCH(XXX, YYY, ZZZ) AGAINST({ParamFullText}) > 0 Syntax of the search value The search value can contain the following elements: | | Element | Meaning | A single word | The specified word will be sought. The relevance will be increased if the text contains this word. Example: "WINDEV" searches for "WINDEV". | Two words separated by a space character | Searches for one of the words. Example: "WINDEV WEBDEV" searches for the texts containing either "WINDEV" or "WEBDEV". | A word preceded by the "+" sign | The specified word is mandatory. Example: "+WINDEV" searches for the texts that necessarily contain "WINDEV". | Word preceded by the "-" sign | The text must not contain the specified word. Example: "-Index" searches for the texts that do no contain "Index". | A word preceded by the "~" sign | If the text contains the specified word, the relevance will be reduced. | One or more words enclosed in quotes | The specified words are searched in group and in order. Warning: if the "Ignore words smaller than " option is set to anything other than 0, words between quotation marks smaller than the specified size will not be searched. | A word followed by the "*" sign | The type of the search performed is "Starts with" the specified word. |
Warning: Full-text search returns no results if the parameter passed is empty or null. Specific case of words separated by a dash If the text contains words separated by a dash ('-'), each section of the word is indexed independently in a full-text index. Example: - Text contains 'multi-platform': the two words "multi-" and "platform" will be indexed independently..
- During the search, the documents containing "multi" and/or "platform" will be found.
- To search for "multi-platform", you need to enclose the word in quotation marks: "multi-plateforme"..
Attention: the search will return all documents containing the words "multi" and "platform" consecutively.. The separator between the two words may not be a hyphen: it can be any separator (space, +, period, comma, carriage return, tab, etc.).
Analyzing the result of a full-text query The result of a full-text query gives, for each record found in the data file, the relevance of the record in relation to the search value. This relevance depends on several factors: - the number of times the sought word is found in the record.
- the number of words in the record and their number of repetitions.
- the ratio between the records that contain the sought words and the records that do not contain the sought words. Indeed, the more the sought word is found in all the records, the less the relevance will be important.
- ...
The result of a "full-text" query can then be processed like any other query result: for example, it can be displayed in a table, sorted by relevance, and so on. Managing the full-text indexes programmatically Several WLanguage functions are used to manage the full-text indexes:
| | HDescribeFullTextIndex | Describes a full-text index of data file created programmatically. | HListFullTextIndex | Returns the list of full-text indexes of a file (a query or a view) recognized by the HFSQL engine. | HListStopWord | Returns the list of stop words used by a full-text index. | HListSynonym | Returns the list of synonyms used by a full-text index. |
Remarks: - Creating a full-text index in a data file created through programming can be done via a FullTextIndex Description variable.
- HIndex is used to re-index full-text indexes.
This page is also available for…
|
|
|