The LIFTI Query Syntax
Don’t want to use advanced queries? You’ll want to configure the simple query parser.
Quick examples
| Example | Meaning |
|---|---|
| West | West must appear in the text exactly. |
| West|Wing^2 | West or Wing must appear in the text exactly, where matches on Wing will have a score boost of 2. |
| ?Wst | Words that fuzzy match with wst must appear in the text. |
| ?3,2?Wst | Words that fuzzy match with wst must appear in the text, with a specified max edit distance and max sequential edits. |
| title=West | A field restricted search. West must appear in the title field of an indexed object. |
| doc* | Words that starts with document must appear in the text. See wildcard matching |
| %%ing | Words that starts with any two letters and end with ing, e.g. doing. See wildcard matching |
| west & wing | The words west and wing must appear in the text. |
| west wing | The words west and wing must appear in the text - the default operator is & if none is specified between search words. |
| west | wing | The words west or wing must appear in the text. |
| west &! wing | Documents containing west but not wing must appear in the text. |
| west ~ wing | west and wing must appear near to each other (within 5 words - the default) in the text. |
| west ~3 wing | west and wing must appear near to each other (within 3 words) in the text. |
| west ~> wing | west must be followed by wing closely (within 5 words - the default) in the text. |
| west ~3> wing | west must be followed by wing closely (within 3 words) in the text. |
| west > wing | west must precede wing anywhere in the text |
| “the west wing” | The words the west wing must appear in sequence in the indexed text. |
| “notr* dam*” | You can use wildcards and fuzzy matching in a sequential text query. In this case, a word starting with notr must be immediately followed by a word starting with dam, e.g. Notre Dame. |
| <<west | west must appear at the start of a the text (first token). |
| east>> | east must appear at the end of the text (last token). |
| <<single>> | The text must contain exactly the single word single and nothing else. |
Search terms can be combined and placed in parenthesis:
| Example | Meaning |
|---|---|
| “west wing” ~ “oval office” | West wing must appear near Oval Office |
| (west | east) & wing | west wing or east wing must appear in the document. |
Query Operators
Exact word matches
Any text in a query will be tokenized using to the provided tokenizer, enforcing the same word stemming, case/accent sensitivity rules as used in the index.
Fuzzy match (?)
By prefixing a search term with ? a fuzzy matching algorithm will be used to match the search term against the index. You can optionally specify the maximum edit distance and maximum number of sequential edits
for a specific search term using the formats:
?{max edits},{max sequential edits}?term
For example ?2,1?food will search for “food” with a maximum number of edits of 2, and maximum sequential edits of 1.
You can omit one or the other parameter if required, so ?2?food will only set the maximum number of edits to 2, leaving the maximum sequential edits at the default value. If you want to only include
the maximum number of sequential edits, then you must include a leading comma in the parameter set, e.g. ?,2?food
See Fuzzy Matching for more details.
Defaulting search terms to fuzzy matching
By default LIFTI will treat a search term as an exact word match, however you can configure the index so that any search term (apart from those containing wildcards) will be treated as a fuzzy match.
Wildcard matching
Any search term containing * or % will be considered a wildcard match, where:
*matches zero or more characters%matches any single character. You can use multiple%in a row to indicate an exact number of characters that need to be matched.
Examples:
foo*would match occurrences offoodandfootball*ingwould matchdriftingandflying%%%ldwould matchcouldandmould(but notshould, because it has 4 letters before theld)%%p*matches words starting with any two characters followed byg, then any zero or more characters, e.g.map,caps,duped
Start and End anchors (<< and >>)
Applies to LIFTI v7 and later
Anchor operators allow you to constrain matches to the start or end of content.
<<termmatches only whentermappears as the first token in the text (token index 0)term>>matches only whentermappears as the last token in the text<<term>>matches only whentermis both the first and last token (i.e., the text contains exactly that one term)
Examples:
<<Skodamatches where any field starts with “Skoda”, e.g. “Skoda Octavia” but not “New Skoda” or “My Skoda”Description=excellent>>matches items where the Description field ends with “excellent”, e.g. “quality is excellent” but not “excellent quality”Status=<<active>>matches records where the Status field contains exactly “active” and nothing elseBrand=<<Sk*matches Brand fields starting with any word beginning with “Sk”, e.g. “Skoda” or “Škoda”Name=<<?Smth>>matches Name fields containing exactly one word that fuzzy matches “Smth”, e.g. “Smith”Title=<<The & Westmatches documents where Title starts with “The” and also contains “West” anywhere"<<The West"matches documents where the text starts with the exact phrase “The West”
And (&)
The and operator (&) Performs an intersection of two intermediate query results, combining word positions for successful matches.
Food & Burger searches for documents containing both "food" and "burger" at any position, and in any field.
(Alternatively Food Burger will have the same effect as the default operator between query parts is an &.)
Or (|)
Performs a union of two intermediate query results. Where a document appears in both sets, word positions are combined into one list.
Restricts results to same field by default: false
And-Not (&!)
Applies to LIFTI v7 and later
The and-not operator (&!) performs a difference operation, returning documents that match the left operand but exclude those that match the right operand.
Food &! Burger searches for documents containing "food" but not "burger".
Examples:
eiffel &! tower- Documents containing “eiffel” but not “tower”(paris | london) &! museum- Documents about Paris or London, excluding museumstitle=important &! content=spam- Documents with “important” in title but not “spam” in contentfrance &! (tower | museum)- Documents about France excluding both towers and museums
Bracketing expressions
Brackets can be used to group expressions together.
e.g. (food & cake) | (cheese & biscuit)
Field restrictions (field=...)
These allow for restricting searches within a given field.
title=analysis | body=(chocolate & cake) Searches for documents with "analysis" in the title field or both "chocolate" and "cake" in the body field.
title=analysis food Searches for documents with "analysis" in the title field and "food" in any field.
If your field name contains spaces or other special characters, you can escape it using square brackets [ and ], e.g. [my field]=chocolate.
Sequential text ("...")
Placing quotes around a search phrase will enforce that the words all appear immediately next to each other in the source text.
"cheese burger" will only match documents that have text containing "cheese" followed immediately by "burger".
Near (~ and ~n)
The near operator performs a positional intersection of two results based on the position of the word in a field.
The ~ operator requires that words must be within 5 words of one another. This can value can be controlled by specifying a number, e.g. ~4 to restrict to only returning results within 4 words of one another.
cheese ~ cake will return documents containing the words "cheese" and "cake" in either order, up to 5 words apart, e.g. "the cake was made with cheese" and "I like cheese and cake" would both match, but "cake is never to be considered a substitute for cheese" would not.
Near following (~> and ~n>)
Same as Near (~) except that order is important in the positional intersection.
cheese ~> cake will match "cheese and cake" but not "cake and cheese"
Following (>)
Same as Near Following (~>) except there are no constraints on how far apart the words can be.
cheese > cake will match any text where "cheese" precedes "cake" in a given field.
Score boosting
Wildcard, fuzzy match and exact match search terms can have their resulting scores boosted by adding ^n after them. For example, wild^2 will boost matches of “wild” by 2x.
Escaping search text
Use a backslash \ when you want to explicitly search for a character that clashes with the query syntax. For example, A\=B will search for a single token containing
exactly “A=B”, rather than attempting to perform a field restricted search.