Postgres Full Text Search
Postgres has built-in functions to handle Full Text Search queries. This is like a "search engine" within Postgres.
to_tsvector()
Converts your data into searchable "tokens". to_tsvector()
stands for "to text search vector". For example:
select to_tsvector("the green eggs and ham")
-- Returns 'egg':2 'green':1 'ham':4
select to_tsvector("the green egg and ham")
-- Returns 'egg':2 'green':1 'ham':4
Collectively these tokens are called a document which Postgres can use for comparisons,
where every token is a lexeme (unit of lexical meaning). The stopwords (and, or, the, ...)
are conveniently omitted. As you can see, the eggs
will be normalized as a lexeme in English: egg
.
to_tsquery()
to_tsquery()
, which accepts a list of words that will be checked against
the normalized vector we created with to_tsvector()
.
The @@
operator to check if tsquery
matches tsvector
, it's returns true (t
) if matched,
otherwise returns false (f
).
Let's see some queries below:
select to_tsvector("the green eggs and ham") @@ to_tsquery("egg")
?column?
--------
t
select to_tsvector("the green eggs and ham") @@ to_tsquery("eggs")
?column?
--------
t
Use &
for AND in the search query:
select to_tsvector("the green eggs and ham") @@ to_tsquery("eggs & red")
?column?
--------
f
Use |
for OR in the search query:
select to_tsvector("the green eggs and ham") @@ to_tsquery("eggs | red")
?column?
--------
t
Use the proximity symbol <->
for searching for terms that are a certain "distance" apart.
For example, search the phase green egg
, green
is followed immediately by a match for egg
.
select to_tsvector("the green eggs and ham") @@ to_tsquery("green <-> egg")
?column?
--------
t
For example, search for the phase egg [1 word] ham
, find egg
and ham
within 2 words of each other:
select to_tsvector("the green eggs and ham") @@ to_tsquery("egg <1> ham")
?column?
--------
t
Use the negation symbol !
to find phrases which don't contain a search term.
For example, search for the phase that have egg
but not ham
:
select to_tsvector("the green eggs and ham") @@ to_tsquery("egg & !ham")
?column?
--------
f