Get list of matching lexemes from PostgreSQL full text search? -

July 15, 2015

the full text search ranking documentation suggests that

you can write own ranking functions and/or combine results additional factors fit specific needs.

but haven't been able find examples of how custom ranking functions can built.

specifically, haven't been able figure out how extract list of lexemes in tsvector match given tsquery… this:

> select ts_matching_lexemes('cat in hat'::tsvector, 'cat'::tsquery); ts_matching_lexems ------------------ 'cat':1

so, how can figure out lexemes in tsvector match given tsquery?

it looks ts_headline function internally, it's deep in c source , outputs string. can, however, use prepare input string parsing result (this relatively slow compared c functions):

code:

create or replace function ts_matching_lexemes(tsv tsvector, tsq tsquery) returns tsvector $$             proc (         select             ts_headline(tsv::text, tsq, 'startsel = <;>, stopsel = <;>') tsh     )     , parts (         select unnest(regexp_split_to_array(tsh, '<;>')) p proc     )     , parts_enum (         select p, lead(p, 1) on (), row_number() on () parts     )     select (string_agg(p || substring(split_part(lead, ' ', 1) 2), ' '))::tsvector     parts_enum     row_number % 2 = 0  $$ language sql;

e.g.:

select ts_matching_lexemes(to_tsvector('cat in hat'), to_tsquery('cat')) union select ts_matching_lexemes(to_tsvector('cats , bikes in hat'), to_tsquery('cat & bike'))   ts_matching_lexemes tsvector ------------------- 'cat':1 'bike':3 'cat':1

notes:

passing text representation of tsvector ts_headline reduce redundant work
it's approximately 10x slower ts_headline(text, to_tsquery(...)) , can sped removing ctes
of course, faster solution add functionality directly in c source. should fast tsvector @@ tsquery

Search This Blog

Silver

Get list of matching lexemes from PostgreSQL full text search? -

Comments

Post a Comment

Popular posts from this blog

user interface - How to replace the Python logo in a Tkinter-based Python GUI app? -

android - Get AccessToken using signpost OAuth without opening a browser (Two legged Oauth) -

org.mockito.exceptions.misusing.InvalidUseOfMatchersException: mockito -