Get list of matching lexemes from PostgreSQL full text search? -


the full text search ranking documentation suggests that

you can write own ranking functions and/or combine results additional factors fit specific needs.

but haven't been able find examples of how custom ranking functions can built.

specifically, haven't been able figure out how extract list of lexemes in tsvector match given tsquery… this:

> select ts_matching_lexemes('cat in hat'::tsvector, 'cat'::tsquery); ts_matching_lexems ------------------ 'cat':1 

so, how can figure out lexemes in tsvector match given tsquery?

it looks ts_headline function internally, it's deep in c source , outputs string. can, however, use prepare input string parsing result (this relatively slow compared c functions):

code:

create or replace function ts_matching_lexemes(tsv tsvector, tsq tsquery) returns tsvector $$             proc (         select             ts_headline(tsv::text, tsq, 'startsel = <;>, stopsel = <;>') tsh     )     , parts (         select unnest(regexp_split_to_array(tsh, '<;>')) p proc     )     , parts_enum (         select p, lead(p, 1) on (), row_number() on () parts     )     select (string_agg(p || substring(split_part(lead, ' ', 1) 2), ' '))::tsvector     parts_enum     row_number % 2 = 0  $$ language sql; 

e.g.:

select ts_matching_lexemes(to_tsvector('cat in hat'), to_tsquery('cat')) union select ts_matching_lexemes(to_tsvector('cats , bikes in hat'), to_tsquery('cat & bike'))   ts_matching_lexemes tsvector ------------------- 'cat':1 'bike':3 'cat':1 

notes:

  1. passing text representation of tsvector ts_headline reduce redundant work
  2. it's approximately 10x slower ts_headline(text, to_tsquery(...)) , can sped removing ctes
  3. of course, faster solution add functionality directly in c source. should fast tsvector @@ tsquery

Comments

Popular posts from this blog

android - Get AccessToken using signpost OAuth without opening a browser (Two legged Oauth) -

org.mockito.exceptions.misusing.InvalidUseOfMatchersException: mockito -

google shop client API returns 400 bad request error while adding an item -