algorithm - String matching for anagrams -


i have data, (x,y,text) , representation of strings put out on screen @ given (x,y) position. text contains control characters color , font data, , other control characters (ignored now). parse out control characters color, font, size , other , remain alphanumeric stuff @ end.

this data generated application scene (using nice gui that's not concern now), there more 1 entries compose final scene. such as: (6, 1, "berlin") , (10,2 , "hauptbahnhof") give final scene, looking more or less (imagine graphics below information screen):

 0123456789.... 0+----------------+ 1|     berlin     | 2|  hauptbahnhof  | 3+----------------+ 

now, application generates objects final scene in totally unpredictable (and uncontrollable, closed source app) order, final scene either first or second.

  • (6, 1, "berlin") (10, 2, "hauptbahnhof"). end -> berlinhauptbahnhof
  • (10, 2, "hauptbahnhof") (6, 1, "berlin"). end -> hauptbahnhofberlin

there can dozens of (x,y,text) entries representing scene, of them having 1 character (such as: each letter printed different color, or characters placed alongside diagonal of screen, or in wave form).

and hardware device interpreting data it's not important either because text go screen @ correct place.

but ... supposed match full text scene final order of: berlin hauptbahnhof comes p(l)aintext me. need come algorithm returning checksum/hash, not depending order or characters gives me same result final messed ordered characters of string.

as example:

"this fox" should give same "a fox is" (just words messed up) , "t h s s f o x" (all characters different blocks, possible different colours) not same "this pex" ('e' 1 less 'f' , 'p' 1 more 'o' simple addition checksum fail case). merging/sorting them not option since in case "eat big fish" equal "bag thief is" (ie: anagrams give same result).

so other options have?

if have access (x, y, text) data, can create screen class emulates hardware screen. can check 2 screens equal comparing string representations against 1 another.

sample python implementation:

class screen:     """emulates hardware screen"""     def __init__(self, data=none):         """initialize state sensible values"""         self.data = {}         self.width = 0         self.height = 0         if data != none:             x, y, word in data:                 self.put_message(x, y, word)     def put_message(self, x, y, message):         """puts message on screen, adjusting height/width fit if necessary"""         self.height = max(self.height, y+1)         self.width = max(self.width, x + len(message))         idx, c in enumerate(message):             self.data[(x+idx, y)] = c     def __repr__(self):         """returns string represents contents of screen"""         ret = ""         y in range(1, self.height):             row = "\n"             x in range(self.width):                 #fetch character @ position, or period if none exists                 row = row + self.data.get((x,y), ".")             ret = ret + row         return ret  def compare(first, second):     print first     print second     if str(first) == str(second):         print "\nthese equal."     else:         print "\nthese not equal."  data_a = [(6, 1, "berlin"), (2, 2, "hauptbahnhof")] = screen(data_a)  data_b = [(2, 2, "hauptbahnhof"), (6, 1, "berlin")] b = screen(data_b)  data_c = [(0, 1, "this is"), (1, 2, "a fox")] c = screen(data_c)  compare(a, b) compare(a, c) 

result:

......berlin.. ..hauptbahnhof  ......berlin.. ..hauptbahnhof  these equal.  ......berlin.. ..hauptbahnhof  .a fox.  these not equal. 

here, see a , b have same output, though inputs have different orders. , c doesn't compare equal either of them.


Comments

Popular posts from this blog

android - Get AccessToken using signpost OAuth without opening a browser (Two legged Oauth) -

org.mockito.exceptions.misusing.InvalidUseOfMatchersException: mockito -

google shop client API returns 400 bad request error while adding an item -