algorithm - String matching for anagrams -
i have data, (x,y,text) , representation of strings put out on screen @ given (x,y) position. text contains control characters color , font data, , other control characters (ignored now). parse out control characters color, font, size , other , remain alphanumeric stuff @ end.
this data generated application scene (using nice gui that's not concern now), there more 1 entries compose final scene. such as: (6, 1, "berlin") , (10,2 , "hauptbahnhof") give final scene, looking more or less (imagine graphics below information screen):
0123456789.... 0+----------------+ 1| berlin | 2| hauptbahnhof | 3+----------------+ now, application generates objects final scene in totally unpredictable (and uncontrollable, closed source app) order, final scene either first or second.
(6, 1, "berlin")(10, 2, "hauptbahnhof"). end ->berlinhauptbahnhof(10, 2, "hauptbahnhof")(6, 1, "berlin"). end ->hauptbahnhofberlin
there can dozens of (x,y,text) entries representing scene, of them having 1 character (such as: each letter printed different color, or characters placed alongside diagonal of screen, or in wave form).
and hardware device interpreting data it's not important either because text go screen @ correct place.
but ... supposed match full text scene final order of: berlin hauptbahnhof comes p(l)aintext me. need come algorithm returning checksum/hash, not depending order or characters gives me same result final messed ordered characters of string.
as example:
"this fox" should give same "a fox is" (just words messed up) , "t h s s f o x" (all characters different blocks, possible different colours) not same "this pex" ('e' 1 less 'f' , 'p' 1 more 'o' simple addition checksum fail case). merging/sorting them not option since in case "eat big fish" equal "bag thief is" (ie: anagrams give same result).
so other options have?
if have access (x, y, text) data, can create screen class emulates hardware screen. can check 2 screens equal comparing string representations against 1 another.
sample python implementation:
class screen: """emulates hardware screen""" def __init__(self, data=none): """initialize state sensible values""" self.data = {} self.width = 0 self.height = 0 if data != none: x, y, word in data: self.put_message(x, y, word) def put_message(self, x, y, message): """puts message on screen, adjusting height/width fit if necessary""" self.height = max(self.height, y+1) self.width = max(self.width, x + len(message)) idx, c in enumerate(message): self.data[(x+idx, y)] = c def __repr__(self): """returns string represents contents of screen""" ret = "" y in range(1, self.height): row = "\n" x in range(self.width): #fetch character @ position, or period if none exists row = row + self.data.get((x,y), ".") ret = ret + row return ret def compare(first, second): print first print second if str(first) == str(second): print "\nthese equal." else: print "\nthese not equal." data_a = [(6, 1, "berlin"), (2, 2, "hauptbahnhof")] = screen(data_a) data_b = [(2, 2, "hauptbahnhof"), (6, 1, "berlin")] b = screen(data_b) data_c = [(0, 1, "this is"), (1, 2, "a fox")] c = screen(data_c) compare(a, b) compare(a, c) result:
......berlin.. ..hauptbahnhof ......berlin.. ..hauptbahnhof these equal. ......berlin.. ..hauptbahnhof .a fox. these not equal. here, see a , b have same output, though inputs have different orders. , c doesn't compare equal either of them.
Comments
Post a Comment