algorithm - String matching for anagrams -
i have data, (x,y,text)
, representation of strings put out on screen @ given (x,y)
position. text
contains control characters color , font data, , other control characters (ignored now). parse out control characters color, font, size , other , remain alphanumeric stuff @ end.
this data generated application scene (using nice gui that's not concern now), there more 1 entries compose final scene. such as: (6, 1, "berlin")
, (10,2 , "hauptbahnhof")
give final scene, looking more or less (imagine graphics below information screen):
0123456789.... 0+----------------+ 1| berlin | 2| hauptbahnhof | 3+----------------+
now, application generates objects final scene in totally unpredictable (and uncontrollable, closed source app) order, final scene either first or second.
(6, 1, "berlin")
(10, 2, "hauptbahnhof")
. end ->berlinhauptbahnhof
(10, 2, "hauptbahnhof")
(6, 1, "berlin")
. end ->hauptbahnhofberlin
there can dozens of (x,y,text)
entries representing scene, of them having 1 character (such as: each letter printed different color, or characters placed alongside diagonal of screen, or in wave form).
and hardware device interpreting data it's not important either because text go screen @ correct place.
but ... supposed match full text scene final
order of: berlin hauptbahnhof
comes p(l)aintext me. need come algorithm returning checksum/hash, not depending order or characters gives me same result final
messed ordered characters of string.
as example:
"this fox"
should give same "a fox is"
(just words messed up) , "t h s s f o x"
(all characters different blocks, possible different colours) not same "this pex"
('e'
1 less 'f'
, 'p'
1 more 'o'
simple addition checksum fail case). merging/sorting them not option since in case "eat big fish" equal "bag thief is" (ie: anagrams give same result).
so other options have?
if have access (x, y, text)
data, can create screen
class emulates hardware screen. can check 2 screens equal comparing string representations against 1 another.
sample python implementation:
class screen: """emulates hardware screen""" def __init__(self, data=none): """initialize state sensible values""" self.data = {} self.width = 0 self.height = 0 if data != none: x, y, word in data: self.put_message(x, y, word) def put_message(self, x, y, message): """puts message on screen, adjusting height/width fit if necessary""" self.height = max(self.height, y+1) self.width = max(self.width, x + len(message)) idx, c in enumerate(message): self.data[(x+idx, y)] = c def __repr__(self): """returns string represents contents of screen""" ret = "" y in range(1, self.height): row = "\n" x in range(self.width): #fetch character @ position, or period if none exists row = row + self.data.get((x,y), ".") ret = ret + row return ret def compare(first, second): print first print second if str(first) == str(second): print "\nthese equal." else: print "\nthese not equal." data_a = [(6, 1, "berlin"), (2, 2, "hauptbahnhof")] = screen(data_a) data_b = [(2, 2, "hauptbahnhof"), (6, 1, "berlin")] b = screen(data_b) data_c = [(0, 1, "this is"), (1, 2, "a fox")] c = screen(data_c) compare(a, b) compare(a, c)
result:
......berlin.. ..hauptbahnhof ......berlin.. ..hauptbahnhof these equal. ......berlin.. ..hauptbahnhof .a fox. these not equal.
here, see a
, b
have same output, though inputs have different orders. , c
doesn't compare equal either of them.
Comments
Post a Comment