python - K-mean clustering, make centroids not overlap nodes -


i have done k-mean++ clustering , obtained centroids of clusters, using python 2.7, following method given in http://datasciencelab.wordpress.com/2014/01/15/improved-seeding-for-clustering-with-k-means/

in problem, there further constraint distances between centroid node should larger constant. best way achieve?

it possible 1 centroid close several nodes.

any suggestions on how displace centroids bit?

many thanks.

for example, nodes clustered mynodes = [[469500, 5802610], [468764, 5803422], [467991, 5804202], [470260, 5799949], [469486, 5800730], [468713, 5801510], [467939, 5802291], [467166, 5803072], [467966, 5800204], [467193, 5800985], [466420, 5801766], [466457, 5799700], [465678, 5800488], [464950, 5799229], [470615, 5796600], [469842, 5797405], [470320, 5794955], [469547, 5795735], [468773, 5796516], [467990, 5797297], [470062, 5793215], [469289, 5793996], [468515, 5794776], [467742, 5795557], [466969, 5796338], [466195, 5797119], [469976, 5791334], [469202, 5792115], [468429, 5792896], [467656, 5793676], [466882, 5794457], [466109, 5795238], [465336, 5796050], [464600, 5796840], [470160, 5789250], [469354, 5789972], [468581, 5790753], [467808, 5791534], [467034, 5792315], [466261, 5793096], [465488, 5793877], [464714, 5794658], [463941, 5795499], [463150, 5796210], [469500, 5787920], [468698, 5788614], [467925, 5789395], [467152, 5790176]]

centroids = [[ 467839.6, 5793224.1], [ 467617.22222222, 5800489.94444444]] centroid[0] close node[29], centroid1 close node[8].enter image description here

enter image description here

if understand question correctly, , not @ sure whether do, solution apparent drawings:

you want point closest given centroid point; , has minimum distance set of node points.

  • draw circle around each node point, minimum distance radius
  • intersect each circle each other circle, note intersection points
  • discard intersection point closer minimum distance node point.
  • from remaining intersection points, take 1 closest centroid point. new displaced centroid.

runtime naive implementation should o(number_of_nodes^2), though can optimize using fast nearest-neighbour lookup data structure, such kd-tree, when intersect circles , discard intersection points close.

this should optimal solution; no matter algorithm use, cannot find point closer original centroid fits minimum distance constraint; , algorithm should find optimal point.

though since centroid surrounded node points, new point may quite distance away if node points densely packed.


Comments

Popular posts from this blog

android - Get AccessToken using signpost OAuth without opening a browser (Two legged Oauth) -

org.mockito.exceptions.misusing.InvalidUseOfMatchersException: mockito -

google shop client API returns 400 bad request error while adding an item -