for now
and start with the commonest word
(probably 'the')
and stretch a separate springy elastic
from 'the' to each other word
with the preferred length
determined by the observed average distance
between occurrences of these words
compared to the expected average distance
based on simple frequencies alone
so that pairs that tend to occur
closer together
will have shorter elastics
and pairs that occur farther apart
longer elastics
and we link every possible wordpair
by this metric
with the least likely pairs
linked by half-million-mile elastics
so the network as a whole
fills Moon's orbit
(not just the flat disk
but an orbit-sized sphere)
now
we can hope
the Yahoo clusters
will be well-separated in this space
and we can guess
the commonest words
will be pulled towards the center
and if we now
flatten the sphere
we ought
should
may
be able to
restore each word
to its frequency orbit
(most frequent closest
least frequent farthest)
without disrupting
the topical clustering...?