Multi-stage Hay voting.

Introduction:

Multi-stage Hay voting is my own idea (Me = Tom Breton (Tehom)).  It
uses Peter de Blanc and Marcello Herreshoff's Hay voting
(http://www.spaceandgames.com/?cat=4) as a building block.

Hay voting has the unusual property that it is a truth-revealing
protocol: voters' best strategy is always to vote sincerely.  (Arrow's
impossibility theorem notwithstanding, because Hay voting incorporates
a random element).  Multistage Hay voting does not entirely preserve
this property, but is able to mostly defang strategic voting.

Hay voting also relies on starting with a "clean" slate of candidates.
It can easily be thrown out of whack by clones, by candidates that
almost everyone strongly dislikes, etc.

How Multi-Stage Hay voting works:

Start with any slate of candidates, as "dirty" as you please.  There's
no special procedure for selecting them.  

Now instead of using Hay to directly choose a candidate, use it to
choose N candidates for the next round.  That is, iterate Hay voting N
times.  N should be many times larger than the original slate of
candidates.

Important point: Clones are OK.  If a candidate A is chosen twice in
round 1, A has two slots in the slate for round 2.

Repeat that procedure with the new slate of candidates.  Do so M
times, each time using the slate of candidates generated in the most
recent round.

With near-certainty, the genuine favorites will come to dominate the
slate.  If greater certainty is wanted, use larger M and N.

Termination could be handled several ways: Repeat until only one
candidate survives.  Repeat until one candidate dominates by a given
ratio.  Repeat a fixed number of times and choose the candidate with
the most clones in the final slate, breaking ties any way you like.
(I suggest breaking ties by a traditional Hay voting, just for
consistency's sake)

I suggest that a flattening rule be used in rounds other than the
last, in order to make strategic voting pointless.


Issue: Could an unfair situation in the original slate result in an
unfair result?

Asking the same thing in a more precise way, how likely is it that a
clone situation in the first slate persists thru all the rounds?

Say, worst case, the first slate consists of 99 indistinguishable
candidates A and one alternative B, and that voters modestly favor B
over A by 60/40, that is, 60% of voters are 100% on B's side, 40% are
100% on A's side.  Also let N = 1000.

Maximizing their voting power, the pro-A voters can give B a 0%
chance, treating the A clones as indistinguishable.  The pro-B voters
can each transfer 1/99 units of voting mass from each A candidate to
B, doubling B's voting mass to 2 sqrt(99).  Since total voting mass is
constant, this doubles B's chances to 2/99, 0.0202.  

With N = 1000, a pro-A voter is chosen about 400 times and a pro-B
voter is chosen about 600 times, (The chance that the number of pro-B
voters chosen differs from 600 by more than a few percent is
negligible)

The pro-A's give all their votes to A, and the pro-B's, hamstrung by
the unfavorable slate, give an average of 0.0202 slots to B; in total
B gets 12.12 slots on average.  B has increased his share of the slate
and is likely to do so even faster in subsequent rounds.

Could B be eliminated in the first round?  The chance that B has been
eliminated is small.  For him to be entirely eliminated, every single
voter would have have had to have the A result, a 0.9879 chance each,
so given 600 pro-B voters it's a 5.054e-06 chance, which is
negligible.  The chance that candidate A receives only one slot is
still only 6.707e-05, and there is only about a 1.8% chance that A
receives fewer than 5 slots.  The expected distribution is actually
fairly narrow; its quartiles are at about 11 and 14.

See the appendix for Elisp code to calculate and print the
probabilities.

Could B be eliminated in later rounds?  In subsequent rounds, B is
very likely to have more cloneshare than he did in the first round.
There is only a 23% chance that his cloneshare has decreased at all.

Additionally, because we chose N larger than the original slate, even
if B's cloneshare has actually been reduced somewhat, his situation is
not even as dangerous as in the first round.  Pro-A's are no longer
able to entirely exclude B and pro-B's have more leverage because they
have a number of B's to transfer voting mass to - when B's cloneshere
is small, pro-B's power increases by the square root of the number of
B clones.

The dynamic is much like a biased random walk on a N-1 hyperplane-like
curve embedded in an N dimensional space.  The curve has asymptotes at
the diagonal axes for which just 1 dimension is positive.  For N=2,
it's a diagonal line of slope -1 passing thru the origin.

The walk's position vector is approximately the log-odds of each
candidate's cloneshare, except that there is quantization near the
endpoints.  That implies that the drift goes as exp(sqrt(M)) while the
non-random movement goes as exp(M).  Given large enough M and N, it
looks like we'll almost certainly get the right winner.

It may make sense to re-vote part-way thru the process, because with a
large slate of candidates, it is difficult for voters to form a
precise opinion on each, but later when there are fewer candidates
voters can pay more attention to individual candidates.


Issue: Can strategic voting still happen?  Yes, but in a lesser way.
It's more effective to flatten your lesser preferences than to reverse
them.

Consider a situation where other voters preferences are:

	A > B > C

and your preferences are:

	B > A > C

and your vote is among those chosen on the next-to-last round.  You'd
do better if A is not in the running, so that it would be a contest
between B and C, which B would likely win.

Does it make sense to insincerely vote as if your preferences are:

	B > C > A

Answer: It sometimes helps you a little, but you are wasting your
allowance moving voting mass from A to C.  You do better by spending
all your allowance helping B.  That is, you flatten the difference
between A and C, as if your preferences were:

	B > {C,A}

I suspect that in general, for an election with N slots you want to
keep your top N preferences unchanged and flatten those of the rest
that agree with the majorities preference.  In effect, when the
majority provides part of the outcome pdf that you want, you spend
your allowance on the part they don't provide.

Flattening votes too much, however, would run the risk of "wasting"
votes on unelectable candidates.

A flattening rule could be applied automatically on rounds other than
the last, using voters' preferences as input, thus still giving voters
a reason to report their preferences linearly.

The resulting action in rounds other than the last superficially
resembles STV.  A voter virtually bullets-votes for their favorite and
then when the favorite is eliminated, bullet-votes for their next
favorite, until the election is resolved.  It differs in that the
intervening slates are larger and that the bullet-vote is only a
tendency.



Ballot construction:

And just a few words on an unrelated Hay voting issue: 

The original writeup (http://www.spaceandgames.com/?cat=4) of Hay
voting by Peter de Blanc seemed to suggest that voters would do their
own calculations, allocating points from a total 1.00.  

ISTM ballot construction can be much simpler: Let the user simply
assign numerical utiles to the candidates and let all the calculations
be done off-ballot by the voting software.

Appendixes:

Elisp code to calculate and print the probabilities (Revised)

;;Since we sample homogeneously from a mixture of pro-A and pro-B, we
;;can just add the probabilities.
(defconst hay-example-chance-of-B 
   (* (/ 2.0 99.0) 0.6)
   "The chance of selecting B in any sample" )

(defun hay-example-600-1000 (n)
   "Calculate the probabilities related to the Multistage Hay example.
1000 voters, 

N should be a non-negative integer and not too big."

   (*
      (/ 
	 (let
	    ((v (float 1)))
	    (dotimes (i n v) (setq v (* v (- 1000 i)))))
	    
	 (let
	    ((v (float 1)))
	    (dotimes (i n v) (setq v (* v (1+ i))))))
	 
      (exp
	 (+
	    (* (log hay-example-chance-of-B) 
	       (float n))
	    (* (log (- 1.0 hay-example-chance-of-B)) 
	       (float (- 1000 n)))))))

(let
   ((cumul 0))
   (dotimes (i 30)
      (princ i (current-buffer))
      (let
	 ((prob (hay-example-600-1000 i)))
	 (setq cumul (+ cumul prob))
	 (print
	    prob
	    (current-buffer))
	 (princ "Cumulatively," (current-buffer))
	 (print
	    cumul
	    (current-buffer)))
	    
      (princ "\n" (current-buffer))))



Analysis of the effect of moving one unit of cloneshare from one
candidate to another.

We'll analyze it as if there were 3 candidates, X, Y, Z.  Z is a third
candidate that is only indirectly affected.  Z can stand homogeneously
for all the remaining candidates, except as far as Z's internal
heterogeneity affects the scaling factor, which is not much.

Some conventions:

Amount of cloneshare that candidate X would have in the sincere slate:

	#X

This notation #X will only refer to the cloneshares of the sincere
slate; cloneshare in the strategic slate will be indicated
algebraically in terms of #X.

The utility function U will be indexed by [crowd] or [you] where it's
ambiguous.

What happens:

Replacing one X clone with a Y clone has the following effects:

Delete the transfer between the removed X and all the Z's 
Delete the transfer between the removed X and all the Y's 
Create a transfer between the new Y and all the X's 
Create a transfer between the new Y and all the Z's 

And renormalize.



Magnitude of transfers:

All transfers between to an X from all the Y's follow the formula:

	U(X-Y)[crowd] #Y

Probability:  

The probability that candidate i is chosen by a member of the crowd on
a single Hay vote is:

	P(i) = 
		(sqrt(n - 1) + votingmasstransfer(i)) 
		/ n sqrt(n - 1)

	      = (sqrt(n - 1) + c total_utility_delta(i)) 
		/ n sqrt(n - 1)

We'll define P'(i) proportional to P(i) to eliminate the denominator
and to group c with the constant term:

	P'(i) = 
	      = P(i) n sqrt(n - 1 / c
	      = (sqrt(n - 1)/c + total_utility_delta(i))

We will index this on [new] and [old].

We'll also define:

	inv_c_delta =:  sqrt(n - 1)(1/c[new] - 1/c[old])

This quantity is likely to be negligible, but we'll keep track of it
just in case.

The individual probabilities:

X's probability is affected just by:

	Create a transfer between the new Y and all the X's 
	There is one less X to be chosen.

So 
	total_utility_delta(X)[new] = 
		total_utility_delta(X)[old] + U(X-Y)

X's new probability is:

	P'(X)[new]
	= (#X - 1) (sqrt(n - 1)/c[new] + total_utility_delta(X)[new])
	= P'(X)[old] 
		- #X (sqrt(n - 1)/c[old] + total_utility_delta(X)[old])
		+ (#X - 1) (sqrt(n - 1)/c[new] + total_utility_delta(X)[new])
	= P'(X)[old] 
		+ #X inv_c_delta 
		-  sqrt(n - 1)/c[new] 
		+ #X U(X-Y) 
		- total_utility_delta(X)[new] 



Y's probability is affected just by:

	Delete the transfer between the removed X and all the Y's 
	Create a transfer between the new Y and all the X's 
	Create a transfer between the new Y and all the Z's 
	There is one more Y to be chosen.

The new transfers just make the new Y clone homogeneous with the other
Y clones.  So we need only to account for the first one:

	total_utility_delta(Y)[new] 
		= total_utility_delta(Y)[old] + U(X-Y)

Y's new probability is:

	P'(Y)[new]
	= (#Y + 1) (sqrt(n - 1)/c[new] + total_utility_delta(Y)[new])
	= P'(Y)[old] 
		+ #Y inv_c_delta 
		+ sqrt(n - 1)/c[new] 
		+ #Y U(X-Y)
		+ total_utility_delta(Y)[new]


Z's probability is affected just by:

Delete the transfer between the removed X and all the Z's 
Create a transfer between the new Y and all the Z's 

So
	total_utility_delta(Z)[new] 
		= total_utility_delta(Z)[old] + U(X-Z) + U(Z-Y)
		= total_utility_delta(Z)[old] + U(X-Y)


Z's new probability is:

	P'(Z)[new]
	= #Z (sqrt(n - 1)/c[new] + total_utility_delta(Z)[new])
	= P'(Z)[old] + #Z inv_c_delta + #Z U(X-Y)



The total utility deltas of X and Y differ exactly where i or j is A
or Y.  When i and j are the same, the delta is zero, so we need not
handle that case specially.  So:

	total_utility_delta(X) =
		total_utility_delta(Y) +
		sum(U(X) - U(i)) -
		sum(U(Y) - U(i))

	= 	total_utility_delta(Y) +
		sum(U(X-Y) for all i)

	= 	total_utility_delta(Y) + N U(X-Y)


We saw earlier that strategic voting can help Z, but what about your
other preferences?  On the whole, does strategic voting help you?

To figure out how much the substitution helps you, we can't simply use
your utility function U, because this stage may not be the last stage,
it feeds into another stage, and the probabilities there are not
linear because there are multiple iterations.  So we define U'(X),
your preference for candidate X getting slots on the next slate, that
varies monotically with U(X).  U'(X-Y) is also defined similarly
between pairs of candidates.


So your total utility delta if you succeed in strategically replacing
X by Y, conveniently scaled by n sqrt(n - 1) / c, is:

	sum((P(W)[new]-P(W)[old])U'(W)[you], member(W,{X,Y,Z}))

	= (P'(X)[new] - P'(X)[old]) U'(X)[you] 
		+ (P'(Z)[new] - P'(Z)[old]) U'(Z)[you] 
		+ (P'(Y)[new] - P'(Y)[old]) U'(Y)[you] 

	= (#X inv_c_delta -  sqrt(n - 1)/c[new] + #X U(X-Y)[crowd] 
			- total_utility_delta(X)[new])
			U'(X)[you] 
		+ (#Z inv_c_delta + #Z U(X-Y)[crowd]) 
			U'(Z)[you]
		+ (#Y inv_c_delta +  sqrt(n - 1)/c[new] + #Y U(X-Y)[crowd]
			+ total_utility_delta(Y)[new])
			U'(Y)[you] 

	= (inv_c_delta + U(X-Y)[crowd])
		(#X U'(X)[you] + #Z U'(Z)[you] + #Y U'(Y)[you])
		-  sqrt(n - 1)/c[new]U'(X-Y)[you]
		- total_utility_delta(X)[new][crowd] U'(X)[you]
		+ total_utility_delta(Y)[new][crowd] U'(Y)[you]

	= (inv_c_delta + U(X-Y)[crowd])
			(#X U'(X)[you] + #Z U'(Z)[you] + #Y U'(Y)[you])
		- ((sqrt(n - 1)/c[new]) + total_utility_delta(Y)[new][crowd])
			U'(X-Y)[you]
		- N U(X-Y)[crowd] U'(X)[you]


The first term means that the substitution helps you to the degree
that the sincere slate would align with your preferences and roughly
in proportion to the crowd's positive preference for X over Y.
inv_c_delta is almost certainly negligible.  Basically, by replacing a
clone of X with less-favored Y, you have lifted all the other
candidates a little, even the other X's.

The second term says that the substitution hurts you to more than the
degree that the crowd likes the new candidate X more than average, and
in proportion to your own positive preference for X over Y.

Both factors are likely to be negative.  Given the strategic-voting
assumption that the crowd's preference X > Z > Y, it is likely that
total_utility_delta(Y)[crowd] is negative.  This is offset positively
by (sqrt(n - 1)/c).

And recall that your preferences U'(X) are for slots in a slate, not
for outcomes.  Given that strategic voting has some effect, your
preference U'(X) will be relatively less than your preference U(X),
and U'(Y) relatively more than U(Y), and your U'(X-Y) is probably
negative, particularly as you take later stages into account.

The third term means that the substitution helps you in proportion to
the number of slots, the crowd's preference X > Y, and your negative
utility for X to be in the next slate.

So on the whole, it's iffy to move voting mass to a candidate you like
less, but reasonable to move extra voting mass to your favorite
candidate.

Your own possible participation in the next round does not materially
change this situation, because of our assumption that the crowd
outnumbers you.  You run a small extra risk of accidentally electing
your least-favorite candidate.