Saturday, July 23, 2011

Majority Vote

A Linear Time Majority Vote Algorithm was invented in 1980 by Bob Boyer and J Strother Moore (inventors of the popular Boyer-Moore string search algorithm). Not seeing this available in Factor, I thought to contribute one.
Note: this is also called the "Moore’s Voting Algorithm".
The algorithm simply looks at each element of the sequence:

Keep a candidate element and a counter (initially unknown and zero, respectively).

As we move across the sequence, look at each element:
  • If the counter is 0: the element is the candidate and the counter is 1.
  • If the counter is not 0: increment if the element is the candidate, decrement if not.
When we are done, the candidate is the majority element, if there is a majority.

Using this specification, we can implement the algorithm:
: majority ( seq -- elt/f )
    [ f 0 ] dip [
        over zero? [ 2nip 1 ] [ 
            pick = [ 1 + ] [ 1 - ] if
        ] if
    ] each zero? [ drop f ] when ;
A few simple tests show that this is working:
[ f ] [ { } majority ] unit-test
[ f ] [ { 1 2 } majority ] unit-test
[ 1 ] [ { 1 1 2 } majority ] unit-test
[ f ] [ { 1 1 2 2 } majority ] unit-test
[ 2 ] [ { 1 1 2 2 2 } majority ] unit-test
This is perhaps not quite idiomatic Factor, can you improve it?

6 comments:

Anonymous said...

Here's my solution, but not so sure it's idiomatic:

:: majority2 ( seq -- f/elt )
0 :> count! f :> candidate!
seq [ count 0 =
[ candidate! 1 count! ]
[ candidate =
[ count 1 + count! ] [ count 1 - count! ] if
] if ] each

count 0 = [ f ] [ candidate ] if ;

What's more disturbing to me is that the input { 1 2 3 1 } for both of our implementations of this algorithm output f, instead of 1.

I don't know why.

Anonymous said...

I looked at the page that you linked to, and the description matches your own description of the algorithm above too. However, I feel that there's something else missing.

I found that the input { 1 1 1 2 2 3 3 } also gives a wonky result. Is this algorithm not always 100% (similar perhaps to a bloom filter which has a % change to give a false positives)?

Anonymous said...

http://stackoverflow.com/questions/780937/linear-time-voting-algorithm-i-dont-get-it

mrjbq7, I didn't read the assumptions carefully enough! There needs to be a MAJORITY (duh!) element in the entire sequence of at greater than N/2.

I mistakenly thought that majority meant "highest frequency group", which is a different problem altogether.

Sorry for any confusion.

mrjbq7 said...

@otoburb: I think I just realized why. One of the requirements that I didn't notice is that the sequence has an element that exists more than N/2 times.

So, its not just the "most common element" (which I had originally thought), but actually the "majority element", meaning more than half the sequence consists of the element.

yac said...

Just for fun using tuple and reduce.

Shorter
[ [ 1 + ] ] [ [ 1 - ] ] if change-n
fails with Cannot apply “call” to a run-time computed value, sadly.

Jason said...

Here's my crack at it:

:: (majority) ( candidate count elt -- candidate count )
{
{ [ count 0 = ] [ elt 1 ] }
{ [ candidate elt = ] [ candidate count 1 + ] }
[ candidate count 1 - ]
} cond ;

: majority ( seq -- elt ) [ f 0 ] dip [ (majority) ] each drop ;