View Single Post
  #11  
Old Dec 16, 2009, 09:08 AM
quintin3265's Avatar
quintin3265 quintin3265 is offline
Senior Member
 
Join Date: Mar 2009
Location: State College, PA
Posts: 193
Default

Quote:
Originally Posted by Gigablah View Post
There is no foolproof way to prevent vote rigging, even if you require voters to be registered with authenticated email addresses. There are more subtle ways to catch cheaters but anyone with general savvy in the workings of web applications can easily circumvent them.

The most you can do with the logged information is to analyze the voting and traffic patterns and filter out suspicious votes. Correlating IP addresses may help, but only to some extent. There is also no way to conclusively prove that voters who share the same IP address (or subnet) are the same person, or that voters with different IP addresses are different people, unless you obtain evidence from their ISPs (good luck with that).

On the other hand, Quentin could just as easily withhold all voting information, and everyone would be none the wiser.
A further explanation about voting, for Andreas: all of the votes were kept, even if some votes will be ultimately disqualified. It doesn't make sense to throw away records of votes that were disqualified. There haven't been any security breaches as far as I can tell, and the votes were not corrupted. So far, I don't see anything that needs to be changed to prevent security violations, because there weren't any (in this initial analysis, at least). This proposal wasn't made because there was a problem in the votes; it was made to be as open as possible because we have nothing to hide. It should not be taken to indicate that we won't also analyze the votes.

I disagree, however, that public posting of IP addresses is a huge security risk. There are thousands of addresses in these votes. Anyone who wants to compromise a computer will simply port-scan millions of addresses to find zombies with security vulnerabilities. It's also impossible to link up addresses to people (as the next paragraph states), and your IP address is recorded in the server logs of every single site you have ever visited. It's even posted on the reviews pages on remixSite if you post an anonymous comment. IP address is not personally identifiable information like a credit card number and security experts who advise people to hide IP addresses should instead be focusing on creating stronger passwords and patching systems with the latest Windows updates.

I also agree with Chris's comment that any analysis of IP addresses, regardless of how strong, does not provide proof to disqualify anyone. Now that the voting is over, however, I can say that there is an algorithm that analyzes all of the data that I designed at the start of the competition. I can't provide any more information because then it might be possible for someone to defeat it in future competitions. The way the method is implemented is such that it is possible that false negatives might occur. But the odds of a false positive when it comes to fraud are extraordinarily low. When this algorithm is run and if any fraud is detected, we'll be almost absolutely certain that the identified user is responsible for the fraud. If that happens, we'll reveal the nature of the algorithm for transparency purposes. I also want to state the following: I am confident enough in this algorithm that the results will adhere to the one-vote-per-person rule with a 99% accuracy. Of course, nobody else has anything to go on but my word, but I hope that I have proven that I'm not trying to influence this competition by not voting, and by not having offered any comments about any song up to this point.

Another method that I'm using to analyze the ratings is a statistical analysis of the distribution of votes. If a song is rated 4.5, the expected value of the majority of votes is either 4 or 5. If all the votes are 1s and 7s, then the data indicates that more investigation is necessary. I'm expecting to find that the data follows a normal distribution with the standard deviation being small.

Since there is disagreement over the release of any of this information, let's do this. If the results are clear-cut and there is no dispute after they are posted, I'll simply create graphs with the results of the statistical analysis and present them along with the results. The IP addresses and algorithm results will not be released. If it turns out that there is a tie as specified in the rules, the tiebreaking vote will be subject to the release of all information, including IP addresses and possibly the nature of the algorithm.
__________________
Now you can embed your songs in forum posts and webpages just like this image! Click the image to find out how!

Last edited by quintin3265; Dec 16, 2009 at 09:19 AM.