Archive for September, 2004

On Mathematical models of the recall vote and fraud, part X: 2nd. Simon Bolivar Seminar

September 19, 2004

On Thursday the second Simon Bolivar University seminar on Statistical Analysis of the referendum process was held. There were supposed to be three talks, but nature conspired against Luis Raul Pericchi, who was in Puerto Rico, and was unable to come to Venezuela due to hurricane Jeanne. Then, they planned a videoconference, but unfortunately the island lost all electric power, making it impossible to set it up. It will be tentatively scheduled for next Thursday.


You can find the program for these conferences here, I though all presentations would be placed there, but only one of them has so far been posted, more on that particular one later.


 


-There was talk by Rafael Torrealba from the Math Department at Universidad Centro Occidental Lisandro Alvarado. The talk would have been useful two or three weeks ago, but by now it is too simplistic a model to be useful. Basically, Torrealba calculated the probability of coincidences assuming all machines have 500 voters and approximating the binomial distribution by a “box” with zero probability above and below a standard deviation. Using this, Torrealba got that coincidences were as likely as observed in the recall vote and cited Rubin’s work, but was unaware of Taylor, Valladares and Jimenez. Thus, it was too crude at this point to make a point.


 


Torrealba also showed some voter distributions from the Barquisimeto area where he lives to discuss the implications of applying a binomial distribution.


 


-There was a second talk by Isbelia Martin on the binomial distribution and the vote from the recall. She did a more complete presentation of the results I summarized here. In the talk she presented much more material than the one I showed and if she places her presentation online I will link it in the future here.


 


What she did was to present the data for a textbook binomial state, Vargas State, and compare it to the data I presented on Miranda State. There are more anomalies to the data that I discussed, including the fact that if one does a fit through the “clouds” of results to obtain the average for each cloud, they do not intersect zero as they should. Additionally, she and her colleagues find that in some cases the same center has machines in both clouds, which obviously makes no sense.


 


-Jimenez, Jimenez and Marcano have now placed a simplified version of their work on coincidences here, I wish everyone would make their work available like that; it would make discussions more lively and interesting.


 


What they have done is essentially to use what is called a bootstrap method, which is a basically a simulation of the vote using the actual data from the recall referendum and modeling the details of the structure of the centers, tables (mesas) and machines. They allow all variables to fluctuate so that they do not have to assume the data is random which would not be if it had been intervened with.


 


Jimenez et al. do also a more detailed calculation of the problem by looking not only at the number of coincidences in the SI or No votes, but by looking at Si, No and all votes and comparing the probability of coincidences for each type of center. That is, they not only calculate how many centers had coincidences in two machines, but calculated how many centers with two machines, had coincidences in any of the three numbers (Si, No or sum of votes), how many centers with three machines did, how many with four etc. In this fashion one has a wider number of probabilities to compare the real data to what the simulations say.


 


They then did 1238 simulations and calculated the same probabilities for centers from 2 to 11 machines. In this manner they found that in general, the proportion of coincidences is higher in the actual vote that in the simulations, which led them to do a test of ranges, calculating the probability that the observed number of coincidences in the recall vote may occur for each center with n=2,3,4…..11 machines. In this manner, it is not simply a matter of asking what the probability of two machines coinciding is, but what is the probability that centers with two machines had the level of coincidences observed.


 


You can see the results in their paper in Table 3, but I will summarize some cases with examples:


 


Centers with two machines: The probability of observing the number of Si coincidences seen was 0.0323, the number of No coincidences was 0.7746 and the number of total vote coincidences was 0.0638. Thus, while low, it was probable that there were that many coincidences.


 


Centers with four machines: The probability of observing that number of Si coincidences was ZERO, with the probability of No coincidences being 0.2883 and the probability of total votes coinciding 0.00807. Similarly low probabilities were observed for the total number of coincidences in centers with 6 and 7 machines or extremely low probabilities in Si coincidences for centers with six machines.


 


The authors conclude:


 


-The repetitions observed in the Si vote and the total number of voters per machine in one center are considerably larger than expected. It is strange, but probable


 


-The repetitions observed in the NO votes are absolutely credible and in many cases, close to what was expected.


 


-The repetitions observed in the Si votes in centers with 4 machines and the number of voters in centers with six machines are extreme cases of their analysis. In these cases the author CAN NOT accept the hypothesis that the repetitions are due to randomness.


 


This last conclusion is the strongest found in the study of the coincidences in the number of votes within one center and it says the data could not have been random.

On Mathematical models of the recall vote and fraud, part X: 2nd. Simon Bolivar Seminar

September 19, 2004

On Thursday the second Simon Bolivar University seminar on Statistical Analysis of the referendum process was held. There were supposed to be three talks, but nature conspired against Luis Raul Pericchi, who was in Puerto Rico, and was unable to come to Venezuela due to hurricane Jeanne. Then, they planned a videoconference, but unfortunately the island lost all electric power, making it impossible to set it up. It will be tentatively scheduled for next Thursday.


You can find the program for these conferences here, I though all presentations would be placed there, but only one of them has so far been posted, more on that particular one later.


 


-There was talk by Rafael Torrealba from the Math Department at Universidad Centro Occidental Lisandro Alvarado. The talk would have been useful two or three weeks ago, but by now it is too simplistic a model to be useful. Basically, Torrealba calculated the probability of coincidences assuming all machines have 500 voters and approximating the binomial distribution by a “box” with zero probability above and below a standard deviation. Using this, Torrealba got that coincidences were as likely as observed in the recall vote and cited Rubin’s work, but was unaware of Taylor, Valladares and Jimenez. Thus, it was too crude at this point to make a point.


 


Torrealba also showed some voter distributions from the Barquisimeto area where he lives to discuss the implications of applying a binomial distribution.


 


-There was a second talk by Isbelia Martin on the binomial distribution and the vote from the recall. She did a more complete presentation of the results I summarized here. In the talk she presented much more material than the one I showed and if she places her presentation online I will link it in the future here.


 


What she did was to present the data for a textbook binomial state, Vargas State, and compare it to the data I presented on Miranda State. There are more anomalies to the data that I discussed, including the fact that if one does a fit through the “clouds” of results to obtain the average for each cloud, they do not intersect zero as they should. Additionally, she and her colleagues find that in some cases the same center has machines in both clouds, which obviously makes no sense.


 


-Jimenez, Jimenez and Marcano have now placed a simplified version of their work on coincidences here, I wish everyone would make their work available like that; it would make discussions more lively and interesting.


 


What they have done is essentially to use what is called a bootstrap method, which is a basically a simulation of the vote using the actual data from the recall referendum and modeling the details of the structure of the centers, tables (mesas) and machines. They allow all variables to fluctuate so that they do not have to assume the data is random which would not be if it had been intervened with.


 


Jimenez et al. do also a more detailed calculation of the problem by looking not only at the number of coincidences in the SI or No votes, but by looking at Si, No and all votes and comparing the probability of coincidences for each type of center. That is, they not only calculate how many centers had coincidences in two machines, but calculated how many centers with two machines, had coincidences in any of the three numbers (Si, No or sum of votes), how many centers with three machines did, how many with four etc. In this fashion one has a wider number of probabilities to compare the real data to what the simulations say.


 


They then did 1238 simulations and calculated the same probabilities for centers from 2 to 11 machines. In this manner they found that in general, the proportion of coincidences is higher in the actual vote that in the simulations, which led them to do a test of ranges, calculating the probability that the observed number of coincidences in the recall vote may occur for each center with n=2,3,4…..11 machines. In this manner, it is not simply a matter of asking what the probability of two machines coinciding is, but what is the probability that centers with two machines had the level of coincidences observed.


 


You can see the results in their paper in Table 3, but I will summarize some cases with examples:


 


Centers with two machines: The probability of observing the number of Si coincidences seen was 0.0323, the number of No coincidences was 0.7746 and the number of total vote coincidences was 0.0638. Thus, while low, it was probable that there were that many coincidences.


 


Centers with four machines: The probability of observing that number of Si coincidences was ZERO, with the probability of No coincidences being 0.2883 and the probability of total votes coinciding 0.00807. Similarly low probabilities were observed for the total number of coincidences in centers with 6 and 7 machines or extremely low probabilities in Si coincidences for centers with six machines.


 


The authors conclude:


 


-The repetitions observed in the Si vote and the total number of voters per machine in one center are considerably larger than expected. It is strange, but probable


 


-The repetitions observed in the NO votes are absolutely credible and in many cases, close to what was expected.


 


-The repetitions observed in the Si votes in centers with 4 machines and the number of voters in centers with six machines are extreme cases of their analysis. In these cases the author CAN NOT accept the hypothesis that the repetitions are due to randomness.


 


This last conclusion is the strongest found in the study of the coincidences in the number of votes within one center and it says the data could not have been random.

My favorite orchid?

September 18, 2004


Above, a thriving Cattleya Intermedia from Brazil with about a dozen blooms and another dozen buds. It not only blooms and blooms and grows and grows, but look at how spectaculart the flower on the right is!



This is Cattleya Loddigesi Tony Boss from Brazil. Is this my favorite orchid? If it is not, it must be a very close call. I am always in awe as to how delicate and well shaped it is. The plant seems to be ready to die on me every year, but always comes back. Not easy to grow, but wonderful blooms, it is supposed to be the best Loddigesi there is, glad I have one!

On the news: From Ministries to plane crashes and fraud

September 18, 2004

-Chavez creates more Ministries. In 1998 when Chavez won his first election, he criticized the size of the State, reducing the number of Ministries from 21 to 15 by merging Interior with Justice, Health with Social Development, Education, Culture and Sports, Transport and Urban Development and Agriculture with Industry and Commerce. Since then, he has created six, including Housing (needed), Foodstuffs (what about Agriculture?), Social Economy (as opposed to what?), Culture, Higher Education Special Zones.


-Another Mirage crashed one more in a long series of Mirages and F16’s that have served as incredibly neat toys for the military, but unfortunately too many have crashed. At $15,000 per hour in fuel, maybe it’s for the better, the fewer there are the less they will spend. There are 13 left…Maybe it is time to buy more to defend ourselves from whatever! So glad the three submarines have not sunk beyond their established parameters (thanks Ed!)!


 



 


-The Prosecutor devoted to political vendettas Danilo Anderson is investigating the people that went to the Presidential Palace on April 12th. to see if they are charged with rebellion. So, the Generals were exonerated from rebellion by the Supreme Court so that now they will go after the civilians. I certainly hope they call General Lucas Rincon, the highest ranking General in the military to testify as to why he said Chavez had resigned on April 11th. Rincon was in Chavez’ Cabinet until August 18th. and remains the most mysterious link as to the supposed “coup” in April 2004, since it was his appearance that evening that led to all of the events in the next three days.


 


Venezuela will sell US$ 1.5 billion in US dollar denominated bonds abroad in the next two weeks. Apparently $44 per barrel is insufficient to finance the revolution. Since Chavez said prices are on their way to $100 per barrel, the revolution should be safe.


 


-The People’s Ombudsman came out to request that the Bank Superintendence eliminate the SICRIT. This is the database that banks share to establish credit for people. The Ombudsman says this is a violation of their rights. Watch out TRW! Watch out consumers, you will no longer be able to get credit if this is approved!


 


-Still waiting to give my opinion on the latest Carter Center report. I will wait for Hausmann and Rigobon to give their oopinion. So far, it seems to me to be so silly, I don’t dare say more than that. They proved the votes and their signatures had a high correlation. Seems obvious to me, but….They also showed the random number generator works. The one in Excel in my PC does too, but that has nothing to say about what happened on the CNE computer. Better shut up and don’t start talking like a Venezuelan expert: Find ten Venezuelans, preferably men, ask them a question, eight out of ten will be experts on the subject)

On the news: From Ministries to plane crashes and fraud

September 18, 2004

-Chavez creates more Ministries. In 1998 when Chavez won his first election, he criticized the size of the State, reducing the number of Ministries from 21 to 15 by merging Interior with Justice, Health with Social Development, Education, Culture and Sports, Transport and Urban Development and Agriculture with Industry and Commerce. Since then, he has created six, including Housing (needed), Foodstuffs (what about Agriculture?), Social Economy (as opposed to what?), Culture, Higher Education Special Zones.


-Another Mirage crashed one more in a long series of Mirages and F16’s that have served as incredibly neat toys for the military, but unfortunately too many have crashed. At $15,000 per hour in fuel, maybe it’s for the better, the fewer there are the less they will spend. There are 13 left…Maybe it is time to buy more to defend ourselves from whatever! So glad the three submarines have not sunk beyond their established parameters (thanks Ed!)!


 



 


-The Prosecutor devoted to political vendettas Danilo Anderson is investigating the people that went to the Presidential Palace on April 12th. to see if they are charged with rebellion. So, the Generals were exonerated from rebellion by the Supreme Court so that now they will go after the civilians. I certainly hope they call General Lucas Rincon, the highest ranking General in the military to testify as to why he said Chavez had resigned on April 11th. Rincon was in Chavez’ Cabinet until August 18th. and remains the most mysterious link as to the supposed “coup” in April 2004, since it was his appearance that evening that led to all of the events in the next three days.


 


Venezuela will sell US$ 1.5 billion in US dollar denominated bonds abroad in the next two weeks. Apparently $44 per barrel is insufficient to finance the revolution. Since Chavez said prices are on their way to $100 per barrel, the revolution should be safe.


 


-The People’s Ombudsman came out to request that the Bank Superintendence eliminate the SICRIT. This is the database that banks share to establish credit for people. The Ombudsman says this is a violation of their rights. Watch out TRW! Watch out consumers, you will no longer be able to get credit if this is approved!


 


-Still waiting to give my opinion on the latest Carter Center report. I will wait for Hausmann and Rigobon to give their oopinion. So far, it seems to me to be so silly, I don’t dare say more than that. They proved the votes and their signatures had a high correlation. Seems obvious to me, but….They also showed the random number generator works. The one in Excel in my PC does too, but that has nothing to say about what happened on the CNE computer. Better shut up and don’t start talking like a Venezuelan expert: Find ten Venezuelans, preferably men, ask them a question, eight out of ten will be experts on the subject)

Pedestrian application of Benford’s Law to Exit polls

September 18, 2004

I have been in computer hell ever since my laptop decided two weeks ago not to charge properly. Then I got a new charger and it’s the connector in the motherboard, so it may die again. Apparently the only solution may be to get a new motherboard from the States. When I finally got it to work it gave me an error, called Microsoft who said reinstall Windows, I did. Unfortunately, after installing Windows I could not get the Wireless network to even give me the right menu. Called Dell and had my first experience with a call center in India, very efficient, polite and knowledgeable and I am back on the air for the time being with my laptop.


Lots to post, for now, someone in the comments asked/said how come Benford’s Law had not been applied to the Exit polls, so I did it. Now, understand that all I did was plot the fraction ocurrenecs for a first digit for the exit polls for both the Si’s and the No’s and got the following histogram:


 



 


This seems to be textbook Benford’s Law for the data from the 325 centers polled by Proyecto Venezuela. I don’t know if mathematicians do other tests to evaluate how “good” or “bad” it fits with Benford’s law. If anyone knows, let me know.

On Mathematical models of the recall vote and fraud, part IX: Too much correlation between the 2000 and 2004 vote?

September 15, 2004

Jose Huerta whose page you can find here (He has some interesting statistics about education and poverty in Venezuela in his page)  has been looking at a comparison of the data from 4565 centers from the results of the 2000 Presidential vote and the recent recall referendum vote, as well as between the 1999 referendum and the 2000 Presidential vote. Essentially, you note that as usual time was working against Chavez as with each vote the anti-Chavez vote went up and the pro-Chavez vote went down. (Here is the full presentation and details in Power Point format). However, this trend stopped between 2000 and 2004, despite the fact that the time span between the two was much longer.


What is interesting about his results, is that he finds that if you look at the number of anti-Chavez votes at the municipal level, there is a high correlation between the 1999 and 2000 vote with R^2=0.9784:


 



 


 


 


But what is remarkable is that the correlation actually went UP between 2000 and 2004, with R^2 of 0.9866 at the municipal level:


 



 


 


which is somewhat counterintuitive given the time frame and everything that happened in Venezuela in those four years, including the new voters, changes in voting centers, migrations and political unrest.


 


Certainly very intriguing.

On Mathematical models of the recall vote and fraud, part IX: Too much correlation between the 2000 and 2004 vote?

September 15, 2004

Jose Huerta whose page you can find here (He has some interesting statistics about education and poverty in Venezuela in his page)  has been looking at a comparison of the data from 4565 centers from the results of the 2000 Presidential vote and the recent recall referendum vote, as well as between the 1999 referendum and the 2000 Presidential vote. Essentially, you note that as usual time was working against Chavez as with each vote the anti-Chavez vote went up and the pro-Chavez vote went down. (Here is the full presentation and details in Power Point format). However, this trend stopped between 2000 and 2004, despite the fact that the time span between the two was much longer.


What is interesting about his results, is that he finds that if you look at the number of anti-Chavez votes at the municipal level, there is a high correlation between the 1999 and 2000 vote with R^2=0.9784:


 



 


 


 


But what is remarkable is that the correlation actually went UP between 2000 and 2004, with R^2 of 0.9866 at the municipal level:


 



 


 


which is somewhat counterintuitive given the time frame and everything that happened in Venezuela in those four years, including the new voters, changes in voting centers, migrations and political unrest.


 


Certainly very intriguing.

On Mathematical models of the recall vote and fraud, part IX: Too much correlation between the 2000 and 2004 vote?

September 15, 2004

Jose Huerta whose page you can find here (He has some interesting statistics about education and poverty in Venezuela in his page)  has been looking at a comparison of the data from 4565 centers from the results of the 2000 Presidential vote and the recent recall referendum vote, as well as between the 1999 referendum and the 2000 Presidential vote. Essentially, you note that as usual time was working against Chavez as with each vote the anti-Chavez vote went up and the pro-Chavez vote went down. (Here is the full presentation and details in Power Point format). However, this trend stopped between 2000 and 2004, despite the fact that the time span between the two was much longer.


What is interesting about his results, is that he finds that if you look at the number of anti-Chavez votes at the municipal level, there is a high correlation between the 1999 and 2000 vote with R^2=0.9784:


 



 


 


 


But what is remarkable is that the correlation actually went UP between 2000 and 2004, with R^2 of 0.9866 at the municipal level:


 



 


 


which is somewhat counterintuitive given the time frame and everything that happened in Venezuela in those four years, including the new voters, changes in voting centers, migrations and political unrest.


 


Certainly very intriguing.

Andrew Morse compares Carter and Rather

September 15, 2004


Good article by Andrew Morse in Tech Central Station comparing the fake Rather documents and the rush to judgment by CBS to Carter’s rush to judgement in the Venezuelan recall. I love this sentence:


The deficiencies exhibited by CBS and the Carter Center are problematic not only for big media and elite NGOs. Left unaddressed, they are problems that will eventually feed back into the blogosphere. The rise of the blogosphere has been predicated on the existence of a reliable, common base of information that people can discuss — a base of information that full-timers are still in the best position to provide. The blogosphere is most robust when it can draw upon the resources of media organizations with global reach that honestly vet sources and conduct comprehensive follow-up reporting. It depends on the boots on the ground that a Carter Center can provide to monitor elections in faraway lands. Without the work of an organization like the Carter Center, there might not be detailed election data from Venezuela to examine in the first place. “


For those that don’t know about the Dan Rather affair, you can read all about it at Instapundit, but basically somebody gave CBS fake documents that questioned President Bush’ National Guard service and the type of treatment he received while there. These documents were quickly debunked and found to be fake by bloggers in less than 24 hours. The giveaway? Easy, the documents were supposed to be from before the era of PC’s but were printed using one, which one could tell by the quality of the spacing and font!. More here too.