Oct 27 2009

MySQL SUM() Doesn’t Play Well With Floats

I had to write some reports for some legacy software today and I was unpleasantly surprised with the results of my SQL queries. I was selecting dollar values and summing them to for the monthly spending of certain individuals. Easy enough right? I wrote a query something like this :


SELECT SUM(t.money_spent) as sum_of_spent,
c.customer_name 
from transactions t 
join customers c on t.customer_id=c.customer_id 
group by customer_name order by c.customer_name asc

I ended up getting numerical values that were 10 decimal places long with seemingly random numbers. After checking to make sure the database didn’t have any odd entries I stumbled on this bug report.
The ‘money_spent’ column had a data type of float, which is a waste, but I still don’t think that it should sum up incorrectly. When I select individual values I get proper two decimal results.
Apparently floats and doubles use floating point math, which deals with approximate values for numbers and can thus result in confusion like this. It seems that it isn’t really possible to store 0.1 in a column of type float. You can only store 0.00999999977648258. This behavior is a little silly but easily fixed by using the ROUND() function :

SELECT SUM(ROUND(t.money_spent)) as sum_of_money_spent,
c.customer_name from transactions t 
join customers c on t.customer_id=c.customer_id 
group by customer_name order by c.customer_name asc

Share

Oct 7 2009

Reset Mysql Root Password On Linux

If you have root access to a linux server and you don’t have the root mysql password, but need it, then you can easily reset the root mysql password in just a few commands. These commands probably differ depending on what linux distro you use. I was using Ubuntu 9.04 (Jaunty Jackalope) when I wrote this.

Firstly you will want to turn the mysql service off.


codytaylor@server:~$ sudo /etc/init.d/mysql stop
 * Stopping MySQL database server mysqld   

Now we restart the mysql server with the ‘skip-grant-tables’ option which basically allows anyone to do whatever they like. It’s usually preferable to include the ‘skip-networking’ option so that only localhost (you) have access to the naked database.

 codytaylor@server:~$ sudo mysqld_safe --skip-grant-tables --skip-networking &

Now all that is left is actually changing the root password. Log into the mysql monitor and change the root password.

codytaylor@server:~$ mysql -u root mysql
mysql> UPDATE user SET Password=PASSWORD('password') WHERE User='root';
mysql> FLUSH PRIVILEGES;

Those commands will reset the root mysql password to ‘password’. Now you’ll probably want to restart the mysql service and have it run normally.

codytaylor@server:~$ sudo /etc/init.d/mysql restart

If you are using windows and you want to reset the mysql root password then check the mysql documentation.

Share

Sep 14 2009

Optimize WordPress

My webhost recently moved all my sites to a ‘stabilization’ server because my sites were using far to much CPU time and Memory. After reviewing the logs it looked like some bot from India decided to repeatedly scrape one of my sites in it’s entirety without any delays between requests. So the support team over there either requires me to correct the problem or upgrade to a dedicated server plan at ridiculous costs.

Since I didn’t really think that there was a problem I emailed back about the single IP address that was causing all the issues and took steps to prevent requests from that IP address from accessing the site. The support team replied saying that my usage was still high and that I still needed to correct the problem. A little frustrated, I did some research on how to improve my site’s load time and hopefully reduce CPU and memory usage.

Most of my sites use wordpress so I found a large number of articles geared specifically to optimizing wordpress blogs. Before I tried anything I backed up my entire public_html directory and did a dump of all my mySQL databases (took almost 20 minutes for the dump).

Dealing with Plugins
The first thing I did was upgrade all my plugins. Most wordpress plugins allow you to upgrade automatically so all you really have to do is click a button and all the work is done for you. I also deactivated and deleted a surprising number of plugins that I haven’t really had any use for recently. Apparently a lot of free plugins can cause large amounts of unneccesary load on your server due to the authors not really knowing or caring how well their software performs.

Dealing with spam bots
I have been using the Akismet plugin for awhile and it has been reporting large amounts of spam comments and pingbacks. It’s not really something that most people worry about because the spam is automatically deleted after a period of time. It does however increase server load, especially if it’s in the thousands of messages a day. I found this little mod_rewrite snippet to deny any blatent spammers that don’t have a proper referer :


RewriteEngine On
RewriteCond %{REQUEST_METHOD} POST
RewriteCond %{REQUEST_URI} .wp-comments-post\.php*
RewriteCond %{HTTP_REFERER} !.*codytaylor.org.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^$
RewriteRule ^(.*)$ ^http://lemonparty.org//$ [R=301,L]

Cache and Compress
Since most of my pages rarely change it’s silly to generate every page for every request dynamically. After some reading I decided to use WP Super Cache to help optimize my WordPress sites. Of course just enabling Super Cache in the WP Super Cache plugin didn’t really improve load times for the end user but it should reduce server load immensely. What did improve load times drastically was the Super Cache Compression. This was a little more involved to get going but if you’re comfortable with copying and pasting code into a .htaccess file then it shouldn’t be difficult as long as your host supports mod_mime, mod_rewrite, and mod_deflate.

After going through all that, my sites now average at about half the load time they used to. Hopefully my web host feels that I’ve done enough to get off the ‘stabilization’ server so I don’t have to transfer all my stuff to another company.

Share

May 14 2009

MySQL Founder Starts Open Database Alliance, Plans Refactoring

Gary Pendergast writes “Monty Widenius, the ‘father’ of MySQL, has created the the Open Database Alliance, with the aim of becoming the industry hub for the MySQL open source database. He wants to unify all MySQL-related development and services, providing a potential solution to the fragmentation and uncertainty facing the communities, businesses and technical experts involved with MySQL, following the news of the Oracle acquisition of Sun.” Related to this, an anonymous reader writes that “MySQL has announced a project to refactor MySQL to be a more Drizzle-like database.”

Read more of this story at Slashdot.


Share

May 12 2009

Break-In Compromises 160k Medical Records At UC Berkeley

nandemoari writes “Hackers have reportedly infiltrated restricted computer databases at the University of California Berkeley, putting the private data of 160,000 students, alumni, and others at risk. According to UC Berkeley, computer administrators determined on April 9, 2009 that electronic databases in University Health Services had been breached by overseas criminals. The breakins began in October 2008. Information contained on the breached databases included Social Security numbers, health insurance information, and non-treatment medical information such as records of immunization and names of treating physicians.”

Read more of this story at Slashdot.


Share

May 10 2009

DoJ Budget Request Details Advanced Surveillance, Biometrics

An anonymous reader writes with a report about programs revealed in the Department of Justice’s 2010 budget request, which includes 3.9 million in funding for an “Advanced Electronic Surveillance” project, and .6 million to establish the Biometric Technology Center. The surveillance project is designed to help the FBI “deal with changing technology and ways to intercept phone calls such as those used by VOIP phones or technology such as Skype. The program is also conducting research on ways to conduct automated analysis to look for links between subjects of surveillance and other investigative suspects.” The Center for Democracy and Technology’s Jim Dempsey warns, “It is appropriate for the FBI to develop more and more powerful interception tools, but the privacy laws that are supposed to guide and limit the use of those tools have not kept pace.” The biometrics plan lays groundwork for a “vast database of personal data including fingerprints, iris scans and DNA which the FBI calls the Next Generation Identification,” a system we have discussed in the past.

Read more of this story at Slashdot.


Share

May 5 2009

Virginia Health Database Held For Ransom

An anonymous reader writes “The Washington Post’s Security Fix is reporting that hackers broke into servers at the Virginia health department that monitors prescription drug abuse and replaced the homepage with a ransom demand. The attackers claimed they had deleted the backups, and demanded million for the return of prescription data on more than 8 million Virginians. Virginia isn’t saying much about the attacks at the moment, except to acknowledge that they’ve involved the FBI, and that they’ve shut down e-mail and a whole mess of servers for the state department of health professionals. The Post piece credits Wikileaks as the source, which has a copy of the ransom note left behind by the attackers.”

Read more of this story at Slashdot.


Share

May 4 2009

The Manga Guide to Databases

stoolpigeon writes “Princess Ruruna, of the Kingdom of Kod, has a problem. Her parents, the King and Queen, have left to travel abroad. Ruruna has been left to manage the nations fruit business. Much is at stake, Kod is known as “The Country of Fruit.” Ruruna is not happy though, as she is swamped by paperwork and information overload. A mysterious book, sent by her father, contains Tico the fairy. Tico, and the supernatural book are going to help Princess Ruruna solve her problems with the power of the database. This is the setting for all that takes place in The Manga Guide to Databases. If you are like me and learned things like normalization and set operations from a rather dry text book, you may be quite entertained by the contents of this book. If you would like to teach others about creating and using relational databases and you want it to be fun, this book may be exactly what you need.” Read below for the rest of JR’s review.

Read more of this story at Slashdot.


Share

Apr 30 2009

Web Analytics Databases Get Even Larger

CurtMonash writes “Web analytics databases are getting even larger. eBay now has a 6 1/2 petabyte warehouse running on Greenplum — user data — to go with its more established 2 1/2 petabyte Teradata system. Between the two databases, the metrics are enormous — 17 trillion rows, 150 billion new rows per day, millions of queries per day, and so on. Meanwhile, Facebook has 2/12 petabytes managed by Hadoop, not running on a conventional DBMS at all, Yahoo has over a petabyte (on a homegrown system), and Fox/MySpace has two different multi-hundred terabyte systems (Greenplum and Aster Data nCluster). eBay and Fox are the two Greenplum customers I wrote in about last August, when they both seemed to be headed to the petabyte range in a hurry. These are basically all web log/clickstream databases, except that network event data is even more voluminous than the pure clickstream stuff.”

Read more of this story at Slashdot.


Share

Apr 27 2009

UK Government To Monitor All Internet Use

nk497 writes “The UK government has further detailed plans to track all communications — mobile phone calls, text messages, email and browser sessions — in the fight against terrorism, pedophiles and organized crime. The government said it’s not looking to see what you’re saying, just to whom and when and how. Contrary to previous plans to keep it all in a massive database, it will now let ISPs and telecoms firms store the data themselves, and access it when it feels it needs it.” And to clarify this Barence writes “The UK Government has dropped plans to create a massive database of all internet communications, following stern criticism from privacy advocates. Instead the Government wants ISPs and mobile phone companies to retain details of mobile phone calls, emails and internet sites visited. As with the original scheme, the actual content of the phone calls and messages won’t be recorded, just the dates, duration and location/IP address of messages sent. The security services would then have to apply to the ISP or telecoms company to have the data released. The new proposals would also require ISPs to retain details of communications that originated in other countries but passed over the UK’s network, such as instant messages.”

Read more of this story at Slashdot.

Share