Selecting Data types for columns, while being important, is one of the most overlooked parts of a database. Use of data types requires understanding to run a database efficiently. For example MySQL supports 10 different numeric data types such as INTEGER, TINYINT, SMALLINT, MEDIUMINT, BIGINT, DECIMAL, NUMERIC, FLOAT, REAL, and DOUBLE PRECISION. Each of some data types exist for specific operations and improper use can result in stress to the database.
The following are some helpful pointers to keep in mind when designing MySQL databases. Understanding how disk I/O affects databases – Typically, databases are stored on Hard Disks. HD I/O is extremely slow when compared to memory I/O. Therefore the larger you database the longer it takes to fetch data. This problem can be exacerbated by poor design where more space than necessary is allocated, thus leading to inefficiency. Picking the right numeric type – Whatever data type you choose, space is allocated to the maximum size of that data type. For example BIGINT occupies 8 bytes of space whether the number you store in it is a “1” or “5,000”. Careless use of data types leads to a massive space requirement and vast amounts of seek times. Storing text in a database – Learning the differences between CHAR and VARCHAR will only increase your efficiency. To put it simply, CHAR should only be used when the length of the data is known and fixed and similarly VARCHAR should be used when the length of the data is not fixed. Also, instead of using TEXT columns to store large blocks of text abstract the TEXT columns to a different table. Data types you should avoid – ENUM and SET should be used with extreme caution and with proper understanding of how they work. Add to the ENUM or SET declaration involves MySQL rebuilding the entire table. Now imagine if you have 2-3 million records, this process could take about a week. Understanding the above points will help you to design a better database. The key message here is that a characteristic of good database design is choosing the proper data type; and a good design will always enable expansion and offer the best possible data retrieval times, thus improving productivity.