Naming conventions for database columns

February 07, 2012

Earlier I was listening to the CFHour Podcast #124 (by the way, this is a great show for anyone wanting to keep up on the ColdFusion community). This particular episode discussed (among other things) databases and naming conventions for columns. The main point that was brought up was prefixing columns with the table name, and when developers should/shouldn't do this. The example given (which is probably the most common scenario) was the primary key "ID" column -- i.e. naming the column "TableNameID", rather than just "ID". The discussion of "why/when to do this?" went back and forth, and it pretty much boiled down to a few things: 1. Personal preference on naming columns -- some people use "ID", some use "TableNameID".
2. It adds to the readability of the code. But WHY? In a complex query, you're probably aliasing the tables anyway, so you'd end up with "TableAlias.ID", which is just as readable (assuming you use an appropriate name for your table alias).
3. It's redundant. If I'm in the Users table, I know that "ID" is referring to the ID of a User.
4. But what about, say, in a CFLoop? If the code contains "#ID#", it's not clear which table ID is being looped over (which is true...but really the problem in this case is, the developer was lazy and didn't scope the variable -- "queryName.ID" being much more readable than "ID").
Personally (unless there is some other shop standard to which I must adhere), I use "TableNamePK" for my primary keys, and "TableNameFK" for my foreign keys. Here's why: I've run into several projects where (because of business requirements), I'll have tables that look something like so: tblManufacturing --
ManufacturingID
Name
VendorID
PartnerID
OrderModuleID
Let's say that VendorID, PartnerID are NOT foreign keys to other tables (but OrderModuleID is), they are just columns that meet a business requirement (maybe this is data that's entered manually by an employee). Maybe "VendorID" is an industry standard term for that piece of data, predating our application by years or even decades. If I rename "VendorID" and "ParnterID" to avoid this confusion (which is only confusing to me, the developer, as the business units will never see the primary keys), it would have slowed down communication among the team members ("when you say VendorID, you actually mean the column I've renamed to VendorAccessCode", etc). So we can't really rename these "ID" columns that are based on business rules, but since the business folks don't really care about the primary key column, we can use a different naming convention for that, without slowing down anyone. Looking at it from the other side, which "ID columns" represent foreign keys to other tables? Which columns should have constraints applied to them? By just looking at this table, I have no way of answering that. This could slow down development, causing developers and DBAs to go looking for foreign key relationships that don't exist. But what if the table were named like so: tblManufacturing --
ManufacturingPK
Name
VendorID
PartnerID
OrderModuleFK
Now, by just looking, which column is the Primary Key? And which ones are Foreign Keys to other tables? And which ones store some sort of 'ID' that is part of a business rule, but isn't actually a foreign key? By using "PK" and "FK", it's much more clear which types of data are stored in the columns, and I've typed the exact same number of characters...so nobody is allowed to complain about this being too much typing. ;) Whichever convention you pick, you should stick to it consistently throughout the application (unless you have a really good reason for breaking it). I've had to write complex recursive algorithms before that traversed down a hierarchial database of roughly 500 tables, and needed to be able to "predict" the name of the next primary key in the tree. Because I used the same naming convention for every table in the database, the algorithm worked throughout the entire system. If there was no rhyme or reason to how things were named, I would have had to provide some sort of "map", listing each table and its primary key column (yet another thing to maintain, as well as a possible point of failure in the application). Additionally, I was able to write regex patterns to find foreign keys, by searching for any column name ending with "FK", which also wouldn't have worked if I'd stuck with the "ID" convention for everything. -nolan