Organizing Your WordPress Data: Understanding Custom Taxonomies vs Custom Fields

This article looks at the two major systems for adding data to posts in WordPress: custom taxonomies and custom fields. Since some needs can reasonably by solved by either custom fields or custom taxonomies, we’ll use a practical example—a movie review site—to help define overall principles for when it’s best to use each tool.

The Two Crucial Types of Post Data in WordPress

database-structureIn WordPress, you need to think about two types of data that you can attach to your posts: taxonomies and “fields.” Since we’re discussing creating custom data, I’ll mostly be referring to “custom taxonomies” and “custom fields.”

What Custom Taxonomies Are

To understand custom taxonomies well, it’s most helpful to think about WordPress’s built in taxonomies: Post Tags and Post Categories. For both, they’re a systematic ways to organize and think about your data. The biggest distinguishing feature of them, though, is that they’re not unique data. For both tags and categories, you’re using the same word or phrase on multiple different posts. You’re essentially attaching an independent data point — that tag or category — to your post.

For both tags and categories, you’re using the same word or phrase on multiple different posts. You’re attaching an independent data point — that tag or category — to your post.

It’s important to realize that custom taxonomies like tags hold their own data, and can change independently of the posts they are attached to. For example, I may have once used the tag “wapo” on a site, because I has a post that referred to The Washington Post. Now I want to be more clear, so I want that tag (and the 79 posts that have it) to look like “The Washington Post (newspaper)”. Because of the way taxonomies work — linking your data to some other point — this is a change I just need to make once and it gets changed everywhere like magic.

That same feature also means that for WordPress finding things that are tagged with that tag — be it as “wapo” or “The Washington Post (newspaper)” is pretty fast. Because essentially WordPress is just saving the links — this post has these 5 tags linked to it, this tag has 47 posts connected — finding all the places that that tag reoccurs is really fast.

The final detail about custom taxonomies is that they can be hierarchical or flat. I expanded on this a lot more in a past post about custom taxonomies; the summary is: “hierarchical” means both “like Post Categories” and “can stack.” So you can organize things so all your references to “US Politics” are also captured when you look for all references to “Politics.” Post Tags are different, they’re “flat”, so without using human intelligence (or fuzzy string-matching), a computer wouldn’t necessarily know the relation between “Politics” and “US Politics.”

So what are Custom Fields then?

Like custom taxonomies, “custom fields” are a way to attached additional data to your content. But from there, the two pretty quickly diverge. There aren’t great examples of “custom fields” that everyone in WordPress is familiar with — because we’re talking about “custom” ones. The closest analogy you definitely know, though, is the title of your post. It’s not specifically the content of the post, it’s related, and it’s stored in a way similar to (but subtly different from) custom fields. What’s important is that WordPress only assumes your post has one title, that that title is unique to that post, and that it wouldn’t make sense to worry about the speed of finding posts with the identical title to the current post because you only use each title once in your site (probably).

These are the key qualities of custom fields — they’re data that’s unique to the post, not likely to be identical in another context, and stored in a way that doesn’t facilate (but doesn’t prohibit) cross-referencing.

The easiest actual example of “custom fields” is an SEO plugin. There the data — your custom meta description, your customized page title, etc — is really stored in a custom field, and for reasons very similar to the title of a post. That data is unique to each post, not likely to be reused, unlikely to be desirable to change in bulk (in the way I was elaborating with the “wapo” tag). These are the key qualities or custom fields — they’re data that’s unique to the post, not likely to be identical in another context, and stored in a way that doesn’t facilate (but doesn’t prohibit) cross-referencing. For more about actually using custom fields in WordPress, check out my recent post.

An Applied Example: When to Use Which

siskel-and-ebertI think it’s really valuable to understand the difference between these two metadata types, as they’ll have many positive returns over the life of a WordPress site if you nail them upfront. It’s not impossible to convert from a custom taxonomy to a custom field or the reverse, but you may need expert WordPress help (which we offer!) to really get it done. Best to avoid the problem and potential performance headache by understanding it deeply before you dive in.

To make this clear, I think we’ll brainstorm a site. In our scenario, we have a client asking for a system for publishing their movie reviews. Think of it like a site for the late Roger Ebert. A film critic like Mr. Ebert might find value in recording relevant metadata about the movies they’re reviewing, including:

Now, your job as the WordPress professional is to decide whether each of those data points should be implemented as custom fields or custom taxonomies for the site. Got your answers? Read on!

  • Poster Image
  • Genre
  • Plot summary
  • Major Actors
  • Release date
  • Overall score

Now, your job as the WordPress professional is to decide whether each of those data points should be implemented as custom fields or custom taxonomies for the site. Got your answers? Read on! (Hopefully the formatting below means you didn’t accidentally glance my answers.)

Serenity_One_SheetThere are a few interesting challenges in that set, but we’ll save them for last. I would certainly choose to make as a custom taxonomy both the movie’s genre and major actors. Why? Because reuse of both of those across movies is almost certainly desirable, and the ability to quickly find all the movies reviewed which starred “Tom Hanks” or which were “Westerns” is really likely to appeal to anyone browsing the site. It would have the added benefit that for hard to spell names, our reviewer might also benefit from some built-in autocomplete. (For extra credit, I’d probably make “Genre” a “hierarchical” taxonomy so that “Spaghetti Western” or “Space Western” could be specified, but its contents would also show up as I browsed “Western” movies. For actors though, you’d have to try really hard find a reasons that one would be a sub-set of the other, so you’d almost certainly want that to be flat or “tag-like.”)

The plot summary, I think, is equally obvious. And the movie poster image — if it’s not better suited as the featured images of the posts — should almost certainly be a custom field. (Though they hide it well, featured images actually are just a custom field under the hood.) Your ability to reuse a poster image or plot summary for a similar movie doesn’t make sense, it’s just flat data that you’d almost never want to browse with, so custom fields it should pretty clearly be.

The more interesting cases are the scores and release date, as for both you can argue in either direction convincingly. Let’s consider release date first. The big question about release date is which matters more to you: exactness or ease-of-browsing.

  • If you favor exactness, I’d personally probably store the exact time of release as a Unix time value (which is just an integer and really hard to misinterpret in your handling of it) in a custom field. This has the big advantage that comparisons of numbers are easy, and selecting ranges would be pretty intuitive.
  • back-to-the-80sIf you favor ease of browsing, and especially if exactness isn’t so important, you can make a case for a hierarchical taxonomy. Essentially you might make your top level “Categories” decades, and then within them put your years, and within those, put months. It has the big advantage that for someone browsing the site, if you showed the whole hierarchy you’d make it easy (and relatively easy for your server) to take a trip back to the 1980s or 1993 and relive their past. But, if you wanted to let someone see exactly what day a movie came out, or allow them to see movies that came out between May 2003 and July 2004, you’d have quite a programming headache on your hands.

In general, for my money, the answer for most date values would probably be a custom field. Even though years are relatively naturally ordered when pulled out of a taxonomy, the mess of nesting you get would pretty thoroughly talk me out of some minimal performance gain in browsing. And building all the features we describe — a quick zoom to browsing the 1980s — with hard coded values isn’t so tough.

thumbs-up

The trade-offs on ratings are pretty much the same, but the other (really important) factor is what the critic favors. If they want the ability to declare that a movie if a 5.2 out of 10, there’s almost no reason you could convince me to try to store that data in a taxonomy. On the other hand if they simply wanted to do a “Thumbs Up”/”Thumbs Down” system — like the one Mr. Ebert’s television show made famous — you’d get a ton of both conceptual and practical benefit of making that a taxonomy with those two terms. A system of stars — 0 to 5, whole numbers only — is really on the line for me. Personally I’d probably favor a taxonomy and figure out the sorting manually, but you could definitely make an argument that the time to figure out the sort isn’t worth the effort. And if half-stars enter the picture…

What We’ve Learned

Hopefully you came through that long example with me. It’s certainly not always an easy decision which is better between custom taxonomies and custom fields. Both have strengths and weaknesses.

Custom fields are great for arbitrary data, and their flexibility makes them appealing and easy for someone new to dealing with custom data. But you use them for things that don’t make sense — especially as your content archive grows — and they can be a big, ugly performance bottleneck.

Custom taxonomies are awesome for making your data easy (and easy on your server) to browse. Anywhere you feel pretty sure that you could apply either the Categories or Tags idea to the data you’re storing, I’d use a custom taxonomy. But when you get into storing numerical data — or data that you’d ever want to sort by — in them, you can walk into a nightmare of complexity. In those cases, custom fields make a ton of sense.

Hopefully you feel a bit more empowered and knowledgeable about the tradeoffs between these two ways of storing custom post data in WordPress. If you have more questions, or want to quibble with my answers to the scenario, I’d love to hear from you in the comments. Thanks for reading!

Image credit: J_CMac, fo.ol, Articulate Mediaworks, mssarakelly


6 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments
Larry Swanson
March 12, 2016 12:56 pm

Super-helpful. Thanks, David.

K.K.Smith
March 10, 2016 4:21 pm

Probably the best “think-through” on this that I have come across.

BV
October 23, 2015 3:04 pm

Thx for the explanation. If you want to see al the genres an actor
played, how to achieve this? So if you want to see all the type of
genres (not the movies) Tom Hanks played?

Deniz Porsuk
June 6, 2015 4:05 am

Dear David

Thank you very much for this article

Could you please make some comments to our case.

http://stackoverflow.com/questions/30614345/perfomance-comparison-between-querying-by-taxonomy-vs-querying-by-custom-fields

Tweet Parade (no.16 Apr 2014) - Best Articles of Last Week | gonzoblog
April 19, 2014 2:03 pm

[…] Organizing Your WordPress Data: Understanding Custom Taxonomies vs Custom Fields – You can attach to your content-types — which for better worse still go by the name “posts” in much of WordPress — two importantly distinct kinds of data […]

Mark Simchock
April 15, 2014 9:32 pm

I think the best rule of thumb might be this:

Does the attribute span across the CPT and have a finite universe of values, or it is it “unique” to individual rows?

For example, gender would be a taxonomy, but first name would be a custom meta field.

And for those who don’t like taxonomy (multiple) check boxes, it is possible to wire up a custom meta box that displays a taxonomy as a radio / select.