The difference between homeobox and Hox genes

Update: This post is now the top result on Google for variations on “homeobox vs hox”. With “homeobox hox” alone it’s only beaten by the Wikipedia page for Hox genes. If I knew this would become such a popular post I would have put in more effort. It’s just a quick rant. The popularity shows just how much of a problem this is in education though.

This is a big pet peeve. Let’s get straight to business: the terms “homeobox” and “Hox” are not interchangeable. They do mean different things. I’m correct in saying that Amphioxus (Branchiostoma lanceolatum) has 15 Hox genes. I’m also correct in pointing out that it has over 130 homeobox genes.

Gene names can be very confusing and difficult to remember, so there are many abbreviations in biology. For example, the gene insulin-like growth factor 1 is abbreviated to Igf1. Does that make it easier to remember? Who knows. But I believe the use of abbreviations is partly responsible for the incredible confusion over homeobox and Hox genes. And I do mean incredible. It’s very obviously a confusing topic for students, or anyone new to evo-devo, developmental genetics, or gene regulation… but it’s so much worse than that. Professional publications make the mistake, academics make the mistake, and they do it often. I think the reason it keeps happening is that the word “Hox” appears to be a shortened “Homeobox”. All over the internet you will see the terms used interchangeably, and sometimes with the apparently shortened version in brackets. “Homeobox (Hox)”. This otherwise decent glossary at Epigenesys manages to dump the terms homeotic, homeobox, and Hox into one single paragraph and glossary entry, which is of little help to a confused student seeking clarity. The first Google result for “homeodomain” (ignoring Wikipedia) is R&D Systems saying, “The DNA sequence that encodes the homeodomain is called the ‘homeobox’ and homeobox-containing genes are known as ‘hox genes’. This is wrong. A homeobox-containing gene is not necessarily a Hox gene. So let’s clear this up and I’ll keep it quick.

First, let’s go over the facts, and the answer, before we discuss why these confusing names have been chosen. Scientists discovered that there are some genes that contain a very conserved region of DNA we now call the homeobox. When I say very conserved, I mean it. You have homeobox genes, the birds outside do, the grass outside does… even yeast does. The origin of homeobox genes is ancient, definitely pre-dating the origin of animals. This 180-base-pair homeobox codes for a 60-residue chain known as the homeobox domain (or homeodomain). So the region of the gene is known as a homeobox, the region of the protein is the homeodomain. The explanation for why it is so conserved across organisms, through hundreds of millions of years of evolution, is that its function restricts its evolution. The homeobox domain binds DNA (or RNA), allowing a protein with a homeodomain to act in gene regulation. For example, these proteins can be used to turn genes on and off. It’s an invention of evolution that’s persisted through the origin of the fungi, plants, and us animals, and the homeobox itself hasn’t changed much at all. So there’s your definition of a homeobox gene. It isn’t a specific gene, it’s a huge and ancient group of genes that all contain the homeobox, a region of DNA that codes for a domain which can bind to DNA.

Every Hox gene is a homeobox gene, but not every homeobox gene is a Hox gene. The homeobox genes have diversified so much through evolutionary history that there are now distinct classes of them. The most famous is definitely the family of Hox genes. This is also where the terms come from. When scientists first discovered the homeobox domain, they found it because they were studying animals that had mutated Hox genes. These mutants often had body parts in the wrong place, and were described as “homeotic mutants”. When they identified the genes causing the mutations, they discovered that they all shared a common motif, so they named it the homeobox. This is one of the most incredible discoveries in biology, as they quickly realised that the homeobox is found in genes from humans, flies, jellyfish, daffodils, yeast, and so on. But the actual genes they had discovered were a distinct group of homeobox genes, which we now call the Hox genes. They definitely are homeobox genes, and they regulate other genes.

Think about the confusion here. Hox genes are a distinct family of homeobox genes. Scientists discovered the homeobox motif by investigating which genes caused homeotic mutations. What they had found were the Hox genes, so calling Hox genes homeotic is fine. But they didn’t understand at the time that the homeobox motif is found in many genes that aren’t Hox genes. Many homeobox genes have absolutely nothing to do with body parts growing in the right or wrong places. But when they named the homeobox, they only knew of the Hox genes they were discovering via the homeotic mutants. This is where almost all the confusion stems from. Despite being called homeobox genes, most don’t cause homeotic mutants if modified. The Hox genes, a specific family of homeobox genes, are great examples of genes that can cause homeotic mutants.

In us bilaterian animals, one of the main roles of the Hox genes is to specify anteroposterior identity to your body. It’s a complicated system, but we’ll keep it simple. The Hox genes play a role in determining which body parts grow where on the body. So by messing with them you can make limbs grow in the wrong places. But there are plenty of other non-Hox homeobox genes. There are entirely different families with entirely different roles. The Hox genes control the body plan along the anterior to posterior axis in us bilaterian animals, but there’s still some uncertainty over their precise role in non-bilaterian animals. The Hox genes do appear to be unique to animals. You don’t find Hox genes in plants and fungi. They have homeobox genes, but not the Hox genes, which appear to have arisen very early in animal evolution (there is evidence that sponges had Hox genes too, but have since lost them).

We know so much about homeobox genes, especially the Hox cluster, that we could discuss it all day. The evolution of the Hox, ParaHox, and NK clusters is quite fascinating, as are the roles of these gene families in a developing animal. I’ll save these for future entries. Today’s point is mostly just an early-morning rant. Hox genes are homeobox genes as they contain the homeobox, but homeobox genes include Hox genes, ParaHox genes etc. The terms are not interchangeable. It’s such an easy mistake to make that it appears in books, academic websites, and helpful videos on YouTube. Just keep it in mind and focus on what exactly is being discussed. It’s not necessarily wrong to describe a mobile phone as technology, but the terms aren’t interchangeable. You can’t go around describing technology as mobile phones. It makes no sense to say, “the electron microscope is a wonderful mobile phone”. Homeobox and Hox genes work the same way. You can describe a Hox gene as a homeobox gene because that’s exactly what it is. But note that the terms aren’t interchangeable.