New Discovery Human Health Published: August 9, 2018

Junk DNA and Cancer: Why the Trash in Your Cells is Very Important


Inside every cell in your body is DNA. Some of this DNA tells the body how to make proteins. However, a lot of this DNA does not make protein and some people call it “junk DNA.” A recent research study showed that variation (differences between individuals) in one particular piece of junk DNA might increase the risk of cancer. The scientists looked at a kind of junk DNA called MSR1 repeats. They showed that MSR1 repeats stuck on the end of cancer-causing genes, forming “tails,” and that shorter tails increased the risk of breast cancer and prostate cancer. This is an exciting finding, because it might allow better diagnosis of and treatments for cancer.

Inside every cell of our bodies is a long, thin molecule called DNA. DNA is your own personal instruction manual and it tells your body everything it needs to know! DNA determines your eye color, skin tone, how tall you are, and even whether your muscles are better at sprinting or running a marathon. Just like a real instruction manual, the instructions in DNA are written in a series of letters. In DNA, there are just four letters–A, T, G, and C. These letters are combined to spell out the instruction for proteins. Proteins are the building blocks of cells. Your brain, heart, and all other organs are made of lots of different proteins. The DNA letters needed to make one protein are referred to as a “gene.” Can you guess how many genes a human being has?

Over 20,000!

That is right, in every cell of your body, there are over 20,000 genes–each spelling out the instructions for a different protein! The genes are lined up along structures called chromosomes. Chromosomes are huge molecules of DNA that have been coiled up really tightly to fit into the cell. Every human cell has 23 pairs of chromosomes. You can see how DNA, genes, chromosomes, and cells relate to each other in Figure 1.

Figure 1 - You can imagine each cell as a library.
  • Figure 1 - You can imagine each cell as a library.
  • In the library there are 23 pairs of bookcases–and in the cell, there are 23 pairs of chromosomes. On the shelves of the bookcases are books–each book is a gene. The library has two copies of every book, as the bookcases are in pairs, remember! Inside the book are the letters A, T, C, and G in many combinations, which gives the instruction on how to make a single protein.

The genes are a secret code for proteins, so they are sometimes called “coding DNA.” However, between the genes, there is lots of other DNA letters that do not produce proteins. This is called “non-coding DNA,” because it is not part of the secret protein code. In the past, scientists thought that genes were the only important part of DNA. They called the non-coding bits “junk DNA,” because they thought it was trash! Some of the junk DNA is very repetitive, repeating the same letter sequence again and again–we call this repeat DNA. Yes, I know–scientists are not very imaginative! Take a look at Figure 2 to see how the junk DNA fits around the genes.

Figure 2 - Each chromosome (bookshelf) has lots of genes (books).
  • Figure 2 - Each chromosome (bookshelf) has lots of genes (books).
  • Each book contains the secret code for a protein. But the books (genes) are not all next to each other, there are loose sheets of paper in between the books. Sometimes the loose sheets of paper are in the back of the book–like an extra appendix. The sheets of paper contain DNA letters, but they are not part of the secret protein code. The sheets of paper are the “junk DNA.” Some of the words on the loose sheets of paper are very repetitive, for example, just saying CAT over and over again! When a sequence of letters is repeated again and again in the genome we called this “repeat DNA.”

DNA Variation

Have you ever used a thesaurus? It is a special type of dictionary that tells us words that have the same, or similar, meaning as each other. For example, you could look up “big” in the thesaurus and it might list “large, massive, enormous.” I think you would agree that the following sentences are all correct and have the same meaning–even though they use a slightly different word:

The cat sat on a dirty mat.

The cat sat on a filthy mat.

The cat sat on a muddy mat.

The same thing can happen in DNA. Do you remember that the sequence of letters that tell the body how to make one protein is called a gene? Imagine you are looking at the letters of a gene in lots of different people. The letters would mostly be the same in every person, but occasionally a different letter would be used, just like using an alternative word from the thesaurus! For example, if you looked at an eye color gene, there is one version for blue eyes, one for green eyes, one for brown eyes, and one for gray eyes. The letters might be slightly different, but they are all correct versions of the gene. We call these small, normal differences “natural variation.”

Junk DNA and Cancer

Junk DNA can have natural variation, too. Recently, Dr. Anna Rose and her colleagues showed that natural variation in junk DNA can increase your risk of cancer [1].

Cancer is a disease in which some cells in your body become out of control. They divide too fast and cause a dangerous lump, called a tumor. Cancer is extremely common–you might know someone who has had cancer or have heard stories of cancer patients in the news. Cancer can affect different parts of the body. Breast cancer usually affects women, and about one in eight women will have breast cancer at some point in their life [2]. Prostate cancer affects men and is just as common as breast cancer [2]. So, how does natural variation in junk DNA increase your risk of these cancers?

The researchers looked at one specific type of junk DNA, called MSR1 repeats. They found that clusters of the MSR1 repeats were often found very close to genes. They found one of these MSR1 clusters very interesting, because this cluster of junk DNA actually was stuck on the end of a known cancer-causing gene. If you look back at Figure 2, you can see that the loose sheets of paper (junk DNA) are found between the books (genes) or tucked at the end of the book like an appendix. In this case, the loose sheets of paper were in the back pages of the book! You could think of the MSR1 repeats as being a tail for the cancer-causing gene. The scientists wondered whether the MSR1 tail was important.

MSR1 Repeats Show a Lot of Natural Variation

First, researchers looked at the MSR1 tail in lots of different people, to check for natural variation in the length. And they found loads of it! In people from the UK and Australia, they saw that different people had everything from very short MSR1 tails to very long tails (Figure 3).

Figure 3
  • Figure 3
  • The scientists found that MSR1 repeats (blue circles) formed a tail at the end of the cancer-causing gene (like the appendix of loose sheets of paper in the back cover of a book in Figure 2). They looked at the MSR1 tails in a big group of people from the UK and Australia and found that the length of the tails showed natural variation. Some people had very short tails, while some had very long tails–and others were somewhere in between!

Remember–the chromosomes are in pairs, so each person has two of every gene. That means every person has two of the cancer-causing genes, and two MSR1 tails! So, on one chromosome, there might be a short tail–but on the other chromosome, there might be a long tail. On the other hand, there might be a short tail on both chromosomes, or a long tail on both chromosomes.

The scientists knew that the tail of a gene is often important in controlling how much or how little protein is produced from that gene. You can imagine that a gene has a control switch–when the gene is “off,” no protein is produced from the gene. When the gene is “on,” protein is produced. Or–more accurately–the genes can be controlled by a dimmer switch on a light. The gene is not simply on or off, but can be off, low light, medium light, or bright light! The scientists thought that maybe the MSR1 tail was a dimmer switch for the cancer-causing gene. They did a complicated experiment that showed that the short-length tail produced a lot more protein than the long tail. So, they had proved that the MSR1 tail was a dimmer switch–and that the long tail was the low-light setting, but the short tail was the bright-light setting (Figure 4).

Figure 4 - The MSR1 tail acts like a dimmer switch for the cancer-causing gene.
  • Figure 4 - The MSR1 tail acts like a dimmer switch for the cancer-causing gene.
  • The short tail is the bright-light setting on the switch and causes a lot of protein to be produced. Conversely, the long tail is the dim-light setting–meaning that not much protein is produced from the gene.

MSR1 Repeats in Breast Cancer and Prostate Cancer

Next, the researchers thought about what this might mean for cancer. Other scientists had already found that breast cancer and prostate cancer tumors had high levels of the protein produced by the cancer-causing gene [3]. Dr. Rose and her colleagues had seen that the short tail was the bright-light switch and produced high levels of the cancer-causing protein (take another look at Figure 4 to remind yourself of this, if you need to). So, they thought that if a person has the short-length tail in the gene, that person might be at risk of breast cancer and prostate cancer.

First, they investigated breast cancer. They looked at a group of women from the UK who had breast cancer, and the same number of women who did not have breast cancer. They measured the length of the MSR1 tails that the women had on the cancer-causing gene on each of their chromosomes (remember–everyone has two of each chromosome). They found that the women with breast cancer were much more likely to have short tails of MSR1. In fact, they used maths to show that if a person has a short tail on both chromosomes, that person is five times more likely to get breast cancer at a young age. Even a short tail on just one of the chromosomes makes a person almost two times more likely to get breast cancer.

Next, they investigated prostate cancer. This time, they looked at a group of men from Australia with prostate cancer, and the same number of men without prostate cancer. Again, they found that the short-tail MSR1 put men at risk of prostate cancer. They calculated that a short tail on both chromosomes, made a man 1.5-times more likely to get prostate cancer.

What Next?

It is pretty cool to know how the DNA in our cells is controlled. It was very exciting for the scientists to discover that the MSR1 repeat acted like a dimmer switch. Understanding how genes are controlled is an important part of science nowadays. But can we use this to help people? Probably!

Your DNA is largely the same from the day you are born until the day you die. This means that a scientist could test people’s blood to find out how long their MSR1 tails are when they are young. The scientist would then know which people have the short tails on their chromosomes, which would tell the scientist which people are at higher risk of breast cancer or prostate cancer. This information will help doctors monitor these people more carefully–and hopefully detect any cancer very early. This means the people who are at risk have a much better chance of being cured of cancer.

However, we also need to think about the ethics of any new genetic test–take a look in Box 1 to see more about medical ethics and whether or not Dr. Rose would take the test herself!

Box 1. What has ethics got to do with genetics?

Medical ethics is a type of philosophy that looks at the morality of scientific experiments—simply, whether it is right or wrong to carry out the research. Medical ethics is particularly important in medical science, because we are often experimenting on human beings, or samples from humans (like DNA samples). Before a scientist runs a research project, he or she must get permission from a group of specialist people called the “Ethics Board,” who consider whether the study is ethically right.

In this research project, I used DNA samples from many people—but not my own. It would not be ethical to use my own DNA sample. This is crucial because, when doing research, we could learn something completely unexpected. What would I do if I discovered by accident that I had a gene mutation for an incurable, serious disease? These are the sort of important questions that medical ethics considers.

However, I would choose to have the test for junk DNA tail length. This is because, although cancer is a very serious disease, there is a cure for it. If I had the test and found out I were at high risk, I would be able to be more prepared for the disease. I would get screened more regularly, and then if I got the disease I would be able to get treatment earlier. However, if there was a genetic test for a different disease that did not have a treatment, I would not want to have that test—for me, it would create more worry for no benefit. What would you do in each situation?

It might also be possible to make new treatments for cancer. This research showed us that MSR1 repeats are important in cancer. So, perhaps pharmacists will be able to make a drug that targets the MSR1 repeats. This could be a new type of chemotherapy–a cancer-fighting drug.

Understanding the genetic changes that increase the risk of getting cancer is really important in continuing to fight cancer. Hopefully, this new finding will allow scientists and doctors to detect cancer sooner and make new-and-improved treatments. And all of this is from so-called “junk DNA”!

Not so rubbish after all, eh?


DNA: A letter that makes up the human genetic code, it can be A, T, G or C.

Protein: The building blocks of every cell in your body.

Gene: The set of DNA letters that spell the instruction for making one protein.

Chromosome: A massive string of DNA that is inside every cell of the body.

Non-coding DNA: The DNA letters that are not part of a gene, so do not spell out the instructions for a protein.

Genome: The name for the complete set of all of the DNA on all of the chromosomes.

Natural variation: Tiny differences between the genetic code of different people.

Cancer: A disease where some cells of the body are out of control and grow to form a lump, or tumour.

Chemotherapy: A drug that fights cancer cells.

Conflict of Interest Statement

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Original Source Article

Rose, A. M., Krishan, A., Chakarova, C. F., Moya, L., Chambers, S. K., Hollands. M., et al. 2018. MSR1 repeats modulate gene expression and affect risk of breast and prostate cancer. Ann. Oncol. 29(5):1292–1303. doi:10.1093/annonc/mdy082


[1] Rose, A. M., Krishan, A., Chakarova, C. F., Moya, L., Chambers, S., Hollands, M., et al. 2018. MSR1 repeats modulate gene expression and affect risk of breast and prostate cancer. Ann. Oncol. 29(5):1292–1303. doi:10.1093/annonc/mdy082

[2] Cancer Research UK Website Statistics. Available at: (Accessed: March 1, 2018).

[3] Kontos, C. K., and Scorilas, A. 2012. Kallikrein-related peptidases (KLKs): a gene family of novel cancer biomarkers. Clin. Chem. Lab. Med. 50(11):1877–91. doi:10.1515/cclm-2012-0247