I am building a script that takes exons from a particular gene and then plots them by their base pair coordinates on a 2D graph. Using data from Ensembl, I can match each exon to its respective transcript and protein domain. I use data from Ensemble to match each domain to the gene base pairs responsible for its formation; I can then plot the protein domains above their respective exon(s). However, my plot currently shows that certain introns are responsible for domain formations—in my particular case, gene ENSG00000111671 is coding domain SOCS_C from an intron — block I6.1. I am taking the raw coordinates from ensembl for both gene and protein—I was wondering how this could even be possible? Here is the image. Each row is a transcript; each blue block is an exon. Exons together are a block. In between the blue blocks are introns. The brown lines above the exon are protein domains---you can see where they overlap with the introns.