ALIGNMENT DNA SEQUENCE

September 30, 201114 yr

hi,

I'm PhD student, working in DNA sequence field , but with non biological data.

my problem is : I don't know which alignment I have to use with my data

global or local alignment

what is I must depend on it to decide working with global or local.

anyone need more details about my problem, let me know

if you need sample of my data, I can send it

I do appreciate any advise

thanks

September 30, 201114 yr

It depends on what you want to see. Local alignments are superior in detecting local similarities and are generally better suited for sequences that are overall dissimilar but with smaller conserved areas. Global alignments are better at aligning larger stretches of sequences, however. Again, it depends a lot on the type of sequence you got (e.g. overall similarity) and what you want to see.

October 1, 201114 yr

Author

thanks for this information,

Now, if the sequence is very long, how I can decide it has local similarity or overall similarity?

I do appreciate your advise.

thanks

October 1, 201114 yr

Well, most of the time you would have some idea how similar they are. If you really have no clue, then I would just try common two common algorithms for each case.

October 1, 201114 yr

Author

Well, most of the time you would have some idea how similar they are. If you really have no clue, then I would just try common two common algorithms for each case.

I'm very new in this field ,so I have no idea.

I do appreciate your help

thanks

Well, most of the time you would have some idea how similar they are. If you really have no clue, then I would just try common two common algorithms for each case.

I'm very new in this field ,so I have no idea.

I do appreciate your help

thanks

October 2, 201114 yr

Ok, do you have two or more sequences and more importantly, what do you want to see?

October 2, 201114 yr

Author

Ok, do you have two or more sequences and more importantly, what do you want to see?

thanks,

I have two sequences, so i have to use pairwise alignment. Right?

I'm not sure, but I think that my problem may be solved by using pairwise not multiple sequences.

I will talk briefly,

I try to find out the similarity(homology) among set of data to make clusters.

my dataset is from an online forum(social networks).

if you need more details or need sample of data , let me know.

October 5, 201114 yr

Yes pairwise is appropriate (though not precisely necessary). I am still not sure regarding how the sequence is supposed to look like and what kind of clusters you want to have (or what you mean with cluster for that matter). For starters I would just go for a tool that e.g. uses an implementation of the Smith-Waterman and take it from there. Again, since I am not quite sure what you really want to have I would just hunt down some tools and play around with it.

October 5, 201114 yr

Author

thanks,

I were waiting your reply , and badly need it.

Ok, you need more details about my work.

as I said earlier , I have not biological data (dataset from online forum) about users who are subscribers in an online forum.

these data represents actions of users over time. I would like to find clusters of users who are alike in terms of activities(behaviour) over time.

You said that the choice the type of alignment depend on data.

so,my last query were :how I can know that my data contains local similarity or global?

Did I be clear?

many many thanks

Yes pairwise is appropriate (though not precisely necessary). I am still not sure regarding how the sequence is supposed to look like and what kind of clusters you want to have (or what you mean with cluster for that matter). For starters I would just go for a tool that e.g. uses an implementation of the Smith-Waterman and take it from there. Again, since I am not quite sure what you really want to have I would just hunt down some tools and play around with it.

sorry, I forgot tell you that I converted my data into protein sequence .

I sent thread regarding this topic.

My data with range (0-1600), so need 11 bits to represent it as binary.

Then , convert it to protein seq.

as long as have 20 amino acid, I took each five bits and convert it to one amino acid.

is that representation proper?

October 5, 201114 yr

OK, I still do not understand the purpose, however the problem here that I see are the distance matrices. The substitutions in an amino acid are related to the changes on the base level (i.e. it is connected to the genetic code). So an amino acid exchange that only requires one base exchange is treated differently than one that takes to, for instance.

Since your amino acid string is based on a completely different system the distance estimations will be off. In fact, I think what you have is a simple computational problem that I am not qualified to solve. Since the string has no biological basis, you cannot apply the same theoretical framework. From what I understand the only reason to call it an amino acid sequence is because you use the same 20 letter code. I am going to move this to the computational science section, maybe someone else can look over that.

October 8, 201114 yr

Author

OK, I still do not understand the purpose, however the problem here that I see are the distance matrices. The substitutions in an amino acid are related to the changes on the base level (i.e. it is connected to the genetic code). So an amino acid exchange that only requires one base exchange is treated differently than one that takes to, for instance.

Since your amino acid string is based on a completely different system the distance estimations will be off. In fact, I think what you have is a simple computational problem that I am not qualified to solve. Since the string has no biological basis, you cannot apply the same theoretical framework. From what I understand the only reason to call it an amino acid sequence is because you use the same 20 letter code. I am going to move this to the computational science section, maybe someone else can look over that.

hi,

sir, please give me chance to give you more details about my work , it is not as you think.

I can not give details in online forum. I need your email. my email is halmamory@yahoo.com

please please,I badly need your advice

thanks in advance

October 8, 201114 yr

Mr. Huda, I've not worked on DNA Alignment problem, but I have experience in algorithms .. if you need any help in forming your algorithms,

anyway, you should know that an overall similarity depends on local similarities, and thus you have to plan how your local similarity can lead you to global optima

October 26, 201114 yr

Author

Mr. Huda, I've not worked on DNA Alignment problem, but I have experience in algorithms .. if you need any help in forming your algorithms,

anyway, you should know that an overall similarity depends on local similarities, and thus you have to plan how your local similarity can lead you to global optima

I left private message for you

October 29, 201114 yr

Problem: DNA Alignment

Type: Sequence Alignment -- see Wikipedia

Complexity: NP ?

Algorithms:

- Heuristic Search

- Linear Optimization

- Genetic Programming

- Probabilistic Methods

- Dynamic Programming

- Global Optimization

You have to specify your needs, do you prefer time over quality of solution, or you'd like a slow method that give good results ?.. the size of the DNA database matter too !

Based on those answers, you will be able to choose the algorithm that fits ...

Edited October 29, 201114 yr by khaled

Sign In

ALIGNMENT DNA SEQUENCE

Featured Replies

Archived

Important Information

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)