<div dir="ltr">Io userei el automa di aho corasick.<div>In Bioinformatics si usa tanto. <a href="http://carshen.github.io/data-structures/algorithms/2014/04/07/aho-corasick-implementation-in-python.html">http://carshen.github.io/data-structures/algorithms/2014/04/07/aho-corasick-implementation-in-python.html</a></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">2017-11-26 10:20 GMT+01:00 Giuseppe Costanzi <span dir="ltr"><<a href="mailto:giuseppecostanzi@gmail.com" target="_blank">giuseppecostanzi@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">salve a tutti,<br>

<br>

ho una sequenza del tipo<br>

<br>

ACTGATCGATTACGTATAGTAGAATTCTAT<wbr>CATACATATATATCGATGCGTTCAT<br>

<br>

scorrendola devo trovare una sequenza target GAATTC<br>

<br>

ACTGATCGATTACGTATAGTA  "GAATTC"  TATCATACATATATATCGATGCGTTCAT<br>

<br>

<br>

quindi dividere la sequenza da G, la prima lettera della sequenza target,<br>

e calcolarmi la lunghezza dei due frammenti risultanti<br>

<br>

ACTGATCGATTACGTATAGTAG<br>

<br>

e di questa<br>

<br>

GAATTCTATCATACATATATATCGATGCGT<wbr>TCAT<br>

<br>

io avrei fatto<br>

<br>

seq = "<wbr>ACTGATCGATTACGTATAGTAGAATTCTAT<wbr>CATACATATATATCGATGCGTTCAT"<br>

target = "GAATTCT"<br>

<br>

s = []<br>

<br>

for i,base in enumerate(seq):<br>

<br>

    s.append(base)<br>

<br>

    if len(s)==8:<br>

<br>

        s.pop(0)<br>

<br>

        if ''.join(s) == target:<br>

             print i-5<br>

             print len(seq)-(i-5)<br>

             break<br>

<br>

funziona ma non e' che ne sia proprio convinto,<br>

avete suggerimenti su come iterare la sequenza?<br>

<br>

saluti<br>

beppe<br>

<br>

<br>

p.s.<br>

<br>

e' un esercizio che ho trovato su p4b, python for biologist<br>

<br>

<br>

"Let's start this exercise by solving the problem manually. If we look<br>

through the<br>

DNA sequence we can spot the EcoRI site at position 21. Here's the sequence with<br>

the base positions labelled above and the EcoRI motif in bold:<br>

<br>

in bold sarebbe questa GAATTCT<br>

<br>

012345678901234567890123456789<wbr>0123456789012345678901234<br>

ACTGATCGATTACGTATAGTAGAATTCTAT<wbr>CATACATATATATCGATGCGTTCAT<br>

Since the EcoRI enzyme cuts the DNA between the G and first A, we can figure out<br>

that the first fragment will run from position 0 to position 21, and the second<br>

fragment from position 22 to the last position, 54. Therefore the<br>

lengths of the two<br>

fragments are 22 and 33."<br>

______________________________<wbr>_________________<br>

Python mailing list<br>

<a href="mailto:Python@lists.python.it">Python@lists.python.it</a><br>

<a href="https://lists.python.it/mailman/listinfo/python" rel="noreferrer" target="_blank">https://lists.python.it/<wbr>mailman/listinfo/python</a><br>

</blockquote></div><br></div>