<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div>On Jul 16, 2013, at 12:09 PM, Nicola Larosa <<a href="mailto:nico@tekNico.net">nico@tekNico.net</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite">Qui c'è la doc di text_content:<br><br><<a href="http://lxml.de/lxmlhtml.html#html-element-methods">http://lxml.de/lxmlhtml.html#html-element-methods</a>><br><br>Nello stesso posto trovi le doc di:<br><br>- find_class (se conosci la classe CSS degli elementi che ti<br> interessano);<br>- get_element_by_id (se conosci l'id dell'elemento che ti interessa):<br>- cssselect (per usare selettori CSS, molto potenti);<br>- un accenno a xpath, documentata altrove<br> <<a href="http://lxml.de/xpathxslt.html#xpath">http://lxml.de/xpathxslt.html#xpath</a>>, anche molto potente.<br><br>L'esempio usa find_class <<a href="http://lxml.de/lxmlhtml.html#examples">http://lxml.de/lxmlhtml.html#examples</a>>.<br></blockquote></div><br><div><br></div><div>In alternativa, per queste attività di web scraping io ho sempre utilizzato</div><div>BeautifulSoup (<a href="http://www.crummy.com/software/BeautifulSoup/">http://www.crummy.com/software/BeautifulSoup/</a>)</div><div><br></div><div>Btw:</div><div>[…]</div><div><span style="font-family: Times; background-color: rgb(255, 255, 255); ">Beautiful Soup sits on top of popular Python parsers like </span><a href="http://lxml.de/" style="font-family: Times; background-color: rgb(255, 255, 255); ">lxml</a><span style="font-family: Times; background-color: rgb(255, 255, 255); "> and </span><a href="http://code.google.com/p/html5lib/" style="font-family: Times; background-color: rgb(255, 255, 255); ">html5lib</a><span style="font-family: Times; background-color: rgb(255, 255, 255); ">, allowing you to try out different parsing strategies or trade speed for flexibility.</span></div><div><font style="background-color: transparent;">[…]</font></div><div><font style="background-color: transparent;"><br></font></div><div><font style="background-color: transparent;">--</font></div><div><font style="background-color: transparent;">Valerio</font></div><div><font style="background-color: transparent;"><br></font></div><div><br></div><div><br></div></body></html>