LXML is a nice little document parser for lightweight and effective HTML/XML parsing without using regular expressions. The module can be installed with relative ease using pip and works for Python 2 and 3. Let’s get the token and expire form values from NYTimes site for an example.

Installation of LXML

# Install lxml using pip3
pip3 install lxml

# Verify it
pip3 list

Using LXML

# Import LXML parser
import lxml.html
import requests

# Use requests library to get the URL
htmlstr = requests.get('')

# Create an HTML tree
htmltree = lxml.html.document_fromstring(htmlstr.content)

# Use XPath to get Token value
for input_el in htmltree.xpath("//input[@name='token']/@value"):
 token_val = input_el

# Use XPath to get Expires value
for input_el_2 in htmltree.xpath("//input[@name='expires']/@value"):
 expires_val = input_el_2

# Printing it all out
print (token_val)
print (expires_val)


If all went well, you should see something like this on your terminal:


About Ali Gajani

Hi. I am Ali Gajani. I started Mr. Geek in early 2012 as a result of my growing enthusiasm and passion for technology. I love sharing my knowledge and helping out the community by creating useful, engaging and compelling content. If you want to write for Mr. Geek, just PM me on my Facebook profile.