A handy companion to handle URLs in Python

Kevin Tewouda
4 min readApr 10, 2023

--

furl at your rescue

Photo by Remotar Jobs on Unsplash

If you ever manipulate URLs in Python using the urlparse library, you may have felt frustrated because you had to juggle several APIs to get to your end. With furl you have an intuitive and uniform API to get the job done. I will introduce it in this blog post.

Installation

To install it, you will need python2.7 or higher (Yes! This is one of the rare libraries still supporting python 2!). Then, you can use pip or a modern tool like poetry to install it.

$ pip install furl

# or with poetry
$ poetry add furl

Usage

Here is a basic usage example.

from furl import furl

f = furl('https://username:password@example.com/some/path/?a=b#fragment')
print(f.scheme) # https
print(f.username) # username
print(f.password) # password
print(f.netloc) # username:password@example.com
print(f.host) # example.com
print(f.origin) # https://example.com
print(f.path) # /some/path
print(f.query) # a=b
print(f.fragment) # fragment

Here we get much information about the URL as we can always do with urlparse module. What is really cool is that we can easily change parts of the URL and the encoding of parameters is done automatically without having to juggle with APIs like quote.

Query

Here you can see how to add/remove query parameters.

from furl import furl

f = furl('https://example.com')
f.query.add({'one': 'two', 'hello': 'world'})
print(f.url)
# https://example.com?one=two&hello=world
f = f.remove(['one'])
print(f.url)
# https://example.com?hello=world

There is another way to see or handle query parameters using the args property.

f = furl('https://example.com?hello=world')
print(f.args) # {'hello': 'world'}

f.args['foo'] = 'bar'
print(f.args) # {'hello': 'world', 'foo': 'bar'}

del f.args['hello']
print(f.args) # {'foo': 'bar'}

furl also handles encoding seamlessly.

f = furl('https://example.com')
f.query.add({'param with space': 'hehe', 'an emoji!': '☺'})
print(f.url)
# https://example.com?param+with+space=hehe&an+emoji%21=%E2%98%BA

Path

We can access the different segments of a path.

from furl import furl

f = furl('https://www.google.com/a/large ish/path')
print(f.path)
# /a/large%20ish/path

print(f.path.segments)
# ['a', 'large ish', 'path']

Changing the path is easy.

f = furl('https://example.com/a/large ish/path')
f.path.segments = ['a', 'new', 'path']
print(f.path) # /a/new/path

f.path = 'or/this/way'
print(f.path) # or/this/way
print(f.path.segments) # ['or', 'this', 'way']

Note that setting the path attribute directly may cause you issues with a static code analyzer. This is because there is no property setter defined to handle the path attribute. Instead, it is handled with the __setattr__ method. I guess it is because the author wanted to support the slash operator to modify the path as follows:

f = furl('https://example.com')
f.path /= 'a'
print(f.path) # /a
f.path = f.path / 'new' / 'path'
print(f.path) # /a/new/path

So if you don’t want to deal with this issue, just use the first method consisting of changing the segments property.

We can know if a path ends with a slash by looking at the properties isdir and isfile.

f = furl('https://example.com/is/dir/')
print(f.path.isdir) # True
print(f.path.isfile) # False
f = furl('https://example.com/is/file')
print(f.path.isdir) # False
print(f.path.isfile) # True

We can normalize a path containing more slashes than needed.

f = furl('https://example.com////a/./b/lolsup/../c/')
f.path.normalize()
print(f.url) # https://example.com/a/b/c/

Fragment

It is worth mentioning that fragments can have a path and a query. Let’s see the following examples.

f = furl('https://example.com')
print(f.fragment) # None

f.fragment.path = 'hell'
print(f.fragment) # hell
print(f.url) # https://example.com#hell

f.fragment.path.segments.append('foo')
print(f.fragment) # hell/foo

f.fragment.query = 'one=two&hello=world'
print(f.fragment) # hell/foo?one=two&hello=world

del f.fragment.args['one']
f.fragment.args['fruit'] = 'apple'
print(f.fragment) # hell/foo?hello=world&fruit=apple

Miscellaneous

Of course, furl understands other types of URLs.

f = furl('file:///c:/Windows')
print(f.scheme) # file
print(f.origin) # file://
print(f.path) # /c:/Windows

We can set multiple parts of an URL with the set method.

f = furl('https://example.com')

# note that international domain names are handled
f.set(host='ドメイン.テスト', path='джк', query='☃=☺')
print(f.url)
# https://xn--eckwd4c7c.xn--zckzah/%D0%B4%D0%B6%D0%BA?%E2%98%83=%E2%98%BA

We can copy an URL object if you don’t want to alter the original one.

f1 = furl('https://example.com')
f2 = f1.copy().set(args={'one': 'two'}, path='/path')
print(f1.url) # https://example.com
print(f2.url) # https://example.com/path?one=two

We can join URLs. The idea is to join the furl object's URL with the provided relative or absolute URL and returns the furl object for method chaining.

f = furl('https://www.foo.com')
f.join('new/path')
print(f.url) # https://www.foo.com/new/path

f.join('../replaced')
print(f.url) # https://www.foo.com/replaced

f.join('path?query=yes#fragment')
print(f.url) # https://www.foo.com/path?query=yes#fragment

f.join('ftp://baba.com/path')
print(f.url) # ftp://baba.com/path

Finally, we can inspect various information about a furl object with the asdict method.

from pprint import pprint
from furl import furl

f = furl('https://xn--eckwd4c7c.xn--zckzah/path?foo=bar#frag')
pprint(f.asdict(), indent=4)

You will have this output:

{   'fragment': {   'encoded': 'frag',
'path': { 'encoded': 'frag',
'isabsolute': False,
'isdir': False,
'isfile': True,
'segments': ['frag']},
'query': {'encoded': '', 'params': []},
'separator': True},
'host': 'ドメイン.テスト',
'host_encoded': 'xn--eckwd4c7c.xn--zckzah',
'netloc': 'xn--eckwd4c7c.xn--zckzah',
'origin': 'https://xn--eckwd4c7c.xn--zckzah',
'password': None,
'path': { 'encoded': '/path',
'isabsolute': True,
'isdir': False,
'isfile': True,
'segments': ['path']},
'port': 443,
'query': {'encoded': 'foo=bar', 'params': [('foo', 'bar')]},
'scheme': 'https',
'url': 'https://xn--eckwd4c7c.xn--zckzah/path?foo=bar#frag',
'username': None}

This is all for this article, hope you enjoy reading it. Take care of yourself and see you soon! 🙂

If you like my article and want to continue learning with me, don’t hesitate to follow me here and subscribe to my newsletter on substack 😉

--

--

Kevin Tewouda
Kevin Tewouda

Written by Kevin Tewouda

Déserteur camerounais résidant désormais en France. Passionné de programmation, sport, de cinéma et mangas. J’écris en français et en anglais dû à mes origines.

No responses yet