requests based www backend (tendril.utils.www.bare)

TODO Some introduction

tendril.utils.www.req.requests_cache = <cachecontrol.caches.file_cache.FileCache object>

The module’s cachecontrol.caches.FileCache instance which should be used whenever cached requests responses are desired. The cache is stored in the directory defined by tendril.config.REQUESTS_CACHE. This cache uses very weak permissions. These should probably be fine tuned.


Given a heuristic, constructs and returns a cachecontrol.CacheControlAdapter attached to the instance’s requests_cache.

tendril.utils.www.req.get_session(target='http://', heuristic=None)[source]

Gets a pre-configured requests session.

This function configures the following behavior into the session :

  • Proxy settings are added to the session.

  • It is configured to use the instance’s requests_cache.

  • Permanent redirect caching is handled by CacheControl.

  • Temporary redirect caching is not supported.

Each module / class instance which uses this should subsequently maintain it’s own session with whatever modifications it requires within a scope which makes sense for the use case (and probably close it when it’s done).

The session returned from here uses the instance’s REQUESTS_CACHE with a single - though configurable - heuristic. If additional caches or heuristics need to be added, it’s the caller’s problem to set them up.


The caching here seems to be pretty bad, particularly for digikey passive component search. I don’t know why.

  • target – Defaults to 'http://'. string containing a prefix for the targets that should be cached. Use this to setup site-specific heuristics.

  • heuristic (cachecontrol.heuristics.BaseHeuristic) – The heuristic to use for the cache adapter.

Return type


tendril.utils.www.req.get_soup_requests(url, session=None)[source]

Gets a bs4 parsed soup for the url specified by the parameter. The lxml parser is used.

If a session (previously created from get_session()) is provided, this session is used and left open. If it is not, a new session is created for the request and closed before the soup is returned.

Using a caller-defined session allows re-use of a single session across multiple requests, therefore taking advantage of HTTP keep-alive to speed things up. It also provides a way for the caller to modify the cache heuristic, if needed.

Any exceptions encountered will be raised, and are left for the caller to handle. The assumption is that a HTTP or URL error is going to make the soup unusable anyway.