Introduction
While doing web scraping, there are many chances of being blacklisted by the websites. So Tor is used when requests are to be made without disclosing IP address and it is very useful. Here, we will use a Python wrapper to assist you in using Tor.
TOR: Meaning
The Onion Project also known as TOR, is a global server network employed by the US Navy. While allowing for anonymous web browsing, TOR also serves as a nonprofit organization for the study and creation of online privacy tools.
TOR has two possible meanings.
- The computer program you install to use TOR
- the computer network that controls TOR connections
In other words, TOR enables you to direct the web traffic across several other computers and it becomes impossible for a third party to link traffic to a specific user. On the TOR network, random untraceable nodes would be visible to anyone attempting to look up the traffic.
Set up TOR
TOR is dependent on TorRequest. Firstly install TOR.
The instructions are intended for Debian and Ubuntu users. Check here to install it on a Mac or Windows computer.
Restarting the service of TOR
TOR Configuration
Let's generate a new password to stop unauthorized outside parties from randomly accessing the port.
Your new generated password will be a lengthy string of alphabets and numbers. Let's now make the necessary modifications to the TOR configuration file (torrc).
The location of the torrc file depends on the operating system you're using and the source from which you're downloading Tor.
We have three tasks to complete.
- As port 9051 is the one to which TOR will listen for any communication from apps speaking to the Tor controller, enable the "ControlPort" listener to listen on this port.
- Update the password generated
- Authenticate using cookies
To do this, uncomment and change the lines immediately below the part for locating hidden services.
Restart TOR after saving and exiting.
TOR is now ready to use.
Describe TorRequest
Making requests across TOR is made possible by TorRequest, a wrapper around requests and stem libraries. For example:
Torrequest can be installed using PyPI:
Try TorRequest first. Start up the Python terminal.
Give TOR your password
The response obtained was
Continuing same example with TorRequest
Conclusion
Your IP address has changed at this point. Reset TOR once more to obtain a fresh IP address.
TorRequests can now be used to quickly disguise your IP address in Python.
Want to know more about TOR and use TorRequest in Python? Contact Web Screen Scraping now or request a quote!