To the Swift: Proxies on Your Relay Team, Part Two
In part one of this article, we noted how much proxy speed contributes to the benefits of proxy services. We asked, “Can a proxy server speed up your internet transactions?” and discussed objective measures of proxy speed along with links to some speed testing services. Now let’s talk about working with proxies on your relay team. About what you can do to optimize proxy transmission speed, or at least keep from slowing down transmissions.
Latency is delay in connection and transmission of data between two computers located far apart and communicating over the Internet. The delay is generally due to the geographical distance and the number of “hops” between connecting servers.
The locations you choose can greatly affect a proxy’s speed in processing your requests. Optimum locations are close to you, and also close to the target site. A good choice of locations can help minimize latency.
Proxy and Target in the Same Country
You can reduce latency by choosing a proxy as close as possible to your target server. Services like ProxyMesh and WonderProxy provide proxy servers in many different locations. If you do use proxies in different continents from your target servers, you may encounter much higher latency.
Try a World Proxy
A world proxy is a proxy server that has outgoing IPs located all around the world. If a world proxy server is physically located in the US, and the outgoing IP is in your target country, requests through the world proxy will likely take at least 1 second longer than a direct request. But world proxies offer advantages. If you can’t find a proxy dedicated to a specific country, you may still find a world proxy that includes IP addresses for that country. ProxyMesh, for example, offers a world proxy access along with a custom header to target requests to some 37 countries.
Try an Open Proxy
Open proxies are free and available to all Internet users, and can forward requests from and to any site. Like other types of proxies, an open proxy offers online anonymity and privacy through concealment of IP address from web servers, since the server requests appear to originate from the proxy server. With an open proxy, however, this anonymity is not total. Also, open proxies are slower and more error prone than commercial proxies. But the tradeoff is a huge increase in quantity and diversity of IP addresses.
You can check speed connectivity by making sure your location and the remote site location are geographically close. But keep in mind that server locations may change. Some hosting providers may reassign blocks of IP addresses to a data center in a different geolocation from their original one. And it can take some time for the geo IP databases to update the IP location so that it is accurately stated in responses to location testing.
It’s good practice to periodically check the location of your proxy. You can use services like WhatIsMyIP.com to check the current geo IPs of a proxy server. Make sure you check geo IP over HTTPS to get an accurate reading.
Use Data Compression
To speed transmission and help control bandwidth usage, include an
Accept-Encoding header to take advantage
of compression options such as gzip. Most remote sites support at least one of
these content encoding methods.
Compression may be especially useful in speeding high-traffic research with sizeable amounts of data per request. Although many proxies will strip out identifying headers, they do not alter the content of your request. So, whichever compression headers you send to the remote server will be passed through the proxy, so that the proxy will send back the requested data in compressed format. This article on HTTP Compression provides more details.
Distribute requests over many IPs to reduce delayed responses and timeouts. A rotating proxy server can help you avoid rate limits and blocking by choosing a random IP for each request.
Rate limits are limits on the number of HTTP requests a user can make in a given period. You can get around them, for example, by changing your IP address frequently when you encounter a site or API that uses IP throttling or IP address rate limiting.
Proxy servers typically have request & response timeouts. If the remote
server does not respond in that time, you will get a
408 response code. If you need to wait longer
for a complete response, you may be able to use a custom header to specify the
number of seconds you want to wait.
If a significant portion of your requests are timing out (the
408 response code), here are a few
- The network connections between you and the proxy and/or between the proxy and the remote site could be unreliable. Try some different proxies and see if that fixes the problem.
- The pages you’re requesting take a long time to load. If a proxy server normally waits for 20 seconds, you could increase that time with a custom timeout header.
- The proxy IPs have been blocked by the remote site. If you think this is the case, then you’ll want to switch proxies, and ideally use multiple proxies to distribute your requests.
Some other time-saving strategies:
- Reduce time and number of requests from the same IP address using rotating proxies, which helps prevent rate limits and blocking.
- Avoid timeouts by using proxies located near your target sites. Try configuring a custom request header for this result when using a world proxy.
301responses (i.e., a site has been permanently moved) by scripting your request to follow redirection.
Other Good Practices
Here are more recommended practices that can speed your proxy responses and minimize timeouts.
- Reduce the number of concurrent requests from a single IP. This could involve using an additional IP for crawling, or slowing down your crawl rate on your current requests.
- With added proxies, you have more connection strategies available, such as putting all of your authorized proxies in a list in your code or script, then randomly choosing one proxy for each request.
Speed is essential in a proxy operation. Measuring proxy speed can be complex. But you can find reliable services to help you understand and measure proxy speed. Try using the strategies we’ve outlined here to maximize the benefits of proxies.