Can’t Keep Up With IS-1

I woke up this morning to find a new file on IS-1.  I downloaded it and started banging on it.

An hour later I refreshed the page and the same file’s timestamp had changed.  I never noticed this before so I’m starting to wonder whether it hasn’t done this all along.  If so, this site has the richest supply of proxies on the Internet.

I’m at the limit of my processing power importing three file simultaneously on the AMD64x2 box, so I may have to enlist another VM if the file updates again today.  Or I can just start stockpiling data and catch-as-catch can.

-= UPDATE 12:00PM =-

I have implemented a check once evry 15 minutes on this file and it appears it is refreshed every 30 minutes, like clockwork.  It’s not a new file, but an update.  The file always has about 250,000 proxies so I’ll need to hack out a diff to make this manageable.

-= UPDATE 1:15PM =-

I hacked out the diff.  Using – surprise – diff!

This site just may max out my processing capabilities.  Right now the page says we have 995,000 proxies, but we’ve probably already gone over a million.

The page updates are taking almost an hour with the extra data.  The twelve o’clock run didn’t make it to the server until 12:46.  I may have to look at that code.  It checks the new proxies sequentially and with a 45 second timeout that can slow things down considerably.  There must be some multitasking opportunities in there somewhere.


0 Responses to “Can’t Keep Up With IS-1”

  1. Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s



%d bloggers like this: