680 Proxies Pulled from the Trash

Sometimes I scare myself.

The re-check is still running and almost finished. I believe the issue was our old friend, html2text.

If you’ve been following this madness, I have noted several times that html2text outputs some seriously screwy things when you pipe it to a file, but looks fine when it outputs to the screen. I think that is where the lost proxies went. I trashed it for the re-check code (the proxy judges are very simple Web pages anyway) and now even the junk filter is working, so there will be fewer “Undefined” proxies in the list.

This has worked so well I’m going to rewrite the purge, which does basically the same thing with the live proxies. The problem is, the purge runs on the VM with the database, whereas I wrote the re-checker on the AMD64x2 Mythbuntu box. I doubt seriously if the VM can handle it. I’ll work something out. Until it’s rewritten I’m discontinuing the purges, although I’ll probably run a few during testing.

html2text is also a big part of the Google Hack, so that’s going to have to get rewritten as well.


0 Responses to “680 Proxies Pulled from the Trash”

  1. Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s



%d bloggers like this: