I've searched everywhere, and I can't seem to find any place that answers this for me... Would really be great if anyone could help.
So basically, I've recently made a program in C# that's supposed to check certain parts of a site's source code, and return the URL on a specific match. What I've done so far is that I've looped through the code with a while-loop, depending on a variable (member_id) which increases at each loop (unless a match is found). At every loop, the source code is also read from the domain website/profile/ + member_id (where website is the name of the website, obviously), and its content checked for possible matches.
Though, the major issue with this method is that it takes WAY too much time. I've calculated that if I'd scan through the website using my current method, it would take around a week (there are 500 000 pages that I want to check through. I'm not expecting it to go fast, but come on...). My question being, is there any way to solve this? I only need a small, small part of the source code from every page (only the head-element), so it feels a bit unnecessary to read everything else aswell.
Here's the code that I currently got (*website* and *match* is really something else):
using System;
using System.Net;
using System.Text;
using System.IO;
static void Main()
{
int member_id = 1;
while (member_id < 486252) // will change to non-constant value later
{
string URL = "http://*website*/profile/" + member_id;
StringBuilder content = new StringBuilder();
byte[] b = new byte[310]; // non-constant value will come later...
HttpWebRequest req = (HttpWebRequest) WebRequest.Create(URL);
HttpWebResponse res = (HttpWebResponse) req.GetResponse();
Stream response = res.GetResponseStream();
int x;
string translated;
do
{
x = response.Read(b, 0, b.Length);
if (x != 0)
{
content.Append(Encoding.ASCII.GetString(b, 0, x));
}
} while (x > 0);
member_id++;
if (content.ToString().Contains("<title>*Match*</title>"))
{
Console.WriteLine("Match has been found!"); // Just for debugging
member_id = 500000; // Lazy, temporary method
}
Console.WriteLine(content.ToString()); // Just for debugging
Console.WriteLine(content.ToString().Length); // Just for debugging
Console.ReadKey(); // Also just for debugging
}
}
I've tried messing around with the b array size, the Read parameters, and the GetString parameters. None of this worked, though I didn't expected that either. Right now, I'm kind of desperate for a solution to be honest.
I've gotten the part that actually finds the source code from a tutorial, with a few personal modifications. I started with C# yesterday, though I've been working with C++ and some other languages (like PHP) for a while. But generally, my knowledge is pretty basic when it comes to programming, so be understanding.
Thanks for reading.