Help - Search - Members - Calendar
Full Version: programming help
Hydrogenaudio Forums > Misc. > Off-Topic
pyrosb
well actually my dad does, and i have next to no experience with this sort of thing. sad.gif
what i need it to do is go to parse a webpage and yank information out of a table. then it has to stick it in a comma delimited text file. i have no idea how to do this and would appreciate any help.
tigre
I'm not sure if I understand correctly what you're trying to do, but here's how I handle such things (without programming skills):

Highlight the table, copy it (Ctrl+C) -> Paste it to an Excel sheet (highlight the whole sheet, press Ctrl+V) -> Highlight the needed entries in the Excel sheet only and Copy them again (Ctrl+C) -> Paste them to notepad.
Now the table entries are separated by tabulators. Using notepad's find+replace you can replace all tabulators by "," or whatever you want.
sthayashi
The simplest way that I know of is to do a search for the string "<td>" or "<TD>". Then read in every character into a string (C++ Vector may be best for this) after that until you get the following five characters IN THIS ORDER: "</td>" or "</TD>"

Here's an example that looks a lot like C, except that I'm not going to run this through a compiler to find out if it is or not.
CODE

while (readchar <> EOF) {
 read(readchar, fromFile);
 if (readchar == "<") {
   read(readchar, fromFile);
   if ((readchar ==  "t")
        || (readchar == "T")) {
      read(readchar, fromFile);
      if ((readchar == "d")
           || (readchar == "D")) {
         read(readchar, fromFile);
         if (readchar == ">") {
             tableString[n] = StringRead(fromFile);
             n++;
         }
      }
   }
 }
}

And StringRead(fromFile) would look something like:
CODE

string StringRead(File fromFile)
{
  char char[5];
  string buffer;
  read(char[4], fromFile);
  if (char[4] == EOF)
     exit("ERROR!!! Unexpected end of file");
  read(char[3], fromFile);
  if (char[3] == EOF)
     exit("ERROR!!! Unexpected end of file");
  read(char[2], fromFile);
  if (char[2] == EOF)
     exit("ERROR!!! Unexpected end of file");
  read(char[1], fromFile);
  if (char[1] == EOF)
     exit("ERROR!!! Unexpected end of file");
  read(char[0], fromFile);
  if (char[0] == EOF)
     exit("ERROR!!! Unexpected end of file");
  while ( (char[4] <> "<") &&
             (char[3] <>"/") &&
             ((char[2] <> "t") || (char[2] <> "T")) &&
             ((char[1] <> "d") || (char[1] <> "D")) &&
             (char[0] <> ">")) {
         addToString(char[4], buffer);
         char[4] = char[3];
         char[3] = char[2];
         char[2] = char[1];
         char[1] = char[0];
         read(char[0], fromFile);
  }
return(buffer);
}

I don't claim to be a programmer, and I'm sure there are far more efficient ways of doing this. But this is the approach I'd take. Alternatively, you may want to look into AWK, because it may be able to do things like this.
pyrosb
great, thanks for the code, i'll have to look into AWK and see what i can find
Jasper
The C equivalent of the pseudo-code above would roughly (it needs a main function, some includes, perhaps some compiler-specific changes, etc.) be:
CODE

FILE* fromFile = fopen("somefile", "r");
char readchar;
while (!feof(fromFile)) {
fread(&readchar, 1, 1, fromFile);
if (readchar == '<') {
  fread(&readchar, 1, 1, fromFile);
  if ( readchar ==  't'
       || readchar == 'T' ) {
     fread(&readchar, 1, 1, fromFile);
     if ( readchar == 'd'
          || readchar == 'D' ) {
        fread(&readchar, 1, 1, fromFile);
        if (readchar == '>') {
            tableString[n] = StringRead(fromFile);
            n++;
        }
     }
  }
}
}

The StringRead function:
CODE

char* StringRead(FILE* fromFile)
{
 char char[5];
 char buffer_array[2048];
 char* buffer = buffer_array;
 fread(char, 1, 5, fromFile);
 if (feof(fromFile))
    exit("ERROR!!! Unexpected end of file");
 while (strnicmp(char, "</td>", 5)) {
   *buffer = char[0];
   ++buffer;
   char[0] = char[1];
   char[1] = char[2];
   char[2] = char[3];
   char[3] = char[4];
   fread(&char[4], 1, 1, fromFile);
 }
 return buffer_array;
}


Other than that, I have some code lying around that imports specific tables from an HTML file to an MS Access table.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.