Page 1 of 2
Trying to digest the Anime Studio Forum
Posted: Tue Dec 12, 2006 5:25 pm
by Rasheed
At the moment, I'm gathering tools to create a digest of the subjects on the AS forum. I'm using a Mac, and perhaps these tools are handy for other Mac users as well. My current OS is OS X 10.3.9 (Panther).
WebGrabber (freeware)
A piece of software to grab files from a website.
Finder (built-in software)
To remove the files I will not be working on. The files I will be working on are postings, with a common phrase "viewtopic.php?p=" in their filenames. If I simply list the php files in alphanumeric order, and remove all files which haven't got that magic "viewtopic.php?p=" in their filename, I'm left with the files I'll need.
Batches (freeware)
An AppleScript droplet you put in the Dock, and which enables me to mass (well, not too mass, otherwise it chokes the script) rename files. The files grabbed by WebGrabber have a strange format, formed after the name found on this forum, such as:
Code: Select all
viewtopic.php?p=10&sid=ad7b145925c85cd17980b9e9b84afaf8
which can't be opened with any program, unless you define that for each file, or add an appropriate extension to the filename.
EasyFind (freeware)
Very fast find utility (faster than Finder's search function). which has wildcard search (very handy indeed). For more powerful search I should really learn to use the built-in Unix command
grep via Terminal.
Posted: Tue Dec 12, 2006 8:45 pm
by bupaje
Great Idea. I did this on my old forums -the old fashioned way though. I had some very simple VB utilities I made to just cut off certain stuff and insert the text in a new template but I had to manually weed out text and non-useful responses. Still it can be very useful addition to the Wiki or similar.
I recommend if you do this include at the end of each
-url to original post
-original date
-people involved in main content (except for the "Wow great" reponses)
-applies to/or tested with if pertinent (might be a technique that won't work with later version for example) ie. Tested with Moho 5.4
-If it gets added to a Wiki then maybe later people can add corrections "Updated for AS3.0" or "Verified by _forum_user xxx using AS2.5" or "_forum_user xxx - this code acts differently after version 123. You can get the same result by ..."
Anyway great work on the forum Rasheed. Thanks for all your energetic help.
Posted: Tue Dec 12, 2006 11:36 pm
by Rasheed
I don't know about adding it to a wiki...
Anyway, when I wrote Batches for file renaming. Forget that. Use FileWrangler. You can find it on your favorite Mac software site (I couldn't find the developer's site). It has similar capabilities, but is 10 times faster (if not faster than that), because it's a Cocoa app, and not an AppleScript.
The structure of the complete forum is rather easy to create by using an appropriate four layer folder structure (forum -> forum sections -> topics -> pages). The naming is also pretty obvious. Nevertheless, there has to be a reference to the original forum, so I'll keep the original file names, with some modifications.
Now I know what to download and how to structure it, I can grab the entire forum contents and process it.
BTW Many of the questions that are repeatedly asked in the forum, are already answered by Lost Marble in the FAQ. People just don't seem to know that they have to look there first.
Posted: Wed Dec 13, 2006 2:51 am
by cribble
That batches tool is real handy, thanks.
Posted: Wed Dec 13, 2006 11:04 am
by Rasheed
Your welcome.
I guess when you're an AppleScript programmer, Batches might be a handy tool to write your code for, plus it already has some other useful stuff in it, though renaming is not fast enough for me with over 28,000 files, and it taking about 2 files per second on my system.
Posted: Wed Dec 13, 2006 11:40 am
by Rasheed
I found a neat url to watch watch everyone on the forum is currently watching. Some forums have this exposed, but on the Anime Studio forum this is a hidden feature.
http://www.lostmarble.com/forum/viewonline.php
Edit: When you encounter a page that needs user authentication, you'll be redirected to a login.php page. I don't really need those pages, and some other pages I don't need as well. Therefore, I filtered out these wildcard patterns in WebGrabber (skip these files):
Code: Select all
*login.php* *memberlist.php* *profile.php* faq.php* groupcp.php* posting.php* search.php* index.php*
Posted: Wed Dec 13, 2006 2:24 pm
by JCook
I've never tried Batches, but I've been using Renamer4Mac and it seems to work quite well. Here's a link if you want to check it out. Just another tool in the toolbox.
http://www.power4mac.com/renamer/
Jack
Posted: Thu Dec 14, 2006 10:04 am
by Rasheed
Thanks for pointing me to this tool, Jack.
I had problems with Webgrabber. It grabs in a recursive loop and kept downloading the same set of files (with another sid). I had downloaded 9000+ files, of which only 136 or so were unique.
So I changed tactics. I've downloaded the forum pages, on which each topic is listed. From there I can view every post on the forum, as long as I'm linked to the internet. The plan is to create a page with links to those topics, and a small description of the contents of the discussion.
However, I still wanted a program to download individual files from the forum. I could use a webbrowser, but that changes the files to locally stored pages (with images and all). I didn't want that. I wanted an exact copy of each page rendered by the forum. I had to search whole night and tried several solutions (even DarwinPorts by Open Darwin), but to no avail. Finally, I found
CocoaWget on a Japanese server (luckily, the page is in English). This app wasn't even listed in the usual software repositories (Versiontracker, MacUpdate), so I just got lucky. SimpleGet didn't do it for me, but CocoaGet did.
Thanks for the Japanese. They have given us the programming language Ruby (from Ruby on Rails), and several other cool apps you don't see in the West.
I will put this forum digestion project on hold for a while, so I can give my full attention to the Scripting Tutorial.
Posted: Thu Dec 14, 2006 7:11 pm
by cribble
Posted: Thu Dec 14, 2006 7:36 pm
by Rai López
Bu
uut... what a nice feature, isn't??
Thank you Lost Marb 
e
err
rm
m... RASHEED!!!

Posted: Thu Dec 14, 2006 9:54 pm
by Genete
Not hidden...

Posted: Thu Dec 14, 2006 10:12 pm
by Rai López
If... I... click... there... it brings to me... to your
"Putfile.com" files site! (...)
Hay que ver... qué fácil es reirse de la ignorancia de los demás...

(y qué diver!

)
PD: Sí, sí, Genete... ahora quita las risitas...

Posted: Fri Dec 15, 2006 12:25 am
by Genete
Creo que no me entendiste Ramón. Cuando digo Click here quiero decir que hagas click en el mismo sitio del dibujo pero en la pagina del foro. Haciendo click en "Who is Online" te lleva a dicho enlace.
Disculpadme si no entendí nada pero es que bromear y además en ingles no es lo mio.
Disculpas si ofendí a alguien.
I think that you did not understand me Ramón. When I say "Click here" I mean to do it at the same place but in the forum page. Doing click on "Who is Online" brings to that link.
I'm sorry if I didn't understand nothing but joking and in english is not my best.
Sorry if someone was offended.
Regards
Genete
Posted: Fri Dec 15, 2006 4:03 am
by Rai López
...Jajaj, pues era una broma todo, sí

...Y bueno, es que a mí también me hizo gracia no haber caido nunca en la cuenta del
"who is Online" ese después de tantos y tantos días por aquí metido

...Sólo eso!
...Bla bla bla, mu mu mu, la la la
...Bla bla bla, mu mu mu, la la la, bla bla bla, mu mu mu, la la la, bla bla bla, mu mu mu, la la la, bla bla bla "Mu mu Mu" la la la, bla, bla, bla, mu mu mu, la la la, bla bla bla, mu mu mu, la la la
...Bla bla bla!
PD/
S:

Posted: Fri Dec 15, 2006 8:37 am
by Genete
I got it now!.
Ramón López's signature wrote:wanted to be a Marble too...
Ahora is pongo las risitas. Buen rollo eh?

(a ver que traduzcan eso

)
Bye