Showing posts with label shell script. Show all posts
Showing posts with label shell script. Show all posts

Monday, April 13, 2009

Shell script for Google search result parsing

This is the shell script I wrote to help me perform the analysis I did for Quest 5.

1. Perform a site:yoursite.edu search in Google, displaying 100 results per page.
2. Save each page (Google will only give you 10 at most) into a folder named yoursite.edu
3. Download the shell script to the directory that contains the yoursite.edu directory.
4. At the command prompt, type:
./google-results-parse yoursite.edu

5. OR, if you named the yoursite.edu directory something different, run this:
./google-results-parse yoursite.edu savedresultsdirectory

6. It will create a "savedresultsdirectory-parsed" directory, which will contain a "domainlist" file and a "pagelinks" directory. The "domainlist" gives the subdomain breakdown of the search results.  The "pagelinks" folder contains files for each subdomain that include all of the search result URLs for that subdomain.

Download the file here.

Open Ed. Quest 5 -- Searching for a Better Way (to Search)

Quest 5


"Many BYU faculty already openly share their syllabi and other course materials on personal websites, through iTunesU, and through other mechanisms ... Find as many of the open educational resources being shared by BYU faculty as you can..."

It seems to me that discoverability is really going to be the ultimate make-or-break hinge issue for OER.  One could produce world class, high quality OER that trumps everything that any institutional OER effort produces, and yet remain in complete obscurity with no hope of ever actually sharing these wonderful OER with anyone at all.  And after all, if you take the time and trouble to make some kind of resource with openness in mind, it seems silly to have it be completely worthless (or at least, gravely underused) in the end because you weren't able to put it somewhere that people would find it.

This post isn't going to discuss the hows and whys of publishing open educational content for maximum discoverability. We'll save that for another time.  However, Quest 5 gives us the specific assignment to comb over BYU's web presence looking for faculty-produced OER content, and it begs the question, "How would one go about finding all of the OER on a university's web space?"

The task is not trivial.