1. Perform a site:yoursite.edu search in Google, displaying 100 results per page.
2. Save each page (Google will only give you 10 at most) into a folder named yoursite.edu
3. Download the shell script to the directory that contains the yoursite.edu directory.
4. At the command prompt, type:
./google-results-parse yoursite.edu
5. OR, if you named the yoursite.edu directory something different, run this:
./google-results-parse yoursite.edu savedresultsdirectory
6. It will create a "savedresultsdirectory-parsed" directory, which will contain a "domainlist" file and a "pagelinks" directory. The "domainlist" gives the subdomain breakdown of the search results. The "pagelinks" folder contains files for each subdomain that include all of the search result URLs for that subdomain.
Download the file here.