2

I'm looking to do a POST request on this site:

http://web1.ncaa.org/stats/StatsSrv/careersearch

The form on the right has four dropdowns. When I run the code below, "School" stubbornly doesn't become selected. There's a hidden input that may be causing the problem, but I haven't been able to fix it. The javascript on the page doesn't seem to have an effect, but I could be wrong. Any help is appreciated:

#!/usr/bin/python

import urllib
import urllib2

url = 'http://web1.ncaa.org/stats/StatsSrv/careersearch'
values = {'searchOrg' : '30123','academicYear' : '2011','searchSport' : 'MBA','searchDiv' : '1'}

data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
the_page = response.read()

print the_page
hward86
  • 23
  • 2

2 Answers2

2

As you suspected, you're missing a hidden field: doWhat = 'teamSearch' (for submitting the form on the right).

Using these request values works for me:

values = {'doWhat':'teamSearch', 'searchOrg' : '30123','academicYear' : '2011','searchSport' : 'MBA','searchDiv' : '1'}
Lukas Graf
  • 30,317
  • 8
  • 77
  • 92
0

I used mechanize:

import mechanize
from BeautifulSoup import BeautifulSoup

mech = mechanize.Browser() 
mech.set_handle_robots(False)
response = mech.open('http://web1.ncaa.org/stats/StatsSrv/careersearch')
mech.select_form(nr=2)
mech.form['searchOrg'] = ['30123']
mech.form['academicYear'] = ['2011']
mech.form['searchSport'] = ['MBA']
mech.form['searchDiv'] = ['1']
mech.submit()
soup = BeautifulSoup(mech.response().read())

I know in mechanize the site was asking for searchOrg, academicYear, searchSport, searchDiv in sequence/list form. You should definitely be mindful of the robots.txt.

tijko
  • 7,599
  • 11
  • 44
  • 64