2

I'm trying to login in this page using python-requests

headers = {
    'content-type': 'application/x-www-form-urlencoded',
    'User-Agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/33.0.1750.152 Chrome/33.0.1750.152 Safari/537.36'
}

data = {
    'username':myusername,
    'password':mypassword,
}
r = requests.post(url,data=data,headers=headers)

I tried to print the returned response via print r and output was <Response [200]> but html page was of login page but I was expecting html of someother page we will be redirected to after login.

user3628682
  • 685
  • 1
  • 11
  • 19
  • You should *not* set the content type header; requests will take care of that for you depending on the requirements of the form. A POST defaults to `application/x-www-form-urlencoded` anyway. – Martijn Pieters May 18 '14 at 09:53

2 Answers2

4

The login form contains a several hidden fields:

<input type="hidden" name="lt" value="LT-1314930-GPfgUfyUj5eRY4RCaoa1Xi3gi5Jfsf" />
<input type="hidden" name="execution" value="e3s1" />
<input type="hidden" name="_eventId" value="submit" /> 

Most likely the first, and perhaps the second field are auto-generated and tied to the session. You'll need to load the login page first (using a session), parse those fields and include them in your POST.

The reason you get 200 responses is that the site redirects unauthorized requests back to the login page; check r.history, there will be one or more 302 responses in that list.

You could use BeautifulSoup to parse this, or use robobrowser, which combines requests and BeautifulSoup, together with a dedicated form handler to make a browser-like framework for navigating a website:

from robobrowser import RoboBrowser

browser = RoboBrowser(history=True,
    user_agent='Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/33.0.1750.152 Chrome/33.0.1750.152 Safari/537.36')
browser.open('http://selleraccounts.snapdeal.com/')

form = browser.get_form(id='fm1')
form['username'].value = myusername
form['password'].value = mypassword
browser.submit_form(form)
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • whoaa !! this looks amazing tool. but what do I do now. I'm lost here. Do I still need to use requests/urllib in my program or robobrowser alone can do all things. would be great if you just give me hint how to proceed further. I tried to look robobrowser tutorials but all I find API. – user3628682 May 18 '14 at 09:53
  • Robobrowser is relatively new, so example code is still sparse. You don't need anything else, really. You can get the original response object as `browser.state.response`. – Martijn Pieters May 18 '14 at 10:04
  • thanks :) I used `browser.parsed` for complete text. is it correct? I have one more query. will robobrowser maintain session automatically, I dont want to login every time. simillar to `requests.Session()` – user3628682 May 18 '14 at 10:30
  • Yes, `.parsed` is the BeautifulSoup parse tree. The `browser` object has several methods that let you search it too (like `find()` and `find_all` and `select`). Robobrowser uses a requests Session under the hood. – Martijn Pieters May 18 '14 at 10:43
0

Two things: 1. Just because the response code is 200 for you login requests doesn't mean it's succeeding. It could be that the developer isn't following REST guidelines for whatever reason and returns a 200 with the body indicating an error.

  1. When I tried to login to the website your provided with my Chrome developer tools enabled, I inspected the traffic and observed that the website is transmitting more than just the username and password. Specifically there are 4 other fields:

    username:adb  
    password:asdf    
    lt:LT-1315009-vg7Xm5MTSfBYkGNuaiUbAFZqVZNmoP   
    execution:e2s1   
    _eventId:submit   
    submit:LOGIN
    

I suspect that some of them are anti-CSRF tokens, which you might need to scrape from the login page that you initially receive, but regardless I don't think the login request will go through unless you provide the correct value for each of these fields.

gilsho
  • 921
  • 7
  • 11