I'm trying to log into a webpage using python's request library. It doesn't work and I think the main issue is that I'm forgetting to send some information with the request, but unfortunately I don't know how to figure out what exactly is missing.
Important:
This question is not about how to use python to log in into a webpage (there are already enough other questions that answer this [see here, here, etc.]). I'd like to know how to figure out from a given HTML page what I need to send to pass a login screen.
Example
Regardless of that, I think an example can't hurt.
The login I tried to pass is https://mangadex.org/login. Looking at the HTML I found
<input autofocus="" tabindex="1" type="text" name="login_username" id="login_username" class="form-control" placeholder="Username" required="">
<input tabindex="2" type="password" name="login_password" id="login_password" class="form-control" placeholder="Password" required="">
So my first attempt was:
import requests
url = 'https://mangadex.org/login'
payload = {'login_username' : 'XXXXXX',
'login_password' : 'YYYYYY'}
# Use 'with' to ensure the session context is closed after use.
with requests.Session() as s:
p = s.post(url, data=payload)
# print the html returned or something more intelligent to see if it's a successful login page.
print p.text
Unfortunately I just get redirected to the login screen. So there seems to be something "hidden" that gets send along the log-in information as suggested here, see step 1.3. The issue is that I don't really know if the above website has something like this (there were some hidden fields, but they don't seem to be involved in the log in process). If not, I really don't understand how I'm supposed to figure out what is missing.
TL;DR:
Given the html code of a webpage, how do I figure out from the html code what information is necessary to be sent to the website to successfully log in?