First, let me say that I know there are solutions for scraping websites that require login, but I have a very specific query: my issue is that in my case the "login" page has a weird address ("/login" in the source code), and the code for logging in does not work. I need to scrape a database (to which I have authorized access), this is the source code of the login page:
<form id="login-box" class="box-primary neutral-login-box" action="/login" method ="post"> =$0
I used the following code to login:
payload = {'username': 'myusername', 'password': 'mypassword'}
with requests.Session() as session:
post = session.post('/login', data=payload)
response= session.get('myURL')
I get an error "MissingSchema: Invalid URL '/login': No schema supplied.". I also tried putting the actual link to the login website or adding "/login" after the main website address, but then the login just does not work. Does anyone know a way to deal with this situation?