Login page issue - scraping website that requires login with Python

Question

First, let me say that I know there are solutions for scraping websites that require login, but I have a very specific query: my issue is that in my case the "login" page has a weird address ("/login" in the source code), and the code for logging in does not work. I need to scrape a database (to which I have authorized access), this is the source code of the login page:

   <form  id="login-box" class="box-primary neutral-login-box" action="/login" method ="post"> =$0

I used the following code to login:

payload = {'username': 'myusername', 'password': 'mypassword'}
    with requests.Session() as session:
        post = session.post('/login', data=payload)
        response= session.get('myURL')

I get an error "MissingSchema: Invalid URL '/login': No schema supplied.". I also tried putting the actual link to the login website or adding "/login" after the main website address, but then the login just does not work. Does anyone know a way to deal with this situation?

Possible duplicate of [Returning 403 Forbidden from simple get](https://stackoverflow.com/questions/49542986/returning-403-forbidden-from-simple-get) — ivan_pozdeev, Apr 24 '18 at 10:47

score 0 · Answer 1 · answered Apr 24 '18 at 10:21

0

This line:

post = session.post('/login', data=payload)

you should provide a full url link. Something like this:

post = session.post('http://google.com/login', data=payload)

answered Apr 24 '18 at 10:21

Cao Minh Vu

1,900
1
16
21

Login page issue - scraping website that requires login with Python

1 Answers1