-1

First, let me say that I know there are solutions for scraping websites that require login, but I have a very specific query: my issue is that in my case the "login" page has a weird address ("/login" in the source code), and the code for logging in does not work. I need to scrape a database (to which I have authorized access), this is the source code of the login page:

   <form  id="login-box" class="box-primary neutral-login-box" action="/login" method ="post"> =$0  

I used the following code to login:

payload = {'username': 'myusername', 'password': 'mypassword'}
    with requests.Session() as session:
        post = session.post('/login', data=payload)
        response= session.get('myURL')

I get an error "MissingSchema: Invalid URL '/login': No schema supplied.". I also tried putting the actual link to the login website or adding "/login" after the main website address, but then the login just does not work. Does anyone know a way to deal with this situation?

Agaz Wani
  • 5,514
  • 8
  • 42
  • 62
  • Possible duplicate of [Returning 403 Forbidden from simple get](https://stackoverflow.com/questions/49542986/returning-403-forbidden-from-simple-get) – ivan_pozdeev Apr 24 '18 at 10:47

1 Answers1

0

This line:

post = session.post('/login', data=payload)

you should provide a full url link. Something like this:

post = session.post('http://google.com/login', data=payload)
Cao Minh Vu
  • 1,900
  • 1
  • 16
  • 21