0

I have checked many posts, but I can't find what I am doing wrong. I am trying to login to a URL but after the request.post, the printed result is still the sign in page. Can you tell me what am I doing wrong ? Here is the HTML

<div class="sign-in-section form-section">
<form class="simple_form form-vertical" novalidate="novalidate" action="/users/sign_in" accept-charset="UTF-8" method="post">
    <input name="utf8" type="hidden" value="✓">
    <input type="hidden" name="authenticity_token" value="TneAOk3yv/mEeuKsJucEN1YUSng7+EJc5YfBqKxwugv6gq2lxsZIjGecFwOK/jA0fYYF3aRb9ih15glcoHCWkg==">
    <div class="form-group email optional user_email">
        <input type="text" class="string email optional form-control" placeholder="Email" value="" name="user[email]" id="user_email">
    </div>
    <div class="password-and-submit-wrapper">
        <div class="form-group password optional user_password">
            <input class="password optional form-control" placeholder="Password" type="password" name="user[password]" id="user_password">
        </div>
        <div class="sign-in-button-and-forgot-password-link-wrapper">
            <div class="forgot-password-link-wrapper">
                <a class="forgot-password-link" href="/users/password/reset_email">Forgot password?</a>
            </div>
            <div class="sign-in-button">
                <input type="submit" name="commit" value="Sign in" class=" submit elcurator-button">
            </div>
        </div>
    </div>
</form>
</div>

And here is my code which works but doesn't take me to the after-login page:

# Import libraries
import requests
import urllib.request
import time
from bs4 import BeautifulSoup
from lxml import html

login_url = "https://www.elcurator.net/users/sign_in"
url = "https://www.elcurator.net/shared_articles"
    
#Login
USERNAME = "some_email"
PASSWORD = "some_password"

def main():
    headers = {
    "User-Agent":
        "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36",
    }
    session_requests = requests.session()
    session_requests.headers.update(headers)
    result = session_requests.get(login_url,verify=False)
    tree = html.fromstring(result.text)
    authenticity_token = list(set(tree.xpath("//input[@name='authenticity_token']/@value")))[0]

# Create payload
    payload = {
        "user[email]": USERNAME,       
        "user[password]": PASSWORD,     
        "authenticity_token": authenticity_token
    }
    
    result = session_requests.post(login_url, data = payload, headers = dict(referer=login_url))
    print(result)

# Connect to the URL
    response = requests.get(url, verify=False)
    soup = BeautifulSoup(response.text, "html.parser")
    print(soup)

if __name__ == '__main__':
    main()

The output shows "you have to login first" and other messages that let me know that I am not logged in. What's wrong?

Danisotomy
  • 67
  • 2
  • 10
  • Whether you're logged in or not is stored in cookies: https://stackoverflow.com/questions/7164679/how-to-send-cookies-in-a-post-request-with-the-python-requests-library (and please don't add random to tags to your questions just to get to five) –  Nov 16 '19 at 12:24
  • i think its moe of a problem with the cookies not being set, in response to your login call – amatrue Nov 16 '19 at 12:26

2 Answers2

1

Thanks it worked. Had to change these lines

result = session_requests.post(login_url, data = payload, headers = dict(referer=login_url), cookies=result.cookies)
response = requests.get(url, verify=False,cookies=result.cookies)
Danisotomy
  • 67
  • 2
  • 10
0

@Danisotomy You can try parsing the cookie from sign_in call's response headers, and send them in every subsequent request.

amatrue
  • 61
  • 5