Python web scraping on site that requires login every time

Question

I've tried a few different methods for this, detailed here How to scrape a website that requires login first with Python https://kazuar.github.io/scraping-tutorial/

The problem I've realized is that the website I am attempting to scrape requires a login every time (there's no "keep me logged in" check box) so when I make the request (with cookies passed in) I still get

response=requests.get("URL behind login", headers=headers, cookies=cookies)

Did you perhaps mean to paste an error of some sort? Sadly the answer here may be very dependent on the login mechanism the site uses, etc, but a quick search in SO pointed me to another potential route for you given you are using `requests`: https://stackoverflow.com/questions/12737740/python-requests-and-persistent-sessions — shawnwall, Sep 07 '21 at 18:00

score 0 · Answer 1 · answered Sep 07 '21 at 17:59

0

cookiesObject = await page._client.send('Network.getAllCookies')
print(cookiesObject)
with open('cookiesfile.json', 'w', encoding='utf-8') as f:
     json.dump(cookiesObject, f, ensure_ascii=False, indent=4)
with open('cookiesfile.json') as data_file:
    data_loaded = json.load(data_file)
for cookie in data_loaded:
    await page.setCookie(cookie)
await page.goto('url')

Go by this method for saving current browsing session for next use

answered Sep 07 '21 at 17:59

Zafar Abbas

27
5

Please provide additional details in your answer. As it's currently written, it's hard to understand your solution. – Community Sep 07 '21 at 20:19

Python web scraping on site that requires login every time

1 Answers1