1

I would like to log-in to facebook messenger and parse the HTML.

import requests
from bs4 import BeautifulSoup
import webbrowser
page = requests.get("https://www.messenger.com", auth=
('username', 'password'))

soup = BeautifulSoup(page, 'html.parser')

print(soup)

I got this from another stack question but it is throwing me this error:

    File "C:/Code/Beautiful Soup Web Scraping.py", line 7, in <module>
    soup = len(BeautifulSoup(page, 'html.parser'))
  File "C:\Users\Ethan\AppData\Local\Programs\Python\Python37\lib\site-packages\bs4\__init__.py", line 246, in __init__
    elif len(markup) <= 256 and (
TypeError: object of type 'Response' has no len()

How can I get this to work?

SquidLord
  • 13
  • 3
  • Try to print page in order to be sure that you're login was made, as I'm guessing this auth is not being passed correctly, and for this reason the next line is failing – Luan Naufal Nov 30 '18 at 20:26
  • For scraping, the most efficient way to active those results is using Selenium, as you can easily store cooking based on user profiles: https://www.seleniumhq.org/ – Luan Naufal Nov 30 '18 at 20:27
  • in any case, take a look on how to use cookies with cookiejar and bs4: https://stackoverflow.com/questions/23102833/how-to-scrape-a-website-which-requires-login-using-python-and-beautifulsoup – Luan Naufal Nov 30 '18 at 20:35

2 Answers2

1

You must pass to BeautifulSoup the content of the web page, not the Response object returned by requests.get. To get the content use the Response.content property.

In your example use : soup = BeautifulSoup(page.content, 'html.parser')

Louis Saglio
  • 1,120
  • 10
  • 20
1

I would recommend using Selenium, which will allow you to login to Facebook, navigate to the desired page, and retrieve the html. You can then pass the HTML to BeautifulSoup. Take a look at this blog post to get started.

tdurnford
  • 3,632
  • 2
  • 6
  • 15