0

Sorry in advance if I am going about this completely wrong, as I am very new to jSoup.

I am trying to login to my school's grade website in order scrape grade data from it and display it in a program. The link to the login portal is "https://portal.mcpsmd.org/public/" however the page I am trying to scrape data from is located at "https://portal.mcpsmd.org/guardian/home.html#/termGrades".

This is the code I am trying to use, which I got from another similar Stack Overflow Question:

String url = "https://portal.mcpsmd.org/guardian/home.html#/termGrades";
Document doc =
Jsoup.connect("https://portal.mcpsmd.org/guardian/home.html#/termGrades")
            .data("fieldAccount","MY_SCHOOL_ID")
            .data("fieldPassword","MY_SCHOOL_PASSWORD")
            .userAgent("Mozilla")
            .post();
System.out.println(doc);

Currently when I run this program it prints out data from the login page, instead of the term grades page. I guess what I am trying to ask is what is the best way I can scrape data from a website that requires me to login first. I am currently trying to build an android app that can scrape data from my school's website, but I dont know how to get around this login screen.

Zoe
  • 27,060
  • 21
  • 118
  • 148
  • 1
    In general, Jsoup is a HTML parser library and not a Web Scraping engine. This means that if you want more than just "parse" html's, you will have to use something more robust. I suggest Selenium WebDriver as it is a good platform for both scraping engine (acts like a browser) and also for scraping. Of course, you can still use Jsoup as your parser if you would like. Here is some documentation: https://crossbrowsertesting.com/blog/test-automation/automate-login-with-selenium/ https://stackoverflow.com/questions/27720839/web-scrapping-with-jsoup-and-selenium – Adi Ohana Apr 22 '19 at 13:42
  • Look here - https://stackoverflow.com/questions/31871801/problems-submitting-a-login-form-with-jsoup/31877829#31877829 – TDG Apr 22 '19 at 18:28
  • @JamesMcGreivy please don't introduce the Android Studio tag on questions that aren't about the IDE. Use [android] or other related tags instead – Zoe Apr 23 '19 at 14:38

0 Answers0