1

The goal is to login to a suntrust bank account and scrape information about checking account transaction data .

I have tried using request library and the selenium library . I am currently using selenium to see where the code fails .

import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

LOGIN_URL = 'https://login.onlinebanking.suntrust.com/olb/login'
userID = 'username'
password = 'password'

chrome_path= "path_to_chromedriver"
chrome_options=webdriver.ChromeOptions()
driver=webdriver.Chrome(chrome_path)

driver.get(LOGIN_URL)
time.sleep(5)
driver.get_cookies()
driver.find_element_by_id('userId').send_keys(userID)
driver.find_element_by_id('password').send_keys(password)
driver.find_element_by_class_name("suntrust-sign-on").click()

The program should successfully log the user in . However I receive an error message sayings ReasonCode = 6004.

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352

1 Answers1

2

I have modified your code a bit and tried to login as follows:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException

options = webdriver.ChromeOptions()
options.add_argument('start-maximized')
options.add_argument('disable-infobars')
options.add_argument('--disable-extensions')
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get("https://login.onlinebanking.suntrust.com/olb/login")

WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input.suntrust-input-text.ng-pristine.ng-valid.ng-touched#userId"))).send_keys("username")
driver.find_element_by_css_selector("input.suntrust-input-text.ng-untouched.ng-pristine.ng-valid#password").send_keys("password")
driver.find_element_by_css_selector("button.suntrust-sign-on.suntrust-button-text>span").click()

But was still unable to login.

Now on inspecting the DOM Tree of SUNTRUST - Online Banking Sign On login page you will find the following tags within the <body> tag:

  • <script type="text/javascript" src="dist/runtime.7d6aba6a1596ee0b757c.js"></script>
  • <script type="text/javascript" src="dist/polyfills.65913a8531010587b6fe.js"></script>
  • <script type="text/javascript" src="dist/scripts.46e57c2d57ad1b3d210d.js"></script>
  • <script type="text/javascript" src="dist/vendor.43f2240dc35276d98b10.js"></script>
  • <script type="text/javascript" src="dist/main.5d227767baa37ef78819.js"></script>

Snapshot

suntrust

The presence of the phrase dist is a clear indication that the website is protected by Bot Management service provider Distil Networks and the navigation by ChromeDriver gets detected and subsequently blocked.


Distil

As per the article There Really Is Something About Distil.it...:

Distil protects sites against automatic content scraping bots by observing site behavior and identifying patterns peculiar to scrapers. When Distil identifies a malicious bot on one site, it creates a blacklisted behavioral profile that is deployed to all its customers. Something like a bot firewall, Distil detects patterns and reacts.

Further,

"One pattern with **Selenium** was automating the theft of Web content", Distil CEO Rami Essaid said in an interview last week. "Even though they can create new bots, we figured out a way to identify Selenium the a tool they're using, so we're blocking Selenium no matter how many times they iterate on that bot. We're doing that now with Python and a lot of different technologies. Once we see a pattern emerge from one type of bot, then we work to reverse engineer the technology they use and identify it as malicious".


Reference

You can find a couple of relevant discussions in:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • is it possible to use proxy and then login may happen ? If yes and you know how to add that layer please share the solution. Thanks – Amit Jain Jun 10 '19 at 20:38
  • @AmitJain Sounds like a possible alternative solution worth giving a try but I am sure it will tough call to bypass Distil. – undetected Selenium Jun 10 '19 at 20:41
  • Would using other methods such as Scrapy or Request library have the same result? – Jonathan Cooper Jun 10 '19 at 20:48
  • @JonathanCooper Honestly at this point of time I would avoid any comment using _Scrapy_ or _Request library_. Of coarse they are worth a try. – undetected Selenium Jun 10 '19 at 20:50
  • 1
    Related: this question and it's answers I found really interesting back when I first read it: https://stackoverflow.com/questions/33225947/can-a-website-detect-when-you-are-using-selenium-with-chromedriver – QHarr Jun 11 '19 at 05:17