8

I want to donwload a csv file by using Caperjs. This is what I wrote:

var login_id = "my_user_id";
var login_password = "my_password";

var casper = require('casper').create();

casper.userAgent('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36 ');

casper.start("http://eoddata.com/symbols.aspx",function(){
    this.evaluate(function(id,password) {
        document.getElementById('tl00_cph1_ls1_txtEmail').value = id;
        document.getElementById('ctl00_cph1_ls1_txtPassword').value = password;
        document.getElementById('ctl00_cph1_ls1_btnLogin').submit();

    }, login_id, login_password);
});

casper.then(function(){
    this.wait(3000, function() {
        this.echo("Wating...");
    });
});

casper.then(function(){
    this.download("http://eoddata.com/Data/symbollist.aspx?e=NYSE","nyse.csv");
});

casper.run();

And I got nyse.csv, but the file was a HTML file for registration of the web site.

It seems login process fails. How can I login correctly and save the csv file?

2015/05/13

Following @Darren's help, I wrote like this:

casper.start("http://eoddata.com/symbols.aspx");
casper.waitForSelector("form input[name = ctl00$cph1$ls1$txtEmail ]", function() {
  this.fillSelectors('form', {
    'input[name = ctl00$cph1$ls1$txtEmail ]' : login_id,
    'input[name = ctl00$cph1$ls1$txtPassword ]' : login_password,
    }, true);
});

And this code ends up with error Wait timeout of 5000ms expired, exiting.. As far as I understand the error means that the CSS selector couldn't find the element. How can I find a way to fix this problem?

Update at 2015/05/18

I wrote like this:

casper.waitForSelector("form input[name = ctl00$cph1$ls1$txtEmail]", function() {
    this.fillSelectors('form', {
        'input[name = ctl00$cph1$ls1$txtEmail]' : login_id,
        'input[name = ctl00$cph1$ls1$txtPassword]' : login_password,
    }, true);
}, function() {
    fs.write("timeout.html", this.getHTML(), "w");
    casper.capture("timeout.png");
});

I checked timeout.html by Chrome Developer tools and Firebugs, and I confirmed several times that there is the input element.

<input name="ctl00$cph1$ls1$txtEmail" id="ctl00_cph1_ls1_txtEmail" style="width:140px;" type="text">

How can I fix this problem? I already spent several hours for this issue.

Update 2015/05/19

Thanks for Darren, Urarist and Artjom I could remove the time out error, but there is still another error.

Downloaded CSV file was still registration html file, so I rewrote the code like this to find out the cause of error:

casper.waitForSelector("form input[name ='ctl00$cph1$ls1$txtEmail']", function() {
    this.fillSelectors('form', {
        "input[name ='ctl00$cph1$ls1$txtEmail']" : login_id,
        "input[name ='ctl00$cph1$ls1$txtPassword']" : login_password,
    }, true);
});/*, function() {
    fs.write("timeout.html", this.getHTML(), "w");
    casper.capture("timeout.png");
});*/

casper.then(function(){
    fs.write("logined.html", this.getHTML(), "w");
});

In the logined.html user email was filled correctly, but password is not filled. Is there anyone who have guess for the cause of this?

ironsand
  • 14,329
  • 17
  • 83
  • 176

2 Answers2

1

At first glance your script looks reasonable. But there are a couple of ways to make it simpler, which should also make it more robust.

First, instead of your evaluate() line,

this.fillSelectors('form', {
  'input[name = id ]' : login_id,
  'input[name = pw ]' : login_password,
  }, true);

The true parameter means submit it. (I guessed the form names, but I'm fairly sure you could continue to use CSS IDs if you prefer.)

But, even better is to not fill the form until you are sure it is there:

casper.waitForSelector("form input[name = id ]", function() {
  this.fillSelectors('form', {
    'input[name = id ]' : login_id,
    'input[name = pw ]' : login_password,
    }, true);
});

This would be important if the login form is being dynamically placed there by JavaScript (possibly even from an Ajax call), so won't exist on the page as soon as the page is loaded.

The other change is instead of using casper.wait(), to use one of the casper.waitForXXX() to make sure the csv file link is there before you try to download it. Waiting 3 seconds will go wrong if the remote server takes more than 3.1 seconds to respond, and wastes time if the remote server only takes 1 second to respond.

UPDATE: When you get a time-out on the waitFor lines it tells you the root of your problem is you are using a selector that is not there. This, I find, is the biggest time-consumer when writing Casper scripts. (I recently envisaged a tool that could automate trying to find a near-miss, but couldn't get anyone else interested, and it is a bit too big a project for one person.) So your troubleshooting start points will be:

  • Add an error handler to the timing-out waitFor() command and take a screenshot (casper.capture()).
  • Dump the HTML. If you know the ID of a parent div, you could give that, to narrow down how much you have to look for.
  • Open the page with FireBug (or tool of your choice) and poke around to find what is there. (remember you can type a jQuery command, or document.querySelector() command, in the console, which is a good way to interactively find the correct selector.)
  • Try with SlimerJS, instead of PhantomJS (especially if still using PhantomJS 1.x). It might be that the site uses some feature that is only supported in newer browsers.
Darren Cook
  • 27,837
  • 13
  • 117
  • 217
  • Thanks for great tips! But I ended up with a error "Wait timeout of 5000ms expired, exiting.". I add more detail to my question. – ironsand May 13 '15 at 06:41
  • Sorry, I didn't notice your answer's update. I followed your advice, but I couldn't solve the problem. All I know is that casperJS can't find the input element. – ironsand May 18 '15 at 05:10
  • 2
    @ironsand An attribute value may should be wrapped with `"` because [CasperJS uses CSS3 selector or XPath](http://casperjs.readthedocs.org/en/latest/selectors.html), not jQuery. – unarist May 18 '15 at 07:39
  • 2
    @unarist I personally always wrap my attribute selectors in `"`. But I just did a test of removing then in a working casper test (a login form), and it continued to work perfectly (both with slimerjs and phantomjs 1.9). However I noticed the OP is using IDs with `$`. It may be that special characters require double quotes (unfortunately I cannot test that so easily...) – Darren Cook May 18 '15 at 08:20
  • @ironsand Do try proper quoting, as suggested by unarist, but it still no luck then the problem is (almost certainly) you are using the wrong selector. If you followed the troubleshooting advice I gave, you cannot help but find the correct selector. Why not update the question with the HTML of the login form. Also, when using Firebug, remember you can type a jQuery command (or `document.querySelector()`) in the console, which is a good way to interactively confirm you have the correct selector. – Darren Cook May 18 '15 at 08:26
  • 1
    @ironsand You are getting a time-out on the waitForSelector, aren't you? So it is the HTML for the
    tag you need to be paying attention to. BTW instead of `'input[name = "ctl00$cph1$ls1$txtEmail" ]'` you could just use `'#ctl00_cph1_ls1_txtEmail'` :-)
    – Darren Cook May 18 '15 at 10:28
  • 2
    @ironsand I just tried quoting the attribute selector value with PhantomJS 2. The element was found with the quotes and the wait timed out without the quotes. Unarist's hunch was correct: `waitForSelector("form input[name = 'ctl00$cph1$ls1$txtEmail' ]", function() {...` – Artjom B. May 18 '15 at 13:00
  • Thanks for help! Finally I could remove the time out error, but somehow password filed is not filled by the value although email field is filled correctly. I added more detail about this issue in my question. I'm very grateful for any help. – ironsand May 19 '15 at 03:11
1

The trick is to successfully log in. There are multiple ways to login. I've tried some and the only one that works on this page is by triggering the form submission using the enter key. This is done by using the PhantomJS page.sendEvent() function. The fields can be filled using casper.sendKeys().

var casper = require('casper').create();

casper.userAgent('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36 ');

casper.start("http://eoddata.com/symbols.aspx",function(){
    this.sendKeys("#ctl00_cph1_ls1_txtEmail", login_id);
    this.sendKeys("#ctl00_cph1_ls1_txtPassword", login_password, {keepFocus: true});
    this.page.sendEvent("keypress", this.page.event.key.Enter);
});

casper.waitForUrl(/myaccount/, function(){
    this.download("http://eoddata.com/Data/symbollist.aspx?e=NYSE", "nyse.csv");
});

casper.run();

It seems that it is necessary to wait for that specific page. CasperJS doesn't notice that a new page was requested and the then() functionality is not used for some reason.

Other ways that I tried were:

  • Filling and submitting the form with casper.fillSelectors()
  • Filling through the DOM with casper.evaluate() and submitting by clicking on the login button with casper.click()
  • Mixing all of the above.
Community
  • 1
  • 1
Artjom B.
  • 61,146
  • 24
  • 125
  • 222