3

A simple text string which contains a dollar ($) sign in Perl program:

open my $fh, "<", $fp or die "can't read open '$fp': $OS_ERROR";
  while (<$fh>)
  {
    $line=''; #Initialize the line variable
    $line=$_; #Reading a record from a text file
    print "Line is $line\n"; #Printing for confirming
    (@arr)=split('\|',$line);
    

$line gets the following pipe-separated string (confirmed by printing $line value):

Vanilla Cake $3.65 New Offering|Half pound Vanilla Cake||Cake with vanilla, cream and cheese

then split and pull that record into specific array elements:

(@arr)=split('\|',$line);

$arr[0] gets Vanilla Cake $3.65, $arr1 gets Half pound Vanilla Cake, $arr[2] remains empty/NULL, $arr[3] gets Cake with vanilla, cream and cheese

Now I check if $arr[0] contains a price value. Pattern to match is some text (Vanilla Cake ), then a dollar sign($), followed by one or more digits (value of 3 in this case), decimal is optional - may be there or may not be there, and then there can be one or more digits after decimal (.65 in this case). Using the following regex:

if ($arr[0]=~ /(.*?)(\$\d+(?:\.\d+)?)/)
{
     print "match1 is $1, match2 is $2, match3 is $3, match4 is $4\n";
}

The problem is that $1, $2, $3, $4 - all matching pattern values are printing as NULL/EMPTY. I suppose it is because of the $ sign being a part of the string $arr[0].

My guess is that because of $3.65 value, it is taking $3 part (before the decimal) as a variable and trying to substitute it and $3 is NULL. So the regex matching is happening buy value extraction may be failing because the whole string may be getting interpreted as Vanilla Cake .65, and not as Vanilla Cake $3.65 (This is my guess)

Probably, that's why the regex matching & extraction is failing.

I also read somewhere that it may be dependent on the variable initialization ($line or $arr[0] as single quote or double quote) - I have no clue about such a dependency (that's why included all the code like initialization of $line variable as above). $line reads one record from a file at a time, so needs to be initialized at each iteration.

Have tried solutions given in Escape a dollar sign inside a variable and Trouble escaping dollar sign in Perl, but unable to get it working. Other trial and errors on creating the regex on https://regex101.com/r/FQjcHp/2/ are also not helping.

Can someone please let me know how to get the values of Vanilla Cake and $3.65 from the above string using the right regex code?

PS: Adding a screenshot of online compiler run with same code, which works fine and captures $ value correctly. Somehow, in my program it is not picking it up. enter image description here

levent001
  • 174
  • 7
  • Can you put a full block of code? You likely have an issue else where because everything you read is fine, and your regex looks correct (https://regex101.com/r/k0ANH0/1) In Perl: 'single quotes maintains $special characters' where as "double quotes $extrapolates variables unless /$escaped" – Negative Zero May 24 '22 at 12:46
  • @NegativeZero - more block code included. – levent001 May 24 '22 at 12:49
  • 1
    You only have two capture groups... `$3` and `$4` will always be empty. – Shawn May 24 '22 at 12:56
  • `while (my $line = <$fh>)` (or should it be `$wh`?) and `my @arr = split ...`, btw. Do you have `use warnings;` and `use strict;` in effect? – Shawn May 24 '22 at 13:01
  • 1
    Really needs a [mcve]. Use the [DATA](https://perldoc.perl.org/perldata#Special-Literals) file handle instead of a separate input file to keep it self contained. – Shawn May 24 '22 at 13:05
  • I've updated it with more detailed code blocks. Not sure how much more can be added further. Pls let me know what may be needed to diagnose further. – levent001 May 24 '22 at 14:03
  • Your match does not fail. And if the match would fail, you would not get a print. The if-clause is only executed if the match succeeds. If you have warnings enabled, you will get a warning about the uninitialized values $3 and $4, but the print will still happen. Also, your guess is wrong, the $ will not be transformed into another variable. (That would be crazy. Although if you `eval` the variable, it will do that) – TLP May 24 '22 at 14:03
  • @TLP - added a picture to original q where I can see the desired values correctly getting captured when I run it online at https://www.onlinegdb.com/online_perl_compiler. But the same regex is not producing output in my program run locally on my PC. – levent001 May 24 '22 at 14:11
  • @levent001 That is what I get when I run your code with your input. What does it produce on your PC? Copy and paste. (Also, posting images of code is frowned upon) – TLP May 24 '22 at 14:12
  • Also, always use `use strict; use warnings`. It will help you avoid many simple errors and typos. – TLP May 24 '22 at 14:12
  • @TLP - Thanks for the inputs, appreciate your help. Please allow me some time to include strict and warnings. It will mean placing terms like `my` against each and every variable & other statements. Its a part of a very lengthy program and I'll need time to update everything. It's client code, so I don't have all liberty to change everything, but will try with a separate copy. I'll revert as soon as i can. – levent001 May 24 '22 at 14:28
  • Writing a "very lengthy program" without strict and warnings sounds like a very bad idea. However, for the purposes of this question, adding `my` should not take all that long. Or shall we consider your problem solved? – TLP May 24 '22 at 14:30
  • Allow me some time pls - around 8 hours - will revert. Yes, coding standards need to be maintained, but the code was written years ago by somebody else, and now it has become my responsibility to fix it. Typical coder's job :-) – levent001 May 24 '22 at 14:42
  • 1
    If it's now your responsibility, switching to strict/warnings will save you a lot of headaches down the line. I know it can be hard to convince the powers at be sometimes, but still... – Ecuador May 24 '22 at 17:30
  • @levent001 I take it adding strict fixed your problem. – TLP May 25 '22 at 10:35
  • Thanks to both @TLP and Aquaholic for providing respective inputs & solutions, as I was able to diagnose & fix the issue with both of your inputs. Much deeper analysis indicates that the code was spread across multiple files (& subroutines). In one internal file, the variable was being altered under double quotes with dollar taken off which was almost invisible in a short line of code. Strict and warnings are the mandatory way to go! Thanks again. – levent001 May 25 '22 at 15:27

2 Answers2

4

This code

if ($foo =~ /(.*?)(\$\d+(?:\.\d+)?)/) {
     print "match1 is $1, match2 is $2, match3 is $3, match4 is $4\n";
}

With this input

Vanilla Cake $3.65 

Will print

Use of uninitialized value $3 in concatenation (.) or string at ...
Use of uninitialized value $4 in concatenation (.) or string at ...
match1 is Vanilla Cake , match2 is $3.65, match3 is , match4 is

The warnings will be silent if you do not have use warnings enabled.

This is what the code you have supplied does with this input. You also show that it does with your screenshot. You say, in comments, that it does not do this on your home PC. I would say that is impossible.

Either your code is different, your input is different, or your Perl installation is different (although this is unlikely the issue). There is really no alternative.

One huge problem is that you are not using use strict; use warnings with your code. That can mean that any number of problems with your code are hidden. Most likely, in your case, I would say it is a typo, such as:

$Iine = $_;
if ($line =~ /...../)  # <---- not the same variable

But you asked for 8 hours to update your code, so I guess we will find out in 8 hours.


A few pointers

  while (<$fh>)
  {
    $line=''; #Initialize the line variable
    $line=$_; #Reading a record from a text file
  • You do not need to "initialize" the line variable. The next line will make that line completely redundant.
  • That line is not actually reading a record from your file, the readline statement <$fh> is doing that.
  • Usually you would write this line as: while (my $line = <$fh>).
  • $3 and $4 in your print statement can never hold a value, because you lack the capture groups ( ... ) necessary. Two capture groups means only $1 and $2 will be populated.

When writing Perl code, you should always use

use strict;
use warnings;

Because not doing so will not help you, it will just hide your problems.

Also make a habit of placing the declaration (my $var) in as small a scope as possible. Sample code:

use strict;
use warnings;
use feature 'say';

while (my $line = <DATA>) {
    my @x = split /\|/, $line;
    if ($x[0] =~ /(.*?)(\$\d+(?:\.\d+)?)/) {
        say "$1 is $2";
    }
}

__DATA__
Vanilla Cake $3.65 New Offering|Half pound Vanilla Cake||Cake with vanilla, cream and cheese
TLP
  • 66,756
  • 10
  • 92
  • 149
0

I ran into a similar problem around 2 years back - and had to break my head for more than 5 days before I could get to the root of the issue with the great $ sign. Here's how it went:

Dollar regex value was not printing - something similar to what you are observing.

The perl code written ages ago by someone had initialized the string var with double quotes. Something like

$string="This is some text";

And it worked perfectly till I touched it. :-)

What I did was inserted a variable into it, like

$string="This is some $PriceVariableHavingDollarSign text";

and then I tried to run a dollar matching regex on the $string variable with a hope to detect the dollar. Not exactly, but something very similar to what you are trying to do as follows:

$string=~ /(.*?)(\$\d+(?:\.\d+)?)/

And it either gave compilation error, or failed to pickup the dollar sign completely with the different regex combinations I tried.

So my answer-cum-suggestion is to check in your "lengthy code" if something similar is happening with double quotes on your variable. Most probably, that may be causing the problem.

Before taking in the value at the source, if possible try to use \ on the $ sign, like (at least that solved my problem). Instead of

PriceVariableHavingDollarSign = "Cake is $3.5";

try having

$PriceVariableHavingDollarSign ="Cake is \$3.5";

Here is a great explanation of what happens with double quotes and single quotes in Perl. https://www.effectiveperlprogramming.com/2012/01/understand-the-order-of-operations-in-double-quoted-contexts/

And good job for the explicit details you've put in the question, comments and graphic. It helps you to get all possible angles, scenarios as well as solutions.

Aquaholic
  • 863
  • 9
  • 25
  • Actually, your problem was not that you had a variable interpolate into a double quoted string. Your main problem was that you did not use `use strict; use warnings`. Because then your problem would have been solved in 10 seconds. Not my DV, btw – TLP May 25 '22 at 10:34
  • You're right abt `use strict; use warnings`. And if it is someone else's code written ages ago which now you are required to upgrade/fix, then it poses new challenges altogether – Aquaholic May 25 '22 at 11:35
  • Agreed. At some point, it becomes easier to just rewrite the whole thing from scratch. – TLP May 25 '22 at 13:15