5.39. Perl regular expression

发布时间 :2023-10-20 23:00:04 UTC      

Regular expression (regular expression) describes a pattern of string matching, which can be used to check whether a string contains a certain substring, replace a matching substring, or extract a substring from a string that meets a certain condition.

Perl regular expression function of the language is very powerful, which is basically the most powerful among the commonly used languages, and many languages can refer to it when designing regular expressions. `` Perl``gets or sets the regular expression.

The Perl three forms of regular expressions are matching, substitution, and transformation

  • Match: m// (it can also be abbreviated as / /, omitting m)

  • Replace: s///

  • Conversion: tr///

These three forms are generally the same as =~ or !~ collocation use =~ indicates a match !~ indicates a mismatch.

5.39.1. Matching operator #

Matching operator m// used to match a string statement or a regular expression, for example, to match scalars $bar “run” in, the code is as follows:

Example #

#!/usr/bin/perl$bar="I am runoob site. welcome to runoob
site.";if($bar=~/run/){print"First match\\n";
}else{print"First Mismatch\\n";}$bar="run";if($bar=~/run/){print"Second match\\n";}
else{print"Second Mismatch\\n";}

Execute the above program, and the output is as follows:

First match
Second match

5.39.2. Pattern matching modifier #

Pattern matching has some common modifiers, as shown in the following table:

Modifier

Description

I

Ignore case in mode

M

Multiline mode

O

Assign a value only once

S

Single line mode, “.”match”n” (default mismatch)

X

Ignore whitespace in the pattern

G

Global matching

Cg

After a global match fails, the matching string is allowed to be found again.

5.39.3. Regular expression variable #

After perl processing, there are three special variable names for the matched values:

  • $`: the string that matches the first part of the part

  • $&: matching string

  • $’: there is no remaining string that matches

If you put these three variables together, you will get the original string.

Examples are as follows:

Example #

#!/usr/bin/perl$string="welcome to runoob
site.";$string=~m/run/;print"String before matching:$\`\\n";print"Matched String:
$&\\n";print"Matched string:$'\\n";

The output result of executing the above program is:

String before matching: welcome to
Matching string: run
Matched string: oob site

5.39.4. Replace operator #

Replace operator s/// is an extension of the matching operator to replace the specified string with a new string. The basic format is as follows:

s/PATTERN/REPLACEMENT/;

PATTERN is for matching pattern REPLACEMENT is the replacement string.

For example, we replace the “google” of the following string with “runoob”:

Example #

#!/usr/bin/perl$string="welcome to google
site.";$string=~s/google/runoob/;print"$string\\n";

The output result of executing the above program is:

welcome to runoob site.

5.39.5. Replace operation modifier #

The replacement operation modifier is shown in the following table:

Modifier

Description

I

If you add “I” to the modifier, the regular will remove case sensitivity, that is, “a” and “A” are the same.

M

The default regular start “^” and end “$” is only for regular strings ifyou add “m” to the modifier, then the beginning and end of each line willrefer to each line of the string: each line begins with “^” and ends with”$”.

O

The expression is executed only once.

S

If “s” is added to the modifier, the default “.” represents that any character other than the line break will become any character, including the line break!

X

If you add this modifier, the white space character in the expression will be ignored unless it has been escaped.

G

Replace all matching strings.

E

Replace the string as an expression

5.39.6. Conversion operator #

The following are the modifiers related to the conversion operator:

Modifier

Description

C

Convert all unspecified characters

D

Delete all specified characters

S

Reduce multiple identical output characters to one

The following example sets the variable $string convert all lowercase letters in to uppercase letters:

#!/usr/bin/perl

$string = 'welcome to runoob site.';
$string =~ tr/a-z/A-z/;

print "$string\n";

The output result of executing the above program is:

WELCOME TO RUNOOB SITE.

The following examples use the /s change the variable $string duplicate character deletion:

Example #

#!/usr/bin/perl$string='runoob';$string=~tr/a-z/a-z/s;print"$string\\n";

The output result of executing the above program is:

runob

More examples:

$string =~ tr/\d/ /c;     # Replace all non numeric characters with spaces
$string =~ tr/\t //d;     # Remove tabs and spaces
$string =~ tr/0-9/ /cs    # Replace other characters between numbers with a space.

5.39.7. More regular expression rules #

Expression.

Description

.

Matches all characters except newline characters

x?

Match 0 or once x strings

x*

Match 0 or more x strings, but as many times as possible

x+

Match one or more x strings, but the least number of times possible

.*

Any character that matches 0 or more times

.+

Any character that matches one or more times

{m}

Matches a specified string that happens to be m

{m,n}

Match specified strings with more than m and less than n

{m,}

Match more than m specified strings

[]

Match match [] characters within

[^]

The match does not match [] characters within

[0-9]

Match all numeric characters

[a-z]

Match all lowercase characters

[^0-9]

Match all non-numeric characters

[^a-z]

Match all non-lowercase alphabetic characters

^

Matches a character at the beginning of a character

$

Matches the character at the end of the character

\d

Matches the character of a number, and [0-9] Grammar is the same

\d+

Matches multiple numeric strings, and [0-9] + the same syntax

\D

Non-numeric, other same asd

\D+

Non-numeric, other same asd +

\w

A string of letters or numbers, and [a-zA-Z0-9_] grammar is the same

\w+

And [a-zA-Z0-9_]+ grammar is the same

\W

A string that is not a letter or number, and [^a-zA-Z0-9_] grammar is the same

\W+

And [^a-zA-Z0-9_]+ grammar is the same

\s

Blank space, the syntax is the same as [\n\t\r\f]

\s+

The syntax is the same as [\n\t\r\f]+

\S

Non blank space, the syntax is the same as [^\n\t\r\f]

\S+

The syntax is the same as [^\n\t\r\f]+

\b

Match strings bounded by letters and numbers

\B

Match strings that are not bounded by letters and numeric values

a|b|c

Match strings that match a character or b character or c character

abc

Matching a string containing abc (pattern) () this symbol remembers the string you are looking for and is a useful syntax. The string found in the first () becomes $1 . This variable or \1 variable, the string found in the second () becomes $2 . This variable or2 variable, and soon.

/pattern/i

The parameter I ignores English case, that is, when matching strings, the case of English is not considered. if you are looking for a special character in pattern mode, such as * need to precede this characterwith \ symbols, so as to invalidate special characters

5.39.8. More referenc #

Perl Regular expressions: https://perldoc.perl.org/perlre#Regular-Expressions

Principles, Technologies, and Methods of Geographic Information Systems  102

In recent years, Geographic Information Systems (GIS) have undergone rapid development in both theoretical and practical dimensions. GIS has been widely applied for modeling and decision-making support across various fields such as urban management, regional planning, and environmental remediation, establishing geographic information as a vital component of the information era. The introduction of the “Digital Earth” concept has further accelerated the advancement of GIS, which serves as its technical foundation. Concurrently, scholars have been dedicated to theoretical research in areas like spatial cognition, spatial data uncertainty, and the formalization of spatial relationships. This reflects the dual nature of GIS as both an applied technology and an academic discipline, with the two aspects forming a mutually reinforcing cycle of progress.