regular expression of python
up vote
2
down vote
favorite
I am struggling when writing regular expression in python.
For instance I get the following right
"GET /images/launch-logo.gif HTTP/1.0" 220 1839
is matched by
"(S+) (S+)s*(S*)" (d{3}) (S+)
however I still need to include the following cases all together
"GET /history/history.html hqpao/hqpao_home.html
HTTP/1.0" 200 1502"GET /shuttle/missions/missions.html Shuttle Launches from
Kennedy Space Center HTTP/1.0"200 8677"GET /finger @net.com HTTP/1.0"404 -
obviously I should change the bold part of the expression
"(S+) (S+)s*(S*)" (d{3}) (S+)
But how should I change it. I have one approach in mind which is change the bold part to
[s |(s*)(S+) |(S+)(12) |(S+)]
where the 2nd, 3rd , 4th expression is the (1), (2), (3) extra cases I need to deal with.
But my expression do not work. What do I misunderstand about regular expression as I simply deal with it case by case.
python regex
add a comment |
up vote
2
down vote
favorite
I am struggling when writing regular expression in python.
For instance I get the following right
"GET /images/launch-logo.gif HTTP/1.0" 220 1839
is matched by
"(S+) (S+)s*(S*)" (d{3}) (S+)
however I still need to include the following cases all together
"GET /history/history.html hqpao/hqpao_home.html
HTTP/1.0" 200 1502"GET /shuttle/missions/missions.html Shuttle Launches from
Kennedy Space Center HTTP/1.0"200 8677"GET /finger @net.com HTTP/1.0"404 -
obviously I should change the bold part of the expression
"(S+) (S+)s*(S*)" (d{3}) (S+)
But how should I change it. I have one approach in mind which is change the bold part to
[s |(s*)(S+) |(S+)(12) |(S+)]
where the 2nd, 3rd , 4th expression is the (1), (2), (3) extra cases I need to deal with.
But my expression do not work. What do I misunderstand about regular expression as I simply deal with it case by case.
python regex
Are the beginnings(1)
,(2)
and(3)
part of what you want to match or is that a numbered list of strings to match?
– das-g
Nov 11 at 10:27
my regular expression have to include all (1), (2), (3) and the very beginning cases
– Ricky Ng
Nov 11 at 10:28
1
Yes, but is the actual string to match in the (1) case(1)"GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502
, or is it just"GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502
and the (1) is just there to number it in your post here?
– das-g
Nov 11 at 10:32
oh! It is just for numbering and not included in the requirement
– Ricky Ng
Nov 11 at 11:29
add a comment |
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I am struggling when writing regular expression in python.
For instance I get the following right
"GET /images/launch-logo.gif HTTP/1.0" 220 1839
is matched by
"(S+) (S+)s*(S*)" (d{3}) (S+)
however I still need to include the following cases all together
"GET /history/history.html hqpao/hqpao_home.html
HTTP/1.0" 200 1502"GET /shuttle/missions/missions.html Shuttle Launches from
Kennedy Space Center HTTP/1.0"200 8677"GET /finger @net.com HTTP/1.0"404 -
obviously I should change the bold part of the expression
"(S+) (S+)s*(S*)" (d{3}) (S+)
But how should I change it. I have one approach in mind which is change the bold part to
[s |(s*)(S+) |(S+)(12) |(S+)]
where the 2nd, 3rd , 4th expression is the (1), (2), (3) extra cases I need to deal with.
But my expression do not work. What do I misunderstand about regular expression as I simply deal with it case by case.
python regex
I am struggling when writing regular expression in python.
For instance I get the following right
"GET /images/launch-logo.gif HTTP/1.0" 220 1839
is matched by
"(S+) (S+)s*(S*)" (d{3}) (S+)
however I still need to include the following cases all together
"GET /history/history.html hqpao/hqpao_home.html
HTTP/1.0" 200 1502"GET /shuttle/missions/missions.html Shuttle Launches from
Kennedy Space Center HTTP/1.0"200 8677"GET /finger @net.com HTTP/1.0"404 -
obviously I should change the bold part of the expression
"(S+) (S+)s*(S*)" (d{3}) (S+)
But how should I change it. I have one approach in mind which is change the bold part to
[s |(s*)(S+) |(S+)(12) |(S+)]
where the 2nd, 3rd , 4th expression is the (1), (2), (3) extra cases I need to deal with.
But my expression do not work. What do I misunderstand about regular expression as I simply deal with it case by case.
python regex
python regex
edited Nov 11 at 11:31
das-g
5,88322250
5,88322250
asked Nov 11 at 10:20
Ricky Ng
167
167
Are the beginnings(1)
,(2)
and(3)
part of what you want to match or is that a numbered list of strings to match?
– das-g
Nov 11 at 10:27
my regular expression have to include all (1), (2), (3) and the very beginning cases
– Ricky Ng
Nov 11 at 10:28
1
Yes, but is the actual string to match in the (1) case(1)"GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502
, or is it just"GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502
and the (1) is just there to number it in your post here?
– das-g
Nov 11 at 10:32
oh! It is just for numbering and not included in the requirement
– Ricky Ng
Nov 11 at 11:29
add a comment |
Are the beginnings(1)
,(2)
and(3)
part of what you want to match or is that a numbered list of strings to match?
– das-g
Nov 11 at 10:27
my regular expression have to include all (1), (2), (3) and the very beginning cases
– Ricky Ng
Nov 11 at 10:28
1
Yes, but is the actual string to match in the (1) case(1)"GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502
, or is it just"GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502
and the (1) is just there to number it in your post here?
– das-g
Nov 11 at 10:32
oh! It is just for numbering and not included in the requirement
– Ricky Ng
Nov 11 at 11:29
Are the beginnings
(1)
, (2)
and (3)
part of what you want to match or is that a numbered list of strings to match?– das-g
Nov 11 at 10:27
Are the beginnings
(1)
, (2)
and (3)
part of what you want to match or is that a numbered list of strings to match?– das-g
Nov 11 at 10:27
my regular expression have to include all (1), (2), (3) and the very beginning cases
– Ricky Ng
Nov 11 at 10:28
my regular expression have to include all (1), (2), (3) and the very beginning cases
– Ricky Ng
Nov 11 at 10:28
1
1
Yes, but is the actual string to match in the (1) case
(1)"GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502
, or is it just "GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502
and the (1) is just there to number it in your post here?– das-g
Nov 11 at 10:32
Yes, but is the actual string to match in the (1) case
(1)"GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502
, or is it just "GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502
and the (1) is just there to number it in your post here?– das-g
Nov 11 at 10:32
oh! It is just for numbering and not included in the requirement
– Ricky Ng
Nov 11 at 11:29
oh! It is just for numbering and not included in the requirement
– Ricky Ng
Nov 11 at 11:29
add a comment |
2 Answers
2
active
oldest
votes
up vote
1
down vote
This Might be a bit messy but it works:
"(S+) (S+[sw.@]*)s*(S*)"s?(d{3})s(S+)*
You can play with it on Regexr. Regexr Shared Link
add a comment |
up vote
0
down vote
You may use
^"([^s"]+)s+([^s"]+)(?:s+([^"]+?))?s+([A-Z]+/d[d.]*)"s*(d{3})s*(S+)$
See the regex demo
Details
^
- start of a line (usere.M
if you are reading the whole file into a variable,f.read()
)
"
- a double quotation mark
([^s"]+)
- Group 1: one or more chars other than whitespace and a double quotation mark
s+
- 1+ whitespaces
([^s"]+)
- Group 2: one or more chars other than whitespace and a double quotation mark
(?:s+([^"]+?))?
- an optional non-capturing group matching
s+
- 1+ whitespaces
([^"]+?)
- Group 3: any 1 or more chars other than"
, as few as possible
s+
- 1+ whitespaces
([A-Z]+/d[d.]*)
- Group 4: 1+ uppercase letters,/
and then 1 digit followed with any 0+ digits or.
chars
"
- a double quotation mark
s+
- 0+ whitespaces
(d{3})
- Group 5: three digits
s*
- 0+ whitespaces
(S+)
- 1 or more non-whitespace chars
$
- end of string.
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
This Might be a bit messy but it works:
"(S+) (S+[sw.@]*)s*(S*)"s?(d{3})s(S+)*
You can play with it on Regexr. Regexr Shared Link
add a comment |
up vote
1
down vote
This Might be a bit messy but it works:
"(S+) (S+[sw.@]*)s*(S*)"s?(d{3})s(S+)*
You can play with it on Regexr. Regexr Shared Link
add a comment |
up vote
1
down vote
up vote
1
down vote
This Might be a bit messy but it works:
"(S+) (S+[sw.@]*)s*(S*)"s?(d{3})s(S+)*
You can play with it on Regexr. Regexr Shared Link
This Might be a bit messy but it works:
"(S+) (S+[sw.@]*)s*(S*)"s?(d{3})s(S+)*
You can play with it on Regexr. Regexr Shared Link
answered Nov 11 at 10:35
Dani G
427411
427411
add a comment |
add a comment |
up vote
0
down vote
You may use
^"([^s"]+)s+([^s"]+)(?:s+([^"]+?))?s+([A-Z]+/d[d.]*)"s*(d{3})s*(S+)$
See the regex demo
Details
^
- start of a line (usere.M
if you are reading the whole file into a variable,f.read()
)
"
- a double quotation mark
([^s"]+)
- Group 1: one or more chars other than whitespace and a double quotation mark
s+
- 1+ whitespaces
([^s"]+)
- Group 2: one or more chars other than whitespace and a double quotation mark
(?:s+([^"]+?))?
- an optional non-capturing group matching
s+
- 1+ whitespaces
([^"]+?)
- Group 3: any 1 or more chars other than"
, as few as possible
s+
- 1+ whitespaces
([A-Z]+/d[d.]*)
- Group 4: 1+ uppercase letters,/
and then 1 digit followed with any 0+ digits or.
chars
"
- a double quotation mark
s+
- 0+ whitespaces
(d{3})
- Group 5: three digits
s*
- 0+ whitespaces
(S+)
- 1 or more non-whitespace chars
$
- end of string.
add a comment |
up vote
0
down vote
You may use
^"([^s"]+)s+([^s"]+)(?:s+([^"]+?))?s+([A-Z]+/d[d.]*)"s*(d{3})s*(S+)$
See the regex demo
Details
^
- start of a line (usere.M
if you are reading the whole file into a variable,f.read()
)
"
- a double quotation mark
([^s"]+)
- Group 1: one or more chars other than whitespace and a double quotation mark
s+
- 1+ whitespaces
([^s"]+)
- Group 2: one or more chars other than whitespace and a double quotation mark
(?:s+([^"]+?))?
- an optional non-capturing group matching
s+
- 1+ whitespaces
([^"]+?)
- Group 3: any 1 or more chars other than"
, as few as possible
s+
- 1+ whitespaces
([A-Z]+/d[d.]*)
- Group 4: 1+ uppercase letters,/
and then 1 digit followed with any 0+ digits or.
chars
"
- a double quotation mark
s+
- 0+ whitespaces
(d{3})
- Group 5: three digits
s*
- 0+ whitespaces
(S+)
- 1 or more non-whitespace chars
$
- end of string.
add a comment |
up vote
0
down vote
up vote
0
down vote
You may use
^"([^s"]+)s+([^s"]+)(?:s+([^"]+?))?s+([A-Z]+/d[d.]*)"s*(d{3})s*(S+)$
See the regex demo
Details
^
- start of a line (usere.M
if you are reading the whole file into a variable,f.read()
)
"
- a double quotation mark
([^s"]+)
- Group 1: one or more chars other than whitespace and a double quotation mark
s+
- 1+ whitespaces
([^s"]+)
- Group 2: one or more chars other than whitespace and a double quotation mark
(?:s+([^"]+?))?
- an optional non-capturing group matching
s+
- 1+ whitespaces
([^"]+?)
- Group 3: any 1 or more chars other than"
, as few as possible
s+
- 1+ whitespaces
([A-Z]+/d[d.]*)
- Group 4: 1+ uppercase letters,/
and then 1 digit followed with any 0+ digits or.
chars
"
- a double quotation mark
s+
- 0+ whitespaces
(d{3})
- Group 5: three digits
s*
- 0+ whitespaces
(S+)
- 1 or more non-whitespace chars
$
- end of string.
You may use
^"([^s"]+)s+([^s"]+)(?:s+([^"]+?))?s+([A-Z]+/d[d.]*)"s*(d{3})s*(S+)$
See the regex demo
Details
^
- start of a line (usere.M
if you are reading the whole file into a variable,f.read()
)
"
- a double quotation mark
([^s"]+)
- Group 1: one or more chars other than whitespace and a double quotation mark
s+
- 1+ whitespaces
([^s"]+)
- Group 2: one or more chars other than whitespace and a double quotation mark
(?:s+([^"]+?))?
- an optional non-capturing group matching
s+
- 1+ whitespaces
([^"]+?)
- Group 3: any 1 or more chars other than"
, as few as possible
s+
- 1+ whitespaces
([A-Z]+/d[d.]*)
- Group 4: 1+ uppercase letters,/
and then 1 digit followed with any 0+ digits or.
chars
"
- a double quotation mark
s+
- 0+ whitespaces
(d{3})
- Group 5: three digits
s*
- 0+ whitespaces
(S+)
- 1 or more non-whitespace chars
$
- end of string.
answered Nov 11 at 11:22
Wiktor Stribiżew
304k16123200
304k16123200
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53247770%2fregular-expression-of-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Are the beginnings
(1)
,(2)
and(3)
part of what you want to match or is that a numbered list of strings to match?– das-g
Nov 11 at 10:27
my regular expression have to include all (1), (2), (3) and the very beginning cases
– Ricky Ng
Nov 11 at 10:28
1
Yes, but is the actual string to match in the (1) case
(1)"GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502
, or is it just"GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502
and the (1) is just there to number it in your post here?– das-g
Nov 11 at 10:32
oh! It is just for numbering and not included in the requirement
– Ricky Ng
Nov 11 at 11:29