Regex to change all text to lowercase but leave out parts of text that start and end in a specific way












1















Is there a way to change all text to lowercase except the words that start with a specific combination of letters ("ABC") and end with a white space (dots, hyphen, underscore can be within?
Preserve capitalization in words like "ABCkjkJ.90_1 " or "ABC-12_OLL " but lowercase everything else?



Find:



(I have no idea)



[^ABC][s]$ 


Replace with:



L$1


Also, how should I delete all punctuation from the rest of the text (not the ones starting with ABC)?










share|improve this question

























  • Regexes are not language agnostic. L and other case changing operators are not supported in many regex libraries. Other features you may need for this task may differ from regex library to regex library.

    – Wiktor Stribiżew
    Nov 16 '18 at 6:11


















1















Is there a way to change all text to lowercase except the words that start with a specific combination of letters ("ABC") and end with a white space (dots, hyphen, underscore can be within?
Preserve capitalization in words like "ABCkjkJ.90_1 " or "ABC-12_OLL " but lowercase everything else?



Find:



(I have no idea)



[^ABC][s]$ 


Replace with:



L$1


Also, how should I delete all punctuation from the rest of the text (not the ones starting with ABC)?










share|improve this question

























  • Regexes are not language agnostic. L and other case changing operators are not supported in many regex libraries. Other features you may need for this task may differ from regex library to regex library.

    – Wiktor Stribiżew
    Nov 16 '18 at 6:11
















1












1








1








Is there a way to change all text to lowercase except the words that start with a specific combination of letters ("ABC") and end with a white space (dots, hyphen, underscore can be within?
Preserve capitalization in words like "ABCkjkJ.90_1 " or "ABC-12_OLL " but lowercase everything else?



Find:



(I have no idea)



[^ABC][s]$ 


Replace with:



L$1


Also, how should I delete all punctuation from the rest of the text (not the ones starting with ABC)?










share|improve this question
















Is there a way to change all text to lowercase except the words that start with a specific combination of letters ("ABC") and end with a white space (dots, hyphen, underscore can be within?
Preserve capitalization in words like "ABCkjkJ.90_1 " or "ABC-12_OLL " but lowercase everything else?



Find:



(I have no idea)



[^ABC][s]$ 


Replace with:



L$1


Also, how should I delete all punctuation from the rest of the text (not the ones starting with ABC)?







regex language-agnostic






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 16 '18 at 6:55







Antidisestablishmentarianism

















asked Nov 16 '18 at 4:17









AntidisestablishmentarianismAntidisestablishmentarianism

114




114













  • Regexes are not language agnostic. L and other case changing operators are not supported in many regex libraries. Other features you may need for this task may differ from regex library to regex library.

    – Wiktor Stribiżew
    Nov 16 '18 at 6:11





















  • Regexes are not language agnostic. L and other case changing operators are not supported in many regex libraries. Other features you may need for this task may differ from regex library to regex library.

    – Wiktor Stribiżew
    Nov 16 '18 at 6:11



















Regexes are not language agnostic. L and other case changing operators are not supported in many regex libraries. Other features you may need for this task may differ from regex library to regex library.

– Wiktor Stribiżew
Nov 16 '18 at 6:11







Regexes are not language agnostic. L and other case changing operators are not supported in many regex libraries. Other features you may need for this task may differ from regex library to regex library.

– Wiktor Stribiżew
Nov 16 '18 at 6:11














1 Answer
1






active

oldest

votes


















1














The problem boils down to matching words that don't start with ABC. Because words in your string can contain dots and hyphens, which aren't word characters, we can't use b to determine the start of a word, unfortunately - instead, match the preceding space (or the beginning of the string) with



(?: |^)


and then negative lookahead for abc, and match as many words, dots, or hyphens as possible:



(?: |^)(?!abc)[w.-]*


Then, lowercase every full match.



https://regex101.com/r/QSShDu/1



Example, for input:



Baz Buzz ABCkjkJ.90_1 ABC-12_OLL Foo Bar


you get



baz buzz ABCkjkJ.90_1 ABC-12_OLL foo bar


If the ABC part always occurs at the beginning of the string, then it's a lot easier - just capture the first word in a group, then capture the rest of the string in a group, and capitalize the rest of the string:



([w.-]*)(.+)


replace with



1L2


https://regex101.com/r/QSShDu/2






share|improve this answer


























  • Thanks! If that matters, the "ABC***" string is always at the beginning of the line. Each line invariably starts with "ABC" and gibberish characters that need to maintain their capitalization, but the rest of the line contains the text that needs to be lowercase.

    – Antidisestablishmentarianism
    Nov 16 '18 at 4:46











  • Thank you so much! One last thing, if I'm not getting too impertinent: how do I delete all punctuation except the apostrophe from the rest of the string? Replace - sth like ([w.-]*)(WS)? And what do I replace it with?

    – Antidisestablishmentarianism
    Nov 16 '18 at 6:40













  • Put every punctuation character you want to remove in a character set, then replace every occurrence with the empty string. eg [._-]

    – CertainPerformance
    Nov 16 '18 at 7:17











  • I decided to forget about the apostrophe. tried ([w.-]*)([[:punct:]]) replaced with 1[._-]2. That didn't work

    – Antidisestablishmentarianism
    Nov 16 '18 at 7:28













  • You shouldn't try matching word characters, only match the punctuation marks you want to remove, in a character set. eg [._-] to remove all dots, underscores, and dashes

    – CertainPerformance
    Nov 16 '18 at 7:33














Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53331371%2fregex-to-change-all-text-to-lowercase-but-leave-out-parts-of-text-that-start-and%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














The problem boils down to matching words that don't start with ABC. Because words in your string can contain dots and hyphens, which aren't word characters, we can't use b to determine the start of a word, unfortunately - instead, match the preceding space (or the beginning of the string) with



(?: |^)


and then negative lookahead for abc, and match as many words, dots, or hyphens as possible:



(?: |^)(?!abc)[w.-]*


Then, lowercase every full match.



https://regex101.com/r/QSShDu/1



Example, for input:



Baz Buzz ABCkjkJ.90_1 ABC-12_OLL Foo Bar


you get



baz buzz ABCkjkJ.90_1 ABC-12_OLL foo bar


If the ABC part always occurs at the beginning of the string, then it's a lot easier - just capture the first word in a group, then capture the rest of the string in a group, and capitalize the rest of the string:



([w.-]*)(.+)


replace with



1L2


https://regex101.com/r/QSShDu/2






share|improve this answer


























  • Thanks! If that matters, the "ABC***" string is always at the beginning of the line. Each line invariably starts with "ABC" and gibberish characters that need to maintain their capitalization, but the rest of the line contains the text that needs to be lowercase.

    – Antidisestablishmentarianism
    Nov 16 '18 at 4:46











  • Thank you so much! One last thing, if I'm not getting too impertinent: how do I delete all punctuation except the apostrophe from the rest of the string? Replace - sth like ([w.-]*)(WS)? And what do I replace it with?

    – Antidisestablishmentarianism
    Nov 16 '18 at 6:40













  • Put every punctuation character you want to remove in a character set, then replace every occurrence with the empty string. eg [._-]

    – CertainPerformance
    Nov 16 '18 at 7:17











  • I decided to forget about the apostrophe. tried ([w.-]*)([[:punct:]]) replaced with 1[._-]2. That didn't work

    – Antidisestablishmentarianism
    Nov 16 '18 at 7:28













  • You shouldn't try matching word characters, only match the punctuation marks you want to remove, in a character set. eg [._-] to remove all dots, underscores, and dashes

    – CertainPerformance
    Nov 16 '18 at 7:33


















1














The problem boils down to matching words that don't start with ABC. Because words in your string can contain dots and hyphens, which aren't word characters, we can't use b to determine the start of a word, unfortunately - instead, match the preceding space (or the beginning of the string) with



(?: |^)


and then negative lookahead for abc, and match as many words, dots, or hyphens as possible:



(?: |^)(?!abc)[w.-]*


Then, lowercase every full match.



https://regex101.com/r/QSShDu/1



Example, for input:



Baz Buzz ABCkjkJ.90_1 ABC-12_OLL Foo Bar


you get



baz buzz ABCkjkJ.90_1 ABC-12_OLL foo bar


If the ABC part always occurs at the beginning of the string, then it's a lot easier - just capture the first word in a group, then capture the rest of the string in a group, and capitalize the rest of the string:



([w.-]*)(.+)


replace with



1L2


https://regex101.com/r/QSShDu/2






share|improve this answer


























  • Thanks! If that matters, the "ABC***" string is always at the beginning of the line. Each line invariably starts with "ABC" and gibberish characters that need to maintain their capitalization, but the rest of the line contains the text that needs to be lowercase.

    – Antidisestablishmentarianism
    Nov 16 '18 at 4:46











  • Thank you so much! One last thing, if I'm not getting too impertinent: how do I delete all punctuation except the apostrophe from the rest of the string? Replace - sth like ([w.-]*)(WS)? And what do I replace it with?

    – Antidisestablishmentarianism
    Nov 16 '18 at 6:40













  • Put every punctuation character you want to remove in a character set, then replace every occurrence with the empty string. eg [._-]

    – CertainPerformance
    Nov 16 '18 at 7:17











  • I decided to forget about the apostrophe. tried ([w.-]*)([[:punct:]]) replaced with 1[._-]2. That didn't work

    – Antidisestablishmentarianism
    Nov 16 '18 at 7:28













  • You shouldn't try matching word characters, only match the punctuation marks you want to remove, in a character set. eg [._-] to remove all dots, underscores, and dashes

    – CertainPerformance
    Nov 16 '18 at 7:33
















1












1








1







The problem boils down to matching words that don't start with ABC. Because words in your string can contain dots and hyphens, which aren't word characters, we can't use b to determine the start of a word, unfortunately - instead, match the preceding space (or the beginning of the string) with



(?: |^)


and then negative lookahead for abc, and match as many words, dots, or hyphens as possible:



(?: |^)(?!abc)[w.-]*


Then, lowercase every full match.



https://regex101.com/r/QSShDu/1



Example, for input:



Baz Buzz ABCkjkJ.90_1 ABC-12_OLL Foo Bar


you get



baz buzz ABCkjkJ.90_1 ABC-12_OLL foo bar


If the ABC part always occurs at the beginning of the string, then it's a lot easier - just capture the first word in a group, then capture the rest of the string in a group, and capitalize the rest of the string:



([w.-]*)(.+)


replace with



1L2


https://regex101.com/r/QSShDu/2






share|improve this answer















The problem boils down to matching words that don't start with ABC. Because words in your string can contain dots and hyphens, which aren't word characters, we can't use b to determine the start of a word, unfortunately - instead, match the preceding space (or the beginning of the string) with



(?: |^)


and then negative lookahead for abc, and match as many words, dots, or hyphens as possible:



(?: |^)(?!abc)[w.-]*


Then, lowercase every full match.



https://regex101.com/r/QSShDu/1



Example, for input:



Baz Buzz ABCkjkJ.90_1 ABC-12_OLL Foo Bar


you get



baz buzz ABCkjkJ.90_1 ABC-12_OLL foo bar


If the ABC part always occurs at the beginning of the string, then it's a lot easier - just capture the first word in a group, then capture the rest of the string in a group, and capitalize the rest of the string:



([w.-]*)(.+)


replace with



1L2


https://regex101.com/r/QSShDu/2







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 16 '18 at 4:50

























answered Nov 16 '18 at 4:27









CertainPerformanceCertainPerformance

96.2k165786




96.2k165786













  • Thanks! If that matters, the "ABC***" string is always at the beginning of the line. Each line invariably starts with "ABC" and gibberish characters that need to maintain their capitalization, but the rest of the line contains the text that needs to be lowercase.

    – Antidisestablishmentarianism
    Nov 16 '18 at 4:46











  • Thank you so much! One last thing, if I'm not getting too impertinent: how do I delete all punctuation except the apostrophe from the rest of the string? Replace - sth like ([w.-]*)(WS)? And what do I replace it with?

    – Antidisestablishmentarianism
    Nov 16 '18 at 6:40













  • Put every punctuation character you want to remove in a character set, then replace every occurrence with the empty string. eg [._-]

    – CertainPerformance
    Nov 16 '18 at 7:17











  • I decided to forget about the apostrophe. tried ([w.-]*)([[:punct:]]) replaced with 1[._-]2. That didn't work

    – Antidisestablishmentarianism
    Nov 16 '18 at 7:28













  • You shouldn't try matching word characters, only match the punctuation marks you want to remove, in a character set. eg [._-] to remove all dots, underscores, and dashes

    – CertainPerformance
    Nov 16 '18 at 7:33





















  • Thanks! If that matters, the "ABC***" string is always at the beginning of the line. Each line invariably starts with "ABC" and gibberish characters that need to maintain their capitalization, but the rest of the line contains the text that needs to be lowercase.

    – Antidisestablishmentarianism
    Nov 16 '18 at 4:46











  • Thank you so much! One last thing, if I'm not getting too impertinent: how do I delete all punctuation except the apostrophe from the rest of the string? Replace - sth like ([w.-]*)(WS)? And what do I replace it with?

    – Antidisestablishmentarianism
    Nov 16 '18 at 6:40













  • Put every punctuation character you want to remove in a character set, then replace every occurrence with the empty string. eg [._-]

    – CertainPerformance
    Nov 16 '18 at 7:17











  • I decided to forget about the apostrophe. tried ([w.-]*)([[:punct:]]) replaced with 1[._-]2. That didn't work

    – Antidisestablishmentarianism
    Nov 16 '18 at 7:28













  • You shouldn't try matching word characters, only match the punctuation marks you want to remove, in a character set. eg [._-] to remove all dots, underscores, and dashes

    – CertainPerformance
    Nov 16 '18 at 7:33



















Thanks! If that matters, the "ABC***" string is always at the beginning of the line. Each line invariably starts with "ABC" and gibberish characters that need to maintain their capitalization, but the rest of the line contains the text that needs to be lowercase.

– Antidisestablishmentarianism
Nov 16 '18 at 4:46





Thanks! If that matters, the "ABC***" string is always at the beginning of the line. Each line invariably starts with "ABC" and gibberish characters that need to maintain their capitalization, but the rest of the line contains the text that needs to be lowercase.

– Antidisestablishmentarianism
Nov 16 '18 at 4:46













Thank you so much! One last thing, if I'm not getting too impertinent: how do I delete all punctuation except the apostrophe from the rest of the string? Replace - sth like ([w.-]*)(WS)? And what do I replace it with?

– Antidisestablishmentarianism
Nov 16 '18 at 6:40







Thank you so much! One last thing, if I'm not getting too impertinent: how do I delete all punctuation except the apostrophe from the rest of the string? Replace - sth like ([w.-]*)(WS)? And what do I replace it with?

– Antidisestablishmentarianism
Nov 16 '18 at 6:40















Put every punctuation character you want to remove in a character set, then replace every occurrence with the empty string. eg [._-]

– CertainPerformance
Nov 16 '18 at 7:17





Put every punctuation character you want to remove in a character set, then replace every occurrence with the empty string. eg [._-]

– CertainPerformance
Nov 16 '18 at 7:17













I decided to forget about the apostrophe. tried ([w.-]*)([[:punct:]]) replaced with 1[._-]2. That didn't work

– Antidisestablishmentarianism
Nov 16 '18 at 7:28







I decided to forget about the apostrophe. tried ([w.-]*)([[:punct:]]) replaced with 1[._-]2. That didn't work

– Antidisestablishmentarianism
Nov 16 '18 at 7:28















You shouldn't try matching word characters, only match the punctuation marks you want to remove, in a character set. eg [._-] to remove all dots, underscores, and dashes

– CertainPerformance
Nov 16 '18 at 7:33







You shouldn't try matching word characters, only match the punctuation marks you want to remove, in a character set. eg [._-] to remove all dots, underscores, and dashes

– CertainPerformance
Nov 16 '18 at 7:33






















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53331371%2fregex-to-change-all-text-to-lowercase-but-leave-out-parts-of-text-that-start-and%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Florida Star v. B. J. F.

Danny Elfman

Lugert, Oklahoma