In 4.2 final
Searching with the regular expressions
\b. or \<. jumps to the beginning of a word, unless the
cursor is already in a word, in which case it finds the
next character within the word.
This makes it impossible to replace characters only at
the beginning of a word.
replace with BeanShell snippet
should force the first character of every word to be
lower case. However, it actually forces *every*
character of every word to be lower case.
|Submitted||abgrover - 2005-10-12 - 17:50:09z||Assigned||nobody|
|Priority||4||Category||search and replace|
|2005-10-13 - 03:39:00z
|Logged In: YES
As far as I have found, \b, \< and some others are not
supported by the gnu.regex-Package.
One additional point for throwing it away and using
java.util.regex instead. ;-)
|2008-03-02 - 21:44:33z
|Logged In: YES
The gnu.regex-Package is thrown away now. Not so the bug.
- The Expression "\b." matches *every* single charcter in a word.
- The Expression ".\b" matches the last charcter in a word (as expected) but also a following space character.
|2012-01-19 - 20:15:50z
|I don't think this is a bug. I see 2 issues with Alan G's approach:
1) Note that the boundary matching characters match the boundary and not the characters themselves. Each word has two boundaries, one BEFORE the first character and one AFTER the last. Your regex will match the first boundary of your complete search string, which occurs before the first word character.
2) The replace string you specify indicates that you want to replace every character with a lower-case character since "_0" refers to the complete contents of your searched text. To clarify, if my search text is the string "this text" Alan's BeanShell snippet is equivalent to "this text".toLowerCase().
As an alternative, the following appears to work for me:
Search regex: \b(\w)(\w*)
Replace with BeanShell snippet: _1.toLowerCase() + _2
By separating the first character of each word (following a boundary) from the rest I can transform just that one character.
|2012-01-21 - 12:21:24z
|Just tried again:
1. ".\b" matches the word boundary and the preceding character *as expected*
2. "\b." matches *every single character* in a word, which is *a bug*, isn't it?
jEdit 4.4.2 and jEdit 4.5pre1
|2012-01-22 - 10:55:23z
|As noted, \b matches word boundaries, i.e. both the beginning and end of a "word".
So \b. will indeed match the first letter of any word, but also the first whitespace
character (and any other non-word character such as the dot after a sentence) AFTER
every word. (Though that doen't matter if you just want to upper-case it).
If you want to upper-case the first letter of every word you should use \b\w instead, or even \b[a-zA-Z] (or similar) depending on whether the search or the conversion is slowest.
I'm guessing that when doing this in jEdit, that after each match has been handled, the pattern is applied again to whatever comes after the last matched character. This will indeed cause every character in a word to match, since the first position of every string will match \b. In other words, the first match for \b in any non-blank string is (and should be) ^, i.e. the first position of the matched string.
In that case it is not a bug, and the correct way is indeed to use something like \b(\w)\w* as suggested by Mr. Jakob. That would let each match consume the rest of the word so that it is not matched in the next iteration.
|2012-04-22 - 14:21:04z
|Search and replace design does not allow to fix this without significant interface
extension. SearchMatcher class has findNext method which always starts from 0 index.
It does not allow to supply a different index. If findNext method has no access to
the previous characters, it is not able to perform a word boundary search correctly.
So I don't expect this to be fixed soon.
A fix would require much attention because there are many clauses for reverse search which must be taken into consideration. I'm not going to do it.
I don't think it is really a crucial functionality, so lowering the priority. I even have a workaround. First do replace all "\b" with "X" (this works), then all "X." with a suitable java snippet. Of course X must be substituted with something that is not contained in the file.